s3 Storage/DataRetrievals Pricing - FinOps tips

#s3 #aws #finops #costmanagement

In this article, I will talk about s3, a complex and highly available service. Like the other articles in my FinOps series, I will give you some tips about the cost-effective architecture design pattern.

Because s3 is a huge topic, today I will only talk about the cost of data storage and data retrieval. The other charges will be covered in a future article (Requests, Data transfer).

Monitor storage metrics with Storage Lens.

When you work on s3, you need some insights about the objects we store, for example:

The size of your buckets.
The total number of encrypted bytes.
How many objects do you have ?

With Storage Lens, you can analyze your storage with ~29 metrics and interactive dashboards to aggregate data for your entire organization.

Storage Lens provides metrics, you can use them in CloudWatch to create alarms and triggered actions.

A real case example :

Using multipart upload to backup your data on s3.

When you use s3 multipart upload (from an object size of 100 MB, you should consider using multipart uploads) and the uploading process fails, you pay for the partial data you have in your bucket.

Consequences :

You will not find any data in your bucket but a high cost in cost explorer.

Solution:

Use Storage Lens to identify this issue and publish the metrics to CloudWatch to be able to create an alarm.

Choose the right encryption strategy, do not use KMS for a bucket with a high volume of data (performance impact and high cost).

A real case example :

You have a data lake workload and use s3 to store the data.

As part of the security guideline of the company, you need to encrypt your data at rest and in transit. A CMK KMS key is used to encrypt all of your s3 buckets at rest.

Consequences :

After a period of time, you notice that KMS is very expensive.

Solution:

For big data workloads, I do not recommend using a CMK KMS key for encryption. Why? Because of the interaction between s3 and KMS to decrypt/encrypt objects when you access them.

For a bucket with TeraBytes or PetaBytes of data, you will have a performance impact and high cost for KMS service.

From my experience, with a very large s3 bucket, we saved 6K$/month by changing the encryption key from SSE-KMS to SSE-S3.

If the security guideline does not allow you to do that, consider using "S3 Bucket Keys for SSE-KMS".

Prefer intelligent Tiering class, depending on your retrieval performance requirements.

As you can see in the animation, with Intelligent Tiering you don't pay to retrieve data except for the expedited mode.

If your use case does not need specific performance and you don’t know how your workload will evolve, Intelligent Tiering is the right class to begin with.