DEV Community

Salma Aga Shaik
Salma Aga Shaik

Posted on

AWS S3 Storage Classes (Start to End)

1) What is Amazon S3?

Amazon S3 (Simple Storage Service) is an AWS service used to store files like images, videos, logs, backups, datasets, and reports as objects inside buckets.

  • Bucket = main container (like a top-level folder)
  • Object = the actual file (data + metadata)

S3 is widely used for:

  • Data lakes
  • Backups and disaster recovery
  • Application logs
  • Static website files
  • Analytics and machine learning datasets
  • Long-term archiving and compliance

2) Why does S3 have multiple storage classes?

Not all data is used in the same way:

  • Some data is used daily (hot data)
  • Some data is used sometimes (cold data)
  • Some data is almost never used (archive data)

So AWS provides different S3 storage classes to help you balance:

  • Cost – how much you pay for storage
  • Speed – how fast you can read data
  • Availability – how often data is accessible
  • Risk – multi-AZ vs single-AZ
  • Retrieval fee – extra cost when you download data in some classes

3) Key Terms

Term Simple Meaning Easy Example
Durability How safe your data is from being lost Even if disks fail, AWS still keeps your file safe
11 nines durability (99.999999999%) Extremely high safety “Almost never lost”
Availability How often data is accessible 99.99% means very little downtime
Latency How fast you can access data Milliseconds = very fast
Throughput How much data can be read/written per second Important for big analytics jobs
Retrieval fee Extra cost when you download data Some classes charge when you read
Availability Zone (AZ) One data center inside a region Multi-AZ is safer than single AZ

4) There are 8 Storage Classes

4.1 S3 Standard – Hot Data

  • Used for frequently accessed and business-critical data.

Key Features:

  • Very fast access (milliseconds): Suitable for real-time applications and user-facing systems.
  • High availability: Designed to be available almost all the time for applications.
  • Multi-AZ durability: Data is safely stored across multiple data centers.
  • No retrieval fee: You don’t pay extra when reading or downloading data.

Use Cases:

  • Website images and videos served to users
  • Daily application logs used by engineers
  • Active analytics datasets queried many times per day
  • Frequently used ML training and inference data

Example: Today’s sales data used every hour → S3 Standard

Remember: Standard = Hot + Fast


4.2 S3 Intelligent-Tiering – AWS Decides Automatically

  • For data where you don’t know how often it will be accessed.

Key Features:

  • Automatic movement between tiers: AWS moves objects to cheaper tiers when access reduces.
  • No performance impact: Applications access data the same way.
  • Small monitoring fee: Charged for AWS to track access patterns.

Use Cases

  • Data lakes where new data is hot and old data becomes cold
  • ML datasets where some features are used more than others
  • Analytics history that changes in access frequency

Example: Some months of logs are queried often, others not → Intelligent-Tiering

Remember: Intelligent = “I don’t know access pattern”


4.3 S3 Standard-IA – Cold but Fast

  • For data accessed rarely, but must be accessed immediately when needed.

Key Features:

  • Lower storage cost than Standard: Helps save money for infrequently used data.
  • Fast access: Still milliseconds when you retrieve data.
  • Retrieval fee applies: Extra cost when you download data.
  • Multi-AZ durability: Safe across multiple data centers.

Use Cases:

  • Backups used only during failures
  • Disaster recovery data
  • Old reports accessed occasionally

Example: Weekly backups restored only during failure → Standard-IA

Remember: IA = Rare, but fast


4.4 S3 One Zone-IA – Cheaper but Risky

  • Same as Standard-IA, but stored in one Availability Zone only.

Key Features:

  • Cheaper than Standard-IA: Cost saving for non-critical data.
  • Single AZ risk: If that AZ goes down, data can be unavailable.
  • Fast access: Still millisecond latency.
  • Retrieval fee applies.

Use Cases:

  • Re-creatable ETL outputs
  • Temporary pipeline files
  • Secondary backups

Example: Temporary pipeline files → One Zone-IA

Remember: One Zone = Cheap + Risk


4.5 S3 Glacier Instant Retrieval – Archive + Fast

  • S3 Glacier Instant Retrieval is a storage class for archived data that is rarely accessed, but when you need it, you can open it immediately. It is mainly used for long-term storage where data is kept for compliance or record-keeping, but still needs instant access sometimes.

Key Features

  • Very low storage cost
  • Instant (milliseconds) access
  • Retrieval fee applies
  • Multi-AZ durability

Use Cases:

  • Compliance documents that must open quickly
  • Audit logs needed during investigations

Example: Legal docs opened only during audits → Glacier Instant

Remember: Glacier Instant = Archive + Fast


4.6 S3 Glacier Flexible Retrieval – Archive + Wait

  • S3 Glacier Flexible Retrieval is used for archived data that is almost never accessed, and when it is accessed, you are okay to wait some time before getting the data back. This class is mainly for long-term backups and historical data.

Key Features:

  • Very low cost for long-term storage
  • Multiple retrieval speeds: expedited, standard, bulk
  • Suitable for large archive restores

Use Cases:

  • Old backups
  • Historical logs

Remember: Flexible = Waiting is okay


4.7 S3 Glacier Deep Archive – Cheapest + Slowest

  • S3 Glacier Deep Archive is the lowest-cost storage class in Amazon S3. It is used for data that must be kept for many years and is almost never accessed. This is mainly for legal, regulatory, and compliance requirements.

Key Features:

  • Cheapest storage class
  • Retrieval time 12–48 hours
  • Best for compliance and legal retention

Use Cases:

  • Financial records
  • Government data

Remember: Deep Archive = Coldest + Slowest + Cheapest


4.8 S3 Express One Zone – Extra Fast, Single AZ

S3 Express One Zone is a storage class designed for very high-performance workloads. It is used when applications need very low latency and very high request rates for reading and writing data. Data is stored in only one Availability Zone, so it is faster but less resilient compared to multi-AZ classes.

Key Features :

  • Ultra-fast performance for request-heavy workloads
  • High throughput for many small reads/writes
  • Stored in one AZ only (less resilient)

Use Cases:

  • Real-time analytics
  • ML feature stores
  • Hot ETL intermediate data

Example: Pipeline reading millions of small files → Express One Zone

Remember: Express = Extra fast, One Zone = Single AZ


5) Comparision table for all 8 S3 Storage Classes

Storage Class Access Pattern Retrieval Speed Storage Cost Extra Cost Availability / Risk Best For
S3 Standard Frequently accessed Milliseconds High No Multi-AZ, very safe Hot data, websites, active logs
S3 Intelligent-Tiering Unknown / changing Milliseconds Medium Monitoring fee Multi-AZ Unpredictable workloads
S3 Standard-IA Infrequent but fast needed Milliseconds Lower Retrieval fee Multi-AZ Backups, DR
S3 One Zone-IA Infrequent, non-critical Milliseconds Cheaper Retrieval fee Single AZ risk Re-creatable data
S3 Glacier Instant Retrieval Rare but instant needed Milliseconds Very low Retrieval fee Multi-AZ Compliance archives
S3 Glacier Flexible Retrieval Very rare access Minutes → Hours Very low Retrieval fee Multi-AZ Old backups, logs
S3 Glacier Deep Archive Almost never accessed 12–48 hours Lowest Retrieval fee Multi-AZ Legal & long-term records
S3 Express One Zone Very frequent, high-performance Ultra-fast Higher Request-based pricing Single AZ High-performance analytics, ML

6) How to Choose Quickly

Ask yourself these 3 simple questions:

i) How often will the data be accessed?

  • Daily or many times a dayS3 Standard
  • Not sure / changes over timeS3 Intelligent-Tiering
  • Rarely → Use IA or Glacier classes

ii) When needed, how fast must I get the data?

  • Instant (milliseconds) → Standard, Standard-IA, Glacier Instant
  • Can wait minutes or hours → Glacier Flexible
  • Can wait 1–2 days → Glacier Deep Archive

iii) Is the data critical or can it be recreated?

  • Critical data → Choose multi-AZ classes
  • Non-critical or re-creatable data → Choose single-AZ classes

Quick Mapping Table

Scenario Best Choice
App serving images every second S3 Standard
Logs with changing access patterns Intelligent-Tiering
Weekly backups Standard-IA
Temporary ETL output One Zone-IA
Compliance docs needing instant access Glacier Instant
Large archive restores Glacier Flexible
10-year legal retention Glacier Deep Archive
High-performance ML feature reads S3 Express One Zone

7) How to Remember

  • Hot → Standard
  • Unknown → Intelligent
  • Cold → IA
  • Very Cold → Glacier
  • Coldest → Deep Archive
  • Ultra-fast hot data → Express One Zone

8) What is Amazon S3 and What is a Bucket?

Amazon S3 (Simple Storage Service) is a cloud storage service provided by AWS. It is used to store files and data such as images, videos, logs, backups, datasets, and documents.

An Amazon S3 bucket is the main container where all your files (objects) are stored. You cannot upload a file directly to S3 without a bucket. Every file must be inside a bucket.

  • Bucket is like a main folder
  • Object is like a file inside the folder

Example You create a bucket named company-data-bucket.
Inside this bucket, you store:

  • logs/app-logs-2026.json
  • reports/sales-jan.csv
  • images/profile.png

Here, company-data-bucket is the bucket, and each file is an object.


9) Basic Structure of Amazon S3

Term Meaning in Simple Words Example
Bucket The top-level container company-analytics-bucket
Object The actual file stored 2026/jan/sales.csv
Key The full path of the file inside the bucket 2026/jan/sales.csv
Region The AWS location where the bucket lives us-east-1, ap-south-1

Important points:

  • Each bucket belongs to one AWS region
  • Your data is physically stored in that region
  • You can access the bucket from anywhere if permissions allow it

10) Why Do We Need Amazon S3 Buckets?

Amazon S3 buckets are used to store and manage almost all types of data in the cloud.

Common real-world use cases:

  • Data lakes Store raw data, logs, CSV, JSON, and Parquet files

  • Backups Store database backups, server backups, and application backups

  • Application files Store images, videos, and documents used by web and mobile apps

  • Analytics and Big Data Store data for Athena, Glue, EMR, and Redshift Spectrum

  • Static website hosting Store HTML, CSS, and JavaScript files for static websites

In short, Amazon S3 buckets are the foundation of data storage in AWS.


11) Amazon S3 Bucket Naming Rules

S3 bucket names follow strict global rules. These rules exist because bucket names are used in URLs and must work with the internet DNS system.

Rule 1: Globally Unique Name

Every bucket name must be globally unique across all AWS accounts and regions. If someone else has already created a bucket with a name, you cannot use that name.

Example:

  • mybucket may already be taken
  • mycompany-analytics-2026 is more likely to be available

Rule 2: Length Rules

Bucket name length must be between 3 and 63 characters.


Rule 3: Allowed Characters

  • You can use only: lowercase letter from a to z,numbers from 0 to 9,hyphens,dots

  • You cannot use: uppercase letters,underscores,spaces,special characters

Examples: my-data-bucket,company.logs.backup,analytics2026

Invalid examples:

  • MyBucket
  • my_bucket
  • my bucket

Rule 4: Start and End with Letter or Number

Bucket name must start and end with a letter or number. It should not start or end with a hyphen or dot.


Rule 5: No IP Address Format

Bucket names cannot look like an IP address such as 192.168.1.1. This is because bucket names are used in URLs.


12) Why These Rules Exist

Amazon S3 buckets are accessed using web URLs like:

https://my-data-bucket.s3.amazonaws.com/file.csv

To make sure these URLs work correctly with:

  • internet routing, DNS system, SSL certificates

AWS enforces strict bucket naming rules.


13) Important Features of Amazon S3 Buckets

Region

When you create a bucket, you select a region. Your data stays in that region. This helps with low latency, cost control, and legal compliance.


Access Control

By default, buckets are private. You control access using:

  • IAM users and roles, Bucket policies

Public access is usually used only for public website content.


Versioning

Versioning keeps multiple versions of the same file. If someone overwrites or deletes a file, older versions are still stored. This helps with data recovery and mistake protection.


Encryption

Amazon S3 supports encryption to protect your data. Data can be encrypted:

  • at rest, in transit

Encryption is important for security and compliance requirements.


Lifecycle Rules

Lifecycle rules help you automate storage management. You can move old data to cheaper storage classes or delete data after a fixed time. This helps reduce storage cost automatically.


14) Real-Life Example from Data Engineering

In a real data engineering project:

  • New logs come every day
  • Old logs are accessed rarely
  • Compliance rules require keeping data for many years

You may create different buckets:

  • company-raw-logs for daily logs
  • company-processed-data for transformed data
  • company-archive-data for long-term storage

Lifecycle rules can move old files automatically to cheaper storage classes.


15) How to Remember Amazon S3 Bucket Rules

Use the word BUCKET as a memory trick:

  • B means Bucket is the main container
  • U means Unique globally
  • C means Characters allowed are lowercase letters, numbers, hyphens, and dots
  • K means Keep name length between 3 and 63
  • E means End with a letter or number
  • T means Tied to one AWS region

Top comments (0)