DEV Community

kranthi
kranthi

Posted on

Mastering AWS Storage Services: S3, EBS, and EFS

Image description

1. Amazon S3 (Simple Storage Service)

• Object storage designed to store and retrieve any volume of data from anywhere on the web.
• Provides 99.999999999% durability across multiple AZs.
• Ideal for backups, data lakes, static websites, machine learning datasets, and log archives.

Amazon S3 Storage Classes

1. S3 Standard
This is the default and most commonly used storage class in Amazon S3. It's ideal for data that is frequently accessed, such as active application files, dynamic websites, and hot data analytics. It offers high availability, millisecond retrieval, and 11 nines (99.999999999%) of durability.
Security features include SSE-S3, SSE-KMS encryption, IAM policies, ACLs, and bucket policies.

2. S3 Intelligent-Tiering
Designed for data with unpredictable access patterns, this class automatically moves objects between frequent and infrequent tiers based on usage. It offers the same high availability and millisecond latency as S3 Standard and maintains 11 nines of durability.
It supports SSE-KMS encryption and features like Object Lock for compliance and immutability. Ideal when you want to optimize cost without affecting performance.

3. S3 Standard-IA (Infrequent Access)
This class is perfect for data that is not accessed frequently but must be quickly retrievable when needed—like backups or older logs. It provides high availability, millisecond retrieval times, and 11 nines of durability.
Security options include MFA Delete, SSE-KMS encryption, and fine-grained access controls. It is more cost-effective than S3 Standard, but retrieval costs apply.

4. S3 One Zone-IA
This is a cheaper version of Standard-IA but stores data in a single Availability Zone instead of multiple ones. It’s suitable for non-critical, infrequently accessed data like secondary backups or easily reproducible data.
While it maintains millisecond access latency and 11 nines durability, it has lower availability and high risk if that one AZ fails. Encryption and access features include SSE-KMS, Object Lock, and IAM controls.

5. S3 Glacier Instant Retrieval
Used for archival data that still requires instant access (e.g., medical images, financial records). It combines low-cost storage with millisecond retrieval, while still offering 11 nines durability.
You get the benefits of Object Lock, SSE-KMS, and legal hold capabilities. Ideal when long-term storage is needed, but latency cannot be compromised.

6. S3 Glacier Flexible Retrieval
Formerly known as "S3 Glacier", this class is great for long-term archives that are rarely accessed but occasionally needed. Retrieval times range from minutes to hours, and it’s cheaper than Instant Retrieval.
It provides 11 nines durability, high availability, and supports event-based restore, encryption via SSE-KMS, and audit control through logging and monitoring tools.

7. S3 Glacier Deep Archive
The lowest-cost storage option for rarely accessed data—like compliance archives or historical records. Retrieval times take hours, but it still guarantees 11 nines durability and high availability.
Used with audit logging, SSE-KMS, and lifecycle policies, it’s perfect for long-term cold storage where retrieval time is not urgent.

Advanced Features:
• Lifecycle Policies: Automate transitions between classes to reduce costs.
• Versioning & MFA Delete: Protects against accidental overwrites/deletions.
• S3 Object Lock: WORM (Write Once Read Many) for regulatory compliance.
• Cross-Region Replication (CRR): For DR and latency optimization.
• Event Notifications: Trigger Lambda, SNS, or SQS for automated workflows.

Security Best Practices:
• Enforce least-privilege access using IAM policies.
• Enable default encryption (SSE-KMS) at the bucket level.
• Block all public access unless explicitly required.
• Use S3 Access Analyzer and AWS Config rules for compliance checks.

When to Use S3:
• Hosting static content like HTML/CSS for websites.
• Storing backups, logs, or data for ML pipelines.
• Archiving compliance data (e.g., logs, financial records).

2. Amazon EBS (Elastic Block Store)

• High-performance block storage used in conjunction with EC2.
• Provides persistent storage with low latency.
• Supports dynamic scaling and high IOPS workloads.

Amazon EBS Volume Types

1. gp3 (General Purpose SSD)
The gp3 volume is the default choice for most general-purpose workloads on AWS. It is designed for use cases like boot volumes, small databases, and development/test environments. gp3 offers up to 16,000 IOPS and 1,000 MB/s throughput, regardless of volume size, making it more performant and cost-effective than its predecessor, gp2. It’s an SSD-backed volume, supporting encryption, snapshots, and provisioned IOPS, allowing you to tailor performance to workload needs without increasing capacity.

2. io2 and io2 Block Express (Provisioned IOPS SSD)
These high-performance SSD volumes are built for mission-critical applications such as large-scale relational and NoSQL databases, SAP HANA, and latency-sensitive transactional workloads. io2 supports up to 256,000 IOPS and 4,000 MB/s throughput, especially with Block Express architecture, which delivers consistent sub-millisecond latency. They offer higher durability (99.999%), multi-attach capability, and enhanced resiliency, making them ideal for enterprise-grade deployments. Encryption and snapshot support are fully integrated.

3. st1 (Throughput Optimized HDD)
st1 volumes are HDD-based and designed for high-throughput workloads such as big data, data warehouses, and log processing systems. With up to 500 MB/s throughput and 500 IOPS, they are a cost-effective option for workloads that require sequential access over random I/O. While not ideal for boot volumes or transactional databases, st1 is a good fit for large-scale data lakes or analytical platforms.

4. sc1 (Cold HDD)
The sc1 volume is optimized for infrequently accessed data, offering the lowest-cost magnetic storage on EBS. It is suitable for archival workloads, large-volume cold storage, and backups that are rarely retrieved. Performance is lower than st1, with up to 250 IOPS and 250 MB/s throughput, making it unsuitable for active use cases but valuable for minimizing costs in long-term storage scenarios.

5. Magnetic (Standard – Deprecated)
This legacy magnetic volume type is still available for older EC2 instances or legacy applications that were designed with it. It offers low performance, both in terms of throughput and IOPS, and is not recommended for new workloads unless absolutely required for compatibility reasons. AWS recommends moving to gp3 or st1 for modern applications.

Advanced Features:
• Snapshots: Point-in-time backups that can be copied across regions.
• Encryption: Fully managed using AWS KMS.
• Elastic Volumes: Modify size, IOPS, or type without downtime.
• Multi-Attach: Attach io2 volumes to multiple instances (Linux only).

Security Best Practices:
• Always encrypt volumes using KMS.
• Implement automated snapshots using AWS Backup.
• Enable CloudTrail for auditing volume operations.
• Use tags for cost allocation and access control.

When to Use EBS:
• Boot volume for EC2 instances.
• Databases like MySQL, PostgreSQL, Oracle.
• Applications requiring high-throughput and low-latency.

3. Amazon EFS (Elastic File System)

• Managed NFS-based file system that auto-scales to petabytes.
• Accessible from multiple EC2s, ideal for shared workloads.
• Available in multiple AZs for HA.

EFS Performance & Throughput Modes

  1. General Purpose Mode
    This is the default performance mode for Amazon EFS. It's optimized for low-latency operations and is ideal for workloads like CMS platforms, developer home directories, web serving, and shared development environments. This mode supports a burst model, allowing short-term high throughput and IOPS for small to medium-sized workloads, providing a great balance between cost and performance.

  2. Max I/O Mode
    The Max I/O performance mode is designed for highly parallel and large-scale workloads, such as big data analytics, media processing pipelines, and machine learning datasets. While it provides higher aggregate throughput and IOPS, it may introduce slightly higher latencies compared to General Purpose mode. It's best suited for environments where scalability is more important than single-instance latency.

Storage Classes:
• Standard: Primary tier for hot data.
• IA (Infrequent Access): Lower-cost tier for cold data.

Lifecycle Management:
• Automatically moves files between Standard and IA based on access
patterns.

Security and Compliance:
• Supports encryption in transit and at rest (using KMS).
• IAM policies + VPC security groups for access control.
• POSIX file permissions for Linux workloads.

When to Use EFS:
• Shared storage for containers, Lambda, EC2.
• Hosting WordPress, MediaWiki, or similar.
• Real-time data processing or analytics.

Best Practices:
• Use lifecycle management to control costs.
• Enable logging with CloudTrail.
• Choose the right performance mode based on workload.

Conclusion
Choosing the right AWS storage service depends on workload type, access
pattern, and cost-performance tradeoffs.

• S3 is unmatched for object storage, backups, and analytics.
• EBS is critical for applications needing block-level performance and
persistence.
• EFS simplifies scalable file sharing and NFS-based workloads.
Always benchmark performance, enable encryption, and apply access control
policies across all services for a secure and optimized architecture.

Top comments (0)