S3 Fundamentals
1. Amazon S3 (Simple Storage Service) is an object storage service offering industry-leading scalability, data availability, security, and performance.
2. S3 Storage Classes:
Storage Class | Designed For | Availability | Min Storage Duration | Retrieval Fee | Use Cases |
---|---|---|---|---|---|
S3 Standard | Frequently accessed data | 99.99% | None | None | Big data analytics, content distribution |
S3 Intelligent-Tiering | Data with unknown or changing access patterns | 99.9% | None | None | Long-lived data with unpredictable access patterns |
S3 Standard-IA | Infrequently accessed data | 99.9% | 30 days | Per GB | Backups, disaster recovery |
S3 One Zone-IA | Infrequently accessed, non-critical data | 99.5% (single AZ) | 30 days | Per GB | Secondary backups, easily recreatable data |
S3 Glacier Instant Retrieval | Archive data needing immediate access | 99.9% | 90 days | Per GB | Media archives, healthcare records |
S3 Glacier Flexible Retrieval | Archive data that rarely needs access | 99.99% | 90 days | Per GB + retrieval | Digital preservation, compliance archives |
S3 Glacier Deep Archive | Long-term archive | 99.99% | 180 days | Per GB + retrieval | Financial records, healthcare data |
S3 Outposts | On-premises S3 storage | Varies | None | None | Local data processing with S3 compatibility |
3. S3 Bucket Naming Rules: Globally unique, 3-63 characters, lowercase letters, numbers, hyphens, no IP format, must start with letter/number.
4. S3 Object Properties: Key (name), Value (data), Version ID, Metadata, Subresources (ACLs, Torrent).
5. S3 Object Size Limits: Individual objects can be from 0 bytes to 5 TB; objects larger than 5 GB must use multipart upload.
S3 Performance and Optimization
6. S3 Performance: S3 automatically scales to high request rates with latency of 100-200ms.
7. S3 Request Rates:
- 3,500 PUT/COPY/POST/DELETE requests per second per prefix
- 5,500 GET/HEAD requests per second per prefix
8. S3 Performance Optimization Techniques:
Technique | Description | Best For |
---|---|---|
Prefix Parallelization | Use multiple prefixes to increase throughput | High-throughput applications |
Multipart Upload | Split large objects into parts for parallel upload | Objects > 100 MB |
S3 Transfer Acceleration | Fast transfer over long distances using CloudFront | Global data transfers |
Byte-Range Fetches | Parallel downloads of specific byte ranges | Large file partial access |
S3 Select | Server-side filtering to reduce data transfer | Analytics on subset of data |
S3 Inventory | Scheduled flat-file output of objects and metadata | Large bucket management |
Partitioning Strategy | Randomized prefixes to distribute load | Very high throughput needs |
9. Multipart Upload Calculation Example:
- 5 GB file with 100 MB parts = 50 parts uploaded in parallel
- With 500 Mbps connection: ~80 seconds vs ~400 seconds for single-part upload
10. S3 Transfer Acceleration: Uses CloudFront's globally distributed edge locations to accelerate uploads to S3 by up to 500% for long-distance transfers.
S3 Data Management
11. S3 Lifecycle Policies automate transitioning objects between storage classes or expiring objects based on age.
12. S3 Lifecycle Transitions:
From | To | Minimum Days |
---|---|---|
Standard | Standard-IA | 30 days |
Standard | Intelligent-Tiering | None |
Standard | One Zone-IA | 30 days |
Standard | Glacier Instant Retrieval | 30 days |
Standard | Glacier Flexible Retrieval | 30 days |
Standard | Glacier Deep Archive | 90 days |
Standard-IA | Glacier Instant Retrieval | 30 days |
Standard-IA | Glacier Flexible Retrieval | 30 days |
Standard-IA | Glacier Deep Archive | 90 days |
Intelligent-Tiering | Glacier Instant Retrieval | 90 days |
Intelligent-Tiering | Glacier Flexible Retrieval | 90 days |
Intelligent-Tiering | Glacier Deep Archive | 180 days |
13. S3 Versioning keeps multiple variants of objects in the same bucket, allowing recovery from accidental deletions or overwrites.
14. S3 Replication:
- Cross-Region Replication (CRR): Replicate objects across regions for compliance, lower latency, or disaster recovery
- Same-Region Replication (SRR): Replicate objects within the same region for log aggregation or production/test sync
- Replication requires versioning enabled on both source and destination buckets
15. S3 Batch Operations perform bulk operations on existing S3 objects with a single request, such as copying objects, setting ACLs, or restoring from Glacier.
S3 Security and Access Control
16. S3 Security Features:
Feature | Description | Use Case |
---|---|---|
IAM Policies | Identity-based policies | User/role access control |
Bucket Policies | Resource-based policies | Cross-account access |
ACLs | Legacy access control | Simple permission grants |
Presigned URLs | Temporary access to objects | Temporary download/upload |
VPC Endpoints | Private connection from VPC | No internet access needed |
Access Points | Named network endpoints | Simplified access management |
Object Lock | WORM (Write Once Read Many) | Compliance requirements |
S3 Block Public Access | Prevent public access | Data protection |
17. S3 Encryption Options:
Encryption Type | Description | Key Management |
---|---|---|
SSE-S3 | Server-side encryption with S3-managed keys | AWS manages keys |
SSE-KMS | Server-side encryption with KMS keys | Customer controls via KMS |
SSE-C | Server-side encryption with customer-provided keys | Customer provides keys |
Client-side encryption | Encryption before uploading to S3 | Customer manages keys |
18. S3 Default Encryption is enabled automatically for all new buckets with SSE-S3 (AES-256).
19. S3 Object Lock provides WORM (Write Once Read Many) model with two retention modes:
- Governance mode: Users with special permissions can override
- Compliance mode: No one can override during retention period, including AWS account root user
S3 Data Processing and Analytics
20. S3 Select and Glacier Select allow you to use SQL expressions to retrieve only a subset of data from an object, reducing data transfer and improving query performance by up to 400%.
21. S3 Event Notifications can trigger workflows when objects are created, deleted, or restored:
- Destinations: SNS, SQS, Lambda
- Event types: ObjectCreated, ObjectRemoved, ObjectRestore, Replication, LifecycleExpiration
22. S3 Inventory provides scheduled reports of objects and metadata, useful for business, compliance, and regulatory needs.
23. S3 Analytics helps analyze storage access patterns to decide when to transition objects to appropriate storage class.
24. S3 Storage Lens provides organization-wide visibility into object storage usage and activity with customizable dashboards.
S3 Data Transfer and Integration
25. AWS DataSync provides high-speed data transfer between on-premises storage and S3 with automatic encryption and data validation.
26. AWS Transfer Family provides SFTP, FTPS, and FTP access to S3, enabling file transfers over these protocols directly to and from S3 buckets.
27. S3 Integration with AWS Services:
AWS Service | Integration with S3 |
---|---|
AWS Glue | Catalog and ETL for data in S3 |
Amazon Athena | SQL queries directly on S3 data |
Amazon EMR | Big data processing on S3 data |
AWS Lambda | Process S3 events |
Amazon QuickSight | Visualize data stored in S3 |
AWS Lake Formation | Build, secure, and manage data lakes |
Amazon Redshift | Query data in S3 with Redshift Spectrum |
Amazon SageMaker | ML model training with S3 data |
AWS Backup | Centralized backup of S3 data |
28. S3 Data Ingestion Patterns:
Pattern | Description | Best For |
---|---|---|
Direct API | Applications use S3 API directly | Simple workflows |
S3 Transfer Acceleration | Fast long-distance uploads | Global data sources |
Kinesis Data Firehose | Streaming data delivery to S3 | Real-time data capture |
AWS DataSync | Scheduled transfers from on-premises | Large dataset migration |
AWS Snowball/Snowmobile | Physical data transfer | Petabyte-scale transfers |
AWS DMS | Database migration to S3 | Database archiving |
S3 Cost Management
29. S3 Pricing Components:
- Storage pricing (per GB-month)
- Request pricing (per 1,000 requests)
- Data transfer pricing (per GB)
- Management features and analytics
- Retrieval fees (for IA and Glacier classes)
30. S3 Cost Optimization Strategies:
Strategy | Description | Savings Potential |
---|---|---|
Storage Class Analysis | Identify optimal storage class | 20-50% |
Lifecycle Policies | Automate transitions and expirations | 30-70% |
S3 Intelligent-Tiering | Automatic tiering based on access patterns | 15-40% |
S3 Storage Lens | Identify cost optimization opportunities | Varies |
S3 Inventory | Identify objects for cleanup | Varies |
S3 Batch Operations | Bulk delete unused objects | Varies |
S3 Same-Region Replication | Replicate only necessary data | Varies |
31. S3 Storage Cost Calculation Example:
- 100 TB in S3 Standard: ~$2,300/month
- Same data with 80% in S3 Standard-IA: ~$1,500/month (35% savings)
- With lifecycle policy moving 50% to Glacier after 90 days: ~$1,000/month (56% savings)
S3 Limits and Quotas
32. S3 Service Limits:
Limit | Value | Can be increased? |
---|---|---|
Maximum buckets per account | 100 | Yes (service quota) |
Maximum object size | 5 TB | No |
Maximum object size (console upload) | 160 GB | No |
Maximum parts in multipart upload | 10,000 | No |
Minimum part size (except last part) | 5 MB | No |
Maximum part size | 5 GB | No |
Maximum bucket policy size | 20 KB | No |
Maximum number of access points per Region | 10,000 | Yes |
Maximum lifecycle rules per bucket | 1,000 | No |
Maximum tags per object | 10 | No |
33. S3 Rate Limits and Throttling:
- Default limits can handle extremely high request rates
- S3 automatically scales to accommodate sustained request rates
- For extreme workloads (>100 requests/second), consider prefix partitioning
34. Overcoming S3 Rate Limits:
- Implement exponential backoff for 503 errors
- Distribute load across multiple prefixes
- Use randomized prefixes for high-throughput workloads
- Consider S3 Transfer Acceleration for uploads
S3 Data Consistency Model
35. S3 Data Consistency: S3 provides strong read-after-write consistency for all operations as of December 2020.
36. S3 Consistency Guarantees:
- New objects: Immediate visibility after successful write
- Overwrite PUTS and DELETES: Immediate consistency for reads
- LIST operations: Consistent view of all objects
S3 Glacier Features
37. S3 Glacier Retrieval Options:
Retrieval Type | Retrieval Time | Cost |
---|---|---|
Expedited | 1-5 minutes | Highest |
Standard | 3-5 hours | Medium |
Bulk | 5-12 hours | Lowest |
38. S3 Glacier Vault Lock provides WORM (Write Once Read Many) protection with compliance controls that even the root user cannot modify.
39. S3 Glacier Restore Calculation Example:
- 1 TB data with Standard retrieval: ~$10 retrieval fee + ~$90 for 30-day restored copy in S3
- Same data with Bulk retrieval: ~$3 retrieval fee + ~$90 for 30-day restored copy in S3
S3 Data Protection
40. S3 Versioning keeps multiple variants of objects, allowing recovery from accidental deletions or overwrites.
41. S3 MFA Delete requires additional authentication for permanently deleting object versions or suspending versioning.
42. S3 Cross-Region Replication (CRR) automatically replicates data across regions for compliance, lower latency, or disaster recovery.
43. S3 Same-Region Replication (SRR) replicates data within the same region for log aggregation or production/test sync.
44. S3 Object Lock prevents objects from being deleted or overwritten for a fixed time or indefinitely.
45. S3 Replication Time Control (RTC) replicates 99.99% of objects within 15 minutes with SLA backing.
S3 Performance Monitoring
46. Key CloudWatch Metrics for S3:
Metric | Description | Threshold Recommendation |
---|---|---|
BucketSizeBytes | Total bucket size | Set alerts based on expected growth |
NumberOfObjects | Total object count | Monitor for unexpected changes |
AllRequests | Total request count | Baseline + 20% for alerts |
4xxErrors | Client errors | <1% of total requests |
5xxErrors | Server errors | <0.01% of total requests |
FirstByteLatency | Time to first byte | P90 < 200ms |
TotalRequestLatency | Total request time | P90 < 300ms |
BytesDownloaded | Data downloaded | Monitor for cost management |
BytesUploaded | Data uploaded | Monitor for cost management |
ReplicationLatency | Time for replication | <15 minutes (with RTC) |
47. S3 Request Metrics can be enabled for specific prefixes, objects, or entire buckets to track request counts, latencies, and errors.
48. S3 Replication Metrics track pending operations, latency, and bytes pending replication.
S3 Data Lake Integration
49. S3 as a Data Lake Foundation:
- Unlimited scalability for any data type
- Cost-effective with storage classes
- Centralized access control
- Integration with analytics services
50. S3 Data Lake Architecture Components:
Component | AWS Service | Purpose |
---|---|---|
Storage | S3 | Raw data storage |
Catalog | AWS Glue | Metadata management |
Security | Lake Formation | Fine-grained access control |
Processing | EMR, Athena, Redshift Spectrum | Data processing |
Orchestration | Step Functions, Airflow | Workflow management |
Ingestion | Kinesis, DataSync, Transfer Family | Data acquisition |
51. S3 Data Partitioning Strategies:
Strategy | Format | Best For |
---|---|---|
Time-based | s3://bucket/data/year=2023/month=05/day=01/ | Time-series data |
Category-based | s3://bucket/data/region=us-east-1/product=widget/ | Dimensional data |
Hive-style | s3://bucket/table_name/key1=val1/key2=val2/ | Compatibility with Hive |
Nested | s3://bucket/data/year=2023/month=05/region=us-east/ | Multi-dimensional analysis |
52. S3 Data Formats for Analytics:
Format | Compression | Splittable | Schema Evolution | Best For |
---|---|---|---|---|
Parquet | Yes | Yes | Yes | Column-oriented analytics |
ORC | Yes | Yes | Yes | Hive workloads |
Avro | Yes | Yes | Yes | Schema evolution |
JSON | Yes | No | Yes | Flexibility, human-readable |
CSV | Yes | No | Limited | Simple data, compatibility |
S3 Data Ingestion Patterns
53. S3 Batch Operations perform bulk operations on existing S3 objects with a single request.
54. S3 Batch Operations Job Properties:
- Operation type (copy, invoke Lambda, restore, etc.)
- Manifest (list of objects to process)
- Priority (numeric value for job ordering)
- RoleArn (IAM role with permissions)
- Report configuration (completion report details)
55. S3 Event Notifications can trigger workflows when objects are created, deleted, or restored.
56. S3 Event Notification Filtering supports prefix and suffix matching to process only relevant objects.
57. S3 Event Notification Destinations:
- SNS Topics: Fan-out to multiple subscribers
- SQS Queues: Reliable message processing
- Lambda Functions: Custom code execution
- EventBridge: Advanced filtering and routing
58. Kinesis Data Firehose to S3 provides real-time streaming data delivery with:
- Automatic batching for efficiency
- Format conversion (JSON to Parquet/ORC)
- Data transformation via Lambda
- Error handling with backup bucket
59. S3 Data Ingestion Pipeline Replayability Strategies:
Strategy | Implementation | Pros | Cons |
---|---|---|---|
Source-based replay | Reread from source system | Complete fidelity | Source system dependency |
S3 versioning | Maintain object versions | Simple implementation | Storage costs |
Backup copy | Duplicate data to another bucket | Isolation from production | Storage costs |
Event-driven | Store events in SQS/Kinesis | Decoupled processing | Additional complexity |
Manifest-based | Track processed files | Precise control | Requires manifest management |
60. Throttling Implementation for S3 Data Ingestion:
- Client-side rate limiting
- SQS as a buffer with controlled processing
- Lambda concurrency limits
- API Gateway throttling for web-based uploads
S3 Advanced Features
61. S3 Access Points simplify managing access to shared datasets with dedicated access policies.
62. S3 Object Lambda transforms data retrieved from S3 before returning to the application, enabling:
- Redacting PII
- Converting formats
- Enriching data
- Filtering rows/columns
63. S3 Requester Pays buckets require the requester to pay for data transfer and request costs instead of the bucket owner.
64. S3 Inventory provides scheduled flat-file output listing objects and metadata.
65. S3 Batch Operations with Inventory enables bulk operations on objects identified in inventory reports.
66. S3 Storage Class Analysis helps identify when to transition objects to lower-cost storage classes.
67. S3 Select Query Examples:
- CSV:
SELECT s._1, s._2 FROM S3Object s WHERE s._3 > 100
- JSON:
SELECT s.name, s.age FROM S3Object s WHERE s.age > 25
- Parquet:
SELECT * FROM S3Object WHERE age > 30 LIMIT 100
S3 Performance Best Practices
68. S3 Performance Best Practices:
Best Practice | Implementation | Benefit |
---|---|---|
Prefix Parallelization | Use multiple prefixes | Higher throughput |
Range GETs | Parallel byte-range fetches | Faster large object access |
Transfer Acceleration | Enable for bucket | Faster long-distance transfers |
Multipart Upload | Split large files | Parallel upload, resiliency |
S3 Select | Server-side filtering | Reduced network transfer |
Compression | Compress objects | Lower storage costs, faster transfer |
Caching | CloudFront or ElastiCache | Lower latency, reduced load |
69. S3 Multipart Upload Thresholds:
- Recommended for objects > 100 MB
- Required for objects > 5 GB
- Optimal part size: 25-100 MB for typical networks
- Maximum of 10,000 parts per upload
70. S3 Performance Testing Methodology:
- Establish baseline with single-threaded transfers
- Test with multiple threads/connections
- Experiment with different part sizes
- Measure with different prefix strategies
- Compare with/without Transfer Acceleration
S3 Data Processing Patterns
71. S3 Event-Driven Processing Patterns:
Pattern | Implementation | Use Case |
---|---|---|
Direct Lambda | S3 event → Lambda | Simple transformations |
Queue-based | S3 event → SQS → Lambda | Rate limiting, retry handling |
Fan-out | S3 event → SNS → multiple endpoints | Multiple consumers |
Orchestrated | S3 event → Step Functions | Complex workflows |
Stream processing | S3 event → Kinesis → processors | Real-time analytics |
72. S3 Batch Processing Patterns:
Pattern | Implementation | Use Case |
---|---|---|
EMR | Spark/Hadoop on EMR reading from S3 | Big data processing |
Glue ETL | AWS Glue jobs reading from S3 | Serverless ETL |
Batch Operations | S3 Batch Operations with Lambda | Object-level operations |
Athena | SQL queries directly on S3 | Interactive analysis |
Redshift Spectrum | Redshift external tables on S3 | Data warehousing |
73. S3 Data Lake Processing Layers:
Layer | Description | S3 Implementation |
---|---|---|
Raw/Bronze | Original unmodified data | S3 Standard with lifecycle to IA/Glacier |
Processed/Silver | Cleansed, validated data | S3 Standard with partitioning |
Curated/Gold | Business-ready datasets | S3 Standard with optimized formats |
Application | Purpose-built data products | S3 Standard with CloudFront |
S3 Security Best Practices
74. S3 Security Best Practices:
Best Practice | Implementation | Benefit |
---|---|---|
Block Public Access | Enable at account level | Prevent accidental exposure |
Default Encryption | Enable SSE-S3 or SSE-KMS | Data protection at rest |
VPC Endpoints | Create Gateway Endpoint for S3 | Private network access |
Access Logging | Enable S3 access logging | Audit and compliance |
IAM Policies | Least privilege principle | Controlled access |
Bucket Policies | Explicit allow/deny | Resource-level control |
Presigned URLs | Time-limited access | Temporary permissions |
Object Lock | Enable for critical data | Immutability |
75. S3 Security Monitoring:
- CloudTrail for API activity
- S3 Access Logs for object-level access
- CloudWatch Metrics for operation counts
- AWS Config for configuration compliance
- Macie for sensitive data detection
S3 Data Migration and Transfer
76. S3 Data Migration Options:
Option | Transfer Speed | Data Size Range | Use Case |
---|---|---|---|
Direct Upload | Depends on bandwidth | MB to GB | Small files, good connectivity |
AWS CLI | Depends on bandwidth | GB to TB | Command-line automation |
S3 Transfer Acceleration | Up to 500% faster | GB to TB | Long-distance transfers |
AWS DataSync | Up to 10 Gbps | TB to PB | Scheduled migrations |
AWS Transfer Family | Depends on bandwidth | GB to TB | FTP/SFTP compatibility |
AWS Storage Gateway | Depends on bandwidth | TB to PB | Hybrid cloud integration |
AWS Snowcone | Offline | Up to 8 TB | Edge locations |
AWS Snowball | Offline | Up to 80 TB | Large datasets |
AWS Snowmobile | Offline | Up to 100 PB | Massive data centers |
77. S3 Transfer Acceleration Performance Comparison:
- 1 TB transfer from US to Australia:
- Standard transfer: ~12 hours
- With Transfer Acceleration: ~2.5 hours
78. AWS DataSync Performance:
- Up to 10 Gbps throughput
- Parallel processing of files
- Automatic retry mechanism
- Built-in validation
S3 Compliance and Governance
79. S3 Compliance Features:
Feature | Implementation | Compliance Need |
---|---|---|
Object Lock | WORM protection | SEC Rule 17a-4, FINRA, CFTC |
Glacier Vault Lock | Immutable vault policy | Long-term records retention |
Access Logging | Detailed access logs | Audit requirements |
Inventory Reports | Scheduled metadata reports | Asset management |
Replication | Cross-region or same-region | Data residency, DR |
Versioning | Object version history | Change tracking |
Lifecycle Policies | Automated retention | Records management |
Macie Integration | Sensitive data discovery | PII protection |
80. S3 Object Lock Modes:
- Governance mode: Special permissions can override
- Compliance mode: No overrides during retention period
- Legal hold: Indefinite retention independent of retention period
S3 Integration with Data Engineering Services
81. S3 Integration with AWS Glue:
- Glue crawlers scan S3 data to populate the Glue Data Catalog
- Glue ETL jobs read from and write to S3
- Glue Data Catalog provides metadata for S3 objects
- Glue schema registry manages schemas for S3 data
82. S3 Integration with Amazon Athena:
- Serverless SQL queries directly on S3 data
- Supports CSV, JSON, ORC, Avro, Parquet
- Pay only for data scanned
- Federated queries to other data sources
83. S3 Integration with Amazon EMR:
- EMRFS provides S3 access from EMR clusters
- S3DistCp for efficient data copying
- EMR File System (EMRFS) consistency view
- EMR supports direct processing of S3 data
84. S3 Integration with Amazon Redshift:
- COPY command loads data from S3 to Redshift
- Redshift Spectrum queries data directly in S3
- Unload command exports data to S3
- Automatic compression encoding
85. S3 Integration with AWS Lake Formation:
- Centralized permissions for S3 data lakes
- Column-level, row-level, and cell-level security
- Tag-based access control
- Data location registration
S3 Throughput and Latency Characteristics
86. S3 Throughput Characteristics:
- Unlimited total throughput (scales with request rate)
- 3,500 PUT/POST/DELETE and 5,500 GET requests per second per prefix
- No bandwidth limits for a single bucket
- Multipart uploads recommended for objects > 100 MB
87. S3 Latency Characteristics:
- First-byte latency: typically 100-200 ms
- Varies by region and request type
- GET operations faster than PUT operations
- S3 Transfer Acceleration improves latency for distant clients
88. S3 Performance Comparison:
Operation | Average Latency | Throughput Limit | Optimization |
---|---|---|---|
GET | 100-200 ms | 5,500 req/s per prefix | CloudFront caching |
PUT | 200-300 ms | 3,500 req/s per prefix | Multipart upload |
LIST | Varies with objects | Rate limited | Prefix organization |
DELETE | 200-300 ms | 3,500 req/s per prefix | Batch operations |
89. S3 Performance Monitoring Metrics:
- FirstByteLatency: Time to first byte
- TotalRequestLatency: End-to-end latency
- BytesDownloaded/BytesUploaded: Data transfer volume
- 4xxErrors/5xxErrors: Error counts
- ReplicationLatency: Time for replication
S3 Cost Optimization
90. S3 Intelligent-Tiering automatically moves objects between access tiers based on usage patterns, with no retrieval charges or operational overhead.
91. S3 Lifecycle Configuration Example:
{
"Rules": [
{
"ID": "Move to IA after 30 days, Glacier after 90, expire after 365",
"Status": "Enabled",
"Prefix": "logs/",
"Transitions": [
{"Days": 30, "StorageClass": "STANDARD_IA"},
{"Days": 90, "StorageClass": "GLACIER"}
],
"Expiration": {"Days": 365}
}
]
}
92. S3 Storage Cost Comparison (prices approximate):
Storage Class | Price per GB-month | Retrieval Cost | Min Duration | Min Size |
---|---|---|---|---|
Standard | $0.023 | None | None | None |
Intelligent-Tiering | $0.023 + monitoring | None | None | None |
Standard-IA | $0.0125 | $0.01/GB | 30 days | 128 KB |
One Zone-IA | $0.01 | $0.01/GB | 30 days | 128 KB |
Glacier Instant | $0.004 | $0.03/GB | 90 days | 128 KB |
Glacier Flexible | $0.0036 | $0.02/GB (std) | 90 days | 40 KB |
Glacier Deep | $0.00099 | $0.02/GB (std) | 180 days | 40 KB |
93. S3 Request Pricing (prices approximate):
- PUT/COPY/POST/LIST: $0.005 per 1,000 requests
- GET: $0.0004 per 1,000 requests
- Lifecycle transitions: $0.01 per 1,000 requests
- Data retrieval: Varies by storage class
S3 Mind Map
94. AWS S3 Service Mind Map:
Amazon S3
├── Storage Classes
│ ├── Standard
│ ├── Intelligent-Tiering
│ ├── Standard-IA
│ ├── One Zone-IA
│ ├── Glacier Instant Retrieval
│ ├── Glacier Flexible Retrieval
│ └── Glacier Deep Archive
├── Data Management
│ ├── Lifecycle Policies
│ ├── Versioning
│ ├── Replication (CRR/SRR)
│ ├── Storage Lens
│ ├── Inventory
│ └── Batch Operations
├── Security
│ ├── IAM Policies
│ ├── Bucket Policies
│ ├── ACLs
│ ├── Encryption (SSE-S3, SSE-KMS, SSE-C)
│ ├── Object Lock
│ ├── Access Points
│ └── VPC Endpoints
├── Performance
│ ├── Transfer Acceleration
│ ├── Multipart Upload
│ ├── Byte-Range Fetches
│ ├── S3 Select
│ └── Prefix Optimization
├── Analytics & Monitoring
│ ├── S3 Analytics
│ ├── CloudWatch Metrics
│ ├── CloudTrail
│ ├── Access Logs
│ └── Event Notifications
└── Integration
├── Data Lake Services (Athena, Glue)
├── Processing (Lambda, EMR)
├── Streaming (Kinesis)
├── Migration (DataSync, Transfer Family)
└── Content Delivery (CloudFront)
S3 Additional Features
95. S3 Requester Pays buckets require the requester to pay for data transfer and request costs instead of the bucket owner.
96. S3 Website Hosting provides static website hosting with custom domain support.
97. S3 Directory Bucket (introduced 2023) provides stronger consistency guarantees and lower latency for specific workloads.
98. S3 Access Points simplify managing access to shared datasets with dedicated access policies.
99. S3 Object Lambda transforms data retrieved from S3 before returning to the application.
100. S3 Replayability for Data Ingestion Pipelines:
- Use S3 versioning to maintain historical versions
- Implement SQS dead-letter queues for failed processing
- Store processing metadata with objects using S3 object tags
- Use manifest files to track processing state
- Implement idempotent processors that can safely reprocess data
Top comments (0)