loading...
Cover image for AWS Series: What's in the Bucket?

AWS Series: What's in the Bucket?

helenanders26 profile image Helen Anderson Updated on ・4 min read

AWS Associate Certifications (13 Part Series)

1) AWS Series: The Journey to Certification 2) AWS concepts from A to Z 3 ... 11 3) AWS Series: Don't you know who IAM? 4) AWS Series: What's in the Bucket? 5) AWS Series: Which EC2 is right for you? 6) AWS Series: SNS, SQS or both? 7) AWS Series: Gently Down the Stream with Kinesis 8) AWS Series: Why a VPC is like the London Underground 9) AWS Series: CloudWatch or CloudTrail? 10) AWS Series: Free AWS Services 11) AWS Series: All About Security 12) AWS Series: All About Cost Optimisation 13) I Sat the AWS Cloud Practitioner Exam Online: Here's What Happened

S3 (Simple Storage Service) is used to store objects and flat files in 'buckets' in the Cloud.

There is unlimited storage available, across 100 buckets, and files can be from 0 bytes to 5TB.

Use Cases
How Data is Stored
Storage Class Options
Security
Encryption
Versioning
Replication
Getting started with Free Tier



Use cases

S3 is one of the oldest services AWS offers and is incredibly flexible with multiple ways to use it.


Analytics / Data Lake

  • Uncouple storage and compute to scale either up or down as needed using Amazon Athena as the query service over the top and AWS Glue as a data catalogue.

Archive

  • When data goes from 'hot', frequently accessed, to 'cold', infrequently accessed, it can be moved to Amazon Glacier for a more cost-effective option.

Data Staging

  • Temporary data storage before being loading into AWS Redshift.

Static website

  • Host a website using S3 for storage and Route 53 as the DNS.


How data is stored

Each bucket needs a unique name and is formatted as:

https://s3-(region).amazonaws.com/(bucketname)

Each object consists of:

  • Key (the name of the object),
  • Value (the data in the file itself made of bytes),
  • VersionID,
  • Metadata

Amazon S3 provides read after write consistently and eventual consistency for updates and deletes. This is because data is being replicated across at least three Availability Zones (AZs) and may take time to flow through.



Storage Class Options

S3

  • The most expensive but most durable and reliable option for 'hot' data with 11 9's of durability.
  • Cloud apps, big data analytics, websites, content distribution.

S3:Infrequent Access

  • For storing non-critical data that CANNOT be easily reproduced and needs to be retrieved quickly. Costs 50% less because of the reduced availability.
  • Disaster recovery, backups.

S3:Infrequent Access - One Zone

  • For storing non-critical data that CAN be easily reproduced and needs to be retrieved quickly.
  • Useful for secondary backups as objects are only stored in one zone.
  • Cheaper than S3:IA as durability is reduced.

Glacier

  • For long-term storage with a 3 - 5 hour retrieval time for 'cold' data.

Deep Glacier

  • For long-term storage with a 12 hour retrieval time for 'cold' data.
  • Documents that need to be kept for compliance reasons for 7+ years.


Security

S3 is secure by default. Each new bucket and the objects in it are private. To keep objects even more secure use bucket policies, similar to IAM policies and Access Control Lists (ACL).

Presigned URLs are another option to provide security if temporary access to an object is required. A URL is generated via the AWS CLI and SDK which can then be used to provide temporary access to write or download object data.



Encryption

Client-Side

The client encrypts the objects and uploads to Amazon S3.

Server Side

The data is encrypted when written and decrypts when it is being used.

  • SSE-AES - S3 handles the key, uses AES-256 algorithm
  • SSE-KMS - Envelope encryption via AWS KMS and you manage the keys
  • SSE-C - Customer provided key (you manage the keys)


Versioning

  • When versioning is turned on deleted files have a delete tag added which hides the file.

  • Deleted files have a delete tag added which hides the file. To restore the file, delete the tag.

  • Each version takes up storage space, so a 1GB file edited 3 times with versioning on takes up 3GB of space.

  • Once turned on versioning can only be suspended, not removed.

  • Versions that are deleted on the other hand are actually deleted. Enabling Versioning MFA Delete gives extra protection as it requires MFA before a version can be deleted.



Replication

  • Cross-Region Replication lets you automatically replicate the contents of a bucket from one region to another.

  • Existing files won’t be copied until there’s been a new version, which will also replicate all previous versions and permissions.



Getting started

To get started with S3, the Free Tier offers 12 months of free storage. If you exceed the limits the standard rates apply.

5 GB of Standard Storage
20,000 Get Requests
2,000 Put Requests



Useful Links

S3 Documentation

S3 FAQ



This post first appeared on helenanderson.co.nz

AWS Associate Certifications (13 Part Series)

1) AWS Series: The Journey to Certification 2) AWS concepts from A to Z 3 ... 11 3) AWS Series: Don't you know who IAM? 4) AWS Series: What's in the Bucket? 5) AWS Series: Which EC2 is right for you? 6) AWS Series: SNS, SQS or both? 7) AWS Series: Gently Down the Stream with Kinesis 8) AWS Series: Why a VPC is like the London Underground 9) AWS Series: CloudWatch or CloudTrail? 10) AWS Series: Free AWS Services 11) AWS Series: All About Security 12) AWS Series: All About Cost Optimisation 13) I Sat the AWS Cloud Practitioner Exam Online: Here's What Happened

Posted on Mar 26 '19 by:

helenanders26 profile

Helen Anderson

@helenanders26

Making applications go faster at Raygun, AWS Data Hero, and tag moderator on Dev.to. Database concept you don’t understand? Let me know, I’ll write a post!

Discussion

markdown guide
 

Fantastic introduction to S3. Some of us are AWS certified so it's safe to say we're fans.

Have you seen how insanely scalable S3 is also? You could place a static website on it, get 4 million hits in a day and it would just automatically scale up. It's madness. Not sure it would be particularly cost effective to do that, but it's possible.

 

Very informative. I like the "Storage Class Options" section where you list out some real-world usage scenarios for each option.

👍 Once again!