Ayo

Posted on Aug 8 • Edited on Aug 19

Beginner's AWS Guide: Storage Services (Part 3)

#aws #programming #cloud #devops

Objective:

This section builds upon a key component of servers - Storage as covered in the Beginner's AWS Guide: Virtual Servers (Part 2). Here, we delve into the available data storage services in AWS, including S3, EBS, and EFS.

S3: Simple Storage Service 🪣

Servers need a place to store and retrieve data, and in this case, AWS S3 is a very popular option. We can store and access many data types, including;

Images (JPG, PNG)
Documents (PDF, DOCX)
Videos (MP4, MOV)
Audio (MP3)
Static files (HTML, CSS, JS)

More precisely, S3 is an object storage service that allows us to store files (called objects) in buckets, which act like top-level folders, and to access objects directly via URL.

Each object can be up to 5TB in size, and buckets can hold an unlimited number of objects.

Each object consists of:

The data itself (e.g. a photo, video, document).
A unique key (how we identify it in the bucket).
Optional metadata (e.g. content type, author, timestamp).

💡 Each newly created bucket must have a globally unique name. When we create a bucket, it becomes part of a public web address (URL). Just like two websites can't share the same domain name, two S3 buckets can't have the same name, otherwise, AWS wouldn't know which bucket to route the request to!

EXAMPLE:

https://my-GLOBALLY-unique-bucket-name.s3.us-east-1.amazonaws.com/photo.jpg

We can make URL links publicly accessible or limit accessibility using bucket/object policies, which we cover in IAM and Security Fundamentals (Part 6).

S3 FAQs 🪣

✨ Q. Can we directly upload 5TB in one go? ✨

No. AWS limits a single PUT request per object to 5GB, so anything larger must use a multipart upload — splitting the file into parts that are uploaded separately.

✨ Q. Is S3 free? ✨

Not quite, but it's very low cost, and we only pay for what we use:

Storage used (GB/month)
Data transferred (in/out)

We can also enable Requester Pays, which charges users for downloading our data.

✨ Q. Does S3 come with additional features? ✨

Yes — AWS offers features to assist with our goals and objectives, including:

Cross-region replication
Access logging
Versioning
Lifecycle rules
Transfer Acceleration for faster uploads worldwide

These are advanced features that I can cover in future posts!

S3 Storage Classes 🪣

When we upload an object, it's stored in S3 Standard (General Purpose) by default. But S3 offers several storage classes to optimise cost and performance based on how frequently we access the data.

💡 Intelligent-Tiering is used if we’re unsure of access patterns. It will move objects between tiers automatically (for a small monitoring fee).

EBS Volume: Elastic Block Store 💽

Think of an EBS Volume as a cloud-based external hard drive (like a USB stick) that we can attach to EC2 servers. It gives us persistent storage, which survives even if we delete the server.

Just like physical drives, we can attach/detach EBS volumes from servers, format them, and use them to store logs, databases, or other data.

EBS volumes are created within a specific Availability Zone (AZ). We can attach/detach volumes between EC2 instances in the same AZ. But to move a volume to another AZ, we must:

Create a snapshot (a copy) of the volume.
Use the snapshot to create a new EBS volume in the target AZ.
And then attach our volume to another EC2 instance.

EBS: Volume Types 💽

AWS offers different volume types based on performance and price. We determine the type of volume we want based on the following:

Size (GB) — how much overall space we want.
Throughput (MB/s) — how much data can be read/written per second. A lower throughput means it takes longer to move large amounts of data, while higher throughput allows for faster bulk data transfers.
IOPS — how many read/write ops per second (important for database-style workloads). A Higher IOPS means our storage can handle more read/write operations per second, resulting in faster application performance.

💡 For further info on EBS Volume specs, please check out - https://digitalcloud.training/amazon-ebs/

EFS: Elastic File System 📠

Whereas S3 is object storage and EBS is block storage, EFS is network file storage - just like a regular file system on our computer.

💡 A file system is where we can store and access files in folders, and further place those folders inside other folders, and so forth.

EFS lets multiple servers simultaneously mount and access the same folder structure, which makes it perfect for shared storage scenarios (e.g. logs shared across multiple instances).

How EFS works

We create an EFS file system in AWS. AWS provides a DNS name (per file system, region, and VPC) to use for mounting.
We install the EFS utilities on our Linux EC2 instances, which let us communicate with the file system using the mount.efs helper.
We configure mount targets across Availability Zones within a VPC (one per subnet/AZ). This ensures EC2 instances in each AZ can access EFS with low latency.
Finally, we mount the file system using the DNS-based mount target.

EXAMPLE

# Example EFS Mount Target DNS
fs-12345678.efs.us-east-1.amazonaws.com

# Install EFS utilities (Amazon Linux)
sudo yum install -y amazon-efs-utils

# Make a directory on the EC2 instance to mount the EFS
sudo mkdir -p /mnt/efs

# Mount the EFS file system (with encryption in transit)
sudo mount -t efs -o tls fs-12345678.efs.us-east-1.amazonaws.com:/ /mnt/efs

# Once mounted, it's just a folder path!
# We can create directories and write files in EFS from the instance:

sudo mkdir -p /mnt/efs/app-logs
sudo mkdir -p /mnt/efs/shared-data
sudo mkdir -p /mnt/efs/backups

echo "log data" | sudo tee /mnt/efs/app-logs/app.log
sudo cp important-file.txt /mnt/efs/shared-data/

💡 EFS has different performance and throughput modes depending on our use case.

🎯 TL;DR

S3: Scalable object storage for files, accessible via URL — great for backups, media, and static content.
EBS: Block-level storage (like a cloud hard drive) attached to one server at a time.
EFS: Shared file system that multiple servers can access at once — perfect for collaboration and distributed workloads.
Choose storage based on how often, how fast, and by whom the data needs to be accessed.

✨ This is part of a mini-series where I delve into everything cloud-related. Check out my other posts for further learning! ✨