Cloud computing often feels difficult to learn because many tutorials focus on individual services in isolation.
You create an S3 bucket in one tutorial, invoke a Lambda function in another, and experiment with DynamoDB somewhere else. While each service makes sense individually, it can still be hard to understand how they work together in a real application.
Instead of learning services one by one, I wanted to build something that connected them together.
Even better, I wanted to do it without creating an AWS account or worrying about cloud costs.
That's where Floci, an open-source AWS emulator, came in.
The Goal
The objective wasn't to recreate AWS perfectly.
It was to understand the interaction between services by building a simple document processing pipeline.
The architecture looked like this:
User
│
▼
Upload Document
│
▼
Amazon S3
│
▼
AWS Lambda
│
Extract Metadata
│
▼
Amazon DynamoDB
Although the final automatic Lambda → DynamoDB write couldn't be completed due to a networking limitation inside Floci, the overall architecture mirrors how the same workflow would be built on AWS.
Why Learn AWS Locally?
Running AWS services locally offers several advantages while learning:
- No cloud costs
- Safe experimentation
- Fast iteration
- Ability to inspect every component
- Easy debugging
Using the AWS CLI against a local endpoint also helped reinforce an important idea:
The AWS CLI is simply a client that sends API requests. Whether those requests go to Amazon's cloud or a local emulator depends on the configured endpoint.
What Each Service Taught Me
Amazon S3
The first service I explored was Amazon S3.
Rather than thinking of S3 as "cloud storage," it became much easier to understand it as object storage.
A bucket acts as a container, while every uploaded file is stored as an object.
Practical exercises included:
- Creating buckets
- Uploading files
- Listing bucket contents
- Downloading objects
- Deleting objects
These simple operations clarified how applications persist documents before any further processing occurs.
Amazon DynamoDB
Once files could be stored, the next step was understanding structured data.
Unlike S3, DynamoDB doesn't store files—it stores records.
Creating tables, inserting items, retrieving data, and scanning tables helped reinforce the difference between object storage and NoSQL databases.
Instead of storing the document itself, DynamoDB became the place to store information about the document.
AWS Lambda
Lambda introduced a completely different mindset.
Instead of managing servers, code is packaged and uploaded as a deployment artifact.
The Lambda function processed uploaded documents and generated metadata such as:
- Document ID
- Filename
- File size
- Upload timestamp
This was also where I encountered some of the most interesting debugging challenges.
Debugging Was the Real Teacher
Building the project wasn't just about writing code.
It involved understanding how different environments interact.
Some issues I encountered included:
- Missing AWS CLI inside the Lambda runtime
- Updating deployment packages correctly
- Lambda timeout while communicating with DynamoDB
- Docker networking behaviour inside Floci
Each issue forced me to understand the difference between:
- My Linux machine
- Docker containers
- Lambda runtime environments
- AWS SDK (
boto3) - AWS CLI
Those distinctions aren't always obvious from documentation alone, but debugging made them much clearer.
Understanding IAM
IAM was another concept that became easier through practice.
Rather than viewing it as just another AWS service, I started thinking of IAM as the system that answers three questions:
- Who is making the request?
- What action is being performed?
- Is that action allowed?
Learning about users, groups, policies, and roles also clarified why Lambda functions execute with an IAM role instead of inheriting permissions automatically.
The Bigger Picture
One realization stood out throughout the project:
AWS services don't communicate because they're "inside AWS."
They communicate through well-defined APIs.
Whether the services are running in Amazon's cloud or emulated locally, the interaction model remains largely the same.
Understanding those interactions felt much more valuable than memorizing individual commands.
What This Project Reinforced
Working through this pipeline reinforced several ideas:
- Building teaches more than reading documentation.
- Debugging is part of learning cloud computing.
- IAM is fundamentally about identities and permissions.
- Lambda runs inside an isolated execution environment.
- Cloud services are loosely coupled and communicate through APIs.
Final Thoughts
Cloud computing can seem overwhelming because of the sheer number of services available.
Building even a small end-to-end workflow makes those services feel much less abstract.
By connecting object storage, serverless compute, databases, and identity management into a single project, I gained a much clearer understanding of how these pieces fit together.
For anyone beginning their cloud journey, building a small pipeline—even locally—can often teach far more than reading documentation alone.
GitHub Repository
If you'd like to explore the project or contribute, here's the repository:
https://github.com/micheal000010000-hub/aws-document-processing-pipeline
Feedback, suggestions, and contributions are always welcome.
Top comments (0)