I have begun practising on AWS and what's more productive than to build projects out of AWS console. Cloud Computing is my new favourite along with AI. After various ideation sessions and discussing with Claude what could intersect with AI and a use of cloud native application we joined hands with image intelligence pipeline and Claude gave me all the AWS services names which fall under free tier yes my project is $0 cost and 100% working. By this I realised how much we are dependent on AI bots to think and act. If I had been working with AWS and somehow having unlimited credits I woulda def. deployed the project.
Hi world! This is a simple project but the happiness of the application to be workable is boundless because I take forever to debug but nowadays we are talking about Agentic Engineering helping me to find bugs faster but the concept is going above my head has been used by me a bit and I can tell you some other day about Claude skills and it's wonders but first let me get acquainted with the new SDLC itself.
What it can identify
faces and their confidence score of emotions, age range
object scenes, people, activities and graphic elements and the best part is it recognises the text and writes down whatever is written in the image so you can upload your receipts too.
Demonstration
Github repository
s17anushka
/
image-intelligence
An AI image intelligence pipeline built fully on AWS free tier and frontend on Next.js
🧠 AI Image Intelligence Pipeline
A serverless AI-powered image analysis app built on AWS. Upload any image and get instant analysis — objects, faces, emotions, text, and receipt line items — all in under 5 seconds.
✨ What it does
Upload any image and the pipeline automatically:
- Detects objects & scenes — labels everything visible with confidence scores
- Reads faces — estimates age range, dominant emotion, gender, smile
- Extracts text — reads any words visible in the image
- Parses receipts — extracts line items and amounts from receipts/invoices
- Moderates content — flags unsafe or explicit content automatically
The Lambda function acts as an AI agent — it branches conditionally based on what Rekognition finds. If text is detected, it automatically fires a Textract call to extract structured data.
🏗️ Architecture
User uploads image
↓
API Gateway → Lambda (presign) → S3 presigned URL
↓
Browser PUT → S3 bucket
↓
…
TechStack
Frontend
- Next.js 15 for app router! Well My frontend is basic enough to demonstrate my idea.
- Typescript for type safety
- Tailwind CSS for the styling
- SWR for data fetching and polling
Backend
Serverless and AWS mediated
- Amazon S3 - To upload images and triggering the pipeline by event notification
- AWS Lambda (Node.js 24) - there are 2 functions created image-intelligence-orchestrator — the AI agent, triggered by S3 image-intelligence-api-handler — handles REST API requests
- Amazon Rekognition - AWS CV AI service fetched from IAM role
Amazon Textract — AWS's document text extraction service
Amazon DynamoDB — NoSQL database storing analysis results
-
API Gateway (HTTP API) — REST endpoints connecting frontend to Lambda
Apart from these services I had taken IAM role access for scoped permissions for lambda to access all the services
I kept in mind actually no, this one was suggested by claude that presigned URLs will enable browser to upload images directly to S3 without exposing credentials.
The whole pipeline runs in 3 to 5 seconds
What the Project Does - Below written by Claude!!
When you upload an image, here's exactly what happens:
Step 1 - Upload
Browser requests a presigned URL from API Gateway → Lambda generates a secure temporary S3 upload URL → browser uploads the image directly to S3.
Step 2 - Trigger
S3 detects the new file and automatically triggers the orchestrator Lambda.
Step 3 - AI Analysis (the agentic part)
Lambda runs 4 Rekognition calls in parallel:
DetectLabels — identifies objects, scenes, concepts in the image
DetectFaces — finds faces, estimates age range, reads emotions and attributes
DetectText — reads any text visible in the image
DetectModerationLabels — flags unsafe content
Then it makes a decision — if more than 3 words were found, it fires a 5th call to Textract which extracts structured data like receipt line items and tables. This conditional branching is the "agentic behaviour" — Lambda decides what to do next based on what it finds.
Step 4 - Store
All results are aggregated into a single DynamoDB item keyed by imageId.
Step 5 - Display
Frontend polls the API every 2 seconds until DynamoDB has the result, then renders the analysis card with labels, faces, text and receipt data.
More to the Projects like these and will dropping the bombs later
Stay tuned!





Top comments (0)