🎯 Overview
In this project, I built a complete analytics pipeline using AWS services.
The goal was simple:
Read CSV files from S3 → Convert to tables using Glue → Query using Athena → Visualize in Grafana
This is a real-world data analytics workflow used widely in Cloud + DevOps environments.
🚀 Architecture Diagram
🏗️ Services Used
| Service | Purpose |
| --------------------- | ------------------------------------- |
| S3 | Stores raw CSV files |
| Glue Crawler | Detects schema & creates Athena table |
| Glue Data Catalog | Manages table metadata |
| Athena | SQL queries on S3 files |
| Grafana | Visualizes Athena queries |
🧩 Step 1 — Upload CSV Data to S3
I created a folder structure:
s3/
└── s3-athena-analytics-abhishek31/raw/
└── bd-dec22-births-deaths-natural-increase.csv
└── bd-dec22-deaths-by-sex-and-age.csv
└── electronic-card-transactions.csv
Upload using AWS CLI:
aws s3 cp your_file_name.csv s3-athena-analytics-abhishek31/raw/
🤖 Step 2 — Create AWS Glue Crawler
The crawler:
- Points to S3 folder
- Detects CSV schema
- Creates a table inside AWS Data Catalog
- Database name: s3_log_db
- Table name: s3_athena_analytics_abhishek31 After running, I validated the schema.
🔍 Step 3 — Query Data in Athena
Athena reads the S3 data using SQL.
Example query:
SELECT
series_reference,
period,
series_title_2,
value
FROM s3_log_db.s3_athena_analytics_abhishek31
ORDER BY series_title_2 ASC;
This verified the data is properly structured.
📊 Step 4 — Connect Grafana to Athena
In Grafana:
- Add datasource → AWS Athena
- Configure:
- AWS Region
- S3 query results bucket
Workgroup
Test connection → Success
📈 Step 5 — Build Dashboards in Grafana
I created multiple visualizations using raw queries (not JSON imports):
Bar chart for age-group distribution
Time series analysis
Gender-wise population trends
Total population per year
🗂️ Project Folder Structure
AWS-Athena-S3-Grafana-Analytics/
│── docs/
│ └── README.md
│── s3/
│ └── README.md
│── sql/
│ └── athena_queries.sql
│── grafana/
│ └── README.md
└── README.md
📌 Key Learnings
✔ How to automate schema detection with Glue
✔ How Athena queries S3 without a database server
✔ How to integrate AWS & Grafana
✔ How to visualize analytics with Live SQL
Great project for Cloud + DevOps profile! 💯
🔗 GitHub Repository:
https://github.com/abhikorde31/aws-s3-athena-grafana-analytics
🏁 Conclusion
This project shows how to build a production-grade analytics pipeline using AWS serverless services + Grafana.
If you want a visualization dashboard, real-time updates with Lambda, or Terraform automation — I can help you extend it.




Top comments (0)