📊 AWS S3 + AWS Glue + Athena + Grafana — End-to-End Analytics Project

#grafana #monitoring #aws #cloud

🎯 Overview

In this project, I built a complete analytics pipeline using AWS services.
The goal was simple:

Read CSV files from S3 → Convert to tables using Glue → Query using Athena → Visualize in Grafana

This is a real-world data analytics workflow used widely in Cloud + DevOps environments.

🚀 Architecture Diagram

🏗️ Services Used
| Service | Purpose |
| --------------------- | ------------------------------------- |
| S3 | Stores raw CSV files |
| Glue Crawler | Detects schema & creates Athena table |
| Glue Data Catalog | Manages table metadata |
| Athena | SQL queries on S3 files |
| Grafana | Visualizes Athena queries |

🧩 Step 1 — Upload CSV Data to S3

I created a folder structure:
s3/
└── s3-athena-analytics-abhishek31/raw/
└── bd-dec22-births-deaths-natural-increase.csv
└── bd-dec22-deaths-by-sex-and-age.csv
└── electronic-card-transactions.csv

Upload using AWS CLI:
aws s3 cp your_file_name.csv s3-athena-analytics-abhishek31/raw/

🤖 Step 2 — Create AWS Glue Crawler

The crawler:

Points to S3 folder
Detects CSV schema
Creates a table inside AWS Data Catalog
Database name: s3_log_db
Table name: s3_athena_analytics_abhishek31 After running, I validated the schema.

🔍 Step 3 — Query Data in Athena

Athena reads the S3 data using SQL.

Example query:

SELECT 
    series_reference,
    period,
    series_title_2,
    value
FROM s3_log_db.s3_athena_analytics_abhishek31
ORDER BY series_title_2 ASC;

This verified the data is properly structured.
📊 Step 4 — Connect Grafana to Athena

In Grafana:

Add datasource → AWS Athena
Configure:
AWS Region
S3 query results bucket
Workgroup
Test connection → Success

📈 Step 5 — Build Dashboards in Grafana

I created multiple visualizations using raw queries (not JSON imports):

Bar chart for age-group distribution

Time series analysis

Gender-wise population trends

Total population per year

🗂️ Project Folder Structure
AWS-Athena-S3-Grafana-Analytics/
│── docs/
│ └── README.md
│── s3/
│ └── README.md
│── sql/
│ └── athena_queries.sql
│── grafana/
│ └── README.md
└── README.md

📌 Key Learnings

✔ How to automate schema detection with Glue
✔ How Athena queries S3 without a database server
✔ How to integrate AWS & Grafana
✔ How to visualize analytics with Live SQL

Great project for Cloud + DevOps profile! 💯

🔗 GitHub Repository:
https://github.com/abhikorde31/aws-s3-athena-grafana-analytics

🏁 Conclusion

This project shows how to build a production-grade analytics pipeline using AWS serverless services + Grafana.

If you want a visualization dashboard, real-time updates with Lambda, or Terraform automation — I can help you extend it.