DEV Community

Cover image for πŸ“Š AWS S3 + AWS Glue + Athena + Grafana β€” End-to-End Analytics Project
Abhishek Korde
Abhishek Korde

Posted on

πŸ“Š AWS S3 + AWS Glue + Athena + Grafana β€” End-to-End Analytics Project

🎯 Overview

In this project, I built a complete analytics pipeline using AWS services.
The goal was simple:

Read CSV files from S3 β†’ Convert to tables using Glue β†’ Query using Athena β†’ Visualize in Grafana

This is a real-world data analytics workflow used widely in Cloud + DevOps environments.


πŸš€ Architecture Diagram


πŸ—οΈ Services Used
| Service | Purpose |
| --------------------- | ------------------------------------- |
| S3 | Stores raw CSV files |
| Glue Crawler | Detects schema & creates Athena table |
| Glue Data Catalog | Manages table metadata |
| Athena | SQL queries on S3 files |
| Grafana | Visualizes Athena queries |


🧩 Step 1 β€” Upload CSV Data to S3

I created a folder structure:
s3/
└── s3-athena-analytics-abhishek31/raw/
└── bd-dec22-births-deaths-natural-increase.csv
└── bd-dec22-deaths-by-sex-and-age.csv
└── electronic-card-transactions.csv

Upload using AWS CLI:
aws s3 cp your_file_name.csv s3-athena-analytics-abhishek31/raw/


πŸ€– Step 2 β€” Create AWS Glue Crawler

The crawler:

  • Points to S3 folder
  • Detects CSV schema
  • Creates a table inside AWS Data Catalog
  • Database name: s3_log_db
  • Table name: s3_athena_analytics_abhishek31 After running, I validated the schema.

πŸ” Step 3 β€” Query Data in Athena

Athena reads the S3 data using SQL.

Example query:

SELECT 
    series_reference,
    period,
    series_title_2,
    value
FROM s3_log_db.s3_athena_analytics_abhishek31
ORDER BY series_title_2 ASC;

Enter fullscreen mode Exit fullscreen mode

This verified the data is properly structured.
πŸ“Š Step 4 β€” Connect Grafana to Athena

In Grafana:

  1. Add datasource β†’ AWS Athena
  2. Configure:
  3. AWS Region
  4. S3 query results bucket
  5. Workgroup

  6. Test connection β†’ Success


πŸ“ˆ Step 5 β€” Build Dashboards in Grafana

I created multiple visualizations using raw queries (not JSON imports):

Bar chart for age-group distribution

Time series analysis

Gender-wise population trends

Total population per year


πŸ—‚οΈ Project Folder Structure
AWS-Athena-S3-Grafana-Analytics/
│── docs/
β”‚ └── README.md
│── s3/
β”‚ └── README.md
│── sql/
β”‚ └── athena_queries.sql
│── grafana/
β”‚ └── README.md
└── README.md


πŸ“Œ Key Learnings

βœ” How to automate schema detection with Glue
βœ” How Athena queries S3 without a database server
βœ” How to integrate AWS & Grafana
βœ” How to visualize analytics with Live SQL

Great project for Cloud + DevOps profile! πŸ’―


πŸ”— GitHub Repository:
https://github.com/abhikorde31/aws-s3-athena-grafana-analytics


🏁 Conclusion

This project shows how to build a production-grade analytics pipeline using AWS serverless services + Grafana.

If you want a visualization dashboard, real-time updates with Lambda, or Terraform automation β€” I can help you extend it.

Top comments (0)