DEV Community

Cover image for Building a Serverless Data Analytics Pipeline with AWS: Premier League Dashboard

Building a Serverless Data Analytics Pipeline with AWS: Premier League Dashboard

Inspired by AWS Cookbook by John Culkin & Mike Zazon - Chapter 7: Big Data


My Journey into Data Analytics

While exploring AWS AI/ML services, I realized that artificial intelligence and machine learning are fundamentally built upon quality data foundations. This insight led me to step back and master the data analytics fundamentals first. What better way than to build a complete serverless pipeline?

In this post, I'll share how I created a scalable analytics solution using AWS S3, Athena, and QuickSight to analyze Premier League data.

Why Premier League data? As a passionate football enthusiast, I'm fascinated by the rich statistical narratives that unfold each season. Every match generates meaningful data points, from goals and assists to tactical formations and player performance metrics. This abundance of structured, real-world data makes football analytics an ideal playground for learning data engineering concepts while working with something I genuinely care about.

What we'll build:

  • Serverless data storage with Amazon S3
  • SQL querying with Amazon Athena
  • Interactive dashboards with Amazon QuickSight

⚠️ Disclaimer: The Premier League data used in this project is completely fictional and for demonstration purposes only. If you see Manchester City with 150 points or Tottenham actually winning something, that's just my creative data generation at work! 😄 Please don't use this for your fantasy football decisions - you've been warned! For real Premier League data, check the official sources (and prepare for more realistic disappointment).

Architecture Overview

AWS Analytics

This serverless architecture eliminates infrastructure complexity while providing:

  • Scalability: Automatic scaling without server management
  • Cost-efficiency: Pay-per-query pricing model
  • Speed: Query results in seconds

Implementation Highlights

1. S3 Data Lake Setup

I stored Premier League CSV files in S3, creating a scalable data foundation:

# Create bucket and upload data
aws s3api create-bucket --bucket premier-league-data-$(openssl rand -hex 3)
aws s3 cp data/ s3://your-bucket/raw-data/ --recursive
Enter fullscreen mode Exit fullscreen mode

Data Source

2. Athena SQL Querying

Created External Tables:
Athena's power lies in querying data directly from S3 without moving it. Here's how I created the tables:

-- Create standings table
CREATE EXTERNAL TABLE IF NOT EXISTS standings (
    team_name STRING,
    matches_played INT,
    wins INT,
    draws INT,
    losses INT,
    goals_for INT,
    goals_against INT,
    goal_difference INT,
    points INT
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE
LOCATION 's3://your-bucket/raw-data/'
TBLPROPERTIES ('skip.header.line.count'='1');

-- Create match results table
CREATE EXTERNAL TABLE IF NOT EXISTS match_results (
    team_name STRING,
    result_type STRING
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS TEXTFILE
LOCATION 's3://your-bucket/raw-data/'
TBLPROPERTIES ('skip.header.line.count'='1');
Enter fullscreen mode Exit fullscreen mode

Athena Tables

Sample Analytics Queries:

Once tables were created, I ran insightful queries:

-- Top 6 teams analysis
SELECT team_name, points, goal_difference 
FROM standings 
ORDER BY points DESC 
LIMIT 6;
Enter fullscreen mode Exit fullscreen mode

Top 6 teams

-- Win percentage calculation
SELECT team_name, 
       ROUND((wins * 100.0 / matches_played), 2) as win_percentage
FROM standings
ORDER BY win_percentage DESC;
Enter fullscreen mode Exit fullscreen mode

Win Percentage

-- Verify data integrity
SELECT *  FROM standings;
SELECT *  FROM match_results;
Enter fullscreen mode Exit fullscreen mode

Standings-results

Key Athena Benefits:

  • No data movement required
  • Standard SQL interface
  • Pay only for data scanned ($5/TB)
  • Results in seconds

3. QuickSight Dashboards

Built interactive visualizations, including:

  • League standings table
  • Points comparison charts
  • Goal difference analysis
  • Team performance metrics

QuickSight Dashboard

Business Value for Management

QuickSight delivers immediate ROI through:

Decision Speed: Real-time dashboards eliminate waiting for IT reports
Cost Savings: $9/user vs $70+ for traditional BI tools like Tableau
Self-Service Analytics: Business users create their own insights without technical dependencies
Mobile Access: Executive dashboards available anywhere, anytime
Scalability: Handles 10 users or 10,000 users with the same architecture
Security: Enterprise-grade AWS security and compliance built in

What Quicksight can do
ImageSource amazon.com QuickSight page

Management Benefits:

  • Reduce reporting cycle from weeks to minutes
  • Democratize data access across all departments
  • Lower total cost of ownership by 60-80% vs traditional solutions
  • Eliminate server maintenance and upgrade costs

Results & Insights

Cost Breakdown:

  • S3 Storage: ~$0.05/month
  • Athena Queries: ~$0.25/month
  • QuickSight: $9/user/month

Total: ~$9.30/month for enterprise-grade analytics!

Key Learnings:
✅ Setup completed in under 2 hours
✅ Serverless = zero infrastructure management
✅ SQL familiarity accelerated development
⚠️ QuickSight permissions required for initial troubleshooting

Next Steps

This foundation opens doors to:

  • Real-time data integration
  • Machine learning predictions
  • Advanced ETL pipelines with AWS Glue

Final Reflections

Starting with data fundamentals before diving into AI/ML proved invaluable. This serverless analytics pipeline demonstrates that powerful data solutions don't require complex infrastructure - just the right AWS services working together.

The S3 + Athena + QuickSight combination delivers enterprise-grade analytics at startup costs, making it perfect for both learning and production use cases.

Resources


Building your own data pipeline? Connect with me on LinkedIn I'd love to hear about your experience!

Top comments (0)