DEV Community

Cover image for Interested in Football Analytics?
Enrique Uribe
Enrique Uribe

Posted on

2 1 1 1 1

Interested in Football Analytics?

I've recently started my journey diving into football analytics and have created a sample python program that references https://understat.com/ to scrape single game shot data.

This marks the beginning of my journey into data manipulation. I’m excited to dive deeper into this field and look forward to sharing more updates as I progress.

Repo:
https://github.com/UribeJr/football-data-scraper-to-csv-exporter

#!/usr/bin/env python
# coding: utf-8

# In[2]:


#import modules and packages
import requests
from bs4 import BeautifulSoup
import json
import pandas as pd


# In[3]:


#scrape single game shots
base_url = 'https://understat.com/match/'
match = str(input("Enter your match ID: "))
url = base_url + match


# In[16]:


res = requests.get(url)
soup = BeautifulSoup(res.content, 'lxml')
span = soup.find('span')
script = soup.find_all('script')
script


# In[18]:


string = script[1].string
string


# In[26]:


#strip symbols so we only have json data
index_start = string.index("('") + 2
index_end = string.index("')")

json_data = string[index_start:index_end]
json_data = json_data.encode('utf8').decode('unicode_escape')
data = json.loads(json_data)


# In[35]:


df_h = pd.DataFrame(data['h'])
print("Home Team DataFrame:")
print(df_h.head())


# In[37]:


# Save the home team DataFrame to a CSV file
df_h.to_csv('home_team_shots.csv', index=False)


# In[ ]:
Enter fullscreen mode Exit fullscreen mode

How To

  • Import all necessary packages/modules requests, pandas, BeautifulSoup
  • Go to https://understat.com/ and go to any match that you want specific shot data for. Match URL should look like the following https://understat.com/match/{match-id}
  • Execute data_scraping.py and input the match-id

Congratulations!

The program then scrapes the shot data from the match and converts each Home and Away's team data into a separate Data Frame. The Data Frame's are then export as separate CSV Files for reference.

Data Frame:

Screenshot 2024-09-13 at 11 18 58 AM

CSV:

Screenshot 2024-09-13 at 11 21 52 AM

Image of AssemblyAI tool

Challenge Submission: SpeechCraft - AI-Powered Speech Analysis for Better Communication

SpeechCraft is an advanced real-time speech analytics platform that transforms spoken words into actionable insights. Using cutting-edge AI technology from AssemblyAI, it provides instant transcription while analyzing multiple dimensions of speech performance.

Read full post

Top comments (1)

Collapse
 
camila_monteirodacosta_ profile image
Camila Monteiro da Costa

If you're into soccer analytics, you're definitely not alone! From making predictions to analyzing team strategy, analytics can provide cool insights. It's a great way to get a deeper understanding of the game and maybe find unexpected bets. By the way, I recently found a cool app that you might like! It gives you access to stats, results and other data useful for analyzing matches. You can check it out on this link - a handy tool for anyone who is into soccer and wagers.

Billboard image

Try REST API Generation for MS SQL Server.

DevOps for Private APIs. With DreamFactory API Generation, you get:

  • Auto-generated live APIs mapped from database schema
  • Interactive Swagger API documentation
  • Scripting engine to customize your API
  • Built-in role-based access control

Learn more

👋 Kindness is contagious

Engage with a sea of insights in this enlightening article, highly esteemed within the encouraging DEV Community. Programmers of every skill level are invited to participate and enrich our shared knowledge.

A simple "thank you" can uplift someone's spirits. Express your appreciation in the comments section!

On DEV, sharing knowledge smooths our journey and strengthens our community bonds. Found this useful? A brief thank you to the author can mean a lot.

Okay