Hello, I’m sea_turt1e.
In this post, I’ll share the process and results of building a machine learning model to predict the chemistry of players in the American professional basketball league, the NBA, which I love.
Overview
- Used Graph Neural Networks (GNNs) for chemistry prediction.
- Adopted AUC (Area Under the Curve) as the evaluation metric.
- The AUC at convergence was approximately 0.73.
- Training data covered the 1996–97 to 2021–22 seasons, with the 2022–23 season reserved for testing.
Caution: About NBA
For readers unfamiliar with the NBA, some parts of this post may be difficult to follow. However, you can think of "chemistry" in a more intuitive sense. Additionally, while this post focuses on the NBA, the approach can be applied to other sports or even interpersonal chemistry predictions.
chemistry Prediction Results
Let’s start with the prediction results. I’ll dive into the dataset and technical details later.
explanation of edge and score
In chemistry predictions, red edges indicate good chemistry, black edges indicate moderate chemistry, and blue edges indicate poor chemistry.
The score on the edge represent the chemistry score, ranging from 0 to 1.
chemistry Prediction for Star Players
Here are the chemistry predictions for star players. The graph only includes pairs of players who have never been on the same team.
Looking at the predictions for star players who have never played together, the results may not always align with intuition.
For instance, LeBron James and Stephen Curry displayed excellent synergy during the Olympics, suggesting high chemistry. On the other hand, it’s surprising that Nikola Jokić was predicted to have low chemistry with other players.
chemistry Predictions for Major Trades in the 2022–23 Season
To make the predictions more relatable, I tested chemistry between players involved in actual trades during the 2022–23 season.
Since the 2022–23 season wasn’t included in the training data, predictions aligning with real-world impressions could indicate the model's validity.
The 2022–23 season saw several significant trades.
Here are the predictions for key players such as Kevin Durant, Kyrie Irving, and Rui Hachimura.
The chemistry predictions for their new teams were as follows:
- Lakers: Rui Hachimura – LeBron James (Red edge: good chemistry)
- Suns: Kevin Durant – Chris Paul (Black edge: moderate chemistry)
- Mavericks: Kyrie Irving – Luka Doncic (Blue edge: poor chemistry)
Considering the 2022–23 season’s dynamics, these results seem reasonably accurate.
(Although the situation changed for the Suns and Mavericks in later seasons.)
Technical Details
From here, I’ll explain the technical aspects, including the GNN framework and dataset preparation.
What is a GNN?
A GNN (Graph Neural Network) is a neural network designed to process graph-structured data.
In this model, "chemistry" between players is represented as graph edges, and learning was conducted as follows:
- Positive edges: Pairs of players with high assist counts.
- Negative edges: Pairs of players with low assist counts.
For negative edges, the model prioritized "teammates with low assist counts" and de-emphasized "players from different teams."
What is AUC?
AUC (Area Under the Curve) refers to the area under the ROC curve and serves as a metric to evaluate model performance.
An AUC closer to 1 indicates higher accuracy. In this study, the model achieved an AUC of about 0.73—a moderately good result.
Learning Curve and AUC Progression
Here are the learning curve and AUC progression from the training process:
Dataset
The primary innovation was in constructing the dataset.
To quantify chemistry, I hypothesized that "high assist counts" indicate good chemistry. Based on this hypothesis, the dataset was structured as follows:
- Positive edges: Players with high assist counts.
- Negative edges: Players with low assist counts.
Additionally, teammates with low assist counts were explicitly treated as having poor chemistry.
Code Details
All code is available on GitHub.
Following the instructions in the README should allow you to replicate the training process and plot the graphs described here.
https://github.com/sea-turt1e/NBANetwork
Future Prospects
There is still room for improvement, and I aim to achieve the following goals:
-
Expand the definition of chemistry
- Incorporate factors beyond assists to more accurately capture player relationships.
-
Improve accuracy
- Enhance AUC through better training methods and expanded datasets.
-
Integrate natural language processing
- Analyze player interviews and social media posts to add new perspectives.
-
Write English articles
- Publish content in English to reach a broader international audience.
-
Develop a GUI for graph visualization
- Create a web application to allow users to explore player chemistry interactively.
Conclusion
In this post, I introduced my attempt to predict NBA player chemistry.
Although the model is still a work in progress, I hope to achieve even more interesting results with further improvements.
I’d love to hear your thoughts and advice in the comments!
Let me know if you’d like further refinements!
Top comments (0)