Ketan Sonar

Posted on May 13

How I Built a Fully Local AI F1 Pit-Wall Using LLaMA 3, RAG, and Real-Time Telemetry

#ai #machinelearning #rag #pgaichallenge

Over the past few weeks, I’ve been experimenting with a question that kept coming to mind while watching Formula 1:

Could a local AI system act like a race engineer and make real-time strategy calls from live telemetry?

That idea eventually turned into a fully local AI pit-wall prototype capable of:

Streaming telemetry at 10Hz
Predicting tire cliff drop-offs one lap early
Retrieving historical race knowledge using RAG
Generating radio-style strategy calls with a local LLM
Producing automated post-race PDF reports

And the entire stack runs offline.

The Goal

Most AI systems today rely heavily on cloud inference.

I wanted to explore something different:

low-latency
local inference
explainable predictions
real-time telemetry processing

The project combines sequential ML prediction with retrieval-augmented contextual reasoning in a motorsport-inspired environment.

System Architecture

The pipeline is divided into five major components:

Telemetry Ingestion Layer

Telemetry data is streamed through FastAPI WebSockets at 10Hz using historical F1 telemetry from the FastF1 library.

Supported inputs:

CSV
JSON
XLSX

The telemetry stream includes:

tire temperatures
sector timing
lap pace
degradation patterns
speed traces

Predictive ML Engine

The prediction system uses:

2-layer LSTM
Multi-head attention
Sequential telemetry windows

The model predicts:

next-lap pace degradation
probability/confidence score
possible tire cliff onset

The attention mechanism helped significantly when modeling nonlinear degradation patterns.

RAG Knowledge Base

To make strategy outputs more contextual, I added a Retrieval-Augmented Generation pipeline using:

ChromaDB
HuggingFace all-MiniLM-L6-v2 embeddings

The vector database stores:

FIA rulebooks
historical race reports
strategy notes

This allows the system to retrieve contextual racing knowledge dynamically before generating strategy recommendations.

AI Race Engineer

The strategy layer runs locally using:

Ollama
LLaMA 3

Instead of returning raw ML predictions, the system converts outputs into radio-style race engineering calls.

Example:

“Tire degradation trend indicates high cliff probability within 2 laps. Box window optimal between laps 18–20.”

One of the most interesting challenges was balancing:

concise outputs
explainability
low latency

Frontend + Visualization

Frontend stack:

Next.js 14
Tailwind CSS
Recharts
Framer Motion

The UI focuses heavily on:

real-time telemetry readability
pit-wall inspired dashboards
prediction visibility
strategy clarity
Automated Race Debrief

At the end of a session, the system generates a complete PDF race report client-side using:

jsPDF
html2canvas

The report includes:

tire performance
prediction history
telemetry snapshots
strategic recommendations
Tech Stack
AI / ML
PyTorch
LSTM + Attention
LangChain
Ollama
LLaMA 3
Backend
FastAPI
Uvicorn
SQLAlchemy
SQLite
Data + Retrieval
ChromaDB
HuggingFace embeddings
FastF1 telemetry
Frontend
Next.js 14
Tailwind CSS
Recharts
Framer Motion
Key Challenges

Some of the hardest parts were:

Maintaining low-latency telemetry streaming
Combining ML predictions with RAG retrieval cleanly
Running everything locally without cloud services
Making outputs interpretable instead of “black box” predictions
Future Improvements

Some things I’m currently exploring:

reinforcement learning for strategy optimization
live multi-car telemetry simulation
voice-based AI engineer interaction
better uncertainty estimation
more advanced degradation modeling
Demo + Repository

Public showcase repository:

GitHub Repository

(Full implementation remains private for now.)

Final Thoughts

This project started as an experiment around local AI systems and real-time telemetry, but it became one of the most interesting engineering challenges I’ve worked on so far.

I’d genuinely appreciate feedback from:

ML engineers
motorsport enthusiasts
telemetry specialists
frontend developers
local AI builders

Especially around:

telemetry realism
RAG usefulness
UI/UX improvements
inference architecture