DEV Community

kanta13jp1
kanta13jp1

Posted on

Building a Fully Automated Horse Racing AI Prediction Pipeline with Flutter + Supabase

Building a Fully Automated Horse Racing AI Prediction Pipeline with Flutter + Supabase

Why Horse Racing?

Horse racing data is rich, structured, and updated daily — a perfect playground for building an automated AI prediction pipeline. I built one into my Flutter Web app, covering both JRA (Japan Racing Association) and NAR (regional tracks, 15 venues).

Here's the full technical breakdown.


Architecture

[JRA/NAR Data Fetch]   → fetch_horse_racing.py (Python, EUC-JP decode)
        ↓
[tools-hub Edge Fn]    → horseracing.today / predict_all / predictions / accuracy
        ↓
[Supabase DB]          → horse_races / horse_results tables
        ↓
[GitHub Actions]       → horse-racing-update.yml (every hour)
        ↓
[Flutter UI]           → horse_racing_predictor_page.dart (3-tab layout)
Enter fullscreen mode Exit fullscreen mode

Data Fetching: JRA + NAR (15 Regional Tracks)

Python script fetch_horse_racing.py handles both JRA and NAR data:

response = requests.get(url, headers=headers, timeout=10)
# Japanese horse racing sites use EUC-JP encoding
# errors='replace' prevents crashes on unknown bytes
content = response.content.decode('euc-jp', errors='replace')
Enter fullscreen mode Exit fullscreen mode

The encoding gotcha: On Windows, Python defaults to CP932. EUC-JP bytes decoded as CP932 produce garbled text. Using errors='replace' stabilizes the decode regardless of system locale — critical since this runs on GitHub Actions (Ubuntu) and local Windows.


Edge Function: Action Dispatch in tools-hub

To stay under the 50 Edge Function hard cap, all horse racing features live as actions inside tools-hub:

// tools-hub/index.ts
switch (action) {
  case 'horseracing.today':
    return await getHorseRacingToday(supabase);
  case 'horseracing.predict_all':
    return await predictAllRaces(supabase, body);
  case 'horseracing.predictions':
    return await getPredictions(supabase, body);
  case 'horseracing.accuracy':
    return await getAccuracyStats(supabase);
}
Enter fullscreen mode Exit fullscreen mode

This is the hub pattern: one deployed function, multiple behaviors via action parameter. Currently 16 Edge Functions total (hard cap: 50).

Auth Zone Design

GitHub Actions calls these endpoints without a user JWT, so today and predictions are in the no-auth zone:

const NO_AUTH_ACTIONS = ['horseracing.today', 'horseracing.predictions'];
Enter fullscreen mode Exit fullscreen mode

Originally placed in the auth zone → GitHub Actions got 401. Moving to no-auth fixed it.

Fixing the 500 Error on horse_results

Fetching all race results in one SELECT timed out on large datasets. Changed to parallel individual queries:

// Before: bulk SELECT → timeout
const { data } = await supabase.from('horse_results').select('*');

// After: parallel individual queries → fast
const results = await Promise.all(
  raceIds.map(id =>
    supabase.from('horse_results').select('*').eq('race_id', id)
  )
);
Enter fullscreen mode Exit fullscreen mode

GitHub Actions: Hourly Full Pipeline

# .github/workflows/horse-racing-update.yml
on:
  schedule:
    - cron: "0 * * * *"  # Every hour

steps:
  - name: Run full pipeline
    run: |
      python fetch_horse_racing.py --mode today    # Fetch today's races
      python fetch_horse_racing.py --mode predict  # Generate AI predictions
      python fetch_horse_racing.py --mode accuracy # Update hit rate stats
Enter fullscreen mode Exit fullscreen mode

One job, three phases. Data → Predictions → Stats. Runs every hour automatically.


Flutter UI: 3-Tab Layout

// horse_racing_predictor_page.dart
TabBar(tabs: [
  Tab(text: 'Today\'s Races'),
  Tab(text: 'Prediction History'),
  Tab(text: 'Accuracy'),
])
Enter fullscreen mode Exit fullscreen mode

Grade Color Badges

Color _gradeColor(String grade) => switch (grade) {
  'G1' => Colors.red.shade700,
  'G2' => Colors.blue.shade700,
  'G3' => Colors.green.shade700,
  _    => Colors.grey.shade600,
};
Enter fullscreen mode Exit fullscreen mode

Previous Race Info (Latest Addition)

Added horse details to the race card — previous race, weight, age/sex:

ListTile(
  title: Text('Previous: ${horse.prevRaceName}'),
  subtitle: Text(
    'Previous rank: ${horse.prevRaceRank} | '
    'Weight: ${horse.weight}kg | '
    '${horse.age}yo ${horse.sex}'
  ),
)
Enter fullscreen mode Exit fullscreen mode

Schema migration:

ALTER TABLE horse_races
  ADD COLUMN prev_race_name text,
  ADD COLUMN prev_race_rank int,
  ADD COLUMN horse_weight   int,
  ADD COLUMN horse_age      int,
  ADD COLUMN horse_sex      text;
Enter fullscreen mode Exit fullscreen mode

Lessons Learned

Problem Root Cause Fix
401 from GitHub Actions Auth zone restricted the action Move to NO_AUTH_ACTIONS
500 on race results fetch Bulk SELECT timeout Parallel individual queries
Garbled Japanese text EUC-JP vs CP932 mismatch decode('euc-jp', errors='replace')

Current Status

Feature Status
JRA data fetch ✅ EUC-JP stable
NAR regional tracks (15 venues)
AI prediction generation ✅ tools-hub EF
Hourly auto-update ✅ GitHub Actions cron
Previous race + weight + age ✅ Added recently
Hit rate dashboard ✅ Flutter 3-tab UI

The pipeline is fully automated. Data flows from Japanese racing sites → AI predictions → Flutter UI with zero manual intervention.


Building in public: https://my-web-app-b67f4.web.app/

Flutter #Supabase #buildinpublic #automation #MachineLearning

Top comments (2)

Collapse
 
pavelbuild profile image
Pavel Gajvoronski

Nice iteration — the parallel queries fix is a good pattern. Bulk SELECTs that work in dev always timeout in production with real data volumes. We hit the same thing with our vault search and switched to scoped queries per business.
The NO_AUTH_ACTIONS split is interesting. We solved a similar problem with deny-by-default middleware + explicit public path whitelist — forces you to consciously decide which endpoints are open.
The previous race data addition is where it gets exciting for predictions — horse history, weight trends, track conditions. That's the kind of feature engineering that separates 60% accuracy from 80%.

Collapse
 
hallengray profile image
Oluwafemi Adedayo

Can this be done for football?