Dillon Huston

Posted on Feb 8

Why Use Weighted Averages for Journey Arrival Predictions?

#machinelearning #coding #backend #python

Release: V1.1 – 08-02-2026

Accurately predicting arrival times is harder than it looks. Static timetables rarely reflect reality, and user-reported data can be noisyand inconsistent. In V1.1, the prediction system takes a big step forward by introducing weighted averaging of recent user journey data, alongside a confidence scoring and improved fallback logic.

This release focuses on one core idea:

Recent, real-world behaviour matters more than theoretical schedules.

The Problem
Traditional arrival prediction systems rely heavily on static timetables:

They assume journeys behave the same every day
They don’t adapt to traffic, delays, or personal habits
They fail badly when conditions change

On the other hand, user-reported journey data is closer to reality. But it comes with its own issues:

Some data points are old
Some are outliers (missed stops, breaks, anomalies)
Sometimes there’s not enough data at all#
So the real challenge is:

How do you trust real-world data without letting bad or outdated data ruin predictions?

The Solution: Weighted Averages

Instead of treating all journey data equally, V1.1 introduces weighted averaging, where:

Recent journeys have more influence
Older journeys gradually matter less
Outliers are naturally diluted
Predictions adapt over time

This allows the system to learn without overreacting.

How Weighted Averaging Works (Concept)
Each past journey contributes to the prediction. But not equally.

Example weighting strategy:

Last journey: weight 0.5

Journey before that: weight 0.3

Older journey: weight 0.2

The predicted arrival time becomes:

(predicted_time) =
(journey_1 × 0.5) +
(journey_2 × 0.3) +
(journey_3 × 0.2)

If a user’s commute suddenly changes (roadworks, new route, traffic pattern), and a report is submitted. The new journey-event data quickly pulls the prediction in the right direction, without instantly removing historical context.

Real-World Explanation

Think about estimating how long it takes you to get to work.

You wouldn’t say:

“Google Maps says 30 minutes, so it’s always 30 minutes.”

You’d think:

Yesterday it took 38 minutes (traffic)
The day before it took 35

Last month it was closer to 28

Naturally, you’d trust yesterday more than last month.

That’s weighted averaging. We humans do it instinctively.

Example in Practice
Let’s say a user reports these recent arrival times:

Journey Arrival Time
Most recent 37 mins
Previous 35 mins
Older 29 mins

With weights applied:
(37 × 0.5) + (35 × 0.3) + (29 × 0.2)
= 18.5 + 10.5 + 5.8
= 34.8 minutes

Instead of blindly predicting 29 (timetable) or 37 (last journey), the system predicts ~35 minutes, which is far more realistic.

What’s New in V1.1?

Added:
Weighted averaging of recent journey data
Confidence scoring for arrival predictions

Clear UI distinction between:
Predicted times
User-reported events and times

Changed
Prediction logic now prioritises recent user data over static timetables
Improved fallback behaviour when data is sparse or unavailable

Fixed

Edge cases where no recent journey data exists
Improved journey creation process
Better error handling for invalid or malformed data

Why does this matter?

This approach makes the system:
More adaptive
More honest about uncertainty
More reflective of real-world behaviour

It builds user trust over time and creates a more accurate model

Instead of pretending predictions are always correct. V1.1 embraces probability, confidence, and learning — which is exactly how good systems should behave.

DEV Community

Why Use Weighted Averages for Journey Arrival Predictions?

Top comments (0)