Imagine walking outside on a quiet afternoon. You hear a sharp roar overhead, pull out your phone, and open a flight-tracking app. You find a tiny airplane icon ✈️ iding smoothly along a solid line. It looks clean, structured, and completely predictable.
But what if that airplane icon suddenly starts making frantic, tight loops over a residential area? What if it begins a terrifyingly rapid descent, or behaves in a way that defies normal flight paths?
If a human air traffic controller isn’t watching that specific screen, how can software automatically flag that something is wrong?
When I set out to build SkyWatch Live, an open-source airspace and satellite tracking dashboard, my focus quickly shifted from simply drawing dots on a map to a much more interesting engineering problem: How do you build a real-time machine learning pipeline that can detect unusual airborne behavior across thousands of concurrent flights using chaotic public data?
Repo: https://github.com/debjit450/skywatch-live
Whether you are an AI engineer or someone who has never looked at aviation data in your life, the architecture behind processing volatile, high-frequency telemetry into clean, explainable alerts contains lessons that apply to any real-time streaming system.
The Operational Blueprint
Before diving into the math, it helps to see the pipeline that keeps everything alive. The application runs a split architecture:
- The UI Layer: A React 19 and TanStack Start dashboard using MapLibre GL and deck.gl to paint real-time positions and historical playback tracks smoothly at 60 frames per second.
- The Ingestion & Inference Layer: A Django ASGI core backed by Celery background workers, a fast Redis state cache, and a durable PostgreSQL time-series database.
Every 15 seconds, a scheduled Celery Beat task pulls raw telemetry vectors from multiple fragmented public radio networks. It de-duplicates them, caches the latest snapshot in Redis, and broadcasts them down a live WebSocket channel directly to the map.
But as soon as that data hits the database, a post-commit hook hands the raw flight data over to our dedicated machine learning pipeline.
The 3-Gate Anomaly Pipeline
If you feed raw, noisy public data straight into a complex neural network, your system will instantly drown in false positives. External sensors glitch, transponders experience signal dropouts, and rate limits throttle incoming coordinates.
To prevent false alarms, SkyWatch Live runs data through three computational gates.
Gate 1: Deterministic Physical Rules
The first line of defense doesn't use AI at all. It uses fast, binary checks written in raw Python to catch immediate high-signal events:
-
Emergency Squawk Codes: Radio transponders broadcasting specific numbers like
7700(general emergency) or7600(radio communication failure). - Kinematic Violations: Physical impossibilities, such as a non-military cargo plane suddenly executing a 90-degree turn mid-air or dropping altitude faster than its structural limits allow.
Gate 2: Feature Engineering & Spatial Grids
If a flight path passes basic physics checks, the pipeline extracts hidden behavioral features out of raw coordinate sequences (latitude, longitude, altitude, heading, velocity). This happens inside backend/ml/features.py:
- Spatial Indexing: The engine dynamically hashes coordinate points into a geometric grid to calculate local proximity metrics (spotting close-proximity events or dense airspace deviations).
- Angular Velocity Tracking: By computing statistical variance across rolling windows of an aircraft's heading history, the system converts raw directional degrees into a "loitering score" that exposes circling or tracking patterns.
-
The Behavioral Baseline: The telemetry is cross-referenced with a historical
AircraftProfiletable, checking whether the current aircraft type is operating outside its typical operational envelope.
Gate 3: The Statistical Machine Learning Ensemble
Once the features are compiled into a normalized vector, they strike a three-model ensemble powered by scikit-learn. Using a blend of different algorithmic approaches balances out the blind spots of any single model:
- Isolation Forest: A tree-based model that isolates anomalies by randomly partitioning features. Because anomalies require fewer splits to isolate than normal data, they appear near the shallow roots of the trees. It is great for spotting overall global outliers.
- Local Outlier Factor (LOF): A density-based algorithm that measures how locally isolated a data point is relative to its surrounding neighborhood. This catches contextual anomalies—like a plane flying at a speed that is normal globally, but highly irregular for that specific crowded corridor.
- MLP Autoencoder: A neural network that attempts to compress the multi-dimensional feature vector down into a tiny bottleneck layer and reconstruct it perfectly on the other side.
The autoencoder flags structural anomalies by calculating the Mean Squared Error (MSE) reconstruction error between the original feature vector x and the reconstructed output x̂ across n dimensions:
If the structural configuration of a flight path is weird, the autoencoder fails to reconstruct it accurately, causing the error E to spike beyond a dynamic, self-calibrating threshold.
Going Deeper: Time-Series Sequence Analysis
While spatial snapshots catch immediate deviations, flight data is fundamentally a time-dependent sequence. To track subtle, slow-building irregularities over time, SkyWatch Live features an optional deep learning path using a Long Short-Term Memory (LSTM) network inside backend/ml/lstm.py.
[State at t-3] ──► [State at t-2] ──► [State at t-1] ──► [Current State t]
│
▼
Deep Sequence Inference
│
▼
Trajectory Anomaly Score
Engineering for Accessibility: TensorFlow and Keras are intentionally excluded from the project's default dependency file. This ensures open-source contributors can download, run, and modify the UI or standard ingestion loops without requiring massive deep-learning runtimes or expensive GPUs. The LSTM modules load conditionally, initializing only when a user explicitly activates them via
python manage.py train_lstm_anomaly.
Explainability: Busting the "Black Box"
An anomaly alert is completely useless if a user doesn't know why it went off. If the map flashes red, an operator needs to know what triggered the alarm instantly.
To fix this, our explainability module decomposes the ensemble's mathematical scoring matrix into an explicit plain-text payload. When the UI hits /api/v1/anomalies/<id>/explanation/, it receives a clean breakdown:
{
"anomaly_id": "8f3b2a-7c",
"detector_type": "Ensemble_LOF",
"severity": "CRITICAL",
"confidence_score": 0.91,
"explanation": "Triggered due to an 87% deviation in rolling heading variance (circling behavior) paired with atypical low-velocity thresholds for this airframe profile."
}
The frontend reads this payload to display clear warnings alongside the live flight profile. Users can even submit structured feedback to mark a detection as a false alarm, creating a clean, labeled dataset that our automated Celery jobs use to retrain the models every week.
Building Systems That Expect Uncertainty
SkyWatch Live shows that you can build highly performant, intelligent monitoring tools out of raw public data if you architect your pipeline to expect imperfections. By separating your fast live-state caches from your analytical data stores and guarding your machine learning models with strict validation layers, you build a system that tells the truth about chaotic real-world inputs.
If you have a laptop, a curious mindset, and want to dig into the background tasks, training engines, or the mapping layer, the repository is open source and ready for setup. Hop into the codebase, explore how the real-time data flows, and let me know your thoughts!
👉 GitHub Repository: debjit450/skywatch-live

Top comments (2)
Solid real-world systems project - Django + Celery + ML for real-time anomaly detection is a pragmatic, production-shaped stack (Celery doing the async heavy lifting so the request path stays fast is the right call). Anomaly detection on streaming data has a notoriously tricky core problem you've probably hit: the threshold. Too sensitive and you drown ops in false alarms until they ignore the system; too loose and you miss the real anomaly. Calibrating that line - and handling concept drift as "normal" shifts over time - is where these systems quietly succeed or fail.
The thing that makes anomaly engines trustworthy long-term is the feedback loop: capturing which flagged anomalies were real vs noise and feeding that back to retune, so it doesn't decay. That observe-and-recalibrate discipline is the same principle I lean on in Moonshift (a multi-agent pipeline that ships a prompt to a deployed SaaS) - a detection/verification layer is only as good as its ongoing calibration. Genuinely meaty build. How are you handling threshold tuning and drift - static thresholds, or does the model adapt as the baseline of "normal" flight behavior shifts? That's usually the make-or-break for real-time anomaly detection.
Thanks, really appreciate this. And yes, you pointed at exactly the hard part: the model is not the whole problem — calibration is.
In SkyWatch Live, I’ve handled it as a layered detection system rather than relying on one black-box anomaly score. The backend combines rule-based aviation checks with ML scoring and explainability payloads. The rule layer catches things like emergency squawks, low/fast profiles, rapid descent, signal loss, unusual kinematics, and position-quality issues. On the ML side, the ensemble path supports Isolation Forest, Local Outlier Factor, and an autoencoder-style component, with optional LSTM sequence scoring kept separate so the default stack stays practical.
Threshold-wise, the current version is closer to controlled/static + explainable than fully self-adaptive. I preferred that for v1 because in an ops-style system, trust matters more than pretending the model can magically decide everything. Every alert needs a reason attached to it.
The feedback loop is already part of the product shape too: anomalies can receive feedback through the backend, so the next step is using that captured “real vs noise” signal to recalibrate thresholds and reduce repeat false positives over time.
For drift, I’d treat it as a roadmap item rather than overclaiming it as solved. The direction is rolling baselines over recent flight behavior, ideally phase-aware and route/aircraft-aware, because “normal” during climb, cruise, approach, or oceanic tracking is not the same thing.
So the honest answer is: current system is rule + ML scoring with explainability and feedback capture; the next evolution is feedback-backed adaptive calibration so the engine improves instead of slowly becoming noisy or stale.