Deny Herianto

Posted on Mar 3

Building Disaster Pulse: What Happened When I Let AI Decide If a Disaster Is Real

#devchallenge #geminireflections #gemini #ai

Built with Google Gemini: Writing Challenge

This is a submission for the Built with Google Gemini: Writing Challenge

I live in Indonesia. We sit on the Ring of Fire, an archipelago of 17,000 islands, home to 40% of the world's active volcanoes, and a long, painful history of disasters that moved too fast and warned too late.

In November 2025, Tropical Cyclone Senyar made landfall in northeastern Sumatra with around 1,090–1,207 people died. Over 1.1 million people were displaced. The cyclone itself was relatively small, the real damage came from deforested watersheds that had no natural buffer left. Information about the disaster spread across TikTok, WhatsApp, news sites, and government agencies, all at once, in fragments, with varying levels of accuracy.

That's the origin of Disaster Pulse, an AI-powered disaster intelligence platform I built for the Gemini 3 Hackathon.

What I Built with Google Gemini

Disaster Pulse is a real-time disaster detection and alert system designed for Indonesia. When Cyclone Senyar hit Sumatra, information came from everywhere, BMKG (Indonesia's meteorological agency), TikTok videos of people filming floodwater from rooftops, WhatsApp forwarded messages, news RSS feeds, and user reports from people trying to locate missing relatives. Most of it was noise. Some of it was life-or-death signal.

The app's job is to separate those two.

Here's where Gemini comes in: I built a 5-agent reasoning pipeline powered by Gemini to process every incoming signal before it becomes an alert on the dashboard.

Observer → Classifier → Skeptic → Synthesizer → Action

Each agent has a single job:

Observer: Reads the raw signal (text, video frame, user report) and writes factual observations
Classifier: Assigns event type, severity, and affected radius
Skeptic: Actively looks for reasons the previous agents are wrong, hallucination prevention
Synthesizer: Weighs all evidence and produces a confidence-scored verdict
Action: Decides whether to create a new incident, update an existing one, or discard the signal

The VideoAnalysisAgent is the one I'm most proud of. It uses Gemini's multimodal capabilities to analyze video frames from social media, checking for visual flood indicators, fire signatures, and structural damage, then applies a freshness rule: if the video metadata suggests it was filmed more than 6 hours ago, the severity score gets downgraded. Old content shouldn't trigger live alerts.

Beyond the AI pipeline, the app includes:

Real-time dashboard with incident severity cards
Live map showing affected zones
Community verification system (human-in-the-loop)
SSE-powered live intelligence ticker showing agent activity
Mobile-first PWA for field workers

Tech stack: NestJS (API), Next.js 15 (web), PostgreSQL, Drizzle ORM, Gemini API, Turborepo monorepo.

Demo

https://disaster-pulse.denyherianto.com

Here's a quick look at the core pipeline in action. The Live Intelligence Ticker shows each agent processing a signal in real time:

[Observer]    Reading signal: TikTok video, 0:42, location metadata: North Sumatra
[Classifier]  Event type: flood | Severity: high | Confidence: 0.84
[Skeptic]     ⚠ Video upload timestamp 14h ago, possible recirculation of 2022 Palu footage
[Synthesizer] Verdict: DOWNGRADE severity to low | Freshness penalty applied
[Action]      Signal discarded, insufficient confidence threshold

What I Learned

Multi-agent architecture is harder than it looks

I thought about building a single "smart prompt" to classify disasters. I'm glad I didn't.

The Skeptic agent alone probably saves the most embarrassment. Early on, without it, the pipeline would happily classify someone's video of a smoke machine at a concert as "fire, high severity." The Skeptic looks at the Classifier's output and asks: what if this is wrong? That adversarial framing is the difference between a toy and something I'd trust in production.

But chains have failure modes. If the Observer writes a vague summary, the Classifier gets bad input, the Skeptic has nothing solid to critique, and the Synthesizer is guessing. I learned to write very explicit output schemas for each agent, not just "describe what you see" but "output JSON with fields: event_type, confidence, evidence_list, contradictions."

Structured output with Gemini was a game-changer here.

Multimodal processing isn't free

Video analysis sounds straightforward until you're pulling frames from a TikTok video at 3 AM, sampling every 2 seconds, and sending each frame to the Gemini API with a detailed prompt. Token costs add up fast. I had to be strategic: sample at lower frequency for longer videos, cache identical frames (TikTok compression creates a lot of duplicates), and set hard limits on video length per analysis job.

The freshness rule emerged from a real problem: during Cyclone Senyar, videos from the 2022 Cianjur earthquake and the 2018 Palu tsunami were getting recirculated on TikTok alongside actual Sumatra flood footage. People were sharing old disasters in the panic of a new one. Without the freshness check, the pipeline would treat a 2018 video as a live signal. Metadata-based time-checking was a simple fix that meaningfully improved signal quality.

The empty state kills demos

I almost demoed with no data. I had built the entire pipeline and UI but hadn't thought about what judges see when they open the app for the first time.

Nothing.

I scrambled and built a demo seed system, a script that populates the database with a realistic scenario based on Cyclone Senyar: flood alerts across North Sumatra, multiple user verification reports with conflicting severity estimates, landslide signals from Mandailing Natal, and full agent traces showing the reasoning chain. A hidden trigger in the frontend (tap the logo 5 times) resets and re-seeds everything.

Lesson: build for the demo from day one. The feature doesn't matter if the judge sees a blank screen.

The hardest part wasn't the AI, it was making the AI visible

I had a sophisticated 5-agent pipeline running, but from the user's perspective, disaster cards just... appeared on a dashboard. There was no way to see why something was classified high severity or what the Skeptic had flagged.

This is the black box problem. In disaster alerting, a black box is a liability. With Cyclone Senyar, the difference between "flooding in Mandailing Natal" and "flooding in Mandailing Natal, confidence 0.91, corroborated by BMKG water level data + 3 user reports + video analysis" is the difference between a coordinator acting or waiting for more information.

Building the AI Transparency Panel, a "Why?" button that exposes the full agent trace, changed the product from a dashboard into a decision-support tool. That distinction matters enormously for real-world impact.

One moment that stuck

At some point during the hackathon, I was debugging the Skeptic agent and fed it a signal I thought was obviously a live Sumatra flood report, a news article with dramatic photos of floodwater submerging houses.

The Skeptic came back: "The article references events from 2022. The images match historical flood patterns from that period. This signal should not trigger a live alert."

It was right. I had been testing with old data I'd scraped as examples, and the agent caught it.

There's something a little unsettling about an AI being more careful than I was. But mostly it made me trust the architecture. That's what I want to keep building toward: systems that are genuinely more careful, not just faster.

Google Gemini Feedback

What worked really well

Structured output is exceptional. Gemini's ability to reliably return JSON matching a schema is the foundation the entire multi-agent pipeline is built on. Without it, I'd be writing fragile regex parsers to extract data from free-text responses. With it, every agent output is predictable, typeable, and chainable. This is the feature I'd highlight to any developer building agentic systems.

Multimodal + long context together. The combination of sending video frames and having the model hold enough context to reason across them is genuinely powerful. For the VideoAnalysisAgent, I send multiple frames with temporal metadata and ask the model to reason about change over time. It handles this well.

Speed on Gemini Flash. For a real-time alert system, latency matters. Flash is fast enough that the agent pipeline completes within a few seconds for most signals, which means dashboard updates feel live rather than batched.

Where I hit friction

Rate limits during development are brutal. When you're building a pipeline that chains 5 API calls per signal and testing with a seed of 20 signals, you hit limits fast. The error messages could be more actionable, "quota exceeded" without any indication of which quota (per-minute, per-day, per-project) cost me real debugging time.

Structured output edge cases aren't well documented. What happens when the model can't produce valid output for a schema? Sometimes the API returns a malformed response silently rather than throwing an error. I had to add defensive validation at every agent step to catch this. It's a correctness issue I didn't expect going in.

Grounding/search integration has a learning curve. I wanted to cross-reference incidents with historical BMKG data, but integrating Google Search grounding into a multi-step pipeline, where only some steps should use grounding, required more plumbing than I expected. I ended up implementing a separate SignalEnrichmentAgent just for this, which added complexity.

Streaming in chained calls needs better SDK support. I wanted to show real-time agent "thinking" in the UI, streaming tokens from each agent as they processed. Gemini supports streaming, but building it into a pipeline where one agent's output feeds the next's input required significant SSE infrastructure on my side. I got it working, but the SDK could make this pattern easier.

This project isn't going back in a drawer. Next steps: Cloud Run deployment, formal BMKG API integration, offline PWA capability, and eventually, getting it in front of people who actually coordinate disaster response in Indonesia.

Disaster Pulse is open source. Feedback welcome, especially from anyone working in disaster response.

Github: https://github.com/denyherianto/disaster-pulse

Built with Gemini API, NestJS, Next.js, and a lot of coffee.