Harish Kotra (he/him)

Posted on May 19

Building Last Message: A Local-First Gemma Emergency Intelligence App

#ai #productivity #python #dailybuild2026

Last Message is a Streamlit app designed for high-stress disaster communication. The problem is simple: during emergencies, people panic and communication quality collapses. The goal was to convert chaotic speech and text into structured, actionable rescue intelligence with Gemma.

This post explains the architecture, model routing, multimodal analysis, stress-adaptive prompting, and UX choices that made the app practical for hackathon judging and real-world constraints.

1) Design constraints

We intentionally stayed lightweight:

no database
no auth
no orchestration framework
no vector store
no additional backend

Everything runs in a single Streamlit app with modular Python utilities and embedded browser components.

2) System architecture

3) Local-first model routing with fallback

Emergency resilience requires operation under degraded network conditions. The app routes inference based on environment availability:

local Gemma first (LM Studio)
cloud fallback (OpenRouter)
optional simulated network failure mode to force local path

def run_text_inference(system_prompt: str, user_prompt: str, cfg: ModelConfig) -> str:
    primary, cloud_available = provider_state(cfg)
    if primary is None:
        raise InferenceError("No model provider configured in .env")

    try:
        if primary == "local":
            return run_lm_studio_inference(system_prompt, user_prompt, cfg)
        return run_openrouter_inference(system_prompt, user_prompt, cfg)
    except InferenceError:
        if primary == "local" and cloud_available and not st.session_state.network_failure_mode:
            return run_openrouter_inference(system_prompt, user_prompt, cfg)
        raise

4) Stress-adaptive prompting

The app estimates panic severity from transcript signals and adapts model instruction style:

high panic -> short, calmer steps
moderate panic -> concise complete guidance
high clarity -> slightly more detail

state = emotional_state(st.session_state.panic_input)
style_line = (
    "Use very short step-by-step instructions and calming language."
    if state["response_style"] == "short"
    else "Use concise but complete instructions with calm tone."
)
system_prompt = load_emergency_system_prompt() + "\nAdaptive response mode: " + style_line

This is not a new model capability; it is a response-policy layer optimized for cognitive load.

5) Multimodal scene analysis

We added image understanding for disaster scenes using OpenAI-compatible multimodal payloads for both local/cloud providers.

payload = {
    "model": model_config.lm_studio_model,
    "messages": [
        {"role": "system", "content": system_prompt},
        {
            "role": "user",
            "content": [
                {"type": "text", "text": user_text},
                {"type": "image_url", "image_url": {"url": image_data_url}},
            ],
        },
    ],
}

Output is normalized into tactical fields:

visible hazards
structural risks
injury indicators
escape recommendations
safety warnings
rescue priority

6) Multi-agent consensus without frameworks

Instead of introducing heavyweight agents, we used role-separated prompt variants:

Medic Agent
Structural Agent
Rescue Coordinator Agent

Then we synthesize a final command-level decision. This gives coordinated reasoning while keeping runtime simple.

7) Responder HUD and cognitive readability

A key learning: technically correct output is useless if unreadable in panic.

We refactored rendering to “emergency chunks”:

severity chip
1-line summary
2–3 action bullets

No long paragraphs. No dense blocks.

Responder View moved from paragraph report to HUD-like metric tiles:

victims
extraction priority
structural risk
injury severity
equipment
rescue difficulty

8) Browser voice capture reliability

Web Speech API behavior differs across browsers. We handled instability with:

explicit recorder state machine
transient error retries
forced transcript commit on stop
manual fallback dictation input if browser path fails

This improved demo reliability significantly.

9) Geo context

The app requests browser geolocation permission by default and attempts reverse geocoding to auto-fill:

latitude
longitude
city
nearby landmark

This removes hardcoded location assumptions and improves responder usefulness.

10) Developer workflow

The project remains easy to fork:

git clone https://github.com/harishkotra/last-message.git
cd last-message
cp .env.example .env
pip install -r requirements.txt
streamlit run app.py

11) What we would build next

on-device text-to-speech for emergency steps
map pins + safe route overlays
incident timeline snapshots for responders
red-team prompt hardening for false certainty control
multilingual quality tuning per region

12) Final take

Last Message demonstrates a practical principle for AI in disasters:

The best emergency AI is not the one that talks the most. It is the one that reduces chaos into clear next actions.

When words fail, AI helps humans be heard.

Code and more: https://www.dailybuild.xyz/project/137-last-message

DEV Community