Esther Studer

Posted on Mar 25

I Built an AI Pet Symptom Checker — Here's What the Data Taught Me About Pet Owner Anxiety

#python #ai #webdev #showdev

Last year I shipped a side project: an AI-powered pet symptom checker. I expected to learn about NLP. What I actually learned was about human psychology.

Here's the breakdown — including the code patterns that made it actually useful.

The Problem

Pet owners Google symptoms at 2am in a panic. They get SEO-optimized horror stories. They either rush to an expensive ER or convince themselves everything is fine when it isn't.

The signal-to-noise ratio is terrible. I wanted to fix that.

The Architecture

The core is a simple pipeline:

import openai
from dataclasses import dataclass
from enum import Enum

class UrgencyLevel(Enum):
    MONITOR = "monitor"
    CALL_VET = "call_vet"
    EMERGENCY = "emergency"

@dataclass
class SymptomResult:
    urgency: UrgencyLevel
    explanation: str
    next_steps: list[str]
    confidence: float

def analyze_pet_symptoms(
    symptoms: str,
    species: str,
    age_years: float,
    weight_kg: float
) -> SymptomResult:
    prompt = f"""
    You are a veterinary triage assistant.

    Patient: {species}, {age_years} years old, {weight_kg}kg
    Owner reports: {symptoms}

    Classify urgency as one of:
    - MONITOR: Watch at home, no immediate vet visit needed
    - CALL_VET: Call your vet within 24 hours
    - EMERGENCY: Go to emergency vet immediately

    Respond in JSON with keys: urgency, explanation, next_steps (list), confidence (0-1)
    """

    response = openai.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}],
        response_format={"type": "json_object"}
    )

    data = json.loads(response.choices[0].message.content)
    return SymptomResult(
        urgency=UrgencyLevel(data["urgency"].lower()),
        explanation=data["explanation"],
        next_steps=data["next_steps"],
        confidence=data["confidence"]
    )

Clean, right? But this naive version had a 22% overtriage rate in early testing.

What Went Wrong (And How I Fixed It)

Problem 1: Context collapse

A dog limping after a hike vs. a dog limping spontaneously are very different. The model needed temporal context.

@dataclass  
class SymptomContext:
    symptoms: str
    onset: str  # "sudden" | "gradual" | "after_activity"
    duration_hours: int
    prior_incidents: bool
    recent_changes: str  # diet, environment, medication

def analyze_with_context(ctx: SymptomContext, patient: PetProfile) -> SymptomResult:
    # Much richer prompt with temporal and contextual signals
    ...

Adding onset + duration dropped overtriage to 8%.

Problem 2: Breed blindness

Some breeds have dramatically different normal ranges. A Bulldog breathing heavily is table stakes. A Labrador breathing heavily after rest is concerning.

BREED_RISK_MODIFIERS = {
    "bulldog": {"respiratory": -0.3},      # Expected baseline higher
    "dachshund": {"back_pain": +0.4},      # Higher risk for IVDD  
    "great_dane": {"bloat": +0.5},          # Bloat risk is real
    "persian_cat": {"respiratory": -0.2},  # Brachycephalic baseline
}

def apply_breed_modifier(result: SymptomResult, breed: str, symptom_category: str) -> SymptomResult:
    modifier = BREED_RISK_MODIFIERS.get(breed.lower(), {}).get(symptom_category, 0)
    adjusted_confidence = min(1.0, max(0.0, result.confidence + modifier))
    return SymptomResult(**{**result.__dict__, "confidence": adjusted_confidence})

Problem 3: The anxiety amplifier

This was the surprising one. Anxious owners describe symptoms more dramatically. "My dog is DYING" might mean a soft stool.

I added a lightweight sentiment calibration step:

def calibrate_owner_language(raw_input: str) -> tuple[str, float]:
    """
    Returns (normalized_description, anxiety_factor)
    anxiety_factor: 1.0 = neutral, >1.0 = amplified, <1.0 = downplaying
    """
    calibration_prompt = f"""
    Rewrite this pet symptom description in neutral clinical language.
    Also rate owner anxiety from 0.5 (downplaying) to 2.0 (highly anxious).

    Input: "{raw_input}"

    JSON response: {{"normalized": "...", "anxiety_factor": 1.0}}
    """
    # ... call LLM, parse response
    return normalized, anxiety_factor

This single addition improved accuracy more than any other change.

The Lesson Nobody Talks About

Building AI tools for emotional use cases is different from building for productivity.

When someone types "my cat won't eat and I'm scared" — they don't just need information. They need to feel heard before they can receive information.

I rewrote the response format three times before I got it right:

RESPONSE_TEMPLATE = """
### I hear you — let's figure this out together.

**What you're seeing:** {normalized_symptoms}

**What this likely means:** {explanation}

**Urgency level:** {urgency_display}

**Your next steps:**
{formatted_next_steps}

---
*Remember: you know your pet best. Trust your gut alongside this guidance.*
"""

Conversion from "viewed result" to "took recommended action" went from 31% to 67%.

Numbers After 90 Days

4,200+ symptom checks
8% overtriage rate (down from 22%)
3% undertriage rate (the scary one — we monitor this obsessively)
Average session: 4.2 minutes
Most common symptom: lethargy (31%)
Most common emergency trigger: breathing difficulty (64% of EMERGENCY classifications)

What I'd Do Differently

Start with the emotional layer — the empathetic framing should be designed first, not bolted on
Breed database from day one — retrofitting it was painful
Build a feedback loop early — I added vet outcome tracking too late to have statistically useful data
Rate limit aggressively — some users were stress-testing with 30 queries/session at 3am (you know who you are)

Try It

If you have a pet and want to see what this looks like in production, the tool is live at mypettherapist.com — it's free to use.

The code patterns above are simplified but represent the real architecture. Happy to answer questions in the comments — especially curious if others have built AI tools for high-anxiety use cases and how you handled the emotional design layer.

What's the most surprising thing your users taught you about how they actually use your AI tool? Drop it below. 👇

DEV Community