DEV Community

Esther Studer
Esther Studer

Posted on

I Used OpenAI + FastAPI to Build a Pet Symptom Checker — Here's the Full Stack Breakdown

I Used OpenAI + FastAPI to Build a Pet Symptom Checker — Here's the Full Stack Breakdown

Pet owners panic. It's 11 PM, your dog just ate something suspicious, and Google gives you 47 tabs of conflicting advice ranging from "it's fine" to "call 911."

So I built an AI that actually helps. Here's the complete technical breakdown — including the parts that didn't work first.

The Problem

Most pet health tools are either:

  • Glorified keyword matching ("vomiting" → "go to vet")
  • $100+/month subscription vet chat services
  • Reddit threads from 2014

I wanted something that actually reasons about symptoms, considers species/breed/age, and gives actionable triage advice in plain English.

The Stack

Frontend:  Next.js 14 + Tailwind
Backend:   FastAPI (Python 3.11)
AI:        OpenAI GPT-4o with structured outputs
DB:        Supabase (Postgres + pgvector)
Infra:     Railway + Vercel
Enter fullscreen mode Exit fullscreen mode

The Core: Structured AI Outputs

The biggest mistake most devs make with AI health tools? Trusting unstructured text output.

Here's the pattern that actually works:

from openai import OpenAI
from pydantic import BaseModel
from enum import Enum

client = OpenAI()

class UrgencyLevel(str, Enum):
    MONITOR = "monitor"      # Watch at home
    VET_SOON = "vet_soon"    # Within 24-48h
    VET_NOW = "vet_now"      # Today, urgent
    EMERGENCY = "emergency"  # Go NOW

class PetTriageResult(BaseModel):
    urgency: UrgencyLevel
    reasoning: str
    possible_causes: list[str]
    home_actions: list[str]
    red_flags_to_watch: list[str]
    disclaimer: str

def analyze_pet_symptoms(
    species: str,
    breed: str,
    age_years: float,
    weight_kg: float,
    symptoms: list[str],
    duration_hours: int
) -> PetTriageResult:

    system_prompt = """You are a veterinary triage assistant. 
    Analyze symptoms conservatively — when in doubt, escalate urgency.
    Never diagnose, always triage. Focus on actionable next steps."""

    user_prompt = f"""
    Patient: {breed} {species}, {age_years} years old, {weight_kg}kg
    Symptoms: {', '.join(symptoms)}
    Duration: {duration_hours} hours

    Provide a structured triage assessment.
    """

    response = client.beta.chat.completions.parse(
        model="gpt-4o-2024-08-06",
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_prompt}
        ],
        response_format=PetTriageResult,
        temperature=0.2  # Low temp for medical context
    )

    return response.choices[0].message.parsed
Enter fullscreen mode Exit fullscreen mode

The beta.chat.completions.parse() endpoint with Pydantic models gives you guaranteed JSON structure — no more json.loads() and praying.

The FastAPI Endpoint

from fastapi import FastAPI, HTTPException
from fastapi.middleware.cors import CORSMiddleware
from pydantic import BaseModel
import asyncio

app = FastAPI()

app.add_middleware(
    CORSMiddleware,
    allow_origins=["https://mypettherapist.com"],
    allow_methods=["POST"],
    allow_headers=["*"]
)

class SymptomRequest(BaseModel):
    species: str
    breed: str
    age_years: float
    weight_kg: float
    symptoms: list[str]
    duration_hours: int

@app.post("/api/triage")
async def triage_symptoms(request: SymptomRequest):
    try:
        # Run in thread pool to avoid blocking the event loop
        result = await asyncio.to_thread(
            analyze_pet_symptoms,
            request.species,
            request.breed,
            request.age_years,
            request.weight_kg,
            request.symptoms,
            request.duration_hours
        )
        return result.model_dump()
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))
Enter fullscreen mode Exit fullscreen mode

The Semantic Memory Layer (The Secret Sauce)

Here's where it gets interesting. I store anonymized past triage sessions in pgvector and use them for few-shot context:

from supabase import create_client
import numpy as np

def get_similar_cases(symptom_embedding: list[float], limit: int = 3) -> list[dict]:
    """Fetch similar historical cases for few-shot context."""

    supabase = create_client(SUPABASE_URL, SUPABASE_KEY)

    result = supabase.rpc(
        'match_pet_cases',
        {
            'query_embedding': symptom_embedding,
            'match_threshold': 0.78,
            'match_count': limit
        }
    ).execute()

    return result.data

def build_context_prompt(similar_cases: list[dict]) -> str:
    if not similar_cases:
        return ""

    examples = []
    for case in similar_cases:
        examples.append(
            f"Similar case: {case['symptoms']}"
            f"Urgency: {case['urgency']} | "
            f"Outcome: {case['outcome_summary']}"
        )

    return "\n".join(["Reference cases:"] + examples)
Enter fullscreen mode Exit fullscreen mode

This reduced hallucinations by ~40% in my testing — especially for rare breed-specific conditions.

What Failed (Honest Post-Mortem)

Attempt 1: GPT-3.5-turbo — Too inconsistent. Would sometimes classify "lethargic after eating" as EMERGENCY. Switched to GPT-4o, problem mostly solved.

Attempt 2: Streaming responses — Users loved seeing words appear in real-time... until the partial JSON broke everything. Switched to full response + loading spinner. Boring but reliable.

Attempt 3: Multi-species single prompt — Turns out cat and dog physiology is different enough that one prompt performed poorly. Separate system prompts per species, ~30% accuracy improvement.

Attempt 4: Asking users to rate urgency — Confirmation bias is real. People rate "my dog seems tired" as MONITOR even when it's 5 days of lethargy. Removed user urgency input entirely.

Performance Numbers

  • Average response time: 1.8s (GPT-4o) / 0.9s (GPT-4o-mini fallback)
  • P99 latency: 4.2s
  • Structured output parse failures: < 0.1%
  • Cost per triage: ~$0.008

Rate Limiting (Don't Skip This)

from slowapi import Limiter, _rate_limit_exceeded_handler
from slowapi.util import get_remote_address
from fastapi import Request

limiter = Limiter(key_func=get_remote_address)
app.state.limiter = limiter
app.add_exception_handler(RateLimitExceeded, _rate_limit_exceeded_handler)

@app.post("/api/triage")
@limiter.limit("10/minute")
async def triage_symptoms(request: Request, body: SymptomRequest):
    # ... same as before
Enter fullscreen mode Exit fullscreen mode

Without this, a single viral Reddit post will drain your OpenAI credits in 20 minutes. Ask me how I know.

The Actual App

If you want to see this in action (not just the code), the live version is at mypettherapist.com — free to use, no signup required. Works for dogs, cats, and a surprisingly wide range of exotic pets.

TL;DR

  1. Use structured outputs — Pydantic + beta.chat.completions.parse() is non-negotiable for production
  2. Separate system prompts per species — don't fight physiology with clever prompting
  3. pgvector for few-shot context — surprisingly effective for domain-specific accuracy
  4. Low temperature + conservative bias — medical context demands it
  5. Rate limit before launch — seriously

Building something similar? Drop your questions in the comments — happy to share more implementation details.

What's the weirdest symptom your pet has ever had? 👇

Top comments (0)