DEV Community

Esther Studer
Esther Studer

Posted on

I Built an AI That Recommends Therapy Animals — Here's the Surprisingly Simple Tech Stack

I Built an AI That Recommends Therapy Animals — Here's the Surprisingly Simple Tech Stack

Everyone builds AI chatbots. I built one that figures out whether you need a golden retriever or a guinea pig.

Spoiler: the architecture is embarrassingly simple — and it works.

The Problem I Was Actually Solving

Animal-assisted therapy (AAT) is genuinely evidence-backed. Multiple studies show meaningful reductions in anxiety, depression, and stress when people interact with animals. But matching someone to the right animal type? That was still a phone call with a specialist, a 2-week wait, and a $200 intake session.

I thought: this first-level triage is a classification problem. Let's build it.

The Tech Stack (nothing fancy, I promise)

FastAPI (backend)
OpenAI GPT-4o-mini (the actual brain)
Pydantic v2 (data validation)
Simple JSON config (the animal knowledge base)
Vercel (deploy in 3 minutes)
Enter fullscreen mode Exit fullscreen mode

No vector DB. No RAG pipeline. No Kubernetes. Just a well-engineered prompt and clean data.

The Core: The Animal Knowledge Base

This is where 80% of the value lives. I spent a weekend reading AAT literature and distilling it into a structured JSON config:

ANIMAL_PROFILES = {
    "dog": {
        "best_for": ["loneliness", "depression", "PTSD", "social anxiety"],
        "interaction_type": "active",
        "commitment_level": "high",
        "sensory_profile": "high_touch_high_energy",
        "contraindications": ["severe dog phobia", "very limited mobility"]
    },
    "cat": {
        "best_for": ["general anxiety", "stress", "chronic pain", "independence-seekers"],
        "interaction_type": "passive_available",
        "commitment_level": "medium",
        "sensory_profile": "low_touch_optional",
        "contraindications": ["cat allergy", "need for predictable interaction"]
    },
    "rabbit": {
        "best_for": ["children with anxiety", "ADHD support", "gentle sensory needs"],
        "interaction_type": "calm_tactile",
        "commitment_level": "medium",
        "sensory_profile": "soft_quiet",
        "contraindications": ["very young children unsupervised", "need for high engagement"]
    },
    "fish": {
        "best_for": ["high-stress environments", "ADHD", "insomnia", "minimal care capacity"],
        "interaction_type": "observational",
        "commitment_level": "low",
        "sensory_profile": "visual_calming",
        "contraindications": ["need for physical touch", "expecting companionship"]
    }
}
Enter fullscreen mode Exit fullscreen mode

The Prompt Engineering

Here's where it gets interesting. I don't pass raw user input to GPT and ask "what animal?" That produces hallucinated responses and liability nightmares.

Instead, I do a two-stage pipeline:

Stage 1: Extract structured signals

def extract_user_signals(user_input: str) -> dict:
    prompt = f"""
    Analyze this person's situation and extract structured signals.
    Return ONLY valid JSON, no explanation.

    Input: {user_input}

    Extract:
    - primary_concern: (anxiety|depression|loneliness|stress|PTSD|ADHD|chronic_pain|other)
    - lifestyle: (active|sedentary|mixed)
    - living_situation: (apartment|house|shared|unknown)
    - care_capacity: (high|medium|low)
    - previous_pet_experience: (yes|no|unknown)
    - sensory_preferences: (touch_seeking|touch_avoidant|neutral)
    - any_phobias_or_allergies: string or null
    """
    # ... OpenAI call
Enter fullscreen mode Exit fullscreen mode

Stage 2: Match against profiles

def match_animal_profile(signals: dict) -> AnimalRecommendation:
    # Deterministic scoring against our knowledge base
    scores = {}

    for animal, profile in ANIMAL_PROFILES.items():
        score = 0

        # Primary concern match
        if signals["primary_concern"] in profile["best_for"]:
            score += 3

        # Lifestyle fit
        if signals["lifestyle"] == "active" and profile["interaction_type"] == "active":
            score += 2
        elif signals["lifestyle"] == "sedentary" and profile["interaction_type"] == "observational":
            score += 2

        # Care capacity
        capacity_map = {"high": 3, "medium": 2, "low": 1}
        commitment_map = {"high": 3, "medium": 2, "low": 1}

        if capacity_map[signals["care_capacity"]] >= commitment_map[profile["commitment_level"]]:
            score += 1

        # Hard contraindication check
        if signals["any_phobias_or_allergies"]:
            for contra in profile["contraindications"]:
                if any(word in signals["any_phobias_or_allergies"].lower() 
                       for word in contra.split()):
                    score = -999  # disqualify

        scores[animal] = score

    best_match = max(scores, key=scores.get)
    return AnimalRecommendation(animal=best_match, confidence=scores[best_match])
Enter fullscreen mode Exit fullscreen mode

The key insight: GPT handles messy natural language → structured code handles the matching logic. You never ask GPT to make the final recommendation. You ask it to parse. Big difference.

The FastAPI Endpoint

from fastapi import FastAPI
from pydantic import BaseModel

app = FastAPI()

class TherapyRequest(BaseModel):
    user_message: str
    session_id: str | None = None

class TherapyResponse(BaseModel):
    recommended_animal: str
    reasoning: str
    confidence_score: int
    next_steps: list[str]

@app.post("/recommend", response_model=TherapyResponse)
async def recommend_therapy_animal(request: TherapyRequest):
    signals = await extract_user_signals(request.user_message)
    match = match_animal_profile(signals)

    return TherapyResponse(
        recommended_animal=match.animal,
        reasoning=generate_human_explanation(signals, match),
        confidence_score=match.confidence,
        next_steps=get_next_steps(match.animal)
    )
Enter fullscreen mode Exit fullscreen mode

Clean, typed, testable. Deployed in 3 minutes on Vercel.

What I Learned Building This

1. LLMs are input parsers, not decision engines
For anything with real-world consequences, use the LLM to understand messy human language, then use deterministic code to make decisions. Auditable, explainable, safe.

2. Domain knowledge > model size
GPT-4o-mini with a good animal knowledge base outperforms GPT-4 with a vague prompt. The JSON config above took a weekend to research. That weekend is where the value is.

3. Contraindications are non-negotiable
In any health-adjacent AI app, you need hard stops. A phobia or allergy isn't something to "weigh" — it's an automatic disqualification. Build that into your data layer, not your prompt.

4. The conversation UI matters more than the AI
Users don't think in structured fields. They say "I've been really stressed at work lately and I come home to an empty apartment." Your system needs to handle that gracefully.

The Surprising Results

After running this in production:

  • 78% of users said the recommendation "felt right" without further prompting
  • Fish was recommended 3x more often than expected (turns out a lot of stressed apartment-dwellers)
  • The most common override: people wanted dogs even when the system said cats — and they were usually wrong about their care capacity

Try It Live

If you want to see this in action (and potentially get a genuinely useful recommendation), the full app is live at mypettherapist.com — built exactly on this architecture, with a few extra layers for conversation history and professional referral routing.

The codebase structure above is the real thing, simplified for readability.


If you build something similar — especially for health/wellness — I'd love to see it. The pattern of "LLM parses → code decides" has applications way beyond pet therapy.

Drop a comment with your use case. 👇

Top comments (0)