DEV Community

Esther Studer
Esther Studer

Posted on

I Built a Pet Emotional Support AI — Here Are the 5 Wildest Edge Cases We Hit

TL;DR

We built an AI-powered pet wellness companion. Users love it. Our error logs are... a lot.

Here are five real architectural decisions we got wrong — and how we fixed them.


Background

My team built MyPetTherapist — an AI companion that helps pet owners understand, support, and connect with their animals on a deeper level.

Think: behavioral guidance, emotional check-ins, and personalized routines — all powered by LLMs.

We expected the hard parts to be the AI. The hard parts were the humans.


Edge Case #1: The Grieving Owner

Our first prompt design was optimized for happy, curious pet owners.

system_prompt = """
You are a cheerful pet wellness assistant.
Help the user understand their pet's behavior and emotional needs.
Keep responses warm, fun, and encouraging.
"""
Enter fullscreen mode Exit fullscreen mode

Within the first week, a user wrote: "My dog passed away yesterday. I don't know what to do."

Our AI responded with: "That's wonderful! Let's explore some fun enrichment activities!"

The fix: Sentiment pre-screening before routing to the main LLM.

import openai

def route_message(user_input: str) -> str:
    classifier = openai.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": "Classify this message as: GRIEF, CRISIS, NORMAL, or CELEBRATION."},
            {"role": "user", "content": user_input}
        ],
        max_tokens=10
    )
    tone = classifier.choices[0].message.content.strip().upper()

    if tone in ["GRIEF", "CRISIS"]:
        return get_compassionate_response(user_input)
    elif tone == "CELEBRATION":
        return get_enthusiastic_response(user_input)
    else:
        return get_standard_response(user_input)
Enter fullscreen mode Exit fullscreen mode

Lesson: Always pre-screen emotional tone before routing to personality-locked prompts.


Edge Case #2: The "My Pet Is Fine" Denial Loop

Users would describe clearly anxious behavior ("she hides under the bed every day, shakes constantly, won't eat") and then say "but she's totally fine, right?"

Our AI, trained to be agreeable, kept validating this: "It sounds like she has her own quirky personality!"

The fix: Behavioral anomaly detection with a gentle flag system.

ANXIETY_SIGNALS = [
    "hides", "shaking", "won't eat", "won't drink",
    "aggressive", "biting", "trembling", "lethargic"
]

def check_behavioral_flags(description: str) -> list[str]:
    flags = []
    lower = description.lower()
    for signal in ANXIETY_SIGNALS:
        if signal in lower:
            flags.append(signal)
    return flags

# In your response pipeline:
flags = check_behavioral_flags(user_input)
if len(flags) >= 2:
    prepend_vet_recommendation = True
Enter fullscreen mode Exit fullscreen mode

Lesson: AI shouldn't just agree. Sometimes the kindest thing is a gentle reality check.


Edge Case #3: Token Cost Explosion from Long Pet Backstories

We let users build profiles for their pets. Cute idea. Expensive mistake.

One user wrote a 2,400-word backstory for her rabbit named Gerald. Every single API call included Gerald's entire lore in the context window.

Monthly token spend was 6x our projections.

The fix: Profile summarization pipeline.

def summarize_pet_profile(full_profile: str, max_tokens: int = 200) -> str:
    if len(full_profile.split()) < 150:
        return full_profile  # Short enough, keep as-is

    summary = openai.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {
                "role": "system",
                "content": "Summarize this pet profile in under 150 words. Keep key personality traits, health issues, and behavioral patterns."
            },
            {"role": "user", "content": full_profile}
        ]
    )
    return summary.choices[0].message.content

# Cache the summary, re-generate monthly or on major updates
Enter fullscreen mode Exit fullscreen mode

Lesson: Summarize long-lived context objects. Your wallet will thank you.


Edge Case #4: The Multi-Pet Household Context Bleed

User has three pets: a calm senior dog, an anxious rescue cat, and a parrot who mimics the cat's stress sounds.

Our system was mixing up which advice belonged to which animal mid-conversation.

The fix: Strict pet-scoped conversation IDs.

from dataclasses import dataclass
from uuid import uuid4

@dataclass
class PetConversation:
    pet_id: str
    user_id: str
    session_id: str = str(uuid4())
    history: list = None

    def __post_init__(self):
        self.history = []

    def add_turn(self, role: str, content: str):
        self.history.append({"role": role, "content": content})

    def get_context_window(self, max_turns: int = 10) -> list:
        # Always prepend pet-specific system message
        return self.history[-max_turns:]

# Never mix pet_id across sessions
Enter fullscreen mode Exit fullscreen mode

Lesson: One pet = one context thread. Don't share session history across animals.


Edge Case #5: The Anthropomorphism Spiral

This one is philosophical but had real product implications.

Users started attributing human emotions to their pets based on our AI outputs. "The AI said my cat feels betrayed when I work late."

We were generating emotionally resonant language that felt true — but wasn't scientifically grounded.

The fix: A calibration layer in the prompt.

PET_LANGUAGE_GUIDELINES = """
When describing pet emotions:
- Use behaviorally observable terms: 'may be experiencing stress', 'shows signs of anxiety'
- Avoid direct human emotion attribution: never say 'your cat feels jealous'
- Always include behavioral context: what the pet is DOING, not just FEELING
- When uncertain, acknowledge it: 'This behavior could indicate...'
"""
Enter fullscreen mode Exit fullscreen mode

This reduced user over-projection AND improved trust — because being accurate felt more honest than being emotionally satisfying.

Lesson: In pet wellness AI, scientific calibration builds more trust than emotional resonance.


The Architecture That Survived

After all these fixes, here's what our production stack looks like (simplified):

User Input
    ↓
[Tone Classifier] → GRIEF/CRISIS → Compassionate Flow
    ↓ NORMAL
[Behavioral Flag Detector] → Flags? → Vet Recommendation Prepend
    ↓
[Pet Profile Loader] → Summarized Profile (≤150 words)
    ↓
[Scoped Conversation Thread] → pet_id + session_id
    ↓
[Main LLM] + Pet Language Guidelines
    ↓
Response
Enter fullscreen mode Exit fullscreen mode

Simple. Defensive. Surprisingly resilient.


What I'd Tell My Past Self

  1. Your AI will meet grief before it meets joy. Design for both from day one.
  2. Agreeable AI is dangerous AI. Build in friction where friction is kind.
  3. Token costs don't scale linearly with users. They scale with user enthusiasm.
  4. Multi-entity contexts are a nightmare. Enforce strict scoping early.
  5. Accuracy > resonance when stakes are emotional.

If you're building in the pet wellness space or anything emotionally adjacent — these lessons apply directly.

We're still learning at MyPetTherapist.com — an AI-powered companion for pet owners who want to understand their animals on a deeper level.

Drop your own edge case war stories in the comments. I want to know what broke yours. 🐾

Top comments (0)