So I shipped a project that most people told me was a dumb idea: an AI that listens to your pet and helps you understand what's going on with them emotionally and behaviorally.
Spoiler: it turned out to be one of the most technically interesting things I've built.
Here's what I learned — including a few things that surprised even me.
The Problem Nobody Talks About
Pet owners spend billions every year on vet visits for issues that turn out to be behavioral, not medical. Anxiety, stress, boredom — these masquerade as physical symptoms. The average vet appointment is 15 minutes. There's no time for a deep behavioral interview.
I wanted to build something that could bridge that gap: a lightweight AI layer that helps owners understand their pet before (or instead of) that expensive trip.
The Stack: Simpler Than You Think
I see a lot of posts overengineering AI projects. Here's what actually powers this:
# Core analysis pipeline
from openai import OpenAI
import json
client = OpenAI()
def analyze_pet_behavior(symptoms: list[str], species: str, age_years: float) -> dict:
system_prompt = """
You are a veterinary behavioral specialist AI.
Analyze the provided symptoms and context.
Return structured JSON with:
- likely_cause (string)
- severity (low|medium|high)
- recommended_actions (list)
- when_to_see_vet (bool)
Be concise. Be accurate. Never diagnose medical conditions.
"""
user_input = f"""
Species: {species}
Age: {age_years} years
Reported behaviors: {', '.join(symptoms)}
"""
response = client.chat.completions.create(
model="gpt-4o-mini", # Fast + cheap for this use case
response_format={"type": "json_object"},
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_input}
]
)
return json.loads(response.choices[0].message.content)
# Example
result = analyze_pet_behavior(
symptoms=["hiding under furniture", "not eating", "excessive grooming"],
species="cat",
age_years=3.5
)
print(result)
# Output:
# {
# "likely_cause": "Stress or environmental change",
# "severity": "medium",
# "recommended_actions": ["Create a safe space", "Maintain feeding schedule", "Reduce household noise"],
# "when_to_see_vet": false
# }
That's the core. The magic isn't in the model call — it's in the prompt engineering and the guardrails.
The Part That Actually Took Time: Guardrails
The naive version of this project is a liability nightmare. An AI confidently misdiagnosing a sick animal is not a feature — it's a crisis.
Here's the guardrail layer I built:
SEVERE_SYMPTOMS = [
"seizure", "not breathing", "bleeding heavily",
"paralyzed", "collapsed", "unconscious", "pale gums"
]
def is_emergency(symptoms: list[str]) -> bool:
"""Flag immediately life-threatening situations before AI analysis."""
symptoms_lower = [s.lower() for s in symptoms]
return any(
any(severe in symptom for symptom in symptoms_lower)
for severe in SEVERE_SYMPTOMS
)
def safe_analyze(symptoms: list[str], species: str, age_years: float) -> dict:
if is_emergency(symptoms):
return {
"emergency": True,
"message": "⚠️ Please contact your emergency vet immediately.",
"analysis": None
}
result = analyze_pet_behavior(symptoms, species, age_years)
# Always include a human-review flag for high severity
if result.get("severity") == "high":
result["note"] = "This assessment suggests professional consultation."
return result
The rule: AI handles the 80% of behavioral questions that don't need a vet. Emergencies are hard-coded off-ramps.
What Surprised Me: Embeddings for Symptom Similarity
One unexpected win was using embeddings to cluster similar behavioral reports. When 50 golden retriever owners report similar anxious behaviors, that's a signal.
from openai import OpenAI
import numpy as np
def get_symptom_embedding(symptom_text: str) -> list[float]:
response = client.embeddings.create(
model="text-embedding-3-small",
input=symptom_text
)
return response.data[0].embedding
def cosine_similarity(a: list[float], b: list[float]) -> float:
a, b = np.array(a), np.array(b)
return float(np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b)))
# Find similar historical cases
def find_similar_cases(new_symptom: str, case_library: list[dict], top_k: int = 3):
new_embedding = get_symptom_embedding(new_symptom)
scored = []
for case in case_library:
sim = cosine_similarity(new_embedding, case["embedding"])
scored.append({**case, "similarity": sim})
return sorted(scored, key=lambda x: x["similarity"], reverse=True)[:top_k]
This let me surface: "47 other cats with similar symptoms improved when their owner did X" — which is way more compelling than generic advice.
The Business Reality: Speed Over Perfection
I almost never shipped this. I kept wanting to add:
- A fine-tuned model on veterinary literature
- Multi-modal audio analysis of vocalizations
- A RAG pipeline with 10,000 case studies
None of that was live. The MVP — GPT-4o-mini + structured prompts + hard guardrails — was live in a weekend.
Ship the 80%. The remaining 20% is future features, not launch blockers.
The site now handles real queries from real pet owners daily, and the feedback loop is teaching me exactly what to build next.
TL;DR — What Actually Works
| Approach | Reality |
|---|---|
| Fancy fine-tuned model | Overkill for MVP |
| GPT-4o-mini + good prompt | Ships in a day, works great |
| No guardrails | Liability nightmare |
| Hard-coded emergency off-ramp | Non-negotiable safety layer |
| RAG pipeline from day 1 | Premature optimization |
| Embeddings for similarity | Surprising win, add in v2 |
What's Next
If you're curious about the full product (not just the code), it's live at mypettherapist.com — real AI behavioral analysis for dogs and cats, built on exactly the principles above.
Happy to answer questions about the architecture in the comments. What would you build differently?
Have you built anything with AI + a niche domain (pets, plants, fitness)? Drop it below — I love seeing what people ship.
Top comments (0)