How I Built an Email-to-Linear Auto-Triage Agent with pydantic-ai and FastAPI
Support engineers at most companies share a quiet frustration: they spend a chunk of every morning doing work that feels robotic. Read email, decide what type it is, guess the priority, open Linear, create a ticket, paste in the details, and maybe ping someone on Slack if it looks urgent. The work itself is mechanical. The judgment it requires is not always trivial, but the process absolutely is.
I built a system that eliminates that loop using pydantic-ai, FastAPI, Gmail IMAP, the Linear API, and the Slack API. This article explains the architecture, the key code pattern, and the honest tradeoffs you should know before using something like this in production.
The Problem: Manual Triage Still Lives in Every Support Team
Here is what actually happens without automation: a support email arrives at 2:47 AM. It says something like "our entire checkout flow is broken, no orders are going through." It sits in a shared inbox. Someone sees it at 8 AM. They manually create a Linear ticket, label it P1, assign it to the on-call engineer, and then fire off a Slack message. By that point, the company has lost five hours of potential revenue recovery.
The frustrating part is that most teams have tried to fix this. Zapier rules break when email subjects change slightly. Regex-based classifiers require constant maintenance as new email patterns appear. Full LangChain pipelines feel like overkill and introduce significant prompt engineering overhead when all you need is a structured classification step.
The result: support teams manually drag emails into ticket systems because existing integrations are either too brittle or too heavy. What you actually need is a lightweight agent that can read an email, make a judgment call about its type and priority, and take structured action without requiring a custom rule for every new ticket category that emerges over time.
That gap is exactly what pydantic-ai is designed to close.
The Approach: Structured Outputs as the Glue Layer
The core insight here is that pydantic-ai lets you define exactly what you want an LLM to return, enforced at the library level. You are not hoping the model formats its response correctly. You are not parsing JSON out of a Markdown code block. The model's output is validated against a Pydantic model before your code ever sees it.
Here is why that matters for email triage specifically: classification is only useful if downstream systems can consume it reliably. Linear's API expects specific field types. Slack's alert logic needs a boolean or an enum, not a string that might say "critical" or "Critical" or "very urgent" depending on the day. Structured output makes the LLM behave like a typed function.
The architecture is straightforward:
- FastAPI exposes a webhook endpoint that receives incoming email data (polled from Gmail via IMAP on a background scheduler).
-
pydantic-ai agent receives the raw email text, runs it through an LLM with a strict output schema, and returns a
TriageResultobject. - The
TriageResultis used to create a Linear issue via their GraphQL API. - If
priorityisP1orP2, a Slack alert fires to the on-call channel.
Why this over LangChain? LangChain's output parsers work, but they add layers of abstraction that obscure what is actually happening. When the parser fails in production, debugging is painful. pydantic-ai is closer to the metal: you define a Pydantic model, you get that model back. The failure modes are explicit and easy to handle.
Why FastAPI over a cron script? You get health check endpoints, async support, and easy deployment to any container environment. The IMAP polling runs as a background task, keeping the architecture clean and testable.
The Code Pattern: Defining the Agent with a Typed Output Schema
This is the piece developers need to understand before anything else. The entire system depends on this pattern working correctly.
from pydantic import BaseModel
from pydantic_ai import Agent
from enum import Enum
class TicketType(str, Enum):
BUG = "bug"
BILLING = "billing"
FEATURE_REQUEST = "feature_request"
OUTAGE = "outage"
GENERAL = "general"
class Priority(str, Enum):
P1 = "P1"
P2 = "P2"
P3 = "P3"
P4 = "P4"
class TriageResult(BaseModel):
ticket_type: TicketType
priority: Priority
summary: str # one sentence, max 120 chars
suggested_team: str # e.g. "backend", "billing", "platform"
requires_immediate_alert: bool
triage_agent = Agent(
model="openai:gpt-4o-mini",
result_type=TriageResult,
system_prompt=(
"You are a support triage agent. Given an email, classify it accurately. "
"Mark requires_immediate_alert=True only for outages or data loss scenarios. "
"Keep summary under 120 characters. Be conservative with P1 reserve it for "
"confirmed production outages affecting multiple users."
),
)
async def triage_email(raw_email_text: str) -> TriageResult:
result = await triage_agent.run(raw_email_text)
return result.data
A few things worth explaining here:
result_type=TriageResult is where the magic lives. pydantic-ai constructs the prompt scaffolding to coerce the model into returning a response that validates against this schema. If validation fails, it retries automatically (configurable).
The requires_immediate_alert boolean is intentional. Keeping alert logic inside the LLM's classification means you can tune it through the system prompt rather than adding conditional branches in your routing code. Want to tighten or loosen the alert threshold? Update the prompt. No code changes needed.
The suggested_team field is a free string rather than an enum because team names vary by organization. You validate it loosely downstream before routing.
The Integration: Email In, Linear Out, Slack on Fire
The data flow looks like this:
Gmail IMAP poll (every 60s)
-> raw email extracted (subject + body)
-> FastAPI background task queued
-> pydantic-ai agent runs classification
-> TriageResult returned
-> Linear GraphQL mutation creates issue
-> if requires_immediate_alert: Slack webhook fires
-> email marked as read / label applied in Gmail
The Linear integration uses their GraphQL API. Creating an issue looks roughly like:
import httpx
LINEAR_API_URL = "https://api.linear.app/graphql"
async def create_linear_issue(result: TriageResult, team_id: str, api_key: str):
priority_map = {"P1": 1, "P2": 2, "P3": 3, "P4": 4}
mutation = """
mutation CreateIssue($title: String!, $description: String!,
$teamId: String!, $priority: Int!) {
issueCreate(input: {
title: $title,
description: $description,
teamId: $teamId,
priority: $priority
}) {
issue { id url }
}
}
"""
variables = {
"title": result.summary,
"description": f"Type: {result.ticket_type}\nSuggested team: {result.suggested_team}",
"teamId": team_id,
"priority": priority_map[result.priority],
}
async with httpx.AsyncClient() as client:
response = await client.post(
LINEAR_API_URL,
json={"query": mutation, "variables": variables},
headers={"Authorization": api_key},
)
return response.json()
One gotcha worth knowing: Gmail IMAP with OAuth2 requires the IMAPClient library and token refresh handling. If you use simple password authentication (which Google is deprecating for standard accounts), you will hit auth failures silently in some environments. Build in token refresh logic from day one, not as an afterthought.
Tradeoffs and Limitations
This architecture works well for well-defined triage scenarios, but it has real limitations you should understand before deploying it.
LLM cost at volume: If you are processing thousands of emails per day, even gpt-4o-mini adds up. For very high volume, you would want to add a fast pre-filter (keyword matching or a fine-tuned small model) before hitting the LLM classification step.
Hallucinated summaries: The summary field is free text generated by the model. Occasionally it will produce a summary that misrepresents the original email. This matters if your Linear issues are the system of record. Consider storing the raw email body as an attachment to the issue.
No threading awareness: The system treats each email as independent. Reply chains and escalations require additional logic that this template does not handle.
When to choose something simpler: If your email types are genuinely stable (three or four categories that never change), a rule-based system with regex matching will be cheaper, faster, and more predictable. LLM classification earns its complexity when the input space is messy and evolving.
Get the Code and Share What You Build
I packaged this as an open-source scaffold on GitHub: https://github.com/Reactance0083/pydantic-ai-email-linear-auto-triage
The scaffold gives you the core structure: the pydantic-ai agent definition, the FastAPI app skeleton, and stub integrations for Linear and Slack.
The full production version with complete error handling, OAuth2 Gmail auth, retry logic, test coverage, and deployment docs is available here: https://reactance0083.gumroad.com/l/dcror
If you are already running something like this in production, or if you have hit edge cases I did not cover here (multi-language emails, CRM integration, SLA tracking), I would genuinely like to hear about it in the comments. The design decisions here are not the only valid ones, and the tradeoffs look different at different scales.
Top comments (0)