<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: parupati madhukar reddy</title>
    <description>The latest articles on DEV Community by parupati madhukar reddy (@parupati).</description>
    <link>https://dev.to/parupati</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2890899%2Ff8c95dc1-8f92-4d3b-9917-a3011dadf0b3.jpg</url>
      <title>DEV Community: parupati madhukar reddy</title>
      <link>https://dev.to/parupati</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/parupati"/>
    <language>en</language>
    <item>
      <title>The Token Tax Problem: How I Built a Super Memory Layer for AI Coding Assistants using LLM Wiki</title>
      <dc:creator>parupati madhukar reddy</dc:creator>
      <pubDate>Wed, 06 May 2026 21:40:49 +0000</pubDate>
      <link>https://dev.to/parupati/the-token-tax-problem-how-i-built-a-super-memory-layer-for-ai-coding-assistants-using-llm-wiki-3c5g</link>
      <guid>https://dev.to/parupati/the-token-tax-problem-how-i-built-a-super-memory-layer-for-ai-coding-assistants-using-llm-wiki-3c5g</guid>
      <description>&lt;h2&gt;
  
  
  The Token Tax Problem: How I Built a Super Memory Layer for AI Coding Assistants
&lt;/h2&gt;

&lt;h2&gt;
  
  
  We Solved the Wrong Problem First
&lt;/h2&gt;

&lt;p&gt;When AI coding assistants arrived, we celebrated. Faster delivery. Less repetitive work. Developers doing more meaningful things.&lt;/p&gt;

&lt;p&gt;Then the invoices arrived.&lt;/p&gt;

&lt;p&gt;Token utilization had quietly become one of the fastest-growing line items in engineering costs. Every session, every agent, every code suggestion — all of it burning through context tokens. And the root cause was embarrassingly simple: &lt;strong&gt;we were paying for AI tools to re-learn our codebase from scratch, over and over again.&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Round One: The Obvious Fixes
&lt;/h2&gt;

&lt;p&gt;We started with the basics. Things that genuinely helped:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Context window hygiene&lt;/strong&gt; — Being deliberate about &lt;em&gt;what&lt;/em&gt; goes into context rather than dumping entire file trees at every agent invocation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Model switching&lt;/strong&gt; — Using faster, cheaper models for repetitive low-complexity tasks and reserving powerful models for architecture decisions and complex debugging&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Preprocessed context&lt;/strong&gt; — Writing structured markdown instruction files that encode team conventions once and reuse them everywhere, instead of expecting agents to infer them from raw code&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scoped agents&lt;/strong&gt; — Purpose-built agents for specific tasks (test generation, code review, planning) rather than one general-purpose agent doing everything&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These helped. But they didn't solve the fundamental issue. Agents were still spending tokens &lt;em&gt;exploring&lt;/em&gt; the codebase before doing any real work.&lt;/p&gt;

&lt;p&gt;We needed something closer to a cache layer.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Core Idea: A Super Memory Layer
&lt;/h2&gt;

&lt;p&gt;The inspiration came from &lt;strong&gt;Andrej Karpathy's concept of the LLM Wiki&lt;/strong&gt; — the idea that an AI system benefits enormously from a persistent, structured knowledge index rather than re-reading raw source on every request.&lt;/p&gt;

&lt;p&gt;Think of it like &lt;strong&gt;CloudFront or Redis in front of your origin server&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Instead of every agent making expensive round trips into raw source code, they read from a pre-built knowledge graph. That graph becomes a &lt;strong&gt;shared memory layer&lt;/strong&gt; — a single source of architectural truth accessible by any AI tool: Copilot, Factory, Claude, Cursor, or whatever comes next.&lt;/p&gt;

&lt;p&gt;For the implementation, I used &lt;strong&gt;Graphify&lt;/strong&gt; (&lt;a href="https://github.com/safishamsi/graphify" rel="noopener noreferrer"&gt;github.com/safishamsi/graphify&lt;/a&gt;), an open-source tool that converts a codebase into a knowledge graph:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Nodes&lt;/strong&gt; — functions, components, hooks, utilities&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Edges&lt;/strong&gt; — relationships between them (imports, calls, dependencies)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Output&lt;/strong&gt; — a plain-language report, interactive visualization, and GraphRAG-ready JSON&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  The POC: Steps We Actually Followed
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Step 1 — Full Codebase Attempt (Hit a Wall)
&lt;/h3&gt;

&lt;p&gt;First instinct: run it on the entire codebase at once.&lt;/p&gt;

&lt;p&gt;The corpus exceeded the tool's recommended limits immediately (~900+ files). This is actually a healthy constraint — feeding an LLM a massive undifferentiated codebase produces poor graph quality anyway.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Lesson:&lt;/strong&gt; Large codebases need a per-module strategy.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h3&gt;
  
  
  Step 2 — Module-by-Module Analysis
&lt;/h3&gt;

&lt;p&gt;We split the codebase by independent modules and ran the graph pipeline on each one separately.&lt;/p&gt;

&lt;p&gt;Each run was &lt;strong&gt;completely free&lt;/strong&gt; — Graphify's AST extraction is pure static analysis with zero LLM API calls. The graph structure emerged from the code itself:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Module&lt;/th&gt;
&lt;th&gt;Source Files&lt;/th&gt;
&lt;th&gt;Nodes&lt;/th&gt;
&lt;th&gt;Edges&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Module A&lt;/td&gt;
&lt;td&gt;354&lt;/td&gt;
&lt;td&gt;606&lt;/td&gt;
&lt;td&gt;1,599&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Module B&lt;/td&gt;
&lt;td&gt;318&lt;/td&gt;
&lt;td&gt;549&lt;/td&gt;
&lt;td&gt;1,501&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Module C&lt;/td&gt;
&lt;td&gt;166&lt;/td&gt;
&lt;td&gt;248&lt;/td&gt;
&lt;td&gt;509&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Module D&lt;/td&gt;
&lt;td&gt;108&lt;/td&gt;
&lt;td&gt;193&lt;/td&gt;
&lt;td&gt;514&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Module E&lt;/td&gt;
&lt;td&gt;27&lt;/td&gt;
&lt;td&gt;37&lt;/td&gt;
&lt;td&gt;60&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h3&gt;
  
  
  Step 3 — Debugging the Tool Itself
&lt;/h3&gt;

&lt;p&gt;During a couple of runs, report generation failed due to API signature changes between Graphify versions. We patched the calls and kept moving.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Lesson:&lt;/strong&gt; Pin your open-source tooling versions. APIs shift.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h3&gt;
  
  
  Step 4 — Merging Module Graphs
&lt;/h3&gt;

&lt;p&gt;With individual module graphs ready, we wrote a merge script to combine them into a single unified knowledge graph.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;First attempt had a subtle bug&lt;/strong&gt; — the script accidentally read the same module's extract file multiple times (once per module), producing a graph full of duplicates. We caught it because all sets of nodes were identical.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fix:&lt;/strong&gt; Rebuilt the merge from each module's actual AST cache files, prefixing node IDs with the module name to prevent collisions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;node_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;module_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;::&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;original_node_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Correct merged result:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;1,600+ unique nodes&lt;/strong&gt; across all modules&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;4,000+ edges&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;28 structural communities&lt;/strong&gt; detected&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zero LLM tokens&lt;/strong&gt; consumed to build it&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Step 5 — Discovering the God Nodes
&lt;/h3&gt;

&lt;p&gt;The most valuable output wasn't the graph itself — it was what the graph &lt;strong&gt;revealed&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;God nodes&lt;/strong&gt; are the most connected abstractions in the codebase. The functions, utilities, and components that everything else depends on. Most experienced developers know these intuitively but have never seen them mapped explicitly.&lt;/p&gt;

&lt;p&gt;Once you know your god nodes, you can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Prioritise documentation specifically for these high-impact functions&lt;/li&gt;
&lt;li&gt;Instruct agents to proceed carefully whenever changes touch them&lt;/li&gt;
&lt;li&gt;Use them as &lt;strong&gt;architectural anchors&lt;/strong&gt; in any context window&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Step 6 — Wiring the Graph into Agent Instructions
&lt;/h3&gt;

&lt;p&gt;We updated the agent instruction files used by each tool (GitHub Copilot, Factory/droid, etc.) to point at the merged graph report as their primary architecture reference:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;Before answering architecture or codebase questions,
read the merged graph report at graphify-out/GRAPH_REPORT_MERGED.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This means any agent that loads these instructions &lt;strong&gt;starts with architectural knowledge already loaded&lt;/strong&gt; — without scanning source files to build that understanding themselves.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;A &lt;strong&gt;9KB markdown report&lt;/strong&gt; replacing several megabytes of source scanning. Every session.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h3&gt;
  
  
  Step 7 — Running the Token Experiment
&lt;/h3&gt;

&lt;p&gt;To quantify the impact, we set up an A/B test. We commented out the graph instructions from all agent configuration files, then ran identical tasks in both configurations and compared token consumption.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="c"&gt;&amp;lt;!-- graphify section disabled for token utilization analysis
     re-enable when experiment is complete --&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Results from the experiment will follow in a separate post.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Three Outputs: What Agents Actually Consume
&lt;/h2&gt;

&lt;p&gt;Every Graphify run produces three files, each serving a different consumer:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;File&lt;/th&gt;
&lt;th&gt;Typical Size&lt;/th&gt;
&lt;th&gt;Best For&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;GRAPH_REPORT.md&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;~9 KB&lt;/td&gt;
&lt;td&gt;Copilot, Cursor, any LLM reading markdown&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;graph.json&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;~1 MB&lt;/td&gt;
&lt;td&gt;GraphRAG queries, programmatic traversal, MCP tools&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;graph.html&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;~1 MB&lt;/td&gt;
&lt;td&gt;Human review, architecture walkthroughs&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For token efficiency, agents read only the markdown (~9KB). The JSON is available for tools that can query it selectively.&lt;/p&gt;




&lt;h2&gt;
  
  
  Honest Pros and Cons
&lt;/h2&gt;

&lt;h3&gt;
  
  
  ✅ What Works
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Zero build cost&lt;/strong&gt; — AST extraction consumes no LLM tokens&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool-agnostic&lt;/strong&gt; — Works with any tool that reads files (Copilot, Factory, Claude, Cursor)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Shared memory&lt;/strong&gt; — One knowledge base, many consumers; no duplication of analysis&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;God node awareness&lt;/strong&gt; — Agents automatically know which abstractions are highest-impact&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Community detection&lt;/strong&gt; — Related code clusters surface naturally without manual documentation&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  ⚠️ What Doesn't
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Stale risk&lt;/strong&gt; — Graph must be regenerated after structural changes; a stale graph actively misleads agents&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Velocity tension&lt;/strong&gt; — Codebases with rapid daily structural changes will find frequent regeneration expensive in &lt;em&gt;time&lt;/em&gt;, even if not in tokens&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Corpus size limits&lt;/strong&gt; — Large repos must be split by module; cross-module edges are inferred, not extracted&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No semantic understanding&lt;/strong&gt; — AST-only extraction misses business intent and domain meaning; semantic extraction adds LLM cost&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Merge complexity&lt;/strong&gt; — Combining module graphs requires care; duplicate nodes and ID collisions are easy mistakes to make&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Suggestions to Take This Further
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Incremental updates&lt;/strong&gt; — Re-extract only changed files after each commit, not the full module&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automate in CI&lt;/strong&gt; — Regenerate affected module graphs as a post-merge pipeline step, triggered only when source files change&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Selective semantic enrichment&lt;/strong&gt; — Run LLM-assisted extraction only on shared utilities and god nodes, not on every file&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Add a wiki navigation layer&lt;/strong&gt; — Generate a navigable &lt;code&gt;index.md&lt;/code&gt; so agents load only the relevant section of the graph rather than the full report&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Commit only the report&lt;/strong&gt; — The markdown report is the token-saver; the JSON and HTML can stay gitignored to avoid bloating the repo&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  The Bigger Picture
&lt;/h2&gt;

&lt;p&gt;Token cost is the new technical debt of the AI-assisted development era.&lt;/p&gt;

&lt;p&gt;Every pattern that reduces it — pre-processed context, structured instructions, scoped agents — points in the same direction:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Give agents knowledge, not raw data.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;A project knowledge graph is one concrete implementation of that principle. It is not magic, and it is not free to maintain. But as a cache layer between your codebase and your AI tools, it fundamentally changes the economics of agent-assisted development.&lt;/p&gt;

&lt;p&gt;The experiment is ongoing. I'll share the token comparison numbers once the A/B test wraps up.&lt;/p&gt;

&lt;p&gt;If you're working on similar token efficiency problems or have taken a different approach, I'd love to hear about it in the comments.&lt;/p&gt;




&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;🔧 &lt;a href="https://github.com/safishamsi/graphify" rel="noopener noreferrer"&gt;Graphify on GitHub&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;💡 Concept inspiration: Andrej Karpathy — LLM Wiki&lt;/li&gt;
&lt;li&gt;📊 Token experiment results: coming soon&lt;/li&gt;
&lt;/ul&gt;




</description>
      <category>aiengineering</category>
      <category>productivity</category>
      <category>llm</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Tracing the agent flow in Openai-agents</title>
      <dc:creator>parupati madhukar reddy</dc:creator>
      <pubDate>Wed, 06 May 2026 21:39:18 +0000</pubDate>
      <link>https://dev.to/parupati/tracing-the-agent-flow-in-openai-agents-5e3k</link>
      <guid>https://dev.to/parupati/tracing-the-agent-flow-in-openai-agents-5e3k</guid>
      <description></description>
      <category>ai</category>
      <category>a2a</category>
      <category>openai</category>
      <category>agents</category>
    </item>
    <item>
      <title>AI Job Hunt Match Agent in n8n (Using AI_Job_Hunt_Agent_N8N)</title>
      <dc:creator>parupati madhukar reddy</dc:creator>
      <pubDate>Sat, 11 Apr 2026 02:48:24 +0000</pubDate>
      <link>https://dev.to/parupati/ai-job-hunt-match-agent-in-n8n-using-aijobhuntagentn8n-1fnh</link>
      <guid>https://dev.to/parupati/ai-job-hunt-match-agent-in-n8n-using-aijobhuntagentn8n-1fnh</guid>
      <description>&lt;p&gt;I updated my workflow to use the &lt;strong&gt;&lt;code&gt;AI_Job_Hunt_Agent_N8N&lt;/code&gt;&lt;/strong&gt; file as the source of truth.&lt;/p&gt;

&lt;p&gt;Instead of generating tailored resumes for every role, this version focuses on &lt;strong&gt;job-match intelligence&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;pull jobs&lt;/li&gt;
&lt;li&gt;compare JD vs resume profile&lt;/li&gt;
&lt;li&gt;score fit&lt;/li&gt;
&lt;li&gt;send ranked opportunities&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Repo: &lt;a href="https://github.com/parupati/AI_Job_Hunt_N8N" rel="noopener noreferrer"&gt;https://github.com/parupati/AI_Job_Hunt_N8N&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What the workflow does
&lt;/h2&gt;

&lt;p&gt;Every day at &lt;strong&gt;7:00 AM&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Scrapes fresh jobs from SerpAPI (&lt;code&gt;google_jobs&lt;/code&gt;) for &lt;strong&gt;AI / Sr Full Stack Engineer&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Loads a structured resume profile from a code node (summary, skills, experience, achievements)&lt;/li&gt;
&lt;li&gt;Sends each job description + resume profile to GPT-4o&lt;/li&gt;
&lt;li&gt;Parses AI response into structured fields like:

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;match_score&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;match_tier&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;apply_recommendation&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Sorts by score and selects the top 5 opportunities&lt;/li&gt;
&lt;li&gt;Sends a daily HTML email report with:

&lt;ul&gt;
&lt;li&gt;company&lt;/li&gt;
&lt;li&gt;role&lt;/li&gt;
&lt;li&gt;location&lt;/li&gt;
&lt;li&gt;posting time&lt;/li&gt;
&lt;li&gt;match %&lt;/li&gt;
&lt;li&gt;tier&lt;/li&gt;
&lt;li&gt;recommendation&lt;/li&gt;
&lt;li&gt;job link&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  n8n node flow
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Schedule Trigger&lt;/strong&gt; (daily at 7 AM)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HTTP Request&lt;/strong&gt; (SerpAPI jobs endpoint)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Code: Load Resume Data&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Code: Prepare Jobs for Match Analysis&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OpenAI (GPT-4o)&lt;/strong&gt; for JD-vs-resume fit analysis&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Code: Parse &amp;amp; Enrich Result&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;IF + Code&lt;/strong&gt; (filter/sort/top 5)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Aggregate&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gmail&lt;/strong&gt; (send report)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Local setup
&lt;/h2&gt;

&lt;p&gt;I run n8n with Docker:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;n8n&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;n8nio/n8n:latest&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;5678:5678"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Start:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker compose up &lt;span class="nt"&gt;-d&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Open:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;http://localhost:5678
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Why this helped
&lt;/h2&gt;

&lt;p&gt;The workflow doesn’t auto-apply to jobs.&lt;br&gt;&lt;br&gt;
It automates job triage so I can spend time only on high-fit opportunities.&lt;/p&gt;

&lt;p&gt;This gave me:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;daily ranked shortlist instead of random browsing&lt;/li&gt;
&lt;li&gt;consistent JD-vs-resume evaluation&lt;/li&gt;
&lt;li&gt;faster decision-making on where to apply&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Next improvements
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;enforce score threshold directly in IF node&lt;/li&gt;
&lt;li&gt;add company blacklist/whitelist&lt;/li&gt;
&lt;li&gt;generate optional cover note for top matches&lt;/li&gt;
&lt;li&gt;send Slack + email notifications&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you want, I can share my importable &lt;code&gt;AI_Job_Hunt_Agent_N8N.sanitized.json&lt;/code&gt; workflow and setup checklist.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>automation</category>
      <category>career</category>
      <category>showdev</category>
    </item>
    <item>
      <title>Building a RAG System from Scratch: Turning Aviation Disruption Data into an AI-Powered Q&amp;A App</title>
      <dc:creator>parupati madhukar reddy</dc:creator>
      <pubDate>Mon, 09 Mar 2026 04:37:04 +0000</pubDate>
      <link>https://dev.to/parupati/building-a-rag-system-from-scratch-turning-aviation-disruption-data-into-an-ai-powered-qa-app-4e8n</link>
      <guid>https://dev.to/parupati/building-a-rag-system-from-scratch-turning-aviation-disruption-data-into-an-ai-powered-qa-app-4e8n</guid>
      <description>&lt;p&gt;I recently built a Retrieval-Augmented Generation (RAG) system that lets you ask natural language questions about the 2026 Iran-US conflict's impact on global civil aviation — and get accurate, source-backed answers in seconds.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Try the live demo:&lt;/strong&gt; &lt;a href="https://parupati.com/aviationRag" rel="noopener noreferrer"&gt;https://parupati.com/aviationRag&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Source code:&lt;/strong&gt; &lt;a href="https://github.com/parupati/IranUSAviationDisruptionRAG" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In this article, I'll walk through the architecture, the decisions I made, and what I learned along the way.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;The &lt;a href="https://www.kaggle.com/datasets/zkskhurram/global-civil-aviation-disruption2026-iranus-war" rel="noopener noreferrer"&gt;Global Civil Aviation Disruption 2026&lt;/a&gt; dataset on Kaggle contains 6 CSV files with 218 records covering airline financial losses, airport disruptions, airspace closures, flight cancellations, reroutes, and a timeline of conflict events.&lt;/p&gt;

&lt;p&gt;Raw CSV data isn't exactly user-friendly. If you wanted to know "Which airline suffered the most?" or "What airports in Iran were closed?", you'd have to manually dig through spreadsheets. I wanted to make this data conversational — ask a question, get a clear answer with sources.&lt;/p&gt;

&lt;p&gt;That's exactly what RAG does.&lt;/p&gt;




&lt;h2&gt;
  
  
  What is RAG?
&lt;/h2&gt;

&lt;p&gt;RAG (Retrieval-Augmented Generation) is a pattern that combines two things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Retrieval&lt;/strong&gt; — Find the most relevant pieces of information from your data&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Generation&lt;/strong&gt; — Feed those pieces to an LLM to produce a human-readable answer&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The key insight: instead of fine-tuning a model on your data (expensive, slow), you just give the LLM the right context at query time. The model doesn't need to "know" your data — it just needs to read it.&lt;/p&gt;




&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;

&lt;p&gt;Here's what I built:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CSV Files (6 tables, 218 records)
  → Python ingestion script converts each row to natural language
  → HuggingFace sentence-transformers embeds each chunk (all-MiniLM-L6-v2)
  → ChromaDB stores the vectors locally
  → FastAPI serves the /query endpoint
  → Angular frontend provides the chat UI
  → Deployed on Hugging Face Spaces (Docker)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Tech Stack
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Orchestration&lt;/td&gt;
&lt;td&gt;LangChain&lt;/td&gt;
&lt;td&gt;Mature RAG framework, pluggable components&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Embeddings&lt;/td&gt;
&lt;td&gt;HuggingFace all-MiniLM-L6-v2&lt;/td&gt;
&lt;td&gt;Fast, runs on CPU, no GPU needed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Vector Store&lt;/td&gt;
&lt;td&gt;ChromaDB&lt;/td&gt;
&lt;td&gt;Zero-config, file-based, perfect for small-medium datasets&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LLM&lt;/td&gt;
&lt;td&gt;OpenAI GPT-4o&lt;/td&gt;
&lt;td&gt;Best answer quality for generation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API&lt;/td&gt;
&lt;td&gt;FastAPI&lt;/td&gt;
&lt;td&gt;Async, auto-generates Swagger docs, production-ready&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Frontend&lt;/td&gt;
&lt;td&gt;Angular&lt;/td&gt;
&lt;td&gt;Integrated into my existing portfolio site&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Deployment&lt;/td&gt;
&lt;td&gt;Hugging Face Spaces (Docker)&lt;/td&gt;
&lt;td&gt;Free tier, auto-scaling, git-based deploys&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  The Interesting Part: Structured Data + RAG
&lt;/h2&gt;

&lt;p&gt;Most RAG tutorials use PDFs or text documents. My dataset was &lt;strong&gt;structured CSV data&lt;/strong&gt; — rows and columns, not paragraphs. This required an extra step: converting each row into a natural language sentence before embedding.&lt;/p&gt;

&lt;p&gt;For example, a row from &lt;code&gt;airline_losses_estimate.csv&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Emirates, UAE, 4200000, 18, 62, 2835200, 9180
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Becomes:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Emirates (UAE) faces an estimated daily financial loss of $4,200,000 USD due to the Iran-US conflict. 18 flights were cancelled and 62 were rerouted, incurring $2,835,200 in additional fuel costs. Approximately 9,180 passengers were impacted."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is important because embedding models understand natural language, not CSV columns. Each of the 6 CSV files has its own conversion function that produces a descriptive sentence with all the context needed for retrieval.&lt;/p&gt;




&lt;h2&gt;
  
  
  Building It: Step by Step
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Ingestion
&lt;/h3&gt;

&lt;p&gt;The ingestion script reads all 6 CSVs, converts each row to a natural language chunk, and stores it in ChromaDB with metadata (source file, category, original field values).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Each CSV file has a dedicated row-to-text converter
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;row_to_text_airline_losses&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nf"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;airline&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; (&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;country&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;) faces an estimated daily &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;financial loss of $&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;estimated_daily_loss_usd&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;,.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; USD...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;218 documents across 6 categories — small enough to fit in a single ChromaDB collection, large enough to need proper retrieval.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Embedding
&lt;/h3&gt;

&lt;p&gt;I used &lt;code&gt;all-MiniLM-L6-v2&lt;/code&gt; from HuggingFace's sentence-transformers. It produces 384-dimensional vectors and runs comfortably on CPU. No GPU, no cloud embedding API, no cost.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;embeddings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;HuggingFaceEmbeddings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;all-MiniLM-L6-v2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;model_kwargs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;device&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cpu&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Retrieval + Generation
&lt;/h3&gt;

&lt;p&gt;At query time, the user's question is embedded with the same model, and ChromaDB returns the top-k most similar chunks. These chunks are injected into a prompt template and sent to GPT-4o:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;chain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;context&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;retriever&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;format_docs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;question&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;RunnablePassthrough&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;
    &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;
    &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;
    &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="nc"&gt;StrOutputParser&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The prompt instructs the model to act as an aviation intelligence analyst and answer using ONLY the provided context — no hallucination.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. API
&lt;/h3&gt;

&lt;p&gt;FastAPI wraps the RAG pipeline into a clean REST endpoint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;POST /query
{
  "question": "Which airline had the highest financial loss?",
  "k": 5
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Response includes the answer and the source documents used to generate it — full transparency.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Deployment
&lt;/h3&gt;

&lt;p&gt;The entire system is containerized with Docker and deployed on Hugging Face Spaces (free tier). The vector store is built during the Docker build phase, so it's baked into the image — no cold-start database initialization.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Learned
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Structured data needs extra love in RAG.&lt;/strong&gt; You can't just throw CSVs at an embedding model. Converting rows to natural language sentences dramatically improves retrieval quality.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. You don't need a GPU for embeddings.&lt;/strong&gt; &lt;code&gt;all-MiniLM-L6-v2&lt;/code&gt; runs in milliseconds on CPU for small datasets. Don't over-engineer the infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. ChromaDB is perfect for prototyping.&lt;/strong&gt; Zero config, runs embedded in your Python process, persists to disk. For 218 documents, it's instant.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Hugging Face Spaces is underrated for API hosting.&lt;/strong&gt; Free Docker-based deployment with auto-generated URLs. The cold-start after inactivity (30-60 seconds) is the main trade-off.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Context-stuffing beats RAG for small data.&lt;/strong&gt; I also built a portfolio chatbot endpoint on the same API — it just stuffs the entire markdown file into the system prompt. No embeddings, no vector store. When your data fits in the context window, keep it simple.&lt;/p&gt;




&lt;h2&gt;
  
  
  Try It Yourself
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Live Demo:&lt;/strong&gt; &lt;a href="https://parupati.com/aviationRag" rel="noopener noreferrer"&gt;https://parupati.com/aviationRag&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example questions to try:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"Which airline suffered the highest daily financial loss?"&lt;/li&gt;
&lt;li&gt;"What airports in Iran were closed?"&lt;/li&gt;
&lt;li&gt;"How many flights were cancelled from Dubai on March 1st?"&lt;/li&gt;
&lt;li&gt;"What was the aviation impact of the Natanz airstrike?"&lt;/li&gt;
&lt;li&gt;"Which countries closed their airspace and for how long?"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Source Code:&lt;/strong&gt; &lt;a href="https://github.com/parupati/IranUSAviationDisruptionRAG" rel="noopener noreferrer"&gt;https://github.com/parupati/IranUSAviationDisruptionRAG&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;API Docs:&lt;/strong&gt; &lt;a href="https://parupati-iran-us-aviation-rag.hf.space/docs" rel="noopener noreferrer"&gt;https://parupati-iran-us-aviation-rag.hf.space/docs&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Adding &lt;strong&gt;hybrid search&lt;/strong&gt; (vector + keyword) via Azure AI Search for better retrieval&lt;/li&gt;
&lt;li&gt;Exploring &lt;strong&gt;streaming responses&lt;/strong&gt; for a more interactive chat experience&lt;/li&gt;
&lt;li&gt;Evaluating retrieval quality with metrics like precision@k and MRR&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're building your first RAG system, start small — a few CSVs, a local vector store, and a cloud LLM. Get the pipeline working end-to-end, then optimize. The fundamentals transfer directly to production-scale systems.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built with Python, LangChain, ChromaDB, HuggingFace, OpenAI GPT-4o, FastAPI, Angular, and Hugging Face Spaces.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Connect with me on &lt;a href="https://www.linkedin.com/in/parupati/" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; or check out more projects on &lt;a href="https://github.com/parupati" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>rag</category>
      <category>webdev</category>
      <category>python</category>
    </item>
    <item>
      <title>Building Production-Ready AI Agents with OpenAI Agents SDK and FastAPI</title>
      <dc:creator>parupati madhukar reddy</dc:creator>
      <pubDate>Mon, 20 Oct 2025 02:41:39 +0000</pubDate>
      <link>https://dev.to/parupati/building-production-ready-ai-agents-with-openai-agents-sdk-and-fastapi-abd</link>
      <guid>https://dev.to/parupati/building-production-ready-ai-agents-with-openai-agents-sdk-and-fastapi-abd</guid>
      <description>&lt;h2&gt;
  
  
  Overview
&lt;/h2&gt;

&lt;p&gt;This guide demonstrates how to leverage the OpenAI Agents SDK with FastAPI to create scalable, production-ready AI agent systems. The OpenAI Agents SDK provides a robust framework for building structured AI agents, while FastAPI offers high-performance API exposure with automatic documentation and validation.&lt;/p&gt;

&lt;h2&gt;
  
  
  🤖 OpenAI Agents SDK: The Foundation
&lt;/h2&gt;

&lt;p&gt;Core Components&lt;br&gt;
The OpenAI Agents SDK provides several key abstractions:&lt;/p&gt;

&lt;p&gt;Agent: The core AI entity with specific instructions and capabilities&lt;br&gt;
Runner: Execution engine for running agents&lt;br&gt;
AgentOutputSchema: Structured output validation using Pydantic&lt;br&gt;
Model Integration: Seamless integration with OpenAI's latest models&lt;/p&gt;
&lt;h2&gt;
  
  
  🏗️ Architecture Overview
&lt;/h2&gt;

&lt;p&gt;The system follows a clean separation of concerns:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Backend&lt;/strong&gt;: Python-based AI agents orchestrated through FastAPI&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Frontend&lt;/strong&gt;: Modern web application with responsive design&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Integration&lt;/strong&gt;: RESTful APIs connecting the two layers&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  🤖 Building AI Agents: The Core Pattern
&lt;/h2&gt;
&lt;h3&gt;
  
  
  1. Agent Definition Structure
&lt;/h3&gt;

&lt;p&gt;Every agent in the system follows a consistent pattern using the &lt;code&gt;agents&lt;/code&gt; framework:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pydantic&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BaseModel&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;agents&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;AgentOutputSchema&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AgentOutput&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Structured output schema for the agent&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;result_data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;success&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;
    &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;

&lt;span class="n"&gt;AGENT_INSTRUCTIONS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
You are a specialized AI agent that performs specific tasks.
Your instructions define the agent&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s behavior and expertise.
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

&lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;SpecializedAgent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;instructions&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;AGENT_INSTRUCTIONS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;output_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;AgentOutputSchema&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;AgentOutput&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;strict_json_schema&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Real-World Example: Database Schema Agent
&lt;/h3&gt;

&lt;p&gt;Here's how the project implements a database schema generation agent:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;database_research.create_db_schema&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DatabaseSchema&lt;/span&gt;

&lt;span class="n"&gt;SCHEMA_GENERATION_INSTRUCTIONS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
You are a senior database architect specialized in creating database schemas 
from functional requirements. Analyze requirements and generate normalized 
database schemas with proper relationships and constraints.
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

&lt;span class="n"&gt;create_db_schema_agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CreateDBSchemaAgent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;instructions&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;SCHEMA_GENERATION_INSTRUCTIONS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;output_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;AgentOutputSchema&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;DatabaseSchema&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;strict_json_schema&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Agent Execution Pattern
&lt;/h3&gt;

&lt;p&gt;Agents are executed using the &lt;code&gt;Runner&lt;/code&gt; pattern:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;agents&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Runner&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate_database_schema&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;requirements&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;DatabaseSchema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;input_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Functional Requirements:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;requirements&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;Runner&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;create_db_schema_agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;input_text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;final_output_as&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;DatabaseSchema&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  🚀 Exposing Agents via FastAPI
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. FastAPI Server Setup
&lt;/h3&gt;

&lt;p&gt;The project uses FastAPI to create a production-ready API server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;fastapi&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;FastAPI&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;HTTPException&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pydantic&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BaseModel&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;FastAPI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AI Agents API&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;version&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1.0.0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AgentRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;input_data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;

&lt;span class="nd"&gt;@app.post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/agent/process&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;process_with_agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;AgentRequest&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Execute agent logic
&lt;/span&gt;        &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;run_agent_workflow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;input_data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;success&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;HTTPException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;detail&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Multi-Agent Orchestration
&lt;/h3&gt;

&lt;p&gt;The system implements complex workflows that chain multiple agents:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;PRDResearchManager&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prd_text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;db_instructions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="c1"&gt;# Step 1: Extract requirements
&lt;/span&gt;        &lt;span class="n"&gt;requirements&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extract_functional_requirements&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prd_text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Step 2: Generate database schema
&lt;/span&gt;        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;requirements&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;success&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;schema&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate_database_schema&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;requirements&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;requirements&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;db_instructions&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Step 3: Generate API contracts
&lt;/span&gt;        &lt;span class="n"&gt;contracts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate_api_contracts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prd_text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;requirements&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;requirements&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;schema&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;contracts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;contracts&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Streaming Responses
&lt;/h3&gt;

&lt;p&gt;For long-running agent operations, the system supports streaming responses:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;fastapi.responses&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;StreamingResponse&lt;/span&gt;

&lt;span class="nd"&gt;@app.post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/research/stream&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;research_stream_endpoint&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ResearchRequest&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate_updates&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="n"&gt;manager&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ResearchManager&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;update&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;manager&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;update&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;update&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;StreamingResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="nf"&gt;generate_updates&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; 
        &lt;span class="n"&gt;media_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text/plain&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Cache-Control&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;no-cache&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  🔧 Configuration Management
&lt;/h2&gt;

&lt;p&gt;The system implements secure configuration management:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Config&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_openai_api_key&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Try config file first
&lt;/span&gt;        &lt;span class="n"&gt;api_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;openai_api_key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Fallback to environment variable
&lt;/span&gt;        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;api_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;set_environment_variables&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;api_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_openai_api_key&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  🎨 Frontend Integration
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Modern Web Interface
&lt;/h3&gt;

&lt;p&gt;The frontend provides an intuitive interface for interacting with AI agents:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;callAgentAPI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;endpoint&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`/api/&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;endpoint&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// Example: Generate database schema&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;generateDatabaseSchema&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;callAgentAPI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;database&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;prd_text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;prdContent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;db_instructions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;instructions&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="nf"&gt;displayResults&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Proxy Architecture
&lt;/h3&gt;

&lt;p&gt;The UI server acts as a proxy to the AI agents backend:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// UI Server (Node.js/Express)&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/database&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;http://localhost:8000/database&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  🛠️ Key Agent Types in the System
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Research Agent
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Purpose&lt;/strong&gt;: Web search and information gathering&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Input&lt;/strong&gt;: Search queries and research topics&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Output&lt;/strong&gt;: Comprehensive research reports&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2. Database Schema Agent
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Purpose&lt;/strong&gt;: Generate database schemas from requirements&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Input&lt;/strong&gt;: Functional requirements and constraints&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Output&lt;/strong&gt;: Complete database schema with tables, relationships&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. Contract Generation Agent
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Purpose&lt;/strong&gt;: Create OpenAPI specifications from database schemas&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Input&lt;/strong&gt;: Database schema and business requirements&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Output&lt;/strong&gt;: Complete OpenAPI 3.0.3 specification&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  4. Sequence Diagram Agent
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Purpose&lt;/strong&gt;: Generate PlantUML sequence diagrams&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Input&lt;/strong&gt;: Architecture requirements and constraints&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Output&lt;/strong&gt;: PlantUML diagram code&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  📊 Benefits of This Architecture
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Scalability
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Each agent is independently deployable&lt;/li&gt;
&lt;li&gt;FastAPI provides async support for concurrent requests&lt;/li&gt;
&lt;li&gt;Streaming responses prevent timeouts on long operations&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Maintainability
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Clear separation between agent logic and API exposure&lt;/li&gt;
&lt;li&gt;Consistent patterns across all agents&lt;/li&gt;
&lt;li&gt;Comprehensive error handling and logging&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Flexibility
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Framework-agnostic OpenAPI specifications&lt;/li&gt;
&lt;li&gt;Multiple output formats (JSON, streaming, files)&lt;/li&gt;
&lt;li&gt;Easy to add new agents following established patterns&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  🚀 Getting Started
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Setup Backend&lt;/strong&gt;:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;   &lt;span class="nb"&gt;cd &lt;/span&gt;server
   pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt
   &lt;span class="nb"&gt;cp &lt;/span&gt;config.template.json config.json
   &lt;span class="c"&gt;# Add your OpenAI API key to config.json&lt;/span&gt;
   python server.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Setup Frontend&lt;/strong&gt;:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;   &lt;span class="nb"&gt;cd &lt;/span&gt;UI
   npm &lt;span class="nb"&gt;install
   &lt;/span&gt;npm start
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Test the System&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;Navigate to &lt;code&gt;http://localhost:3000&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Try generating a database schema from requirements&lt;/li&gt;
&lt;li&gt;Explore the generated API contracts&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  💡 Best Practices Learned
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Agent Design&lt;/strong&gt;: Keep agents focused on single responsibilities&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Error Handling&lt;/strong&gt;: Always provide meaningful error responses&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Configuration&lt;/strong&gt;: Use secure configuration management&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;API Design&lt;/strong&gt;: Follow REST conventions and provide clear documentation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Frontend Integration&lt;/strong&gt;: Use proxy patterns for clean separation&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This architecture demonstrates how to build production-ready AI agent systems that can scale from prototype to enterprise deployment. The combination of structured agents, robust APIs, and modern web interfaces creates a powerful platform for AI-driven development workflows.&lt;/p&gt;

</description>
      <category>api</category>
      <category>openai</category>
      <category>python</category>
      <category>ai</category>
    </item>
  </channel>
</rss>
