<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: maryu0</title>
    <description>The latest articles on DEV Community by maryu0 (@maryu0).</description>
    <link>https://dev.to/maryu0</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1902798%2F745433e7-d6aa-4f0d-973d-fb5bcacf2b4b.png</url>
      <title>DEV Community: maryu0</title>
      <link>https://dev.to/maryu0</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/maryu0"/>
    <language>en</language>
    <item>
      <title>Python AsyncIO Explained: Coroutines, Tasks, Queues, Locks &amp; Semaphores with Examples</title>
      <dc:creator>maryu0</dc:creator>
      <pubDate>Fri, 05 Jun 2026 14:49:10 +0000</pubDate>
      <link>https://dev.to/maryu0/python-asyncio-explained-coroutines-tasks-queues-locks-semaphores-with-examples-533b</link>
      <guid>https://dev.to/maryu0/python-asyncio-explained-coroutines-tasks-queues-locks-semaphores-with-examples-533b</guid>
      <description>&lt;p&gt;Asynchronous programming is one of those topics that feels confusing until it suddenly clicks.&lt;/p&gt;

&lt;p&gt;When I first started learning Python's &lt;code&gt;asyncio&lt;/code&gt;, I understood the syntax but struggled to understand &lt;strong&gt;why&lt;/strong&gt; things behaved the way they did. Why does &lt;code&gt;await&lt;/code&gt; sometimes run sequentially? Why do we need tasks? When should we use locks, queues, or semaphores?&lt;/p&gt;

&lt;p&gt;To build a stronger intuition, I created a small repository of focused examples that explore the core concepts of &lt;code&gt;asyncio&lt;/code&gt; step by step.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Repository:&lt;/strong&gt; &lt;a href="https://github.com/maryu0/python-asyncio" rel="noopener noreferrer"&gt;https://github.com/maryu0/python-asyncio&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Why AsyncIO?
&lt;/h2&gt;

&lt;p&gt;Traditional Python code executes one operation at a time.&lt;/p&gt;

&lt;p&gt;For CPU-heavy work, this is often fine. However, many modern applications spend most of their time waiting for external resources:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;API calls&lt;/li&gt;
&lt;li&gt;Database queries&lt;/li&gt;
&lt;li&gt;File operations&lt;/li&gt;
&lt;li&gt;Network requests&lt;/li&gt;
&lt;li&gt;Message queues&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;While the program is waiting, the CPU is mostly idle.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;asyncio&lt;/code&gt; allows Python to switch to other work during these waiting periods, improving efficiency for I/O-bound applications.&lt;/p&gt;




&lt;h2&gt;
  
  
  Learning Path
&lt;/h2&gt;

&lt;p&gt;The repository is organized from beginner-friendly concepts to more practical concurrency patterns.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Coroutines
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;File:&lt;/strong&gt; &lt;code&gt;coroutine.py&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;The starting point of asyncio.&lt;/p&gt;

&lt;p&gt;You'll learn:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;How to create async functions&lt;/li&gt;
&lt;li&gt;What &lt;code&gt;await&lt;/code&gt; does&lt;/li&gt;
&lt;li&gt;How the event loop executes coroutines&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;greet&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Hello&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;World&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Understanding coroutines is the foundation for everything else.&lt;/p&gt;




&lt;h3&gt;
  
  
  2. Why Tasks Matter
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;File:&lt;/strong&gt; &lt;code&gt;Need_for_TASKS.py&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;One of the biggest beginner misconceptions is assuming that multiple &lt;code&gt;await&lt;/code&gt; statements automatically run concurrently.&lt;/p&gt;

&lt;p&gt;They don't.&lt;/p&gt;

&lt;p&gt;Consider:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;task1&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;task2&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;task3&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This executes sequentially.&lt;/p&gt;

&lt;p&gt;This example demonstrates why &lt;code&gt;asyncio.create_task()&lt;/code&gt; exists and how tasks enable concurrent execution.&lt;/p&gt;




&lt;h3&gt;
  
  
  3. Running Concurrent Work
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;File:&lt;/strong&gt; &lt;code&gt;tasks.py&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Once tasks are introduced, we can run multiple coroutines at the same time.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;t1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;worker&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;span class="n"&gt;t2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;worker&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;t1&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;t2&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This significantly reduces waiting time for I/O-heavy operations.&lt;/p&gt;




&lt;h3&gt;
  
  
  4. gather() and TaskGroup
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;File:&lt;/strong&gt; &lt;code&gt;gather.py&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;When managing multiple concurrent operations, Python provides powerful abstractions.&lt;/p&gt;

&lt;h4&gt;
  
  
  asyncio.gather()
&lt;/h4&gt;

&lt;p&gt;Run multiple coroutines together and collect their results.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;gather&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nf"&gt;task1&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="nf"&gt;task2&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="nf"&gt;task3&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  TaskGroup
&lt;/h4&gt;

&lt;p&gt;Introduced in newer Python versions, &lt;code&gt;TaskGroup&lt;/code&gt; provides safer task management and structured concurrency.&lt;/p&gt;

&lt;p&gt;This file compares both approaches and explains when each is useful.&lt;/p&gt;




&lt;h3&gt;
  
  
  5. Protecting Shared Resources
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;File:&lt;/strong&gt; &lt;code&gt;Lock.py&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Concurrency introduces a new challenge: &lt;strong&gt;race conditions&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;When multiple coroutines access shared data simultaneously, unexpected behavior can occur.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;asyncio.Lock&lt;/code&gt; ensures only one coroutine modifies a shared resource at a time.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;lock&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Lock&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;lock&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;shared_counter&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This pattern is essential whenever multiple tasks update shared state.&lt;/p&gt;




&lt;h3&gt;
  
  
  6. Practical AsyncIO Patterns
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;File:&lt;/strong&gt; &lt;code&gt;Practice.py&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This file combines multiple concepts into realistic examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Concurrent execution with &lt;code&gt;gather&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Timeout handling using &lt;code&gt;wait_for&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Fallback strategies&lt;/li&gt;
&lt;li&gt;Async generators&lt;/li&gt;
&lt;li&gt;Streaming-style output&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These patterns are commonly used in production systems interacting with APIs and external services.&lt;/p&gt;




&lt;h3&gt;
  
  
  7. Producer-Consumer Queues
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;File:&lt;/strong&gt; &lt;code&gt;queue.py&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Real-world systems often produce work faster than it can be processed.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;asyncio.Queue&lt;/code&gt; acts as a buffer between producers and consumers.&lt;/p&gt;

&lt;p&gt;Common use cases include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Job processing systems&lt;/li&gt;
&lt;li&gt;Event pipelines&lt;/li&gt;
&lt;li&gt;Background workers&lt;/li&gt;
&lt;li&gt;Message handling&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This example demonstrates how queues help smooth bursts of incoming work.&lt;/p&gt;




&lt;h3&gt;
  
  
  8. Limiting Concurrency with Semaphores
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;File:&lt;/strong&gt; &lt;code&gt;semaphore.py&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Sometimes running everything concurrently is actually a bad idea.&lt;/p&gt;

&lt;p&gt;Imagine sending 1,000 API requests simultaneously.&lt;/p&gt;

&lt;p&gt;You might:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Hit rate limits&lt;/li&gt;
&lt;li&gt;Overload a service&lt;/li&gt;
&lt;li&gt;Consume excessive resources&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;asyncio.Semaphore&lt;/code&gt; limits how many tasks run at once.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;semaphore&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Semaphore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;semaphore&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;make_request&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This pattern is extremely useful when working with external APIs.&lt;/p&gt;




&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;p&gt;After working through these examples, a few ideas became much clearer:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Coroutines define asynchronous work.&lt;/li&gt;
&lt;li&gt;Tasks enable concurrent execution.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;gather()&lt;/code&gt; and &lt;code&gt;TaskGroup&lt;/code&gt; help coordinate multiple tasks.&lt;/li&gt;
&lt;li&gt;Locks prevent race conditions.&lt;/li&gt;
&lt;li&gt;Queues provide buffering between producers and consumers.&lt;/li&gt;
&lt;li&gt;Semaphores prevent excessive concurrency.&lt;/li&gt;
&lt;li&gt;Async generators enable streaming-style workflows.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Most importantly, asyncio isn't about making code magically faster.&lt;/p&gt;

&lt;p&gt;It's about making better use of waiting time.&lt;/p&gt;




&lt;h2&gt;
  
  
  Repository Structure
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;python-asyncio/
│
├── coroutine.py
├── Need_for_TASKS.py
├── tasks.py
├── gather.py
├── Lock.py
├── Practice.py
├── queue.py
├── semaphore.py
└── Concepts.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Recommended Learning Order
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;code&gt;coroutine.py&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Need_for_TASKS.py&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;tasks.py&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;gather.py&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Lock.py&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Practice.py&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;queue.py&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;semaphore.py&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Concepts.md&lt;/code&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Following this order helps build intuition gradually, from basic coroutines to advanced concurrency control patterns.&lt;/p&gt;




&lt;h2&gt;
  
  
  Who Is This For?
&lt;/h2&gt;

&lt;p&gt;This repository is intended for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Python beginners learning asyncio&lt;/li&gt;
&lt;li&gt;Students exploring concurrent programming&lt;/li&gt;
&lt;li&gt;Developers preparing for backend engineering&lt;/li&gt;
&lt;li&gt;Anyone who wants hands-on asyncio practice before using frameworks or production systems&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The examples are intentionally small and educational, focusing on clarity rather than production architecture.&lt;/p&gt;




&lt;h2&gt;
  
  
  Explore the Repository
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/maryu0/python-asyncio" rel="noopener noreferrer"&gt;https://github.com/maryu0/python-asyncio&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you're learning asyncio, I'd love to know:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Which asyncio concept was the hardest for you to understand when you first started?&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;&lt;strong&gt;Tags:&lt;/strong&gt; &lt;code&gt;#python&lt;/code&gt; &lt;code&gt;#asyncio&lt;/code&gt; &lt;code&gt;#beginners&lt;/code&gt; &lt;code&gt;#programming&lt;/code&gt;&lt;/p&gt;

</description>
      <category>resources</category>
      <category>api</category>
      <category>python</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>I built an AI debugging assistant with Llama 3.3 — here's what actually worked</title>
      <dc:creator>maryu0</dc:creator>
      <pubDate>Fri, 15 May 2026 19:13:56 +0000</pubDate>
      <link>https://dev.to/maryu0/i-built-an-ai-debugging-assistant-with-llama-33-heres-what-actually-worked-ind</link>
      <guid>https://dev.to/maryu0/i-built-an-ai-debugging-assistant-with-llama-33-heres-what-actually-worked-ind</guid>
      <description>&lt;p&gt;Every developer has been there. It's 2am, your CI pipeline is red, and you're staring at a wall of error logs trying to figure out which of the 47 things that could be wrong is actually wrong.&lt;/p&gt;

&lt;p&gt;That pain is what made me build &lt;strong&gt;FailSense&lt;/strong&gt; — an AI debugging assistant that ingests error logs and returns ranked, actionable fixes using Llama 3.3. Here's an honest breakdown of what I built, the mistakes I made, and what I'd do differently.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;~40% reduction in debugging time · ~99% uptime on AWS · 2 services, one pipeline&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The problem with debugging + LLMs
&lt;/h2&gt;

&lt;p&gt;The naive approach is obvious: dump the error into ChatGPT and hope for the best. It kind of works. But it breaks down quickly when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Your error spans multiple files and stack frames&lt;/li&gt;
&lt;li&gt;The root cause is buried 3 levels deep in a dependency&lt;/li&gt;
&lt;li&gt;You need ranked fixes, not a monologue&lt;/li&gt;
&lt;li&gt;You want this in your own pipeline, not a chat UI
So I decided to build something purpose-built for error log analysis — with structured output, confidence-ranked fixes, and a real deployment.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Architecture: keep it boring
&lt;/h2&gt;

&lt;p&gt;The stack is deliberately simple. Two services. One job each.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Next.js (Frontend) → FastAPI (Backend) → Llama 3.3 via Groq
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Next.js frontend handles log input and renders ranked fixes. The FastAPI backend owns all the prompt logic, output parsing, and error handling. Llama 3.3 runs on Groq for low-latency inference — this matters more than you'd think when users are already frustrated.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Lesson learned:&lt;/strong&gt; Don't add a third service just because you can. Every hop between services is a new failure point, a new auth layer, and a new thing to monitor at 2am.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The prompt that actually works
&lt;/h2&gt;

&lt;p&gt;This took the most iteration. The first version just said "here's an error, fix it." The output was verbose, unstructured, and hard to parse programmatically. Here's the version that works:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;system_prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
You are a senior software engineer debugging production errors.
Given an error log, return ONLY a JSON array of fixes, ranked by likelihood.
Each fix must have:
  - rank (int): 1 = most likely cause
  - cause (str): one sentence root cause
  - fix (str): exact steps to resolve
  - confidence (float): 0.0 to 1.0

Return nothing else. No preamble. No markdown. Raw JSON only.
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three things made this work:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Explicit output format&lt;/strong&gt; — telling the model to return raw JSON (not markdown-wrapped JSON) saved me a ton of parsing headaches&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Role framing&lt;/strong&gt; — "senior software engineer" shifts the model toward precise, opinionated output over safe hedging&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  3. &lt;strong&gt;Ranked by likelihood&lt;/strong&gt; — forcing a ranking means the most actionable fix is always first, which is what a tired developer actually wants
&lt;/h2&gt;

&lt;h2&gt;
  
  
  Parsing LLM output without going insane
&lt;/h2&gt;

&lt;p&gt;LLMs are not deterministic JSON machines. Sometimes Llama 3.3 returns perfect JSON. Sometimes it adds a sentence before it. Sometimes the confidence is a string instead of a float. Here's the defensive parsing layer I built:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;parse_fixes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Strip markdown fences if present
&lt;/span&gt;    &lt;span class="n"&gt;clean&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;```

(?:json)?|

```&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;fixes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;clean&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;JSONDecodeError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Try to extract the JSON array from within a larger string
&lt;/span&gt;        &lt;span class="n"&gt;match&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;\[.*\]&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;clean&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DOTALL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;fixes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;match&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;group&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;match&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

    &lt;span class="c1"&gt;# Normalize confidence to float
&lt;/span&gt;    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;fixes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;confidence&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;confidence&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fixes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rank&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Hot take:&lt;/strong&gt; If you're not writing a fallback parser for LLM output, you're writing a bug. Models drift, prompts drift, and what works today breaks next month.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Deployment: boring is good
&lt;/h2&gt;

&lt;p&gt;Next.js on Vercel. FastAPI on Railway. Both wired up with GitHub Actions for CI/CD. Every push to main triggers a deploy. The whole thing costs under $5/month to run.&lt;/p&gt;

&lt;p&gt;The ~99% uptime wasn't magic — it was just not doing anything clever. No custom load balancers, no exotic infra. Just two managed services that restart themselves when they crash.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I'd do differently
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Add evals from day one.&lt;/strong&gt; I had no systematic way to know if a prompt change made things better or worse. I was eyeballing it. Don't eyeball it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stream the response.&lt;/strong&gt; Waiting 3-4 seconds for the full JSON response feels slow. Streaming partial results — even just a loading state with intermediate tokens — makes it feel snappy.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  - &lt;strong&gt;Log everything.&lt;/strong&gt; What errors are users pasting in? What fixes are they ignoring? This data is gold for improving the prompt and I threw it away by not logging it.
&lt;/h2&gt;

&lt;h2&gt;
  
  
  The takeaway
&lt;/h2&gt;

&lt;p&gt;Building production AI tools is less about the model and more about the scaffolding around it. The prompt, the output parser, the fallback handling, the latency — that's where the real engineering happens.&lt;/p&gt;

&lt;p&gt;FailSense isn't magic. It's a well-prompted LLM with a defensive parser and a boring deployment. That's enough to cut debugging time by ~40% and actually ship something people use.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Check out the full source on &lt;a href="https://github.com/maryu0/FailSense.git" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; · Built with Next.js, FastAPI, Groq, and Llama 3.3&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>productivity</category>
      <category>showdev</category>
    </item>
  </channel>
</rss>
