<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: 박준희</title>
    <description>The latest articles on DEV Community by 박준희 (@junhee916).</description>
    <link>https://dev.to/junhee916</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3964655%2F447f5509-845c-4cd0-8de8-a2cf635e18bb.jpg</url>
      <title>DEV Community: 박준희</title>
      <link>https://dev.to/junhee916</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/junhee916"/>
    <language>en</language>
    <item>
      <title>Cloud TTS Chirp3-HD with Caching: Fixing Voice Readout for Accessibility</title>
      <dc:creator>박준희</dc:creator>
      <pubDate>Tue, 23 Jun 2026 10:10:04 +0000</pubDate>
      <link>https://dev.to/junhee916/cloud-tts-chirp3-hd-with-caching-fixing-voice-readout-for-accessibility-3paf</link>
      <guid>https://dev.to/junhee916/cloud-tts-chirp3-hd-with-caching-fixing-voice-readout-for-accessibility-3paf</guid>
      <description>&lt;h2&gt;
  
  
  Cloud TTS Chirp3-HD with Caching: Fixing Voice Readout for Accessibility
&lt;/h2&gt;

&lt;p&gt;As a solo developer, keeping the product lean and accessible is paramount. A recent request highlighted a critical need: the ability to have AI chat responses read aloud. This wasn't just about adding a feature; it was about making the platform usable for someone with visual impairments, specifically a user's mother-in-law who struggles with reading text on screen. The initial thought was simple text-to-speech (TTS), but the reality of implementing it well, especially on a single small VM, presented several engineering challenges.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Genesis: A Need for Voice
&lt;/h3&gt;

&lt;p&gt;The request was clear: "When I ask a question via text, if I don't have time to check it, let me hear the answer via voice." This immediately told me it wasn't about real-time conversational voice, but rather a playback feature for existing text responses. This distinction is crucial for architecture and cost management.&lt;/p&gt;

&lt;h3&gt;
  
  
  Design Iterations: From Browser to Cloud
&lt;/h3&gt;

&lt;p&gt;The first instinct might be to use the browser's built-in TTS capabilities. However, user feedback quickly shut that down: the native browser voices sounded too robotic and were actively disliked. This pointed towards a cloud-based neural TTS engine. The key requirements that emerged after a couple of back-and-forth discussions were:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Natural Voice:&lt;/strong&gt; The voice needed to be human-like and pleasant to listen to.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Text Cleaning:&lt;/strong&gt; Tables, code blocks, and other non-prose elements in the AI's response would create noise if read directly. These needed to be cleaned or handled gracefully.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Efficiency:&lt;/strong&gt; Repeatedly generating the same audio for the same response was wasteful, both in terms of processing time and cost. A caching mechanism was essential.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Engine Selection: Benchmarking Latency and Quality
&lt;/h3&gt;

&lt;p&gt;I evaluated a couple of cloud TTS options:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Vertex AI Gemini TTS:&lt;/strong&gt; Offered a natural voice but had a higher latency, taking between 7 to 19 seconds for generation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cloud TTS Chirp3-HD:&lt;/strong&gt; This engine provided a Gemini-level of naturalness and significantly faster generation times, around 2 seconds. It used the &lt;code&gt;ko-KR-Chirp3-HD-Charon/Kore&lt;/code&gt; voice, which was consistent with other natural voices used in the product.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Neural2:&lt;/strong&gt; This was the fastest at around 0.5 seconds but sounded noticeably more like a traditional TTS engine.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Given the balance of speed, naturalness, and voice consistency, &lt;strong&gt;Cloud TTS Chirp3-HD was the clear winner&lt;/strong&gt;. The implementation would involve using its REST API with a service account (SA) that had the &lt;code&gt;cloud-platform&lt;/code&gt; scope for authentication, and storing generated audio in Google Cloud Storage (GCS).&lt;/p&gt;

&lt;h3&gt;
  
  
  Implementation Details: Cleaning, Caching, and Delivery
&lt;/h3&gt;

&lt;p&gt;The implementation involved several components:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Text Cleaning:&lt;/strong&gt; A new function, &lt;code&gt;clean\_for\_tts&lt;/code&gt;, was introduced. This function preprocesses the AI's response to remove or rephrase elements that don't translate well to speech. For instance, code blocks and tables would be replaced with a message like "Please refer to the screen for tables and code." Links, emphasis, and list markers were also stripped to leave only plain text.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Caching Logic:&lt;/strong&gt; A cache was implemented using GCS. When a request for audio comes in, the cleaned text is hashed. If an MP3 file with that hash exists in the &lt;code&gt;tts-cache/&lt;/code&gt; GCS bucket, it's served immediately. This is a cache HIT, incurs no extra cost, and doesn't deduct from the daily quota. If the file doesn't exist (a cache MISS), the Cloud TTS API is called to generate the audio. The generated audio is then saved to GCS and logged in the &lt;code&gt;usage\_logs&lt;/code&gt; table with &lt;code&gt;source='tts'&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost Management:&lt;/strong&gt; To prevent abuse and manage costs on my small VM, a daily cap (&lt;code&gt;TTS\_DAILY\_CAP&lt;/code&gt;) was set. This cap applies only to MISSes, not HITs. The cost for TTS generation is approximately $30 per 1 million characters.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;API Endpoint:&lt;/strong&gt; A new API endpoint, &lt;code&gt;POST /chat/tts&lt;/code&gt;, was created. This endpoint is publicly accessible (as it's an accessibility feature) and returns the audio data as a base64 encoded string. It handles cache HITs and MISSes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Nginx Configuration:&lt;/strong&gt; To ensure the new &lt;code&gt;/api/chat/tts&lt;/code&gt; endpoint was correctly routed and not intercepted by other services (like SSE), an exact match configuration was added to Nginx, drawing from lessons learned from previous video upload handling.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Frontend Integration:&lt;/strong&gt; A "Listen to response" (🔊) button was added to each chat bubble. This button, visible by default for accessibility, triggers the API call. Upon receiving the audio data, it's played back using the browser's &lt;code&gt;new Audio()&lt;/code&gt; constructor. The user's selected voice preference is also respected.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  A Hidden Bug: Date Encoding Woes
&lt;/h3&gt;

&lt;p&gt;During testing, a critical bug surfaced related to the daily quota tracking. The system was failing to record usage for sessions that timed out or had no speech output, causing a potential leak in the daily cap. This was traced to how dates were handled in the database query for tracking daily generations. Specifically, passing a date as a string &lt;code&gt;$2::date&lt;/code&gt; to &lt;code&gt;asyncpg&lt;/code&gt; caused a &lt;code&gt;DataError&lt;/code&gt; because the string representation wasn't correctly converted. The fix was to perform the date calculation directly within SQL using &lt;code&gt;(now() AT TIME ZONE 'Asia/Seoul')::date&lt;/code&gt; instead of relying on string conversion.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lesson learned:&lt;/strong&gt; When using &lt;code&gt;asyncpg&lt;/code&gt; for date comparisons, it's safer to perform date calculations within SQL or ensure you're passing proper date objects, rather than relying on string casting, especially across different time zones.&lt;/p&gt;

&lt;h3&gt;
  
  
  Honest Limitations
&lt;/h3&gt;

&lt;p&gt;While this feature significantly enhances accessibility, it's important to be transparent about its limitations:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The voice used is a natural neural voice, not the device's default.&lt;/li&gt;
&lt;li&gt;Tables and code blocks are skipped with a "Please refer to the screen" message.&lt;/li&gt;
&lt;li&gt;Automatic playback is not enabled to prevent accidental costs and ensure user control.&lt;/li&gt;
&lt;li&gt;While cache HITs are free and unlimited, generating new audio deducts from the daily quota.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Furthermore, a note was added that any replies generated by the user's request would require explicit approval before being sent, a standard procedure for user-facing replies.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;...building aicoreutility.com in the open...&lt;/em&gt;&lt;/p&gt;

</description>
      <category>tts</category>
      <category>cloudtts</category>
      <category>caching</category>
      <category>a11y</category>
    </item>
    <item>
      <title>Documenting the AI Feature Development Journey and Lessons Learned: Sharing Practical Experience</title>
      <dc:creator>박준희</dc:creator>
      <pubDate>Mon, 22 Jun 2026 16:00:00 +0000</pubDate>
      <link>https://dev.to/junhee916/documenting-the-ai-feature-development-journey-and-lessons-learned-sharing-practical-experience-3a4g</link>
      <guid>https://dev.to/junhee916/documenting-the-ai-feature-development-journey-and-lessons-learned-sharing-practical-experience-3a4g</guid>
      <description>&lt;p&gt;Documenting AI Feature Development History and Lessons Learned: Sharing Practical Experience&lt;/p&gt;

&lt;p&gt;I felt the need to systematically document the history of AI feature development and the lessons learned during that process. It was particularly important to record the history of various feature additions and insights gained during development for Claude-related documentation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Attempts and Pitfalls
&lt;/h2&gt;

&lt;p&gt;At first, I just wanted to list the features. But I realized that documenting the problems encountered and the solutions found during the development of each feature would help me avoid getting stuck if I faced similar situations later.&lt;/p&gt;

&lt;p&gt;For example, while developing the budget management audit feature, unexpected data consistency issues arose.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Example code (not actual working code)
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;audit_budget&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;start_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;end_date&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;transactions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_transactions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;start_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;end_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;total_spent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;amount&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;transactions&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;expense&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;total_income&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;amount&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;transactions&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;income&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;total_spent&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;total_income&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;1.2&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="c1"&gt;# Excessive spending detection logic
&lt;/span&gt;        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;warning&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;There&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s a risk of exceeding the budget.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ok&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Budget management is good&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;During this process, there was a bug in the transaction aggregation logic for a specific date range, and it took me over 3 hours to find it.&lt;/p&gt;

&lt;p&gt;As I added various features like TTS integration, drawing board functionality, separating the guide hub, improving the settings modal, developing voice tone selection, and showing remaining voice time, I encountered unexpected problems at each stage.&lt;/p&gt;

&lt;h2&gt;
  
  
  Causes
&lt;/h2&gt;

&lt;p&gt;The problems encountered during development mainly stemmed from the following reasons:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;AI-UI-Tutorial Mismatch:&lt;/strong&gt; It was difficult to properly explain the complexity of AI features through the UI or tutorials.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gemini Voice Tone Consistency Issues:&lt;/strong&gt; There were difficulties in consistently managing the various voice tones of the Gemini model.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Native Audio Voice XOR Input Transcription Conflicts:&lt;/strong&gt; Conflicts occurred between certain audio libraries and the voice input transcription feature.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Solutions
&lt;/h2&gt;

&lt;p&gt;Ultimately, the solution was to systematically document the history of major feature development and the lessons learned during development in the Claude-related documentation. For each feature, I included the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Feature Name:&lt;/strong&gt; Clearly stated which feature was developed.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Development Background:&lt;/strong&gt; Explained why this feature was necessary.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Development Process:&lt;/strong&gt; Outlined the main problems encountered and the solutions attempted.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Final Solution:&lt;/strong&gt; Provided the actual working code and the reasoning behind it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lessons Learned:&lt;/strong&gt; Detailed what was learned while developing this feature.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As an example, the conflict issues encountered while developing the voice tone selection feature were resolved as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Example code (not actual working code)
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;some_audio_library&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AudioSynthesizer&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;speech_recognition_library&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SpeechRecognizer&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;VoiceManager&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;synthesizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AudioSynthesizer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;recognizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SpeechRecognizer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;available_voices&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;synthesizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_available_voices&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;set_voice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;voice_id&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;voice_id&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;available_voices&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;synthesizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_voice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;voice_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Voice tone set to &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;voice_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error: Voice tone &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;voice_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; not found.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;speak&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;synthesizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;speak&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;recognize_speech&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;recognizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;recognize&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Separating voice tone settings and voice output to prevent conflicts
&lt;/span&gt;&lt;span class="n"&gt;voice_manager&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;VoiceManager&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;voice_manager&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_voice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gemini-standard&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# Set Gemini voice tone
&lt;/span&gt;&lt;span class="n"&gt;voice_manager&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;speak&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Hello.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# Voice output
&lt;/span&gt;&lt;span class="n"&gt;recognized_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;voice_manager&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;recognize_speech&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c1"&gt;# Voice input transcription
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This code clearly separated &lt;code&gt;AudioSynthesizer&lt;/code&gt; and &lt;code&gt;SpeechRecognizer&lt;/code&gt; so they wouldn't interfere with each other's operations, and prevented conflicts by managing voice tone settings and voice output in separate methods.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;The history of AI feature development and the lessons learned during the development process have been systematically documented in the Claude-related documentation.&lt;/li&gt;
&lt;li&gt;The problems encountered and solutions found during the development of each feature are now clear, allowing for quicker responses to similar issues in the future.&lt;/li&gt;
&lt;li&gt;Insights gained during development (AI-UI-Tutorial mismatch, Gemini voice tone consistency, native-audio voice XOR input transcription conflicts, etc.) have been documented, contributing to knowledge sharing within the team.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Summary — How to Avoid Falling into the Same Traps
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;[ ] When developing new AI features, review their suitability with UI/UX and tutorials from the initial stages.&lt;/li&gt;
&lt;li&gt;[ ] When using multiple AI models (especially for voice), establish a strategy beforehand to maintain consistency in voice tones and output.&lt;/li&gt;
&lt;li&gt;[ ] When integrating external libraries or SDKs, test with the possibility of conflicts with existing features in mind.&lt;/li&gt;
&lt;li&gt;[ ] Make it a habit to document problems, "sprints" (wasted efforts), and solutions encountered during development in detail.&lt;/li&gt;
&lt;li&gt;[ ] When documenting, strive to present generalized causes and solutions rather than relying on specific infrastructure or app names.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>claude</category>
    </item>
    <item>
      <title>Reducing Unnecessary Costs by Automatically Ending Voice Call Sessions</title>
      <dc:creator>박준희</dc:creator>
      <pubDate>Sat, 20 Jun 2026 16:00:00 +0000</pubDate>
      <link>https://dev.to/junhee916/reducing-unnecessary-costs-by-automatically-ending-voice-call-sessions-10g6</link>
      <guid>https://dev.to/junhee916/reducing-unnecessary-costs-by-automatically-ending-voice-call-sessions-10g6</guid>
      <description>&lt;p&gt;Ever faced the issue of voice call sessions staying active unnecessarily, leading to unexpected costs? This often happens, especially in Live session environments. In this post, I'll share my experience implementing automatic session termination for silence or explicit end-of-call signals to reduce costs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Attempts and Pitfalls
&lt;/h2&gt;

&lt;p&gt;Initially, I thought of a simple logic: if there's no speech for a certain period, just terminate the session. But it turned out to be trickier than I expected. It was hard to distinguish between a user briefly pausing their speech and them actually wanting to end the call.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Initial attempt: Simple silence timeout
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;check_silence_and_terminate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;last_activity_time&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeout_duration&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;current_time&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="nf"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;current_time&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;last_activity_time&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;total_seconds&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;timeout_duration&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Session &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; terminated due to silence.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c1"&gt;# Session termination logic...
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With this code, there was a risk of the session being terminated even when the user was just catching their breath or taking a moment to think. Ultimately, I spent about 3 hours fiddling with this approach without achieving satisfactory results.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Root Cause
&lt;/h2&gt;

&lt;p&gt;The core of the problem was understanding 'user intent.' Relying solely on the absence of speech led to false positives. We needed to differentiate between cases where the user clearly expressed their intention to end the call (e.g., "I'll hang up," "End") and cases where they simply remained silent.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Solution
&lt;/h2&gt;

&lt;p&gt;So, in the end, I implemented the session termination logic by combining two conditions:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Silence Detection&lt;/strong&gt;: When there's no voice input for a specific duration.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Explicit End-of-Call Intent Detection&lt;/strong&gt;: Detecting speech containing specific keywords (e.g., "종료" (end), "끊어" (hang up), "안녕" (bye)).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I modified the logic to terminate the session only when both these conditions are met.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Improved session termination logic
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;should_terminate_session&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session_data&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;current_time&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;last_activity_time&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;session_data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;last_activity_time&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;silence_timeout&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;  &lt;span class="c1"&gt;# 60 seconds of silence
&lt;/span&gt;    &lt;span class="n"&gt;last_utterance&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;session_data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;last_utterance&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;keywords_to_terminate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;종료&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;끊어&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;안녕&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;마칠게&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="c1"&gt;# 1. After a certain period of silence
&lt;/span&gt;    &lt;span class="nf"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;current_time&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;last_activity_time&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;total_seconds&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;silence_timeout&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Session &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;session_data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; is a candidate for termination due to silence.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;

    &lt;span class="c1"&gt;# 2. When the user explicitly expresses intent to end
&lt;/span&gt;    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;keyword&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;keywords_to_terminate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;keyword&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;last_utterance&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Session &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;session_data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; detected explicit intent to terminate: &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;last_utterance&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;

&lt;span class="c1"&gt;# Example data
&lt;/span&gt;&lt;span class="n"&gt;session_info&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;session_abc&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;last_activity_time&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nf"&gt;timedelta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seconds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;70&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;last_utterance&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Yes, I understand. See you next time.&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;should_terminate_session&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session_info&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Proceeding with session termination...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Session ongoing...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;session_info_2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;session_xyz&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;last_activity_time&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nf"&gt;timedelta&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seconds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;last_utterance&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Yes, I&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="n"&gt;ll&lt;/span&gt; &lt;span class="n"&gt;hang&lt;/span&gt; &lt;span class="n"&gt;up&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;
}

if should_terminate_session(session_info_2):
    print(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Proceeding with session termination...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;)
else:
    print(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Session ongoing...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;)
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Thanks to this logic, sessions are no longer unnecessarily terminated when the user pauses briefly, while still accurately detecting when the user intends to end the call. This has allowed for smoother, more cost-effective operation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Reduced unnecessary extension of Live session durations.&lt;/li&gt;
&lt;li&gt;Confirmed cost savings for voice call-related services.&lt;/li&gt;
&lt;li&gt;Increased cost-efficiency without compromising user experience.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Summary — To Avoid the Same Pitfalls
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;[ ] When implementing voice session termination logic, do not rely solely on silence duration.&lt;/li&gt;
&lt;li&gt;[ ] Consider explicit end-of-call intent signals from the user (e.g., keyword detection) as well.&lt;/li&gt;
&lt;li&gt;[ ] During testing, thoroughly validate with various speech patterns (brief pauses, intentional closing remarks, etc.).&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>live</category>
    </item>
    <item>
      <title>Node.js Speech Transcription Bug: Unsaved Content Due to Missing `conversation_id`, Resolved by Bridge Forwarding</title>
      <dc:creator>박준희</dc:creator>
      <pubDate>Fri, 19 Jun 2026 16:00:00 +0000</pubDate>
      <link>https://dev.to/junhee916/nodejs-speech-transcription-bug-unsaved-content-due-to-missing-conversationid-resolved-by-224o</link>
      <guid>https://dev.to/junhee916/nodejs-speech-transcription-bug-unsaved-content-due-to-missing-conversationid-resolved-by-224o</guid>
      <description>&lt;p&gt;Did you run into an issue where transcribed speech content wasn't being saved? I had a similar experience, and it turned out to be caused by not properly passing the &lt;code&gt;conversation_id&lt;/code&gt;. I'd like to share how I solved it in this post.&lt;/p&gt;

&lt;h2&gt;
  
  
  Attempts and Pitfalls
&lt;/h2&gt;

&lt;p&gt;At first, I suspected a problem with the speech transcription API response itself. I hypothesized that if the &lt;code&gt;conversation_id&lt;/code&gt; wasn't passed correctly during the API call, the server wouldn't be able to properly match the transcription results. So, I tried explicitly passing the &lt;code&gt;conversation_id&lt;/code&gt; through a bridge when making the API call.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Previous code (assumed)&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;transcribeAudio&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;audioBlob&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/transcribe&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;audioBlob&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;audio/wav&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="c1"&gt;// conversation_id missing or incorrectly passed&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;However, as expected, passing the &lt;code&gt;conversation_id&lt;/code&gt; via a bridge didn't resolve the issue. There were no specific error messages in the server logs, and the transcription results were just empty. After about 3 hours of struggling, I realized that the &lt;code&gt;conversation_id&lt;/code&gt; needed to be passed in a different way, not in the API request body or headers.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cause
&lt;/h2&gt;

&lt;p&gt;The root cause of the problem was that the &lt;code&gt;conversation_id&lt;/code&gt; was not being transmitted correctly as part of the API request. The speech transcription service uses the &lt;code&gt;conversation_id&lt;/code&gt; to uniquely identify each speech session. If this value was missing or passed through the wrong channel, the server wouldn't know which conversation the transcription results belonged to, and thus couldn't save the data.&lt;/p&gt;

&lt;h2&gt;
  
  
  Solution
&lt;/h2&gt;

&lt;p&gt;The solution was to clearly specify and pass the &lt;code&gt;conversation_id&lt;/code&gt; as a dedicated field within the API request. Additionally, I added &lt;code&gt;WARNING&lt;/code&gt; logging as a safeguard and modified the code to store a fallback response text if the &lt;code&gt;text&lt;/code&gt; field was missing in the response.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Modified code&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;transcribeAudio&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;audioBlob&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;conversationId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;formData&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;FormData&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="nx"&gt;formData&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;audio&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;audioBlob&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nx"&gt;formData&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;conversation_id&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;conversationId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// Explicitly passed as a field&lt;/span&gt;

  &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/transcribe&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;formData&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;warn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`[Transcribe] No text found for conversation_id: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;conversationId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;. Response:`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="c1"&gt;// Add logic to temporarily store response.text if the text field is missing in the response (example)&lt;/span&gt;
      &lt;span class="c1"&gt;// Actual implementation may vary depending on the situation&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;raw_response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="c1"&gt;// Arbitrary field name, depends on actual API response&lt;/span&gt;
         &lt;span class="c1"&gt;// Save logic...&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`[Transcribe] Error transcribing audio for conversation_id: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;conversationId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By explicitly adding the &lt;code&gt;conversation_id&lt;/code&gt; to &lt;code&gt;FormData&lt;/code&gt; and modifying the server to parse that field, the transcription content started being saved correctly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;The bug where transcribed speech content wasn't being saved has been completely resolved.&lt;/li&gt;
&lt;li&gt;Adding &lt;code&gt;WARNING&lt;/code&gt; logging and fallback save logic has made it much easier to identify the cause and debug when issues arise.&lt;/li&gt;
&lt;li&gt;Clarifying the API request structure has improved code readability and maintainability.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Summary — How to Avoid the Same Pitfall
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;[ ] When using speech transcription APIs, double-check the API documentation to confirm how the &lt;code&gt;conversation_id&lt;/code&gt; should be passed.&lt;/li&gt;
&lt;li&gt;[ ] The &lt;code&gt;conversation_id&lt;/code&gt; typically needs to be explicitly sent as a specific field in the request body or as a header.&lt;/li&gt;
&lt;li&gt;[ ] When problems occur, don't just rely on server logs; enhance client-side logging to thoroughly inspect request and response data.&lt;/li&gt;
&lt;li&gt;[ ] Consider implementing fallback logic to handle unexpected values in API responses.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>node</category>
      <category>conversationid</category>
      <category>api</category>
      <category>backend</category>
    </item>
    <item>
      <title>Next.js 14: 'Could not find the module in the React Client Manifest' — the real cause nobody tells you</title>
      <dc:creator>박준희</dc:creator>
      <pubDate>Fri, 19 Jun 2026 05:42:33 +0000</pubDate>
      <link>https://dev.to/junhee916/nextjs-14-could-not-find-the-module-in-the-react-client-manifest-the-real-cause-nobody-tells-5coc</link>
      <guid>https://dev.to/junhee916/nextjs-14-could-not-find-the-module-in-the-react-client-manifest-the-real-cause-nobody-tells-5coc</guid>
      <description>&lt;h2&gt;
  
  
  Next.js 14: 'Could not find the module in the React Client Manifest' — the real cause nobody tells you
&lt;/h2&gt;

&lt;p&gt;As a solo developer running a full AI product on a single, modest VM, every deployment is a high-stakes operation. When things go wrong, there's no team to swarm the problem; it's just me, the logs, and a rapidly ticking clock. Recently, a routine Next.js 14 deployment failed spectacularly with an error message that, at first glance, offered little help: &lt;code&gt;'Could not find the module in the React Client Manifest'&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This error typically surfaces during the build process of a Next.js application, especially when using React Server Components (RSC). The React Client Manifest is a crucial artifact that Next.js uses to map client-side components and their dependencies. When it can't find a module it expects, it means something has gone awry in how the build is being generated or where it's being executed.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Symptom: A Cryptic Build Failure
&lt;/h3&gt;

&lt;p&gt;The build pipeline, which relies on GitHub Actions to deploy changes to my single VM, ground to a halt. The logs were filled with this specific error, pointing to an issue with the client manifest generation. My initial thoughts went to the usual suspects:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Corrupted cache?&lt;/li&gt;
&lt;li&gt;Dependency mismatch?&lt;/li&gt;
&lt;li&gt;A bug in a recent Next.js update?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I meticulously checked my &lt;code&gt;package.json&lt;/code&gt;, ran &lt;code&gt;npm ci&lt;/code&gt; to ensure clean dependencies, and even tried reverting to a previous, known-good commit. Nothing. The error persisted, mocking me with its vagueness. The application itself was running fine in development, so the issue was clearly environment or build-specific.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Wrong Turns: Chasing Ghosts
&lt;/h3&gt;

&lt;p&gt;My first instinct was to dive deep into the Next.js documentation and community forums. I found similar errors, often attributed to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Incorrectly configured &lt;code&gt;next.config.js&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Issues with the &lt;code&gt;app&lt;/code&gt; directory structure.&lt;/li&gt;
&lt;li&gt;Problems with serverless function configurations.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I spent hours scrutinizing my configuration files, ensuring the &lt;code&gt;app&lt;/code&gt; directory was correctly set up, and verifying that my serverless function settings (though not strictly applicable to my single VM setup, I checked for any residual configuration) were sound. Still, no luck. The error message was a red herring, leading me down paths that were irrelevant to the actual problem.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Real Root Cause: The Build's Working Directory
&lt;/h3&gt;

&lt;p&gt;The breakthrough came when I started thinking about the execution context of the build command itself. My GitHub Actions workflow was designed to check out the code, install dependencies, and then run the Next.js build command. The error message mentioned the &lt;em&gt;client manifest&lt;/em&gt;, which is generated during the build. What if the build command wasn't running in the expected directory?&lt;/p&gt;

&lt;p&gt;The key insight came from a subtle detail in the GitHub Actions logs. The build command was being executed, but it seemed to be operating from a different working directory than anticipated. Specifically, the build was likely running from the root of the repository *after* a checkout operation, but perhaps some cached or intermediate build artifacts were expected to be in a specific sub-directory, or the build process itself was sensitive to the current working directory.&lt;/p&gt;

&lt;p&gt;The MATERIAL mentions &lt;code&gt;nextjs-package-json-deleted-build-failure-recovery-2026&lt;/code&gt; and related entries like &lt;code&gt;nextjs-rsc-client-manifest-build-cwd&lt;/code&gt;. This suggests that the build process, particularly with RSCs, can be sensitive to the current working directory (&lt;code&gt;cwd&lt;/code&gt;) during execution. If the build script isn't explicitly run from the project root where &lt;code&gt;package.json&lt;/code&gt; and &lt;code&gt;next.config.js&lt;/code&gt; reside, or if it's executed in a way that the build tools don't correctly infer the project root, it can lead to manifest generation errors.&lt;/p&gt;

&lt;p&gt;In my case, the GitHub Actions workflow had recently been updated, and a subtle change in how the code was checked out or how subsequent commands were executed likely shifted the default working directory for the build step. The Next.js build process, when invoked without explicitly setting the correct working directory, might not correctly locate necessary files or generate artifacts in the expected locations, leading to the manifest error.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Reproducible Fix: Explicitly Setting the Working Directory
&lt;/h3&gt;

&lt;p&gt;The fix was surprisingly simple, yet elusive because it wasn't directly indicated by the error message itself. I needed to ensure that the Next.js build command was executed from the correct directory within the GitHub Actions runner. This is often done by changing the directory before running the command.&lt;/p&gt;

&lt;p&gt;My updated GitHub Actions workflow step now looks something like this (simplified):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Build Next.js App&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;|&lt;/span&gt;
    &lt;span class="s"&gt;cd my-app-directory # Explicitly change to the app's root directory&lt;/span&gt;
    &lt;span class="s"&gt;npm run build&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By adding the &lt;code&gt;cd my-app-directory&lt;/code&gt; command before &lt;code&gt;npm run build&lt;/code&gt;, I explicitly told the runner where to find the project's root, including &lt;code&gt;package.json&lt;/code&gt; and other configuration files. This ensured that the Next.js build process could correctly locate all its dependencies and generate the client manifest without errors.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Lesson: Beware of Implicit Working Directories
&lt;/h3&gt;

&lt;p&gt;This incident was a stark reminder that even with modern tooling, the fundamentals of execution context matter. When running builds, especially in CI/CD environments, always be mindful of the current working directory. Implicit assumptions about where commands are run can lead to obscure errors that are difficult to debug.&lt;/p&gt;

&lt;p&gt;For anyone else encountering the &lt;code&gt;'Could not find the module in the React Client Manifest'&lt;/code&gt; error in Next.js 14, especially within a CI environment:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Verify your build command's working directory.&lt;/strong&gt; Ensure it's run from the root of your Next.js project.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Check your CI/CD configuration&lt;/strong&gt; for any recent changes that might affect the execution context of your build steps.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Clean builds are good, but context is king.&lt;/strong&gt; Sometimes, the issue isn't what's in your code, but where your code is being built.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It's the unglamorous reality of solo development: a single line change in a CI script can bring the whole operation down, and the debugging process is a lonely journey through cryptic error messages.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;...building aicoreutility.com in the open...&lt;/em&gt;&lt;/p&gt;

</description>
      <category>nextjs</category>
      <category>react</category>
      <category>deployment</category>
      <category>typescript</category>
    </item>
    <item>
      <title>Chatbot Zero-Downtime Deployment Failure? Solved with Backend Rolling Restart (2026)</title>
      <dc:creator>박준희</dc:creator>
      <pubDate>Thu, 18 Jun 2026 16:00:01 +0000</pubDate>
      <link>https://dev.to/junhee916/chatbot-zero-downtime-deployment-failure-solved-with-backend-rolling-restart-2026-372c</link>
      <guid>https://dev.to/junhee916/chatbot-zero-downtime-deployment-failure-solved-with-backend-rolling-restart-2026-372c</guid>
      <description>&lt;p&gt;Ever been frustrated by service downtime when deploying your chatbot? If you've aimed for zero-downtime deployments but ended up inconveniencing users repeatedly, this post might help. I want to share my experience implementing stable deployments using a backend rolling restart strategy.&lt;/p&gt;

&lt;h2&gt;
  
  
  Attempts and Pitfalls
&lt;/h2&gt;

&lt;p&gt;Initially, I naturally tried to stick with the existing deployment method. However, since chatbot services need to process user requests in real-time, even a brief pause was critical. So, I decided to adopt a rolling restart approach, gradually replacing backend servers.&lt;/p&gt;

&lt;p&gt;Similar to Blue-Green deployments, this involves preparing a new version (Green) and gradually shifting traffic from the old version (Blue). I also included health checks to ensure the new version was healthy before taking over.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Example: Start deploying the new version&lt;/span&gt;
kubectl apply &lt;span class="nt"&gt;-f&lt;/span&gt; new-backend-deployment.yaml

&lt;span class="c"&gt;# Example: Wait until new version Pods are ready&lt;/span&gt;
kubectl rollout status deployment/backend-deployment &lt;span class="nt"&gt;-n&lt;/span&gt; chatbot &lt;span class="nt"&gt;--timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;5m

&lt;span class="c"&gt;# Example: Gradually terminate old Pods and switch traffic&lt;/span&gt;
&lt;span class="c"&gt;# (This is where an unexpected issue occurred)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;However, an unforeseen problem arose during this process. Even though all the new version Pods were ready, traffic wasn't being switched correctly when the old Pods were terminated, causing some requests to be dropped. It seemed like the rolling restart wasn't functioning as intended. After three hours of struggling, I discovered the cause...&lt;/p&gt;

&lt;h2&gt;
  
  
  The Cause
&lt;/h2&gt;

&lt;p&gt;The issue wasn't with the rolling restart logic itself, but with the &lt;strong&gt;service's state management approach&lt;/strong&gt;. The chatbot service maintained specific session information in memory. When the old Pods were terminated, this information disappeared abruptly. When requests then went to the new Pods, sessions were broken. In essence, the Pods themselves were terminating correctly, but the service's state led to a poor user experience.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Solution
&lt;/h2&gt;

&lt;p&gt;To fix this, I implemented a strategy to move session information to an external storage (like Redis). This way, even when old Pods are terminated, the session data remains intact. New Pods can then retrieve this data and seamlessly resume existing sessions.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Example: Kubernetes Deployment configuration (partial)&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apps/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deployment&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;backend-chatbot&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;replicas&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt; &lt;span class="c1"&gt;# Example: running with 3 Pods&lt;/span&gt;
  &lt;span class="na"&gt;strategy&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;RollingUpdate&lt;/span&gt;
    &lt;span class="na"&gt;rollingUpdate&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;maxUnavailable&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt; &lt;span class="c1"&gt;# Set to ensure only one Pod is unavailable at a time&lt;/span&gt;
      &lt;span class="na"&gt;maxSurge&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;       &lt;span class="c1"&gt;# Allow up to one new Pod to be created above desired replicas&lt;/span&gt;
  &lt;span class="c1"&gt;# ... (other configurations)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Example: Python code to load session data from Redis (based on Flask)
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;flask&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Flask&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;redis&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Flask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;__name__&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;SECRET_KEY&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;your_secret_key&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="c1"&gt;# Use a secure key in production
&lt;/span&gt;&lt;span class="n"&gt;redis_client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;StrictRedis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;your-redis-host&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;6379&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nd"&gt;@app.before_request&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;load_session_from_redis&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;session_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cookies&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;session_id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# Or other session identifier
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;session_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;redis_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;session:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;session_data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="c1"&gt;# Logic to decode and restore session data
&lt;/span&gt;            &lt;span class="k"&gt;pass&lt;/span&gt;

&lt;span class="nd"&gt;@app.route&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;index&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="c1"&gt;# ...
&lt;/span&gt;    &lt;span class="k"&gt;pass&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;0.0.0.0&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By externalizing session information like this, I ensured that user sessions are not interrupted and seamlessly continue even as Pods are terminated and new ones start. The health checks remain valid, and traffic is only switched when all Pods are in a healthy state.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Downtime during deployment has been reduced to zero.&lt;/li&gt;
&lt;li&gt;User experience has significantly improved, with no more session 끊김 (session interruptions).&lt;/li&gt;
&lt;li&gt;The rolling restart strategy has been successfully applied, enhancing deployment stability.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Summary — How to Avoid the Same Pitfall
&lt;/h2&gt;

&lt;p&gt;If you're attempting rolling restarts for zero-downtime deployments, make sure to check the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;[ ] Verify that your service's state (like session information) is managed independently of the Pod's lifecycle (e.g., using external storage).&lt;/li&gt;
&lt;li&gt;[ ] Ensure your &lt;code&gt;Deployment&lt;/code&gt; strategy is configured with &lt;code&gt;maxUnavailable&lt;/code&gt; and &lt;code&gt;maxSurge&lt;/code&gt; to allow for gradual traffic shifting.&lt;/li&gt;
&lt;li&gt;[ ] Validate that your &lt;code&gt;readinessProbe&lt;/code&gt; and &lt;code&gt;livenessProbe&lt;/code&gt; accurately reflect the actual health of your service.&lt;/li&gt;
&lt;li&gt;[ ] After deployment, continuously monitor actual user traffic to detect any unexpected errors.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>kubernetes</category>
      <category>redis</category>
    </item>
    <item>
      <title>Measuring Backend Power Levels: Securing Objectivity by Reflecting Real-time Congestion</title>
      <dc:creator>박준희</dc:creator>
      <pubDate>Wed, 17 Jun 2026 16:00:00 +0000</pubDate>
      <link>https://dev.to/junhee916/measuring-backend-power-levels-securing-objectivity-by-reflecting-real-time-congestion-5485</link>
      <guid>https://dev.to/junhee916/measuring-backend-power-levels-securing-objectivity-by-reflecting-real-time-congestion-5485</guid>
      <description>&lt;p&gt;Want to reflect real-time congestion when measuring power scale? Or have you questioned the objectivity of existing measurement methods? This post covers how to enhance the objectivity of power scale measurement and reflect real-time congestion using Seoul's urban data.&lt;/p&gt;

&lt;h2&gt;
  
  
  Attempts and Pitfalls
&lt;/h2&gt;

&lt;p&gt;Initially, I changed the power scale measurement method to simply record the daily maximum value for each date. The goal was to ensure objectivity by using Seoul's urban data as the primary indicator.&lt;/p&gt;

&lt;p&gt;However, I realized that it lacked the ability to reflect real-time congestion. For instance, if there was an event on a specific day causing congestion to surge, recording only the daily maximum might make it appear not much different from the usual congestion levels.&lt;/p&gt;

&lt;p&gt;So, I considered adding real-time congestion data for Olympic Park as a secondary indicator. I attempted to retrieve real-time congestion information through the Seoul Urban Data API.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Example: Calling Seoul Urban Data API (Actual API endpoint may vary)
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;

&lt;span class="n"&gt;api_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YOUR_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;olympic_park_data_url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.seoul.go.kr/v1/citydata/olympic_park_congestion?apiKey=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;olympic_park_data_url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;raise_for_status&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="c1"&gt;# Raise an exception for HTTP errors
&lt;/span&gt;    &lt;span class="n"&gt;congestion_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;congestion_data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;exceptions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RequestException&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;API call error: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;During this process, unexpected issues arose with the API response format and data parsing. Sometimes the data was empty, or it came in a format different from what was expected. After about 3 hours of struggling, I figured out a way to reliably handle the API responses.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Cause
&lt;/h2&gt;

&lt;p&gt;To improve the objectivity of the power scale measurement method and reflect real-time congestion, both aspects needed to be considered.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Ensuring Objectivity of the Primary Indicator&lt;/strong&gt;: Recording the daily maximum value for each date could enhance objectivity by showing the peak at a specific point in time.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reflecting Real-time Changes&lt;/strong&gt;: It was necessary to detect and incorporate unexpected congestion situations into the measurement by utilizing real-time congestion data from highly populated areas like Olympic Park as a secondary indicator.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The Solution
&lt;/h2&gt;

&lt;p&gt;Ultimately, I changed the power scale measurement method as follows and updated the related documentation.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Primary Indicator&lt;/strong&gt;: Maintain the recording of the daily maximum value for each date.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Secondary Indicator&lt;/strong&gt;: Add real-time congestion data for Olympic Park. This data is periodically collected through the Seoul Urban Data API.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Example: Data collection and storage logic (simplified)
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_olympic_park_congestion&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="c1"&gt;# ... (API call and data parsing similar to previous code)
&lt;/span&gt;    &lt;span class="c1"&gt;# Return actual congestion value (e.g., 0.8 - very congested)
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mf"&gt;0.8&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;update_power_scale_measurement&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;today&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;today&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;current_max_scale&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_daily_max_power_scale&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;today&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# Existing function to get daily max
&lt;/span&gt;    &lt;span class="n"&gt;current_congestion&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_olympic_park_congestion&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# Logic to update daily maximum
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;current_congestion&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;current_max_scale&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;update_daily_max_power_scale&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;today&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;current_congestion&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Store secondary indicator data (e.g., in a separate table or log)
&lt;/span&gt;    &lt;span class="nf"&gt;log_congestion_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;today&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;current_congestion&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Power scale measurement data update complete.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Call periodically during actual execution
# update_power_scale_measurement()
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This approach allows us to understand the usual power scale through the daily maximum while also detecting sudden congestion changes with the secondary indicator, thus increasing the realism of the measurement.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;The objectivity of the power scale measurement method has been improved.&lt;/li&gt;
&lt;li&gt;The realism of the measurement has increased with the incorporation of real-time congestion data.&lt;/li&gt;
&lt;li&gt;The related documentation has been updated to include the latest data reflections and changes in the measurement method.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Summary — To Avoid the Same Pitfalls
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;[ ] When measuring power scale, consider real-time congestion data as a secondary indicator in addition to the daily maximum.&lt;/li&gt;
&lt;li&gt;[ ] When integrating external APIs, implement robust exception handling for potential response format issues or data omissions.&lt;/li&gt;
&lt;li&gt;[ ] When restarting WAS (e.g., gunicorn), verify that data collection and measurement logic resume normally. (Previously, there was an issue where data was lost after restarts.)&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>api</category>
    </item>
    <item>
      <title>How to Fix Search Engine Indexing Issues Caused by robots.txt Block Errors</title>
      <dc:creator>박준희</dc:creator>
      <pubDate>Tue, 16 Jun 2026 16:00:00 +0000</pubDate>
      <link>https://dev.to/junhee916/how-to-fix-search-engine-indexing-issues-caused-by-robotstxt-block-errors-5981</link>
      <guid>https://dev.to/junhee916/how-to-fix-search-engine-indexing-issues-caused-by-robotstxt-block-errors-5981</guid>
      <description>&lt;p&gt;Is your search engine not indexing important pages on your site properly? You might be experiencing issues with certain paths being blocked by &lt;code&gt;robots.txt&lt;/code&gt; settings, causing them to be omitted from search results. In this post, I'll share a similar situation I encountered and how I resolved it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Attempts and Pitfalls
&lt;/h2&gt;

&lt;p&gt;At first, I naturally assumed there was a syntax error in the &lt;code&gt;robots.txt&lt;/code&gt; file itself, or that it contained incorrect directives. So, I meticulously reviewed the file's contents again.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight conf"&gt;&lt;code&gt;&lt;span class="n"&gt;User&lt;/span&gt;-&lt;span class="n"&gt;agent&lt;/span&gt;: *
&lt;span class="n"&gt;Disallow&lt;/span&gt;: /&lt;span class="n"&gt;chat&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I suspected that a setting like this, blocking the &lt;code&gt;/chat&lt;/code&gt; path, was the culprit. This path indeed contained a lot of content related to the user interface.&lt;/p&gt;

&lt;p&gt;However, the &lt;code&gt;robots.txt&lt;/code&gt; syntax was perfect, and there seemed to be no issues with other search engine-related settings. I spent hours poring over documentation related to &lt;code&gt;robots.txt&lt;/code&gt;, but struggled to find a clear solution. The "Indexed, though blocked by robots.txt" warning kept appearing in the search engine's developer tools.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Cause
&lt;/h2&gt;

&lt;p&gt;In the end, the problem wasn't an error in the &lt;code&gt;robots.txt&lt;/code&gt; file itself, but rather that &lt;strong&gt;the blocking setting was unintentionally preventing important pages from being indexed&lt;/strong&gt;. Specifically, some pages within the &lt;code&gt;/chat&lt;/code&gt; path contained crucial content that the search engine needed to index, and blocking the entire path with &lt;code&gt;Disallow&lt;/code&gt; was the mistake.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Solution
&lt;/h2&gt;

&lt;p&gt;The solution was surprisingly simple. Instead of blocking the entire &lt;code&gt;/chat&lt;/code&gt; path, I modified the settings to explicitly block only the specific sub-paths that I genuinely wanted search engines to avoid.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight conf"&gt;&lt;code&gt;&lt;span class="n"&gt;User&lt;/span&gt;-&lt;span class="n"&gt;agent&lt;/span&gt;: *
&lt;span class="n"&gt;Disallow&lt;/span&gt;: /&lt;span class="n"&gt;chat&lt;/span&gt;/&lt;span class="n"&gt;private&lt;/span&gt;-&lt;span class="n"&gt;conversations&lt;/span&gt;/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With this change, other pages under &lt;code&gt;/chat&lt;/code&gt; can still be indexed, while only the sensitive content located in the &lt;code&gt;/chat/private-conversations/&lt;/code&gt; path is blocked.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Result
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Search engines began indexing the relevant pages of my site correctly.&lt;/li&gt;
&lt;li&gt;The "Indexed, though blocked by robots.txt" warning in the developer tools disappeared.&lt;/li&gt;
&lt;li&gt;I observed an overall improvement in my site's search visibility.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  In Summary — To Avoid the Same Pitfall
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;[ ] When configuring &lt;code&gt;robots.txt&lt;/code&gt;, double-check if the paths specified in &lt;code&gt;Disallow&lt;/code&gt; are unintentionally blocking access to important pages.&lt;/li&gt;
&lt;li&gt;[ ] Consider explicitly specifying only the sub-paths that absolutely need to be blocked, rather than blocking an entire path.&lt;/li&gt;
&lt;li&gt;[ ] After making changes to &lt;code&gt;robots.txt&lt;/code&gt;, always verify the changes using search engine developer tools, including the indexing status and the &lt;code&gt;robots.txt&lt;/code&gt; tester.&lt;/li&gt;
&lt;li&gt;[ ] Remember that &lt;code&gt;robots.txt&lt;/code&gt; is a 'request' to search engines not to crawl, not a 'command' that forces them.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>robotstxt</category>
      <category>seo</category>
      <category>infra</category>
    </item>
    <item>
      <title>Resolving CP949 Errors in Local LLM Benchmarking and Building an Automatic Model Recommendation System</title>
      <dc:creator>박준희</dc:creator>
      <pubDate>Mon, 15 Jun 2026 16:00:00 +0000</pubDate>
      <link>https://dev.to/junhee916/resolving-cp949-errors-in-local-llm-benchmarking-and-building-an-automatic-model-recommendation-128g</link>
      <guid>https://dev.to/junhee916/resolving-cp949-errors-in-local-llm-benchmarking-and-building-an-automatic-model-recommendation-128g</guid>
      <description>&lt;p&gt;Ever run into CP949 encoding errors when benchmarking local LLMs, or felt frustrated by the lack of model management features? In this post, I'll share my experience overcoming CP949 encoding issues and building an automatic model recommendation system to enhance local model research and management capabilities.&lt;/p&gt;

&lt;h2&gt;
  
  
  Attempts and Pitfalls
&lt;/h2&gt;

&lt;p&gt;Initially, I wanted to build a simple feature in the admin page to switch and benchmark local models. I also prepared a more diverse set of benchmark questions in Korean.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="c1"&gt;// riel_agent/src/app/admin/tabs/LocalModelLabTab.tsx (excerpt)&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Button&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;Select&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;Input&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@mantine/core&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;useState&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;useEffect&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;react&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;getLocalModels&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;switchLocalModel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;runBenchmark&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;getBenchmarkResults&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;../../api/admin&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Actual API call functions&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;LocalModelLabTab&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;models&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setModels&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;useState&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;([]);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;selectedModel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setSelectedModel&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;useState&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;benchmarkQuestions&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setBenchmarkQuestions&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;useState&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;([]);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;benchmarkResults&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setBenchmarkResults&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;useState&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;any&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="nf"&gt;useEffect&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Load local model list&lt;/span&gt;
    &lt;span class="nf"&gt;getLocalModels&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;setModels&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="c1"&gt;// Load Korean benchmark questions (expanded to 25)&lt;/span&gt;
    &lt;span class="c1"&gt;// ...&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;[]);&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;handleModelChange&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;modelName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;switchLocalModel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;modelName&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// Actual model switching API&lt;/span&gt;
    &lt;span class="nf"&gt;setSelectedModel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;modelName&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;handleRunBenchmark&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;runBenchmark&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;selectedModel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;benchmarkQuestions&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// Actual benchmark execution API&lt;/span&gt;
    &lt;span class="nf"&gt;setBenchmarkResults&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;results&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="c1"&gt;// ... UI rendering ...&lt;/span&gt;

  &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Select&lt;/span&gt;
        &lt;span class="na"&gt;label&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"Select Local Model"&lt;/span&gt;
        &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;models&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
        &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;selectedModel&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
        &lt;span class="na"&gt;onChange&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;handleModelChange&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Button&lt;/span&gt; &lt;span class="na"&gt;onClick&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;handleRunBenchmark&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;Run Benchmark&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Button&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="cm"&gt;/* Results display section */&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;default&lt;/span&gt; &lt;span class="nx"&gt;LocalModelLabTab&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Switching models and expanding questions were relatively straightforward. The problem arose when running benchmarks, especially with Korean data, where I frequently encountered &lt;code&gt;CP949&lt;/code&gt; encoding errors.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;UnicodeEncodeError: 'cp949' codec can't encode characters in position 1-3: illegal multibyte sequence
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Seeing this error message, I initially thought it was just a Korean string processing issue. So, I tried changing the encoding settings in Python files or explicitly encoding/decoding strings to &lt;code&gt;utf-8&lt;/code&gt;. However, after hours of struggling, the problem persisted.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# riel_backend/api/local_llm.py (part of initial attempts)
&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;process_text_with_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# ... Model call logic ...
&lt;/span&gt;    &lt;span class="c1"&gt;# CP949 error occurred here
&lt;/span&gt;    &lt;span class="c1"&gt;# text = text.encode('utf-8').decode('cp949', errors='ignore') # Attempts like this
&lt;/span&gt;    &lt;span class="c1"&gt;# ...
&lt;/span&gt;    &lt;span class="k"&gt;pass&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Cause
&lt;/h2&gt;

&lt;p&gt;After hours of debugging, I finally pinpointed the root cause. It wasn't just an encoding issue with the Python script itself. The local LLM worker was attempting to forcibly convert data to &lt;code&gt;CP949&lt;/code&gt;, the default encoding on certain environments (especially Windows), during the process of handling and saving model responses.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# tools/local_llm_worker/worker.py (suspected point of failure)
&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;save_output&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output_data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# ...
&lt;/span&gt;    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output_file_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;w&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;encoding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;cp949&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="c1"&gt;# &amp;lt;-- Problem occurred here
&lt;/span&gt;        &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dump&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ensure_ascii&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# ...
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;json.dump&lt;/code&gt; function, when used with &lt;code&gt;ensure_ascii=False&lt;/code&gt;, outputs Unicode characters as they are. However, specifying &lt;code&gt;encoding='cp949'&lt;/code&gt; during file writing caused an error because it tried to convert them to that encoding.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Solution
&lt;/h2&gt;

&lt;p&gt;The fix was simple: modify the local LLM worker to explicitly use &lt;code&gt;utf-8&lt;/code&gt; encoding when saving files.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# tools/local_llm_worker/worker.py (after modification)
&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;save_output&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output_data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# ...
&lt;/span&gt;    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output_file_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;w&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;encoding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="c1"&gt;# &amp;lt;-- Changed to utf-8
&lt;/span&gt;        &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dump&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ensure_ascii&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;indent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# Added indent for better readability
&lt;/span&gt;    &lt;span class="c1"&gt;# ...
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Along with this, I built a system to automatically download models, benchmark them, and recommend better ones.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# tools/local_llm_bench/auto_bench.py (automatic benchmark loop)
&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;

&lt;span class="c1"&gt;# Import necessary functions (e.g., download_model, run_single_benchmark, get_best_model)
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;.utils&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;download_model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;run_single_benchmark&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;get_best_model&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;..local_llm_worker.worker&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;process_prompt&lt;/span&gt; &lt;span class="c1"&gt;# Import prompt processing function from worker module
&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;auto_benchmark_loop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model_dir&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;benchmark_prompts_path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_iterations&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;current_best_model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
    &lt;span class="n"&gt;candidate_models&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model_a&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model_b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model_c&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="c1"&gt;# Actual model list would be fetched dynamically
&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_iterations&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Iteration &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;num_iterations&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# 1. Download candidate models (if they don't exist yet)
&lt;/span&gt;        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;model_name&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;candidate_models&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exists&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model_dir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
                &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Downloading &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="nf"&gt;download_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model_dir&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# Actual download function
&lt;/span&gt;
        &lt;span class="c1"&gt;# 2. Benchmark current best model
&lt;/span&gt;        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;current_best_model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Benchmarking current best model: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;current_best_model&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;run_single_benchmark&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;current_best_model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;benchmark_prompts_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="c1"&gt;# Analyze and save results
&lt;/span&gt;            &lt;span class="c1"&gt;# ...
&lt;/span&gt;
        &lt;span class="c1"&gt;# 3. Benchmark all candidate models
&lt;/span&gt;        &lt;span class="n"&gt;all_results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;model_name&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;candidate_models&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Benchmarking candidate model: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;run_single_benchmark&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;benchmark_prompts_path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;all_results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;model_name&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;scores&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="c1"&gt;# Example: list of scores
&lt;/span&gt;
        &lt;span class="c1"&gt;# 4. Select best model based on latest results
&lt;/span&gt;        &lt;span class="n"&gt;new_best_model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_best_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;all_results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# Actual best model selection logic
&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;new_best_model&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;current_best_model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;New best model found: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;new_best_model&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;. Updating...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;current_best_model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;new_best_model&lt;/span&gt;
            &lt;span class="c1"&gt;# Notify the system about the best model via admin API, etc.
&lt;/span&gt;            &lt;span class="c1"&gt;# switchLocalModel(current_best_model) # Example
&lt;/span&gt;        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Current best model remains the best.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# Wait before the next iteration
&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;MODEL_DIRECTORY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/path/to/local/models&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="c1"&gt;# Actual path
&lt;/span&gt;    &lt;span class="n"&gt;PROMPTS_FILE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tools/local_llm_bench/prompts.json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="nf"&gt;auto_benchmark_loop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;MODEL_DIRECTORY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;PROMPTS_FILE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_iterations&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;During this process, I discovered that the Gemma2:2b model performed significantly better than the EXAONE model I was using previously. I documented and shared this finding.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## Gemma2:2b Model Performance Analysis (As of June 15, 2026)&lt;/span&gt;

Recently, I've been analyzing the performance of various models using my automated local model benchmarking system. In particular, I've confirmed that the &lt;span class="gs"&gt;**Gemma2:2b**&lt;/span&gt; model shows a significant advantage over the &lt;span class="gs"&gt;**EXAONE**&lt;/span&gt; model, which I was using previously, in terms of Korean language processing and overall response quality.

&lt;span class="gs"&gt;**Key Observations:**&lt;/span&gt;
&lt;span class="p"&gt;
*&lt;/span&gt;   &lt;span class="gs"&gt;**Response Speed:**&lt;/span&gt; Gemma2:2b maintained a similar response speed to EXAONE while generating higher quality results.
&lt;span class="p"&gt;*&lt;/span&gt;   &lt;span class="gs"&gt;**Korean Comprehension:**&lt;/span&gt; Gemma2:2b provided much more accurate and natural answers to complex and nuanced Korean questions.
&lt;span class="p"&gt;*&lt;/span&gt;   &lt;span class="gs"&gt;**Creative Generation:**&lt;/span&gt; Gemma2:2b also scored higher in its ability to generate creative responses to given prompts.

These findings suggest that Gemma2:2b should be prioritized when building local LLM systems in the future.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Research, management, and benchmarking capabilities for local models have been significantly enhanced.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;CP949&lt;/code&gt; encoding errors encountered during benchmark execution have been completely resolved, improving system stability.&lt;/li&gt;
&lt;li&gt;It was objectively confirmed and documented that the Gemma2:2b model outperforms EXAONE.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Summary — To Avoid the Same Pitfalls
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;[ ] When performing file I/O in a local environment, do not rely on the operating system's default encoding (&lt;code&gt;CP949&lt;/code&gt; on Windows); always explicitly use &lt;code&gt;utf-8&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;[ ] When using Python's &lt;code&gt;json.dump&lt;/code&gt;, prevent Korean garbling and encoding errors by specifying &lt;code&gt;encoding='utf-8'&lt;/code&gt; during file writing, along with the &lt;code&gt;ensure_ascii=False&lt;/code&gt; option.&lt;/li&gt;
&lt;li&gt;[ ] Build automated scripts for local LLM model management and benchmarking to improve model performance and ensure efficient operation.&lt;/li&gt;
&lt;li&gt;[ ] Regularly benchmark various models, and when you discover a high-performing model, immediately document it and incorporate it into your system.&lt;/li&gt;
&lt;li&gt;[ ] When encountering errors like &lt;code&gt;UnicodeEncodeError: 'cp949' codec can't encode characters...&lt;/code&gt;, investigate not only the encoding issues of the code itself but also the entire system environment and file I/O logic.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>llm</category>
      <category>cp949</category>
      <category>ai</category>
    </item>
    <item>
      <title>Node.js Backend: Visualizing the Observer Pattern and Improving Data Processing Performance</title>
      <dc:creator>박준희</dc:creator>
      <pubDate>Sun, 14 Jun 2026 16:00:00 +0000</pubDate>
      <link>https://dev.to/junhee916/nodejs-backend-visualizing-the-observer-pattern-and-improving-data-processing-performance-3c0p</link>
      <guid>https://dev.to/junhee916/nodejs-backend-visualizing-the-observer-pattern-and-improving-data-processing-performance-3c0p</guid>
      <description>&lt;p&gt;Improving Node.js Backend: Visualizing Observer Pattern and Enhancing Data Processing Performance&lt;/p&gt;

&lt;p&gt;I noticed a deficiency in visualizing observer functionality and handling data within the user interface and backend logic. I tried a few things to fix this, and I'd like to share the process.&lt;/p&gt;

&lt;h2&gt;
  
  
  Attempts and Pitfalls
&lt;/h2&gt;

&lt;p&gt;Initially, I focused on visualizing nationwide spread phenomena. The idea was to show the spread process by adjusting the activity time for each province using a slider. However, I realized this approach made it difficult to properly represent complex interactions.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Attempt 1: Visualizing Spread (Conceptual Code)&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;visualizeSpread&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;simulationData&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;timeSliderValue&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;currentTimeData&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;simulationData&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;d&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;d&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;time&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="nx"&gt;timeSliderValue&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="c1"&gt;// Logic to visualize spread on the map based on currentTimeData&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Visualizing spread at time: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;timeSliderValue&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="c1"&gt;// ... actual visualization code ...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Next, I tried to implement a "conflicting intertwined chains" feature to visualize the self-reinforcing loops between the government and citizens on the ground. The idea was interesting, but I was stumped on how to structure and process the data.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Attempt 2: Conflicting Intertwined Chains (Conceptual Code)&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;createConflictingChains&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;governmentActions&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;citizenReactions&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;chains&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
  &lt;span class="c1"&gt;// Analyze interactions between governmentActions and citizenReactions to create chains&lt;/span&gt;
  &lt;span class="c1"&gt;// Example: Government Policy A -&amp;gt; Citizen Reaction B -&amp;gt; Government Policy C (amplified by Reaction B)&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Attempting to create conflicting chains...&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="c1"&gt;// ... actual logic ...&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;chains&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Critically, when I tried to add functionality to retroactively extract these conflicting chains and separate mega-calls, the data processing volume became unmanageable. I wasted a significant amount of time dealing with unexpected performance degradation and increased complexity. After 3 hours of struggling, I realized that simple visualization couldn't adequately capture a complex system.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Cause
&lt;/h2&gt;

&lt;p&gt;Ultimately, the problem lay in the data processing and visualization methods between the user interface and the backend logic. The existing approach didn't sufficiently reflect the complexity of spread phenomena or interactions, and data processing efficiency was low. In particular, there was a lack of mechanisms needed to effectively model and visualize dynamic interactions like self-reinforcing loops.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Solution
&lt;/h2&gt;

&lt;p&gt;I improved the user interface and backend logic to enhance the visualization and data processing capabilities of the 'observer' feature. While keeping the visualization of nationwide spread phenomena with a provincial activity time slider, I newly implemented the 'conflicting intertwined chains' feature to represent the self-reinforcing loops between the government and citizens.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Solution: Improved Data Processing and Visualization Logic (Conceptual Code)&lt;/span&gt;
&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ObserverVisualizer&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;constructor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;backendService&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;backendService&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;backendService&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="nf"&gt;visualizeSpreadOverTime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;simulationId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;spreadData&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;backendService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getSpreadData&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;simulationId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="c1"&gt;// Visualize with the provincial activity time slider using spreadData&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Visualizing spread with improved logic.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="c1"&gt;// ... actual visualization implementation ...&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="nf"&gt;visualizeConflictingChains&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;interactionData&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;processedChains&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;backendService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;processAndExtractChains&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;interactionData&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="c1"&gt;// Visualize processedChains as 'conflicting intertwined chains'&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Visualizing conflicting chains and mega calls.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="c1"&gt;// ... actual visualization implementation ...&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// Example of calling the actual backend service&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;backend&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;BackendService&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt; &lt;span class="c1"&gt;// Actual backend service instance&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;visualizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ObserverVisualizer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;backend&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Visualize nationwide spread phenomena&lt;/span&gt;
&lt;span class="nx"&gt;visualizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;visualizeSpreadOverTime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;some-simulation-id&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Visualize government-citizen interactions&lt;/span&gt;
&lt;span class="nx"&gt;visualizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;visualizeConflictingChains&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;collectedInteractionData&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Furthermore, I enhanced data processing efficiency by adding functionality to retroactively extract these conflicting chains and separate mega-calls. This allowed for a clearer understanding of the dynamic interactions within complex systems.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Effectively visualized nationwide spread phenomena through a provincial activity time slider.&lt;/li&gt;
&lt;li&gt;Successfully implemented a visualization feature for 'conflicting intertwined chains' representing self-reinforcing loops between the government and citizens.&lt;/li&gt;
&lt;li&gt;Increased data processing efficiency by adding retroactive extraction of conflicting chains and mega-call separation.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Takeaways — To Avoid the Same Pitfalls
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;[ ] When visualizing complex interactions, go beyond simple data representation and adopt modeling that can reflect the dynamic characteristics of the system.&lt;/li&gt;
&lt;li&gt;[ ] When implementing feedback mechanisms like self-reinforcing loops, thorough consideration of data structure design and processing logic must come first.&lt;/li&gt;
&lt;li&gt;[ ] For large-scale data processing, it's crucial to identify potential performance bottlenecks in advance and apply efficient algorithms and data structures.&lt;/li&gt;
&lt;li&gt;[ ] The integration between the user interface and backend logic should be achieved through clear API design and consistent data flow.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>node</category>
    </item>
    <item>
      <title>Vertex AI 'Resource exhausted' (429) API Rate Limit on a Single VM</title>
      <dc:creator>박준희</dc:creator>
      <pubDate>Sat, 13 Jun 2026 09:30:00 +0000</pubDate>
      <link>https://dev.to/junhee916/vertex-ai-resource-exhausted-429-api-rate-limit-on-a-single-vm-4pek</link>
      <guid>https://dev.to/junhee916/vertex-ai-resource-exhausted-429-api-rate-limit-on-a-single-vm-4pek</guid>
      <description>&lt;h2&gt;
  
  
  Vertex AI 'Resource exhausted' (429) API Rate Limit on a Single VM
&lt;/h2&gt;

&lt;p&gt;Building and running a full-fledged AI product, aicoreutility.com, as a solo developer on a single, modest virtual machine presents a unique set of challenges. It's a constant dance between functionality, cost, and the sheer limitations of the infrastructure. Today, I want to share a scar from this journey: a persistent 429 'Resource exhausted' error from Google Cloud's Vertex AI API that brought a critical part of my service to a halt.&lt;/p&gt;

&lt;p&gt;The symptom was simple, yet infuriating: API calls to Vertex AI were intermittently failing, returning a &lt;code&gt;429 RESOURCE_EXHAUSTED&lt;/code&gt; error. The accompanying message was equally unhelpful for a solo dev on a budget: &lt;code&gt;'Resource exhausted. Please try again later. Please refer to https://cloud.google.com/vertex-ai/docs for more information.'&lt;/code&gt;. This wasn't a constant failure, which made it even harder to pin down. It would work for a while, then suddenly start failing, only to recover later. This erratic behavior suggested a rate-limiting issue, but the context of my setup made it perplexing.&lt;/p&gt;

&lt;p&gt;My initial thought process was a bit scattered. Was it a bug in my application code? Was I making too many requests in a short period? Was there a sudden surge in global traffic to Vertex AI that was impacting shared resources? Given I'm running on a single small VM, I don't have the luxury of massive parallel processing or distributed systems that might inadvertently hammer an API. My request volume, while growing, felt modest.&lt;/p&gt;

&lt;p&gt;I started by scrutinizing my own code. I checked the API client implementation, ensuring I wasn't inadvertently creating infinite loops or making redundant calls. I reviewed the logic for how I was interacting with the Vertex AI models. I added more detailed logging around every API call, capturing request payloads, response status codes, and timings. This helped confirm that the errors were indeed originating from Vertex AI itself, and the &lt;code&gt;429&lt;/code&gt; status code was consistent.&lt;/p&gt;

&lt;p&gt;The next step was to investigate the rate limits. Google Cloud documentation is extensive, but pinpointing the exact limit for my specific use case on Vertex AI, especially when running from a single VM without a dedicated, high-volume tier, was challenging. The documentation often speaks in terms of project-level quotas or per-user quotas, which felt too broad for my situation. I was operating on a very lean setup, and the idea that I was somehow exceeding limits designed for much larger applications seemed unlikely, yet the error message was undeniable.&lt;/p&gt;

&lt;p&gt;The breakthrough came when I started looking at the &lt;em&gt;timing&lt;/em&gt; and &lt;em&gt;pattern&lt;/em&gt; of the failures more closely, correlating them with my application's internal operations. I realized that the failures often occurred not during peak user activity, but during background tasks or internal processing jobs that ran on the same VM. These tasks, while not directly user-facing, were still making calls to Vertex AI.&lt;/p&gt;

&lt;p&gt;The root cause, as it turned out, was a combination of factors:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Shared Resource Contention:&lt;/strong&gt; My single VM was running both the web application serving users and background AI processing tasks. Both were sharing the same outbound IP address and the same API client configurations.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;API Quota Granularity:&lt;/strong&gt; Vertex AI's default quotas, while generous for many use cases, are still finite. Without explicit configuration for higher limits or a more robust quota management strategy, even a moderate number of concurrent requests from a single source could trigger the &lt;code&gt;429&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lack of Backoff and Retry Logic:&lt;/strong&gt; While I had some basic retry mechanisms, they weren't sophisticated enough to handle sustained rate limiting. They would retry too quickly, hitting the API again before the rate limit window had fully passed, thus perpetuating the problem.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The specific incident that forced me to address this was a critical background job for processing user-uploaded documents failing repeatedly. This job was essential for providing one of the core AI features of aicoreutility.com. Seeing it fail due to an external API's rate limit, especially when I felt my usage was reasonable, was frustrating.&lt;/p&gt;

&lt;p&gt;The fix involved a multi-pronged approach:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Implementing Exponential Backoff with Jitter:&lt;/strong&gt; I enhanced my API client to use a more robust exponential backoff strategy. When a &lt;code&gt;429&lt;/code&gt; error is received, instead of retrying immediately, the client now waits an increasing amount of time before retrying, with a small random jitter added to prevent multiple instances from retrying at the exact same moment. This is crucial for respecting rate limits and allowing the API service to recover.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Request Throttling for Background Tasks:&lt;/strong&gt; I introduced a separate, more conservative rate limiter specifically for my background processing jobs. This ensures that these non-critical, albeit important, tasks do not consume API resources in a way that impacts real-time user requests.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Monitoring and Alerting:&lt;/strong&gt; I set up more granular monitoring for Vertex AI API error rates. If the &lt;code&gt;429&lt;/code&gt; errors exceed a certain threshold within a given time window, I'm now alerted. This allows me to investigate proactively rather than discovering a service outage through user complaints.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Exploring Quota Adjustments:&lt;/strong&gt; While not immediately implemented due to cost considerations on a small VM, I've bookmarked the process for requesting quota increases for Vertex AI if my usage continues to grow and these measures prove insufficient.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;After implementing these changes, the &lt;code&gt;429 RESOURCE_EXHAUSTED&lt;/code&gt; errors significantly decreased. The background jobs now run reliably, and the core AI features remain available to users. It's a stark reminder that even with seemingly low usage, understanding and respecting external API rate limits is paramount, especially when operating on constrained infrastructure.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;...building aicoreutility.com in the open...&lt;/em&gt; &lt;a href="https://aicoreutility.com" rel="noopener noreferrer"&gt;aicoreutility.com&lt;/a&gt;&lt;/p&gt;

</description>
      <category>vertexai</category>
      <category>ratelimiting</category>
      <category>gcp</category>
      <category>aiinfra</category>
    </item>
    <item>
      <title>TypeScript TS2802 Error: Resolving Observer Pattern 'Set' Spread with Array.from Conversion</title>
      <dc:creator>박준희</dc:creator>
      <pubDate>Fri, 12 Jun 2026 16:00:01 +0000</pubDate>
      <link>https://dev.to/junhee916/typescript-ts2802-error-resolving-observer-pattern-set-spread-with-arrayfrom-conversion-2ibd</link>
      <guid>https://dev.to/junhee916/typescript-ts2802-error-resolving-observer-pattern-set-spread-with-arrayfrom-conversion-2ibd</guid>
      <description>&lt;p&gt;TypeScript Compile Error TS2802: Resolved with Observer Pattern by Converting Set Spread to Array.from&lt;/p&gt;

&lt;p&gt;If you're stuck implementing the observer pattern due to TypeScript compile error TS2802, this post might help. I resolved the issue with a simple conversion: changing Set spread to &lt;code&gt;Array.from()&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Attempts and Pitfalls
&lt;/h2&gt;

&lt;p&gt;While implementing the observer pattern, I encountered TypeScript compile error TS2802 when trying to spread a Set. Initially, I suspected the Set's type might be the problem, so I tried various approaches.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Observer&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="nx"&gt;subscribers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nb"&gt;Set&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;void&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="nf"&gt;subscribe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;callback&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;subscribers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;callback&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nf"&gt;notify&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// TS2802 error occurs here&lt;/span&gt;
    &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;callback&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="p"&gt;[...&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;subscribers&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nf"&gt;callback&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When attempting to spread the Set into an array using &lt;code&gt;[...this.subscribers]&lt;/code&gt; as shown above, TypeScript failed to recognize it properly, throwing an error similar to &lt;code&gt;TS2802: Cannot find module '...' or its corresponding type declarations.&lt;/code&gt;. At first, I thought it was a library configuration issue and spent a considerable amount of time lost.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Cause
&lt;/h2&gt;

&lt;p&gt;In the end, the problem lay with the Set spread syntax itself. When TypeScript applies the &lt;code&gt;...&lt;/code&gt; spread operator to a Set, there were instances where it couldn't accurately infer the types internally. This issue can be more pronounced in certain versions or environments.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Solution
&lt;/h2&gt;

&lt;p&gt;To resolve this, I used the method of explicitly converting the Set spread to an array using &lt;code&gt;Array.from()&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Observer&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="nx"&gt;subscribers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nb"&gt;Set&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;void&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="nf"&gt;subscribe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;callback&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;subscribers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;callback&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nf"&gt;notify&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Resolved by converting with Array.from&lt;/span&gt;
    &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;callback&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nb"&gt;Array&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;subscribers&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nf"&gt;callback&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By using &lt;code&gt;Array.from(this.subscribers)&lt;/code&gt;, TypeScript clearly recognizes the Set as an array, allowing the loop to execute correctly.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Outcome
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;The TypeScript compile error TS2802 was cleanly resolved.&lt;/li&gt;
&lt;li&gt;The observer pattern's &lt;code&gt;notify&lt;/code&gt; method now functions as intended.&lt;/li&gt;
&lt;li&gt;I no longer have to waste time on unnecessary type-related debugging.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Summary — To Avoid the Same Pitfall
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;[ ] If you encounter TS2802 errors when spreading a Set in TypeScript, try converting it with &lt;code&gt;Array.from()&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;[ ] Instead of blindly following error messages, focus on specific parts of your code (in this case, the Set spread).&lt;/li&gt;
&lt;li&gt;[ ] Before checking library configurations or type definitions, consider first improving the clarity of your code itself.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>typescript</category>
      <category>ts2802</category>
      <category>set</category>
      <category>arrayfrom</category>
    </item>
  </channel>
</rss>
