<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: NEXADiag Nexa</title>
    <description>The latest articles on DEV Community by NEXADiag Nexa (@nexadiag_nexa_312a4b5f603).</description>
    <link>https://dev.to/nexadiag_nexa_312a4b5f603</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3876562%2F2ff991c0-67fb-4f37-8301-458ceffbd8a9.png</url>
      <title>DEV Community: NEXADiag Nexa</title>
      <link>https://dev.to/nexadiag_nexa_312a4b5f603</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/nexadiag_nexa_312a4b5f603"/>
    <language>en</language>
    <item>
      <title>Why JSON.parse() Fails Silently on Truncated LLM Responses (And What I Did About It)</title>
      <dc:creator>NEXADiag Nexa</dc:creator>
      <pubDate>Wed, 13 May 2026 11:41:17 +0000</pubDate>
      <link>https://dev.to/nexadiag_nexa_312a4b5f603/why-jsonparse-fails-silently-on-truncated-llm-responses-and-what-i-did-about-it-3681</link>
      <guid>https://dev.to/nexadiag_nexa_312a4b5f603/why-jsonparse-fails-silently-on-truncated-llm-responses-and-what-i-did-about-it-3681</guid>
      <description>&lt;h1&gt;
  
  
  Why JSON.parse() Fails Silently on Truncated LLM Responses (And What I Did About It)
&lt;/h1&gt;

&lt;p&gt;If you've shipped anything that asks an LLM to return JSON, you've already hit this bug. You just may not have noticed.&lt;/p&gt;

&lt;p&gt;The LLM returns a response. Your code parses it. Most of the time it works. Sometimes it returns &lt;code&gt;{}&lt;/code&gt; and you assume the LLM didn't find anything. The reality is darker: the JSON was truncated mid-object, your parser silently failed, and your downstream code is now operating on an empty dictionary instead of the partial result the LLM actually produced.&lt;/p&gt;

&lt;p&gt;I lost six weeks to this bug. Here's what I learned.&lt;/p&gt;

&lt;h2&gt;
  
  
  The setup
&lt;/h2&gt;

&lt;p&gt;I run code review with multiple LLMs in parallel. Each one returns a JSON array of issues found:&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
json
[
  {"file": "main.py", "line": 47, "type": "security", "severity": "high", "description": "..."},
  {"file": "main.py", "line": 89, "type": "smell", "severity": "low", "description": "..."}
]

When the LLM hits its max_tokens limit mid-response, the response gets cut off. You receive something like:
json

[
  {"file": "main.py", "line": 47, "type": "security", "severity": "high", "description": "..."},
  {"file": "main.py", "line": 89, "type": "smell", "seve

json.loads() raises JSONDecodeError. Most code catches the exception and returns []. The issues that WERE successfully parsed before the truncation are lost.
The dumb solution that actually works

You don’t need a streaming JSON parser. You need a bracket-counting repair function:
python

def guard_truncation(text: str, provider_id: str, file_path: str) -&amp;gt; str:
    stripped = text.strip()
    if not stripped.startswith("["):
        return text

    try:
        json.loads(stripped)
        return text  # already valid
    except json.JSONDecodeError:
        pass

    # find last complete object
    last_close = stripped.rfind("}")
    if last_close == -1:
        return "[]"

    # rebuild a valid array from the last complete object backward
    repaired = stripped[: last_close + 1] + "\n]"
    try:
        json.loads(repaired)
        return repaired
    except json.JSONDecodeError:
        return "[]"

It’s not elegant. It works. You recover 80-90% of the partial result instead of 0%.
The second bug that this revealed

Here’s where it gets worse.

My downstream code assumed every entry in the parsed list was a dictionary. Most of the time it was. But occasionally an LLM would return a string entry in the middle of the array:
json

[
  {"file": "main.py", "line": 47, ...},
  "I noticed there might be an issue here but I'm not sure",
  {"file": "main.py", "line": 89, ...}
]

My code did entry.get("file") on every entry. When it hit the string, AttributeError: 'str' object has no attribute 'get'. The exception was caught by a try/except too wide to be useful. The entire scan silently produced empty results for that file.

Six weeks. No error log. The only signal was “the report has fewer issues than usual for this codebase”.

The fix:
python

for entry in raw_issues:
    if not isinstance(entry, dict):
        continue
    # safe to call entry.get(...) here

Three lines. That’s it.
The bigger lesson

I don’t think LLM output should ever be trusted to match a schema. Even when you tell it “return valid JSON only”, you’ll get:

    Truncated JSON when you hit token limits
    Strings injected mid-array as informal commentary
    Wrong types in correct keys (line: "approximately 50" instead of line: 50)
    Extra keys not in your schema
    Missing required keys

The temptation is to use Pydantic or a JSON schema validator and reject malformed responses entirely. That’s the worst possible choice — you lose all the partial work the LLM did. The better choice is to repair what you can, type-check defensively at every step, and log what you couldn’t recover so you can iterate.

Three patterns that have saved me from similar bugs:

    Always isinstance(x, dict) before .get() on LLM-derived data. Always.
    Bracket-repair truncated JSON before declaring failure. 80% recovery beats 0%.
    Log what you discarded. If you silently filter bad entries, you’ll never know how often it happens. I now log every malformed entry with the provider name and file path.

Why this matters in 2026

Most teams treat LLM output as “either it works or it doesn’t”. The reality is closer to “it partially works most of the time, and the partial-failure modes are silent”. Production code that runs LLM output needs to be more paranoid than production code that talks to a normal API, because LLMs don’t have HTTP status codes — they have a single channel that mixes intent, format, and content.

I built my entire scanning workflow around the assumption that any single LLM response will be 5-10% broken. That assumption has been a better friend than any prompt engineering trick.

What’s your experience? Anyone else burned by silent truncation, or am I the last one to notice?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>programming</category>
      <category>python</category>
    </item>
    <item>
      <title>Stop copy-pasting AI code: The 6-step validation checklist for devs.</title>
      <dc:creator>NEXADiag Nexa</dc:creator>
      <pubDate>Wed, 15 Apr 2026 14:42:58 +0000</pubDate>
      <link>https://dev.to/nexadiag_nexa_312a4b5f603/stop-copy-pasting-ai-code-the-6-step-validation-checklist-for-devs-5g3l</link>
      <guid>https://dev.to/nexadiag_nexa_312a4b5f603/stop-copy-pasting-ai-code-the-6-step-validation-checklist-for-devs-5g3l</guid>
      <description>&lt;p&gt;It is impossible to be 100% certain that a tool or code generated by an LLM (like ChatGPT, Claude, etc.) is bug-free. LLMs are text predictors: they generate code that looks correct, but they do not "compile" or execute the code internally. Consequently, they can invent functions that do not exist (hallucinations) or make subtle logic errors.&lt;/p&gt;

&lt;p&gt;However, you can achieve a very high level of confidence by following a rigorous validation method. Here are the essential steps:&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Code Review (Never just copy-paste)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Have the code explained:&lt;/strong&gt; Ask the LLM: "Explain this function to me line by line." If the explanation is logically sound, that is a good sign.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Check the business logic:&lt;/strong&gt; Does the tool do exactly what you want, or did it simplify the problem to provide a faster answer?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Watch for LLM "habits":&lt;/strong&gt; LLMs tend to use popular libraries even if they aren't the best fit, or they might ignore error handling (try/catch).&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  2. Edge Case Testing
&lt;/h2&gt;

&lt;p&gt;This is where LLMs fail most often. A tool might work perfectly with normal data but crash with unusual data. Test for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Empty inputs:&lt;/strong&gt; What happens if you provide nothing?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Extreme values:&lt;/strong&gt; A negative number where it should be positive? A text string of 10,000 characters?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Special characters:&lt;/strong&gt; Accents, emojis, or HTML tags (&lt;code&gt;&amp;lt;script&amp;gt;&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Wrong format:&lt;/strong&gt; If the tool expects a date (DD/MM/YYYY), what happens if you type "Monday"?&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  3. Dependency Validation
&lt;/h2&gt;

&lt;p&gt;LLMs sometimes invent package names or use obsolete functions.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Verify that every &lt;code&gt;import&lt;/code&gt; (Python), &lt;code&gt;require&lt;/code&gt; (Node.js), or &lt;code&gt;using&lt;/code&gt; (C#) corresponds to an actual, existing library.&lt;/li&gt;
&lt;li&gt;Check that the library version is compatible with your environment.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  4. Use Automated Tools (Don't do everything manually)
&lt;/h2&gt;

&lt;p&gt;Run the LLM's code through real development tools:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Linters:&lt;/strong&gt; Tools like ESLint (JavaScript), Pylint (Python), or Ruff detect syntax errors and poor practices.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Type Checkers:&lt;/strong&gt; If using TypeScript or Python with "Type Hints," the compiler will catch many silent errors (e.g., passing a string to a function expecting a number).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ask the LLM to write unit tests:&lt;/strong&gt; Ask: "Write unit tests (using Jest, PyTest, etc.) for this code including nominal and edge cases," then execute those tests.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  5. Security Check (Crucial)
&lt;/h2&gt;

&lt;p&gt;Never trust an LLM with security.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Check for hardcoded passwords or API keys in the script.&lt;/li&gt;
&lt;li&gt;If the tool interacts with a database, ensure there is protection against SQL injections (using parameterized queries).&lt;/li&gt;
&lt;li&gt;If the tool takes user input, ensure the data is sanitized before being displayed or processed.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  6. Cross-Checking Technique (Pitting LLMs against each other)
&lt;/h2&gt;

&lt;p&gt;If you have doubts about a complex piece of code:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Take the code generated by ChatGPT.&lt;/li&gt;
&lt;li&gt;Open Claude or Gemini and ask: "Here is code generated by an AI. Find the bugs, security flaws, or performance issues."&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;LLMs have different biases. An error that goes unnoticed by one is often caught by another.&lt;/p&gt;




&lt;p&gt;A note: this checklist is partly automated in &lt;a href="https://nexaverify.netlify.app" rel="noopener noreferrer"&gt;NexaVerify&lt;/a&gt;, the multi-LLM consensus scanner I'm building. Step 6 (LLM cross-checking) is its core mechanic. Free tier on Gumroad if you want to try it.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>tutorial</category>
      <category>devops</category>
      <category>showdev</category>
    </item>
  </channel>
</rss>
