LangChain JsonOutputParser: Fix Malformed JSON from LLMs

#python #langchain #llm #json

LangChain's JsonOutputParser is one of the most useful tools in the LangChain ecosystem — until your LLM returns malformed JSON and everything crashes.

This article was originally published at AI JSONMedic.

The Problem

LLMs are probabilistic. Even with a structured prompt, models sometimes return:

Single-quoted JSON: {'key': 'value'} instead of {"key": "value"}
Trailing commas: {"key": "value",}
Markdown code fences wrapping the JSON: json\n{...}\n
Truncated responses that cut off mid-object

JsonOutputParser raises OutputParserException on all of these.

Basic Setup

from langchain_core.output_parsers import JsonOutputParser
from langchain_core.prompts import PromptTemplate
from langchain_openai import ChatOpenAI
from pydantic import BaseModel, Field

class MovieReview(BaseModel):
    title: "str = Field(description=\"Movie title\")"
    rating: float = Field(description="Rating from 0 to 10")
    summary: str = Field(description="One sentence summary")

parser = JsonOutputParser(pydantic_object=MovieReview)

prompt = PromptTemplate(
    template="Review this movie: {movie}\n{format_instructions}",
    input_variables=["movie"],
    partial_variables={"format_instructions": parser.get_format_instructions()},
)

model = ChatOpenAI(temperature=0)
chain = prompt | model | parser

result = chain.invoke({"movie": "Inception"})
print(result)

Making It Robust — Auto-Repair Strategy

The most reliable approach is to add a repair step before the parser:

from json_repair import repair_json
from langchain_core.output_parsers import StrOutputParser
import json

def robust_json_parse(text: str) -> dict:
    """Parse JSON from LLM output, with auto-repair for common errors."""
    # Strip markdown code fences if present
    if text.strip().startswith("```

"):
        lines = text.strip().split("\n")
        text = "\n".join(lines[1:-1] if lines[-1] == "

```" else lines[1:])

    # Try direct parse first
    try:
        return json.loads(text)
    except json.JSONDecodeError:
        pass

    # Auto-repair (handles single quotes, trailing commas, etc.)
    repaired = repair_json(text)
    return json.loads(repaired)

# Use with a raw string output parser + manual repair
chain = prompt | model | StrOutputParser() | robust_json_parse

Using OutputFixingParser

LangChain also has a built-in OutputFixingParser that uses a second LLM call to fix bad output:

from langchain.output_parsers import OutputFixingParser

fix_parser = OutputFixingParser.from_llm(
    parser=JsonOutputParser(pydantic_object=MovieReview),
    llm=ChatOpenAI(temperature=0)
)

chain = prompt | model | fix_parser

This works but uses an extra API call per failure. Use it when accuracy is critical and cost is acceptable.

Using RetryOutputParser

For structured retry logic:

from langchain.output_parsers import RetryOutputParser

retry_parser = RetryOutputParser.from_llm(
    parser=JsonOutputParser(pydantic_object=MovieReview),
    llm=model,
    max_retries=3
)

Summary

Strategy	When to use
`json-repair` library	Fast, no extra API calls, handles most cases
`OutputFixingParser`	Critical accuracy, cost not a concern
`RetryOutputParser`	When the original prompt just needs re-running
Manual strip + parse	Code fences only, no other errors

For quick one-off JSON repair from LLM output, AI JSONMedic handles single quotes, trailing commas, Python booleans, and truncated JSON in the browser.