DEV Community

Cover image for Day 4: Structured Outputs with LangChain Output Parsers
Utkarsh Rastogi
Utkarsh Rastogi

Posted on

Day 4: Structured Outputs with LangChain Output Parsers

AI models love to chat, but apps need structured data. Today we're learning Output Parsers - your solution for getting clean JSON, lists, and custom formats from AI responses.

What You'll Learn:

  • JSON, List, and Custom Schema parsers
  • Converting AI text to structured data
  • Practical implementation examples

The Problem

AI gives you essays when you need structured data:

{
  "service": "AWS Lambda",
  "purpose": "Serverless compute service",
  "analogy": "Like hiring a contractor - you pay only when they work"
}
Enter fullscreen mode Exit fullscreen mode

Output Parsers solve this.


What Are Output Parsers?

Translators between AI's natural language and your app's data structures:

  • JSON Parser → Clean objects
  • List Parser → Simple lists
  • Custom Schema → Exact templates

Key Concepts

PydanticOutputParser - Enforces data structure
BaseModel - Your data template
Field - Descriptions for AI guidance
get_format_instructions() - Auto-generates AI instructions


Setup First!

Check Day 1 for environment setup.

Basic JSON Output Parser

Let's start with the most common use case - getting JSON from AI responses:

import boto3
from langchain.prompts import PromptTemplate
from langchain_aws import ChatBedrock
from langchain.output_parsers import PydanticOutputParser
from langchain.schema.output_parser import StrOutputParser
from pydantic import BaseModel, Field
import json

# Initialize Bedrock client
bedrock_client = boto3.client('bedrock-runtime', region_name='us-east-1')

llm = ChatBedrock(
    client=bedrock_client, 
    model_id="anthropic.claude-3-sonnet-20240229-v1:0",
    model_kwargs={
        "max_tokens": 300,
        "temperature": 0.3  # Lower temperature for consistent structure
    }
)

# Define the structure we want
class ServiceInfo(BaseModel):
    service: str = Field(description="AWS service name")
    purpose: str = Field(description="Main purpose in one sentence")
    analogy: str = Field(description="Simple analogy to explain the service")

# Create parser
parser = PydanticOutputParser(pydantic_object=ServiceInfo)

# Create prompt with format instructions
prompt = PromptTemplate(
    template="""Extract information about the AWS service mentioned in this text.

{format_instructions}

Text: {text}

Return only the JSON, no additional text.""",
    input_variables=["text"],
    partial_variables={"format_instructions": parser.get_format_instructions()}
)

# Build the chain
chain = prompt | llm | parser

# Test with AWS Bedrock description
bedrock_text = """
AWS Bedrock is a fully managed service that offers a choice of high-performing 
foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, 
Cohere, Meta, Stability AI, and Amazon via a single API. It provides the 
easiest way to build and scale generative AI applications with security, 
privacy, and responsible AI.
"""

print("Processing AWS Bedrock description...")
result = chain.invoke({"text": bedrock_text})

print("Structured Output:")
print(f"Service: {result.service}")
print(f"Purpose: {result.purpose}")
print(f"Analogy: {result.analogy}")
Enter fullscreen mode Exit fullscreen mode

parsor


List Output Parser

Sometimes you need lists instead of objects. Here's how to extract structured lists:

from langchain.output_parsers import CommaSeparatedListOutputParser

llm = ChatBedrock(
    client=bedrock_client,
    model_id="anthropic.claude-3-sonnet-20240229-v1:0",
    model_kwargs={
        "max_tokens": 300,
        "temperature": 0.3  # Lower temperature for consistent structure
    }
)

# Create list parser
list_parser = CommaSeparatedListOutputParser()

# Create prompt for extracting use cases
list_prompt = PromptTemplate(
    template="""List the main use cases for this AWS service. 

{format_instructions}

Service description: {text}

Provide exactly 4 use cases.""",
    input_variables=["text"],
    partial_variables={"format_instructions": list_parser.get_format_instructions()}
)

# Build list extraction chain
list_chain = list_prompt | llm | list_parser

bedrock_text = """
AWS Bedrock is a fully managed service that offers a choice of high-performing 
foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, 
Cohere, Meta, Stability AI, and Amazon via a single API. It provides the 
easiest way to build and scale generative AI applications with security, 
privacy, and responsible AI.
"""

print("\nExtracting use cases...")
use_cases = list_chain.invoke({"text": bedrock_text})

print("Use Cases List:")
for i, use_case in enumerate(use_cases, 1):
    print(f"{i}. {use_case.strip()}")
Enter fullscreen mode Exit fullscreen mode

list


Custom Schema Parser

For complex applications, define exact data structures:

from typing import List
from pydantic import BaseModel, Field
import boto3
from langchain_aws import ChatBedrock
from langchain.output_parsers import PydanticOutputParser

# Initialize Bedrock client
bedrock_client = boto3.client('bedrock-runtime', region_name='us-east-1')

llm = ChatBedrock(
    client=bedrock_client,
    model_id="anthropic.claude-3-sonnet-20240229-v1:0",
    model_kwargs={
        "max_tokens": 300,
        "temperature": 0.3
    }
)

# Define comprehensive service schema
class AWSServiceAnalysis(BaseModel):
    service_name: str = Field(description="Official AWS service name")
    category: str = Field(description="Service category (compute, storage, database, etc.)")
    pricing_model: str = Field(description="How the service is priced")
    key_features: List[str] = Field(description="List of 3 main features")
    best_for: str = Field(description="What type of applications this service is best for")
    complexity_level: str = Field(description="Beginner, Intermediate, or Advanced")

# Create comprehensive parser
comprehensive_parser = PydanticOutputParser(pydantic_object=AWSServiceAnalysis)

# Create detailed analysis prompt
analysis_prompt = PromptTemplate(
    template="""Analyze this AWS service comprehensively.

{format_instructions}

Service description: {text}

Provide detailed analysis following the exact schema.""",
    input_variables=["text"],
    partial_variables={"format_instructions": comprehensive_parser.get_format_instructions()}
)

# Build comprehensive analysis chain
analysis_chain = analysis_prompt | llm | comprehensive_parser

# Test with AWS Bedrock description
bedrock_text = """
AWS Bedrock is a fully managed service that offers a choice of high-performing 
foundation models (FMs) from leading AI companies like AI21 Labs, Anthropic, 
Cohere, Meta, Stability AI, and Amazon via a single API. It provides the 
easiest way to build and scale generative AI applications with security, 
privacy, and responsible AI.
"""
print("\nPerforming comprehensive analysis...")
analysis = analysis_chain.invoke({"text": bedrock_text})

print("Comprehensive Analysis:")
print(f"Service: {analysis.service_name}")
print(f"Category: {analysis.category}")
print(f"Pricing: {analysis.pricing_model}")
print(f"Key Features:")
for feature in analysis.key_features:
    print(f"{feature}")
print(f"Best For: {analysis.best_for}")
print(f"Complexity: {analysis.complexity_level}")
Enter fullscreen mode Exit fullscreen mode

Custom


Parser Comparison

PydanticOutputParser → Structured objects with validation
CommaSeparatedListOutputParser → Simple lists
OutputFixingParser → Error recovery

Key Takeaways

  • Lower temperature (0.1-0.3) for consistent outputs
  • Use Field descriptions to guide AI
  • Test with various inputs
  • Start simple, add complexity as needed

About Me

Hi! I'm Utkarsh, a Cloud Specialist & AWS Community Builder who loves turning complex AWS topics into fun chai-time stories

👉 Explore more


This is part of my "LangChain with AWS Bedrock: A Developer's Journey" series. Follow along as I document everything I learn, including the mistakes and the victories.

Top comments (0)