Leo Han

Posted on Jun 12

LangChain-Core-Components-Guide

LangChain Architect's Guide: Building LLM Applications from First Principles

Author's Note: This guide is based on a 9-episode LangChain tutorial series (~5 hours total). Every slide, code demo, and architecture diagram from the videos has been analyzed frame by frame, validated against the latest LangChain API, and rewritten as a comprehensive technical reference.

Introduction: Why LangChain Matters
Episode 1: LangChain Overview & the LLM Landscape
Episode 2: Hello World & ConversationChain
Episode 3: Model I/O — Prompt Engineering at Scale
Episode 4: Data Connection — Teaching LLMs to Read Your Data
Episode 5: Chains — The Art of Orchestration
Episode 6: Agents — Autonomous LLM Reasoning
Episode 7: Hands-on PDF Q&A System
Episode 8: Hands-on Advanced Search Agent
Episode 9: Retrospective & Best Practices
Appendix: API Migration Guide & Troubleshooting

1. Introduction: Why LangChain Matters

1.1 The LLM Revolution and Its Bottlenecks

In November 2022, OpenAI released the GPT-3.5 API (text-davinci-003), and everything changed. Within months, GPT-4 arrived with its MoE (Mixture of Experts) architecture, Meta open-sourced LLaMA, and Zhipu AI launched ChatGLM. LLMs were no longer just research papers — they were programmable infrastructure.

But building applications directly on raw API calls hits four walls immediately:

Wall 1: Context Constraints. GPT-3.5 maxes out at 4,096 tokens. A 20-page PDF simply doesn't fit.

Wall 2: Capability Boundaries. An LLM is a text predictor, not an agent. It can't search the web, execute code, read files, or call external APIs.

Wall 3: Amnesia by Design. Every API call is a blank slate. State management must be built from scratch.

Wall 4: Prompt Sprawl. Prompts get scattered across dozens of files as raw strings. There's no templating, versioning, or testing.

LangChain was built specifically to tear down these four walls.

1.2 What LangChain Actually Is

LangChain is not a new LLM. It's an orchestration framework that provides standardized abstractions:

User Input → [Prompt Template] → [LLM Call] → [Output Parser] → Result
                      ↑                ↑
                  [Memory]        [Tools / APIs]

The six-layer architecture:

Layer	Module	Problem It Solves
L1	Model I/O	Unified interface across LLM providers
L2	Data Connection	Reading external documents
L3	Chains	Composing multiple LLM calls
L4	Memory	Retaining conversation state
L5	Agents	LLM autonomously decides which tools to use
L6	Callbacks	Monitoring, logging, debugging

1.3 The LLM Landscape

Model	Provider	Architecture	Best For
GPT-4	OpenAI	MoE, 8×220B experts	Complex reasoning
GPT-3.5	OpenAI	175B Dense	Price-performance
LLaMA 2	Meta	7B/13B/70B open-source	Local deployment
ChatGLM	Zhipu AI	Chinese-English bilingual	Chinese scenarios

What is MoE? GPT-4 is a collection of "expert" sub-models. For each inference, only a subset activates — like a dispatch system routing each question to the best-qualified specialists.

2. Episode 1: LangChain Overview & the LLM Landscape

Source: 1.mp4 (~30 min)

2.1 Key Questions

What are LLMs? — From GPT-3 (June 2020) through GPT-3.5 API (November 2022), to GPT-4 and open-source.
What is LangChain? — A Python framework for composing LLM calls like building blocks.
Why use LangChain? — Raw API calls work for demos; products require engineering.

2.2 The Raw API Problem

const response = await createCompletion({
  model: "text-davinci-003",
  prompt: "Who are you?",
  temperature: 0.8,
  max_tokens: 100,
});

Missing: prompt management, context injection, structured output, reliability, observability.

2.3 LangChain's Answer

LangChain is a framework for developing applications powered by language models.

It provides standardized abstractions above the LLM layer so you focus on business logic, not plumbing.

3. Episode 2: Hello World & ConversationChain

Source: 2.mp4 (~48 min)

3.1 Environment Setup

pip install langchain langchain-openai langchain-community python-dotenv

3.2 First LangChain Call

import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage

load_dotenv()

llm = ChatOpenAI(
    model="deepseek-chat",
    temperature=0.7,
    api_key=os.getenv("DEEPSEEK_API_KEY"),
    base_url="https://api.deepseek.com/v1"
)

result = llm.invoke([HumanMessage(content="Explain LangChain in one sentence.")])
print(result.content)

Key insight: base_url enables hot-swappable model infrastructure.

3.3 Temperature: The Creativity Knob

temperature = 0.0  →  Deterministic: math, code, factual Q&A
temperature = 0.7  →  Balanced: conversation, summarization
temperature = 1.0  →  Creative: storytelling, brainstorming

Under the hood: temperature modulates the probability distribution over vocabulary.

3.4 Jupyter Notebook for LLM Development

The Cell mechanism is uniquely suited to LLM development — iteratively build prompts, observe outputs, tune parameters.

3.5 ConversationChain: Giving LLMs Memory

from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationChain

conversation = ConversationChain(
    llm=llm,
    memory=ConversationBufferMemory(),
    verbose=True
)

conversation.predict(input="My name is Alice, I'm 25")
conversation.predict(input="What's my name and age?")
# AI: Your name is Alice and you're 25 years old.

3.6 How Prompts Pass Information to LLMs

System Message: Sets behavioral boundaries ("You are a professional Python developer")
Human Message: User's direct input
AI Message: LLM's response — appended as history in multi-turn conversations

4. Episode 3: Model I/O — Prompt Engineering at Scale

Source: 3.mp4 (~31 min)

4.1 The Anti-Pattern: Raw String Concatenation

# Anti-pattern
prompt = "Translate: " + text

4.2 PromptTemplate

from langchain_core.prompts import PromptTemplate

template = PromptTemplate.from_template(
    "You are a {role}. Translate to {target_lang}:\n{text}"
)
prompt_str = template.format(role="translator", target_lang="English", text="Hello")

4.3 Few-Shot Prompting

from langchain_core.prompts import FewShotPromptTemplate

examples = [
    {"input": "happy", "output": "Positive"},
    {"input": "sad",   "output": "Negative"},
]

few_shot = FewShotPromptTemplate(
    examples=examples,
    example_prompt=PromptTemplate.from_template("Input: {input}\nSentiment: {output}"),
    prefix="Classify sentiment:",
    suffix="Input: {input}\nSentiment:",
    input_variables=["input"],
)

Few-Shot vs Fine-Tuning:

Dimension	Few-Shot	Fine-Tuning
Cost	Zero training cost	Training data + GPU
Flexibility	Change instantly	Requires retraining
Effectiveness	Format control	Domain knowledge

Rule of thumb: Start with Few-Shot, fine-tune only when stable.

4.4 Example Selector

from langchain_core.example_selectors import SemanticSimilarityExampleSelector
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma

selector = SemanticSimilarityExampleSelector.from_examples(
    examples=all_examples,
    embeddings=OpenAIEmbeddings(),
    vectorstore_cls=Chroma,
    k=3,
)

How it works: 1) Embed all examples → 2) Embed user input → 3) Cosine similarity → 4) Inject top-K

4.5 Output Parsers

CommaSeparatedListOutputParser: CSV-style output
StructuredOutputParser: JSON Schema compliance
PydanticOutputParser: Direct Pydantic model parsing (most powerful)

5. Episode 4: Data Connection — Teaching LLMs to Read Your Data

Source: 4.mp4 (~35 min)

This is LangChain's most important module — complete RAG infrastructure.

5.1 The Core Problem

LLMs have a training cutoff. Your proprietary documents are invisible to them. RAG bridges this gap:

Step 1: Load → Step 2: Split → Step 3: Embed → Step 4: Store

5.2 Document Loaders

from langchain_community.document_loaders import (
    PyPDFLoader, WebBaseLoader, YoutubeLoader,
    UnstructuredPowerPointLoader, TextLoader, CSVLoader,
)

loader = PyPDFLoader("report.pdf")
pages = loader.load()
# pages[0].page_content → text
# pages[0].metadata → {"source": "...", "page": 1}

Every loader returns Document with page_content + metadata.

5.3 Text Splitters

from langchain_text_splitters import RecursiveCharacterTextSplitter

splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,
    chunk_overlap=50,
    separators=["\n\n", "\n", ".", " ", ""]
)
splits = splitter.split_documents(pages)

Why overlap? Prevents cutting sentences at chunk boundaries.

5.4 Word Embeddings: The Math of Meaning

"cat" → [0.12, -0.34, 0.56, ...]
"dog" → [0.14, -0.31, 0.58, ...]  ← close to "cat"
"car" → [-0.78, 0.45, -0.12, ...]  ← far from both

Cosine similarity: cos(θ) = (A·B) / (|A|×|B|) — range [-1, 1]

from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vec = embeddings.embed_query("What is RAG?")

Model guide: text-embedding-3-small (1536d, best value), text-embedding-3-large (3072d)

5.5 Vector Stores

from langchain_community.vectorstores import FAISS

vectorstore = FAISS.from_documents(splits, OpenAIEmbeddings())
results = vectorstore.similarity_search("MoE architecture pros and cons?", k=4)

DB guide: FAISS (dev), Chroma (small projects), Pinecone (production), Weaviate/Qdrant (enterprise)

6. Episode 5: Chains — The Art of Orchestration

Source: 5.mp4 (~25 min)

6.1 What Is a Chain?

# Without Chain:
prompt = template.format(input=x)
response = llm.invoke(prompt)
parsed = parser.parse(response)

# With LCEL:
chain = prompt | llm | parser
result = chain.invoke({"input": x})

6.2 Chain Types

LLMChain: Atomic unit — Prompt + LLM.

RouterChain: Auto-dispatches to specialized handlers:

Input → Router → Math? → MathChain
               → Code? → CodeChain  
               → General? → GeneralChain

SequentialChain: Pipeline processing:

Generate Outline → Expand → Polish → Output

TransformationChain: Post-process output (clean, translate, filter).

6.3 Document Chains (RAG Core)

Four strategies for feeding retrieved chunks to LLM:

Stuff: All chunks in one prompt. Simple, context-limited.
Map-Reduce: Process each chunk independently, then aggregate. Parallel, scalable.
Refine: Iteratively improve answer with each chunk. High quality, sequential.
Map-Rerank: Score each chunk's answer, pick best. For relevance ranking.

7. Episode 6: Agents — Autonomous LLM Reasoning

Source: 6.mp4 (~25 min)

7.1 Chains vs Agents

Chain = passive (defined workflow). Agent = active (chooses tools autonomously).

7.2 ReAct Pattern

Q: Who is Leo DiCaprio's girlfriend? Her age ^ 0.43?

Thought: Find girlfriend first
Action: Search("Leo DiCaprio girlfriend")
Observation: Vittoria Ceretti

Thought: Need her age
Action: Search("Vittoria Ceretti age")  
Observation: 26 years old

Thought: Calculate 26^0.43
Action: Calculator("26^0.43")
Observation: ~4.06

Final Answer: Vittoria Ceretti, 26. 26^0.43 ≈ 4.06

7.3 Agent Implementation

from langchain.agents import load_tools, initialize_agent, AgentType
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="deepseek-chat", temperature=0)
tools = load_tools(["serpapi", "llm-math"], llm=llm)
agent = initialize_agent(
    tools, llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True, max_iterations=5,
)
agent.run("Who is Leo DiCaprio's girlfriend? Her age ^ 0.43?")

7.4 Agent Types

Type	Pattern	Best For
Zero-shot ReAct	Decide on the fly	Simple tasks
Structured Chat	Multi-parameter tools	Complex tools
OpenAI Functions	Function Calling API	GPT models
Plan-and-Execute	Plan then execute	Multi-step tasks

7.5 Tuning

max_iterations: Prevent infinite loops
handle_parsing_errors: Retry on malformed output
early_stopping_method: "force" or "generate" best guess

8. Episode 7: Hands-on PDF Q&A System

Source: 7.mp4 (~38 min)

from langchain_community.document_loaders import PyPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS
from langchain.chains import RetrievalQA
from langchain_openai import ChatOpenAI

loader = PyPDFLoader("report.pdf")
splits = RecursiveCharacterTextSplitter(chunk_size=800, chunk_overlap=100).split_documents(loader.load())
vectorstore = FAISS.from_documents(splits, OpenAIEmbeddings(model="text-embedding-3-small"))
qa = RetrievalQA.from_chain_type(
    llm=ChatOpenAI(model="deepseek-chat", temperature=0),
    chain_type="stuff",
    retriever=vectorstore.as_retriever(search_kwargs={"k": 4}),
    return_source_documents=True,
)
result = qa.invoke({"query": "What are the key findings?"})
print(result["result"])

chain_type: stuff (<4 chunks), map_reduce (many), refine (sequential), map_rerank (score).

9. Episode 8: Hands-on Advanced Search Agent

Source: 8.mp4 (~53 min)

from langchain.agents import tool, initialize_agent, AgentType

@tool
def get_stock_price(symbol: str) -> str:
    """Get current stock price. Input: ticker like AAPL, TSLA."""
    prices = {"AAPL": "189.30", "TSLA": "242.84"}
    return prices.get(symbol.upper(), f"Symbol {symbol} not found")

agent = initialize_agent([get_stock_price], llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
agent.run("What's Apple's stock price?")

Key: tool's docstring IS what the Agent reads to decide when to use it.

10. Episode 9: Retrospective & Best Practices

Source: 9.mp4 (~20 min)

Production Checklist

Security: Never expose keys in prompts. Sandbox agent execution.
Performance: Cache embeddings. Persistent vector stores (Chroma/Pinecone).
Cost: Cheap models first. Set max_iterations. Cache deterministic responses.
Observability: LangSmith tracing. Log token consumption.

API Migration (2 Years)

Old	New
`langchain.llms.OpenAI`	`langchain_openai.ChatOpenAI`
`llm.predict("text")`	`llm.invoke([HumanMessage("text")])`
`Chain.run(input)`	`Chain.invoke({"key": value})`

11. Appendix: Troubleshooting

Error	Solution
`ModuleNotFoundError: langchain.llms`	Use `langchain-openai`
`jupyter-lab` not found	Add Scripts to PATH
DeepSeek 401	Check `.env` API key
Agent infinite loop	Improve tool docstrings, set max_iterations
FAISS OOM	Switch to Chroma or Pinecone

Learning Path

Week 1: Hello World → Week 2: Prompts → Week 3: RAG → Week 4: Agent + Tools → Ongoing: Docs

Epilogue

LangChain's core philosophy — modular, composable, engineered — endures. Package names change; architectural patterns don't.

The most valuable thing LangChain offers isn't its code — it's the paradigm of composing LLM applications like building blocks.

Disclaimer: Code adapted for LangChain latest API (2025-2026). Images from original tutorial screenshots for educational reference only.