DEV Community

Leo Han
Leo Han

Posted on

LangChain-Core-Components-Guide

LangChain Architect's Guide: Building LLM Applications from First Principles

Author's Note: This guide is based on a 9-episode LangChain tutorial series (~5 hours total). Every slide, code demo, and architecture diagram from the videos has been analyzed frame by frame, validated against the latest LangChain API, and rewritten as a comprehensive technical reference.


Table of Contents

  1. Introduction: Why LangChain Matters
  2. Episode 1: LangChain Overview & the LLM Landscape
  3. Episode 2: Hello World & ConversationChain
  4. Episode 3: Model I/O — Prompt Engineering at Scale
  5. Episode 4: Data Connection — Teaching LLMs to Read Your Data
  6. Episode 5: Chains — The Art of Orchestration
  7. Episode 6: Agents — Autonomous LLM Reasoning
  8. Episode 7: Hands-on PDF Q&A System
  9. Episode 8: Hands-on Advanced Search Agent
  10. Episode 9: Retrospective & Best Practices
  11. Appendix: API Migration Guide & Troubleshooting

1. Introduction: Why LangChain Matters

1.1 The LLM Revolution and Its Bottlenecks

In November 2022, OpenAI released the GPT-3.5 API (text-davinci-003), and everything changed. Within months, GPT-4 arrived with its MoE (Mixture of Experts) architecture, Meta open-sourced LLaMA, and Zhipu AI launched ChatGLM. LLMs were no longer just research papers — they were programmable infrastructure.

But building applications directly on raw API calls hits four walls immediately:

Wall 1: Context Constraints. GPT-3.5 maxes out at 4,096 tokens. A 20-page PDF simply doesn't fit.

Wall 2: Capability Boundaries. An LLM is a text predictor, not an agent. It can't search the web, execute code, read files, or call external APIs.

Wall 3: Amnesia by Design. Every API call is a blank slate. State management must be built from scratch.

Wall 4: Prompt Sprawl. Prompts get scattered across dozens of files as raw strings. There's no templating, versioning, or testing.

LangChain was built specifically to tear down these four walls.

1.2 What LangChain Actually Is

LangChain is not a new LLM. It's an orchestration framework that provides standardized abstractions:

User Input → [Prompt Template] → [LLM Call] → [Output Parser] → Result
                      ↑                ↑
                  [Memory]        [Tools / APIs]
Enter fullscreen mode Exit fullscreen mode

The six-layer architecture:

Layer Module Problem It Solves
L1 Model I/O Unified interface across LLM providers
L2 Data Connection Reading external documents
L3 Chains Composing multiple LLM calls
L4 Memory Retaining conversation state
L5 Agents LLM autonomously decides which tools to use
L6 Callbacks Monitoring, logging, debugging

LangChain Architecture

1.3 The LLM Landscape

Model Provider Architecture Best For
GPT-4 OpenAI MoE, 8×220B experts Complex reasoning
GPT-3.5 OpenAI 175B Dense Price-performance
LLaMA 2 Meta 7B/13B/70B open-source Local deployment
ChatGLM Zhipu AI Chinese-English bilingual Chinese scenarios

What is MoE? GPT-4 is a collection of "expert" sub-models. For each inference, only a subset activates — like a dispatch system routing each question to the best-qualified specialists.


2. Episode 1: LangChain Overview & the LLM Landscape

Source: 1.mp4 (~30 min)

2.1 Key Questions

  1. What are LLMs? — From GPT-3 (June 2020) through GPT-3.5 API (November 2022), to GPT-4 and open-source.
  2. What is LangChain? — A Python framework for composing LLM calls like building blocks.
  3. Why use LangChain? — Raw API calls work for demos; products require engineering.

2.2 The Raw API Problem

const response = await createCompletion({
  model: "text-davinci-003",
  prompt: "Who are you?",
  temperature: 0.8,
  max_tokens: 100,
});
Enter fullscreen mode Exit fullscreen mode

Missing: prompt management, context injection, structured output, reliability, observability.

2.3 LangChain's Answer

LangChain is a framework for developing applications powered by language models.

It provides standardized abstractions above the LLM layer so you focus on business logic, not plumbing.

LLM Ecosystem Overview


3. Episode 2: Hello World & ConversationChain

Source: 2.mp4 (~48 min)

3.1 Environment Setup

pip install langchain langchain-openai langchain-community python-dotenv
Enter fullscreen mode Exit fullscreen mode

3.2 First LangChain Call

import os
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage

load_dotenv()

llm = ChatOpenAI(
    model="deepseek-chat",
    temperature=0.7,
    api_key=os.getenv("DEEPSEEK_API_KEY"),
    base_url="https://api.deepseek.com/v1"
)

result = llm.invoke([HumanMessage(content="Explain LangChain in one sentence.")])
print(result.content)
Enter fullscreen mode Exit fullscreen mode

Key insight: base_url enables hot-swappable model infrastructure.

3.3 Temperature: The Creativity Knob

temperature = 0.0  →  Deterministic: math, code, factual Q&A
temperature = 0.7  →  Balanced: conversation, summarization
temperature = 1.0  →  Creative: storytelling, brainstorming
Enter fullscreen mode Exit fullscreen mode

Under the hood: temperature modulates the probability distribution over vocabulary.

3.4 Jupyter Notebook for LLM Development

The Cell mechanism is uniquely suited to LLM development — iteratively build prompts, observe outputs, tune parameters.

Jupyter Example

3.5 ConversationChain: Giving LLMs Memory

from langchain.memory import ConversationBufferMemory
from langchain.chains import ConversationChain

conversation = ConversationChain(
    llm=llm,
    memory=ConversationBufferMemory(),
    verbose=True
)

conversation.predict(input="My name is Alice, I'm 25")
conversation.predict(input="What's my name and age?")
# AI: Your name is Alice and you're 25 years old.
Enter fullscreen mode Exit fullscreen mode

3.6 How Prompts Pass Information to LLMs

  • System Message: Sets behavioral boundaries ("You are a professional Python developer")
  • Human Message: User's direct input
  • AI Message: LLM's response — appended as history in multi-turn conversations

4. Episode 3: Model I/O — Prompt Engineering at Scale

Source: 3.mp4 (~31 min)

4.1 The Anti-Pattern: Raw String Concatenation

# Anti-pattern
prompt = "Translate: " + text
Enter fullscreen mode Exit fullscreen mode

4.2 PromptTemplate

from langchain_core.prompts import PromptTemplate

template = PromptTemplate.from_template(
    "You are a {role}. Translate to {target_lang}:\n{text}"
)
prompt_str = template.format(role="translator", target_lang="English", text="Hello")
Enter fullscreen mode Exit fullscreen mode

4.3 Few-Shot Prompting

from langchain_core.prompts import FewShotPromptTemplate

examples = [
    {"input": "happy", "output": "Positive"},
    {"input": "sad",   "output": "Negative"},
]

few_shot = FewShotPromptTemplate(
    examples=examples,
    example_prompt=PromptTemplate.from_template("Input: {input}\nSentiment: {output}"),
    prefix="Classify sentiment:",
    suffix="Input: {input}\nSentiment:",
    input_variables=["input"],
)
Enter fullscreen mode Exit fullscreen mode

Few-Shot vs Fine-Tuning:

Dimension Few-Shot Fine-Tuning
Cost Zero training cost Training data + GPU
Flexibility Change instantly Requires retraining
Effectiveness Format control Domain knowledge

Rule of thumb: Start with Few-Shot, fine-tune only when stable.

Prompt Templates

4.4 Example Selector

from langchain_core.example_selectors import SemanticSimilarityExampleSelector
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import Chroma

selector = SemanticSimilarityExampleSelector.from_examples(
    examples=all_examples,
    embeddings=OpenAIEmbeddings(),
    vectorstore_cls=Chroma,
    k=3,
)
Enter fullscreen mode Exit fullscreen mode

How it works: 1) Embed all examples → 2) Embed user input → 3) Cosine similarity → 4) Inject top-K

4.5 Output Parsers

  • CommaSeparatedListOutputParser: CSV-style output
  • StructuredOutputParser: JSON Schema compliance
  • PydanticOutputParser: Direct Pydantic model parsing (most powerful)

5. Episode 4: Data Connection — Teaching LLMs to Read Your Data

Source: 4.mp4 (~35 min)

This is LangChain's most important module — complete RAG infrastructure.

5.1 The Core Problem

LLMs have a training cutoff. Your proprietary documents are invisible to them. RAG bridges this gap:

Step 1: Load → Step 2: Split → Step 3: Embed → Step 4: Store
Enter fullscreen mode Exit fullscreen mode

5.2 Document Loaders

from langchain_community.document_loaders import (
    PyPDFLoader, WebBaseLoader, YoutubeLoader,
    UnstructuredPowerPointLoader, TextLoader, CSVLoader,
)

loader = PyPDFLoader("report.pdf")
pages = loader.load()
# pages[0].page_content → text
# pages[0].metadata → {"source": "...", "page": 1}
Enter fullscreen mode Exit fullscreen mode

Every loader returns Document with page_content + metadata.

5.3 Text Splitters

from langchain_text_splitters import RecursiveCharacterTextSplitter

splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,
    chunk_overlap=50,
    separators=["\n\n", "\n", ".", " ", ""]
)
splits = splitter.split_documents(pages)
Enter fullscreen mode Exit fullscreen mode

Why overlap? Prevents cutting sentences at chunk boundaries.

5.4 Word Embeddings: The Math of Meaning

"cat" → [0.12, -0.34, 0.56, ...]
"dog" → [0.14, -0.31, 0.58, ...]  ← close to "cat"
"car" → [-0.78, 0.45, -0.12, ...]  ← far from both
Enter fullscreen mode Exit fullscreen mode

Cosine similarity: cos(θ) = (A·B) / (|A|×|B|) — range [-1, 1]

from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vec = embeddings.embed_query("What is RAG?")
Enter fullscreen mode Exit fullscreen mode

Model guide: text-embedding-3-small (1536d, best value), text-embedding-3-large (3072d)

5.5 Vector Stores

from langchain_community.vectorstores import FAISS

vectorstore = FAISS.from_documents(splits, OpenAIEmbeddings())
results = vectorstore.similarity_search("MoE architecture pros and cons?", k=4)
Enter fullscreen mode Exit fullscreen mode

DB guide: FAISS (dev), Chroma (small projects), Pinecone (production), Weaviate/Qdrant (enterprise)


6. Episode 5: Chains — The Art of Orchestration

Source: 5.mp4 (~25 min)

6.1 What Is a Chain?

# Without Chain:
prompt = template.format(input=x)
response = llm.invoke(prompt)
parsed = parser.parse(response)

# With LCEL:
chain = prompt | llm | parser
result = chain.invoke({"input": x})
Enter fullscreen mode Exit fullscreen mode

6.2 Chain Types

LLMChain: Atomic unit — Prompt + LLM.

RouterChain: Auto-dispatches to specialized handlers:

Input → Router → Math? → MathChain
               → Code? → CodeChain  
               → General? → GeneralChain
Enter fullscreen mode Exit fullscreen mode

SequentialChain: Pipeline processing:

Generate Outline → Expand → Polish → Output
Enter fullscreen mode Exit fullscreen mode

TransformationChain: Post-process output (clean, translate, filter).

6.3 Document Chains (RAG Core)

Four strategies for feeding retrieved chunks to LLM:

Stuff: All chunks in one prompt. Simple, context-limited.
Map-Reduce: Process each chunk independently, then aggregate. Parallel, scalable.
Refine: Iteratively improve answer with each chunk. High quality, sequential.
Map-Rerank: Score each chunk's answer, pick best. For relevance ranking.


7. Episode 6: Agents — Autonomous LLM Reasoning

Source: 6.mp4 (~25 min)

7.1 Chains vs Agents

Chain = passive (defined workflow). Agent = active (chooses tools autonomously).

7.2 ReAct Pattern

Q: Who is Leo DiCaprio's girlfriend? Her age ^ 0.43?

Thought: Find girlfriend first
Action: Search("Leo DiCaprio girlfriend")
Observation: Vittoria Ceretti

Thought: Need her age
Action: Search("Vittoria Ceretti age")  
Observation: 26 years old

Thought: Calculate 26^0.43
Action: Calculator("26^0.43")
Observation: ~4.06

Final Answer: Vittoria Ceretti, 26. 26^0.43 ≈ 4.06
Enter fullscreen mode Exit fullscreen mode

7.3 Agent Implementation

from langchain.agents import load_tools, initialize_agent, AgentType
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="deepseek-chat", temperature=0)
tools = load_tools(["serpapi", "llm-math"], llm=llm)
agent = initialize_agent(
    tools, llm,
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True, max_iterations=5,
)
agent.run("Who is Leo DiCaprio's girlfriend? Her age ^ 0.43?")
Enter fullscreen mode Exit fullscreen mode

7.4 Agent Types

Type Pattern Best For
Zero-shot ReAct Decide on the fly Simple tasks
Structured Chat Multi-parameter tools Complex tools
OpenAI Functions Function Calling API GPT models
Plan-and-Execute Plan then execute Multi-step tasks

7.5 Tuning

  • max_iterations: Prevent infinite loops
  • handle_parsing_errors: Retry on malformed output
  • early_stopping_method: "force" or "generate" best guess

8. Episode 7: Hands-on PDF Q&A System

Source: 7.mp4 (~38 min)

from langchain_community.document_loaders import PyPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_openai import OpenAIEmbeddings
from langchain_community.vectorstores import FAISS
from langchain.chains import RetrievalQA
from langchain_openai import ChatOpenAI

loader = PyPDFLoader("report.pdf")
splits = RecursiveCharacterTextSplitter(chunk_size=800, chunk_overlap=100).split_documents(loader.load())
vectorstore = FAISS.from_documents(splits, OpenAIEmbeddings(model="text-embedding-3-small"))
qa = RetrievalQA.from_chain_type(
    llm=ChatOpenAI(model="deepseek-chat", temperature=0),
    chain_type="stuff",
    retriever=vectorstore.as_retriever(search_kwargs={"k": 4}),
    return_source_documents=True,
)
result = qa.invoke({"query": "What are the key findings?"})
print(result["result"])
Enter fullscreen mode Exit fullscreen mode

chain_type: stuff (<4 chunks), map_reduce (many), refine (sequential), map_rerank (score).


9. Episode 8: Hands-on Advanced Search Agent

Source: 8.mp4 (~53 min)

from langchain.agents import tool, initialize_agent, AgentType

@tool
def get_stock_price(symbol: str) -> str:
    """Get current stock price. Input: ticker like AAPL, TSLA."""
    prices = {"AAPL": "189.30", "TSLA": "242.84"}
    return prices.get(symbol.upper(), f"Symbol {symbol} not found")

agent = initialize_agent([get_stock_price], llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True)
agent.run("What's Apple's stock price?")
Enter fullscreen mode Exit fullscreen mode

Key: tool's docstring IS what the Agent reads to decide when to use it.


10. Episode 9: Retrospective & Best Practices

Source: 9.mp4 (~20 min)

Production Checklist

Security: Never expose keys in prompts. Sandbox agent execution.
Performance: Cache embeddings. Persistent vector stores (Chroma/Pinecone).
Cost: Cheap models first. Set max_iterations. Cache deterministic responses.
Observability: LangSmith tracing. Log token consumption.

API Migration (2 Years)

Old New
langchain.llms.OpenAI langchain_openai.ChatOpenAI
llm.predict("text") llm.invoke([HumanMessage("text")])
Chain.run(input) Chain.invoke({"key": value})

11. Appendix: Troubleshooting

Error Solution
ModuleNotFoundError: langchain.llms Use langchain-openai
jupyter-lab not found Add Scripts to PATH
DeepSeek 401 Check .env API key
Agent infinite loop Improve tool docstrings, set max_iterations
FAISS OOM Switch to Chroma or Pinecone

Learning Path

Week 1: Hello World → Week 2: Prompts → Week 3: RAG → Week 4: Agent + Tools → Ongoing: Docs


Epilogue

LangChain's core philosophy — modular, composable, engineered — endures. Package names change; architectural patterns don't.

The most valuable thing LangChain offers isn't its code — it's the paradigm of composing LLM applications like building blocks.


Disclaimer: Code adapted for LangChain latest API (2025-2026). Images from original tutorial screenshots for educational reference only.

Top comments (0)