Every time you start a RAG project in Python, three names come up almost immediately: LlamaIndex, LangChain, and Haystack. They're all described as "RAG frameworks," but once you actually install them, the design philosophies diverge significantly.
This week I ran all three through a temporary sandbox under identical conditions — pip install, InMemory retrieval, code complexity measurements. I also hit an unexpected deprecation warning that anyone starting a new LangChain project needs to know about.
Test environment and installation
# Versions tested
llama-index-core: 0.14.22
langchain: 1.3.4
langchain-core: 1.4.1
haystack-ai: 2.30.0
# Environment
Python 3.12.8 / venv / macOS
A few things came up during installation. After installing llama-index-core, importing it immediately warns you that llama-index-llms-openai is missing. Running without an LLM requires using MockEmbedding or similar stub implementations.
LangChain threw a deprecation warning the moment I installed langchain-community. I'll cover that in detail below. Haystack installed cleanly with no warnings.
No API keys required — I used InMemory stores throughout. BM25 retrieval (Haystack), MockEmbedding (LlamaIndex), and FakeEmbeddings (LangChain) let you validate the full pipeline structure without spending on model calls.
LlamaIndex 0.14 — abstraction-first design
LlamaIndex needs the fewest lines for a working RAG pipeline. VectorStoreIndex.from_documents() handles splitting, embedding, and indexing in one call.
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
Settings.llm = OpenAI(model="gpt-4o-mini")
Settings.embed_model = OpenAIEmbedding(model="text-embedding-3-small")
documents = SimpleDirectoryReader("./data").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine(similarity_top_k=3)
response = query_engine.query("What is the main topic?")
print(response)
That's 10 meaningful lines. My code complexity measurement landed at exactly 10.
The Settings global object is the centerpiece. Set the LLM and embedding model once, and every component that follows picks them up automatically. Convenient — but if you have multiple pipelines needing different models, that global state becomes a problem.
Testing without an API key
from llama_index.core.embeddings.mock_embed_model import MockEmbedding
from llama_index.core.node_parser import SentenceSplitter
Settings.embed_model = MockEmbedding(embed_dim=128)
Settings.llm = None # No LLM needed for indexing
splitter = SentenceSplitter(chunk_size=512, chunk_overlap=50)
index = VectorStoreIndex.from_documents(docs, transformations=[splitter])
retriever = index.as_retriever(similarity_top_k=2)
results = retriever.retrieve("What is VectorStoreIndex?")
Persistence creates five separate files: docstore.json, default__vector_store.json, index_store.json, image__vector_store.json, graph_store.json. More files than the other two frameworks, but each has a clear purpose.
93 top-level modules live in llama_index.core. When you're not sure what you need, it's in there somewhere. SubQuestionQueryEngine, RouterQueryEngine, RecursiveRetriever — advanced retrieval patterns are implemented and ready to use.
Strengths and weaknesses
Strengths:
- Lowest code count: 10 lines for complete RAG
- Rich document loaders: 80+ readers (PDF, DOCX, HTML, images, video)
- Advanced retrieval patterns: SubQuestion, HyDE, RAG Fusion built in
- Dense documentation and examples for every use case
Weaknesses:
-
Settingsglobal state can conflict across multiple pipelines in the same process - Deep abstraction makes debugging non-obvious failures harder
- 28 core dependencies pull in packages you may never use
LangChain 1.3 — LCEL and a deprecation you need to know about
LangChain builds chains using LCEL (LangChain Expression Language) — a pipe-operator syntax that connects components together.
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_chroma import Chroma
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
llm = ChatOpenAI(model="gpt-4o-mini")
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
vectorstore = Chroma.from_documents(documents, embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 3})
prompt = ChatPromptTemplate.from_template(
"Context: {context}\n\nQuestion: {question}"
)
chain = (
{"context": retriever, "question": RunnablePassthrough()}
| prompt
| llm
| StrOutputParser()
)
result = chain.invoke("What is the main topic?")
The | pipe operator is genuinely Pythonic. Every component implements the Runnable interface, so .invoke(), .stream(), and .batch() work identically across the chain. That uniformity is a real practical benefit.
The langchain-community deprecation
Then I ran into this during testing:
DeprecationWarning: `langchain-community` is being sunset and is
no longer actively maintained. See
https://github.com/langchain-ai/langchain-community/issues/674
for details and migration guidance toward standalone integration packages.
langchain-community is being deprecated. Not just a soft warning — the package that handled hundreds of integrations is being sunset in favor of standalone packages.
What was:
from langchain_community.vectorstores import Chroma
from langchain_community.embeddings import HuggingFaceEmbeddings
Is now:
pip install langchain-chroma
pip install langchain-huggingface
from langchain_chroma import Chroma
from langchain_huggingface import HuggingFaceEmbeddings
For existing codebases, everything still runs. For new projects, skip langchain-community entirely and install integration packages individually. If your codebase is heavy on from langchain_community import ..., now is a good time to plan a migration.
As of 1.3.4, the package breakdown:
-
langchain-core1.4.1 — Runnable interface, base abstractions (actively maintained) -
langchain1.3.4 — chain implementations (actively maintained) -
langchain-community0.4.2 — deprecated, migrating to standalone packages
I covered this ecosystem in Python AI agent library comparison 2026 — LangChain's integration breadth is still unmatched. The community sunset is turbulence, not a death sentence.
InMemory test
from langchain_core.vectorstores import InMemoryVectorStore
from langchain_core.embeddings import FakeEmbeddings
from langchain_core.runnables import RunnableLambda
fake_embeddings = FakeEmbeddings(size=128)
vectorstore = InMemoryVectorStore.from_documents(splits, fake_embeddings)
retriever = vectorstore.as_retriever(search_kwargs={"k": 2})
results = retriever.invoke("LCEL pipe operator")
# Result: LangChain 1.3 uses LCEL (LangChain Expression Language) with pipe operator.
InMemoryVectorStore lives in langchain-core — no extra packages needed for basic testing.
Strengths and weaknesses
Strengths:
- LCEL pipe syntax is intuitive and Pythonic
- Largest ecosystem: most integrations, community resources, Stack Overflow answers
- LangSmith, LangServe ecosystem for tracing and serving
- Lightest core dependency footprint (9 packages)
Weaknesses:
- langchain-community deprecated — new projects need to avoid it
- Rapid API churn between versions causes compatibility surprises
- Deep stack traces make error messages hard to parse
Haystack 2.30 — explicit graphs and YAML serialization
Haystack is the most explicit of the three. Every component gets add_component(), every connection gets connect().
from haystack import Pipeline
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.preprocessors import DocumentCleaner, DocumentSplitter
from haystack.components.retrievers.in_memory import InMemoryBM25Retriever
from haystack.components.builders import PromptBuilder
from haystack.components.generators import OpenAIGenerator
template = """
Context:
{% for doc in documents %}{{ doc.content }}{% endfor %}
Question: {{ question }}
Answer:
"""
document_store = InMemoryDocumentStore()
p = Pipeline()
p.add_component("cleaner", DocumentCleaner())
p.add_component("splitter", DocumentSplitter(split_by="word", split_length=200))
p.add_component("retriever", InMemoryBM25Retriever(document_store=document_store))
p.add_component("prompt_builder", PromptBuilder(template=template))
p.add_component("llm", OpenAIGenerator(model="gpt-4o-mini"))
p.connect("cleaner.documents", "splitter.documents")
p.connect("retriever", "prompt_builder.documents")
p.connect("prompt_builder", "llm.prompt")
More lines, but the data flow is readable. connect("retriever", "prompt_builder.documents") tells you exactly how data moves through the system.
Live BM25 retrieval test
I ran BM25 retrieval without any API key:
store = InMemoryDocumentStore()
store.write_documents([
Document(content="LlamaIndex excels at document-centric RAG with rich query abstractions."),
Document(content="LangChain LCEL is best for chaining multiple AI components with pipe syntax."),
Document(content="Haystack provides production-grade type-safe pipelines with YAML serialization."),
Document(content="For beginners, LlamaIndex has the lowest learning curve. For teams, Haystack wins."),
])
p = Pipeline()
p.add_component("retriever", InMemoryBM25Retriever(document_store=store, top_k=2))
result = p.run({"retriever": {"query": "which framework is best for teams"}})
# Score: 5.307 | For beginners, LlamaIndex has the lowest learning curve. For teams, Haystack wins.
# Score: 4.755 | Haystack provides production-grade type-safe pipelines with YAML serialization.
BM25 picked up "teams" correctly and ranked the relevant document first.
YAML serialization in practice
yaml_str = p.dumps()
with open("rag_pipeline.yaml", "w") as f:
f.write(yaml_str)
# Later, load and run
from haystack import Pipeline
loaded_p = Pipeline.loads(yaml_str)
The output looks like this:
components:
cleaner:
init_parameters:
ascii_only: false
remove_empty_lines: true
remove_extra_whitespaces: true
type: haystack.components.preprocessors.document_cleaner.DocumentCleaner
connections:
- receiver: splitter.documents
sender: cleaner.documents
A YAML-serialized pipeline means you can version configuration separately from code, let infra teams modify pipeline parameters without Python, and deploy config changes without code reviews. That's not a feature you care about until you really need it — then you'll wish you had it from day one.
Type safety actually works
# Try to connect incompatible types
p.connect("retriever", "prompt_builder.wrong_input")
# PipelineConnectError:
# Cannot connect 'retriever.documents' (list[Document]) to
# 'prompt_builder.wrong_input' — no such input
Incorrect connections fail at build time, not execution time. This catches integration bugs before they become production incidents.
Strengths and weaknesses
Strengths:
- Pipeline YAML serialization for production config management
- Type safety: connection errors caught before execution
- Explicit graph makes data flow readable directly from code
- deepset's enterprise support, Haystack Cookbook, Hayhooks
Weaknesses:
- More boilerplate: same functionality requires roughly 2x LlamaIndex's code
- Steeper initial setup and learning curve
- Smaller ecosystem than LangChain
Quantitative comparison
Numbers I measured directly:
| Metric | LlamaIndex 0.14 | LangChain 1.3 | Haystack 2.30 |
|---|---|---|---|
| RAG pipeline code (meaningful lines) | 10 | 18 | 18 |
| Core dependencies | 28 | 9 | 19 |
| Top-level public modules | 93 | — | — |
| Pipeline serialization | 5 JSON files | None (default) | Single YAML |
| InMemory testing | MockEmbedding | FakeEmbeddings | BM25 live run |
| Type safety | Weak | Medium | Strong |
| Notable warnings | Missing llms-openai | community deprecated | None |
Vector store integration status
All three support the major vector stores, but the integration paths differ:
LlamaIndex: Separate packages per store:
pip install llama-index-vector-stores-qdrant
pip install llama-index-vector-stores-chroma
LangChain: Moving from langchain-community to standalone packages:
pip install langchain-chroma # ✅ standalone (recommended)
pip install langchain-qdrant # ✅ standalone (recommended)
# from langchain_community.vectorstores import Chroma # ⚠️ deprecated
Haystack: Official integrations via haystack-integrations:
pip install chroma-haystack
pip install qdrant-haystack
I benchmarked Qdrant, ChromaDB, and pgvector directly in the Vector DB comparison 2026. Whichever framework you choose, the vector store decision is a separate concern that matters a lot.
Decision guide
No single right answer, but here's my decision framework.
Choose LlamaIndex when:
- You need a prototype running today
- Document-centric RAG (PDF, DOCX, HTML parsing) is the core use case
- Advanced retrieval patterns like sub-queries and recursive retrieval are needed
- Small team or solo developer where speed-to-prototype matters most
Choose LangChain when:
- You already have a LangChain codebase (migration cost is real)
- LCEL's functional chaining style fits your team's patterns
- You're invested in LangSmith or other LangChain ecosystem tooling
- If you have
langchain-communitydependencies, audit them and migrate now
Choose Haystack when:
# Pipeline config lives in YAML, deployed separately from code
pipeline = Pipeline.load("production_rag.yaml")
result = pipeline.run({"retriever": {"query": query}})
- Pipeline config needs to be managed as production infrastructure
- Type safety and explicit data flow matter to your team
- Non-engineers need to modify pipeline behavior via config files
- Medium to large teams planning for long-term maintenance
Community and ecosystem
Rough GitHub star counts (June 2026):
- LangChain: 90,000+
- LlamaIndex: 35,000+
- Haystack: 18,000+
LangChain dominates on ecosystem size. But numbers aren't everything. Haystack is backed by deepset as a full-time engineering team, which means consistent maintenance and enterprise support. LlamaIndex grew rapidly through 2024 and has an active community. All three are shipping updates regularly.
My take
If I were choosing from scratch today, I'd go with Haystack. As pipelines grow more complex, explicit structure makes maintenance easier. The connect() declarations feel verbose initially, but six months in, when you're debugging a production issue at 2am, you'll appreciate being able to read the data flow directly from the code. The YAML serialization feels like overkill early on — then, when your infra and code change cycles diverge, you'll wish you had it from the start.
LlamaIndex is genuinely fast for prototypes. 10 lines to a working RAG is a real advantage. But without understanding what's happening under the abstraction, debugging gets hard when things break in unexpected ways.
LangChain's langchain-community situation makes me cautious about starting new projects with it right now. Existing codebases are fine to maintain. For greenfield projects, I'd want to see how the community-to-standalone migration lands before making a long-term bet.
If you're planning to extend beyond RAG into agent orchestration, the LangGraph, CrewAI, Dapr agent framework comparison gives the context for the next connected decision.
All three frameworks update frequently. These version numbers will be outdated before long. That's why checking each framework's CHANGELOG regularly matters more than any comparison article.
Test environment: Python 3.12.8, llama-index-core 0.14.22, langchain 1.3.4, haystack-ai 2.30.0. As of 2026-06-06. All retrieval tested without API keys using InMemory stores.

Top comments (0)