Build LangChain chains once with lazy initialization
Build LangChain chains once, on demand. Guard with a None check, initialize related singletons together on the first request, and let bad config fail as a clear runtime error instead of a startup crash.
Why this matters
Building a LangChain chain is not free. Constructing a SQLDatabase opens a database connection, inspects schema, and may sample rows. Instantiating ChatOpenAI validates configuration and prepares client state. Calling create_sql_query_chain wires prompts, models, and parsers into an executable graph.
In our NL2SQL agent, initialization added several hundred milliseconds.
If you do that work at import time, a missing DB URL or API key can kill the process before health checks or structured logs help you diagnose it. If you rebuild everything per request, you pay the same setup cost on every query before the model processes a single token.
Lazy initialization avoids both problems.
The problem
Without lazy initialization, you usually end up with two bad options:
- Eager import-time loading: startup fails immediately if config is wrong.
- Per-request initialization: identical objects are rebuilt on every call, adding avoidable setup latency.
Naive approach vs production approach
| Naive: eager import-time loading | Production: lazy singleton |
|---|---|
| ❌ Chains built when module loads | ✅ None declared at module level |
| ❌ Crashes on missing DB or API key | ✅ Initialized on the first request |
| ❌ No startup without full environment | ✅ Hot-path guard is effectively negligible |
| ❌ Hard to unit-test without live DB | ✅ Bad config fails with a clear error |
| ❌ Config read before app logs start | ✅ Related singletons initialized together |
How I implemented it
The NL2SQL agent keeps module-level placeholders for the database, LLM clients, and chains. _get_chain() is the single initialization point: the first call builds everything, and later calls return the cached objects.
# Pseudocode
# Initialized on first request so missing DB config or bad API keys
# fail at runtime with context, not during module import.
_db = None
_llm = None
_fast_llm = None
_generate_query = None
_execute_query = None
_rephrase_answer = None
_select_table = None
_rewrite_query = None
_llm_semaphore = None
def _get_chain():
"""Return all chains, initializing them once on first use."""
global _db, _llm, _fast_llm
global _generate_query, _execute_query, _rephrase_answer
global _select_table, _rewrite_query, _llm_semaphore
# Guard: if the last-built chain exists, the rest do too.
if _generate_query is not None:
return (
_generate_query,
_execute_query,
_rephrase_answer,
_select_table,
_rewrite_query,
)
_init_redis() # Redis or in-memory fallback
if _llm_semaphore is None:
_llm_semaphore = Semaphore(settings.llm_max_concurrency)
_db = SQLDatabase.from_uri(
settings.database_url,
sample_rows_in_table_info=3,
)
_llm = build_llm_with_fallbacks()
_fast_llm = build_fast_llm_with_fallbacks()
_generate_query = create_sql_query_chain(_llm, _db, prompt=build_prompt())
_execute_query = QuerySQLDataBaseTool(db=_db)
_rephrase_answer = answer_prompt | _fast_llm | StrOutputParser()
_select_table = table_prompt | _fast_llm | StrOutputParser() | split_fn
_rewrite_query = rewrite_prompt | _fast_llm | StrOutputParser()
return (
_generate_query,
_execute_query,
_rephrase_answer,
_select_table,
_rewrite_query,
)
Why check _generate_query?
_generate_query is the last object created in the initialization sequence. If it exists, the earlier objects should exist too.
That makes it a safer sentinel than _db. If initialization fails midway, _db might already be set while one or more chains are still missing. Guarding on the last-built object reduces the chance of returning a partially initialized state.
_init_redis() uses the same pattern internally: if the client already exists, return immediately. Both guards are idempotent; only the first successful call performs real work.
Concurrency note
In multi-worker or highly concurrent environments, protect first-time initialization with a lock if partial initialization is possible. The pattern is sound, but concurrent first access can still create race conditions if two requests enter the initialization path at the same time.
Bug story: lazy singleton without a TTL
This pattern worked well for chains, but failed when I used it for mutable data.
The entity resolver cached the players table behind a simpleNoneguard and kept it for the lifetime of the process. That was fine until a new player was added mid-season after an IPL auction. The resolver kept serving the stale mapping, and lookups for the new player failed until the backend restarted.
The lesson is simple: lazy singletons are a good fit for resources that are expensive to build and effectively static for the lifetime of the process. They are a poor fit for data that changes over time.
If the underlying data can change, add a TTL or explicit invalidation. A bare None guard will cache forever.
General pattern
The same idea shows up in many languages under different names: lazy initialization, initialization-on-demand, deferred construction, or memoized setup.
The pattern is always the same:
- Declare the resource as None at the shared scope.
- Add a guard that returns the cached object if it already exists.
- Initialize once, store the result, and reuse it on later calls.
After the first successful call, the setup cost disappears from the hot path.
When to use it
Use lazy singletons for expensive, effectively immutable resources such as:
- database connections or pools
- HTTP clients
- LLM clients
- LangChain chains
- semaphores
- model weights loaded once per process
When not to use it
Do not use a bare lazy singleton for mutable data such as:
- lookup tables that can change
- feature flags
- config that may be reloaded
- caches backed by changing database rows
Those cases need TTL-based refresh, invalidation, or a different caching strategy.
Common mistakes
- Checking the wrong sentinel Guarding on
_dbinstead of the last-built chain can expose partially initialized state after a mid-init failure.- Forgetting
globalIn Python, assigning to a module-level variable inside a function creates a local unless declared `global1. The singleton never persists, and initialization repeats on every call.- Splitting related initialization across multiple guards If DB, LLM, and chains are initialized in separate paths, concurrent startup can leave them out of sync. Initialize related objects together behind one guard.
- Using the pattern for mutable data The pattern is fine for process-lifetime resources, not for data that needs refresh or invalidation.


Top comments (0)