The "Happy Path" Problem
If you look at the documentation for most AI agent frameworks (LangChain, AutoGPT, CrewAI), they all share a dangerous assumption: Abundance Connectivity.
They assume your API calls to OpenAI will always succeed. They assume your websocket will never drop. They assume your user has stable 5G.
But I build software for Lagos, Nigeria. Here, power flickers, fiber cuts happen, and latency is a physical constraint, not an edge case. When I tried deploying standard agentic workflows here, they didn't just fail, they failed catastrophically. Users lost data, workflows hallucinated, and API credits were wasted on timeouts.
I call this the "Agentic Gap", the massive divide between how AI works in a demo video in San Francisco and how it works in a resource-constrained environment.
We Need "Contextual Engineering"I spent the last year re-architecting how we build these systems. I call the approach Contextual Engineering. It’s not about making models smarter; it’s about making the system around them resilient.
Here are two architectural patterns I built to fix this, which you can use in your own Python projects today.
Pattern 1: The "Sync-Later" Queue
Most agents use a synchronous User -> LLM -> Response loop. If the network dies in the middle, the context is lost.
Instead, we treat every user intent as a Transaction.
- Serialize the Intent: When a user prompts the agent, we don't hit the API immediately. We serialize the request and store it in a local SQLite queue.
Cryptographic Signing: We sign the request to ensure integrity.
Opportunistic Sync: A background worker checks for connectivity (Ping/Heartbeat). Only when $N(t) = 1$ (network is available) do we flush the queue.
The Python Implementation:Instead of a direct requests.post, we use a local buffer. Here is the logic from the open-source framework:
import sqlite3
import uuid
def queue_action(user_input, intent_type):
# 1. Create a transaction ID
tx_id = str(uuid.uuid4())
# 2. Store locally first (Offline-First)
conn = sqlite3.connect('agent_state.db')
cursor = conn.cursor()
cursor.execute(
"INSERT INTO pending_actions (id, input, status) VALUES (?, ?, 'PENDING')",
(tx_id, user_input)
)
conn.commit()
# 3. Try to sync (if online)
if check_connectivity():
sync_manager.flush()
else:
print(f"Network down. Action {tx_id} queued for later.")
This ensures Zero Data Loss. The user can keep working, and the agent "catches up" when the internet comes back.
Pattern 2: The Hybrid Inference Router
Why route a simple "Hello" or "Summarize this text" to GPT-4? It’s slow, expensive, and requires a heavy internet connection.
I implemented a Router Logic Gate that inspects the prompt before it leaves the device.
- Low Complexity? → Route to a local SLM (like Llama-3-8B or Phi-2) running on-device. (Cost: $0, Latency: Low).
- High Complexity? → Route to the Cloud (GPT-4o).
The decision function looks like this:
# The Routing Logic
if network_is_down() or complexity < threshold:
model = "Local Llama-3 (8B)" # Free, Fast, Offline
else:
model = "GPT-4o" # Smart, Costly, Online
This simple check saved us about 40-60% on API costs and made the application feel "instant" for basic tasks, even on 3G networks.
The "Contextual Engineering" Framework
These patterns aren't just hacks; they are part of a broader discipline I’m trying to formalize called Contextual Engineering. It’s about building AI that respects the Contextual Tuple (C = {I, K, R}): Infrastructure, Knowledge (Culture), and Regulation.
I’ve open-sourced the entire reference architecture. It includes the routing logic, the SQLite queue wrappers, and the "Constitutional Sentinel" for safety.
Where to find the code
I want to see more engineers building specifically for the Global South. You can find the full Python implementation here:
The Deep Dive (Free Book)
For those who want the math and the full architectural theory, I also wrote a 90-page reference manuscript titled "Contextual Engineering: Architectural Patterns for Resilient AI." It covers the full "Agentic Gap" theory and detailed diagrams.
📖 Download the PDF (Open Access)
Let me know in the comments: How do you handle network flakes in your LLM apps?
Top comments (0)