<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Tobi Lekan Adeosun</title>
    <description>The latest articles on DEV Community by Tobi Lekan Adeosun (@tflux2011).</description>
    <link>https://dev.to/tflux2011</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3736286%2F955e019a-972d-4ddd-94e0-731d9bdd9326.jpeg</url>
      <title>DEV Community: Tobi Lekan Adeosun</title>
      <link>https://dev.to/tflux2011</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/tflux2011"/>
    <language>en</language>
    <item>
      <title>Why Merging AI Models Fails (And How a 'Gossip Handshake' Fixed It)</title>
      <dc:creator>Tobi Lekan Adeosun</dc:creator>
      <pubDate>Sat, 07 Mar 2026 06:54:38 +0000</pubDate>
      <link>https://dev.to/tflux2011/why-merging-ai-models-fails-and-how-a-gossip-handshake-fixed-it-3gef</link>
      <guid>https://dev.to/tflux2011/why-merging-ai-models-fails-and-how-a-gossip-handshake-fixed-it-3gef</guid>
      <description>&lt;p&gt;The Problem: AI is Too Centralized&lt;br&gt;
Right now, the "AI Arms Race" is happening in giant data centers. But what happens in a rural village in Africa, or a high-security office with no internet? These communities need to share knowledge between their local AI models without a central server.&lt;/p&gt;

&lt;p&gt;I spent the last few months researching Decentralized Knowledge Sharing. The goal: Could two different AI "experts"—say, an Agronomy Expert and a Veterinary Expert, combine their brains into one?&lt;/p&gt;

&lt;p&gt;The "Common Sense" Failure: Weight-Space Merging&lt;br&gt;
The current trend in AI is called Weight-Space Merging (like TIES-Merging). It basically tries to "average" the math of two models to create a single super-model.&lt;/p&gt;

&lt;p&gt;I tested this, and the results were catastrophic.&lt;/p&gt;

&lt;p&gt;When I merged a model that knew how to fix tractors with a model that knew how to treat cattle, the resulting "merged" model scored below random chance. It didn't just forget; it got confused. It tried to apply tractor repair logic to sick cows.&lt;/p&gt;

&lt;p&gt;I call this the Specialization Paradox: The smarter your individual AI models get, the harder they are to merge.&lt;/p&gt;

&lt;p&gt;The Solution: The Gossip Handshake Protocol&lt;br&gt;
Instead of trying to smash two brains together, I built the Gossip Handshake.&lt;/p&gt;

&lt;p&gt;Instead of merging weights, we:&lt;/p&gt;

&lt;p&gt;Gossip: Devices discover each other via Bluetooth (BLE) and swap tiny 50MB "LoRA adapters" (knowledge packets).&lt;/p&gt;

&lt;p&gt;Handshake: The device stores these adapters in a local library.&lt;/p&gt;

&lt;p&gt;Route: When you ask a question, a lightweight Semantic Router picks the right expert for the job.&lt;/p&gt;

&lt;p&gt;The Results: 13x Better Performance&lt;br&gt;
I ran this on Apple Silicon (M-series) using the Qwen2.5 model family (0.5B and 1.5B parameters).&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Method&lt;/th&gt;
&lt;th&gt;Configuration&lt;/th&gt;
&lt;th&gt;Agronomy&lt;/th&gt;
&lt;th&gt;Veterinary&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Overall Score&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Baseline&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Standalone Expert&lt;/td&gt;
&lt;td&gt;68.0%&lt;/td&gt;
&lt;td&gt;92.0%&lt;/td&gt;
&lt;td&gt;80.0%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Standard Merge&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;TIES-Merging (d=0.5)&lt;/td&gt;
&lt;td&gt;20.0%&lt;/td&gt;
&lt;td&gt;8.0%&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;14.0%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Our Approach&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Gossip Handshake&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;64.0%&lt;/td&gt;
&lt;td&gt;92.0%&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;78.0%&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The gap is massive. By simply switching instead of merging, we achieved a 5.6x to 13x leap in performance.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why This Matters for Digital Sovereignty&lt;/strong&gt;&lt;br&gt;
This isn't just about better scores; it's about Sovereignty.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Zero Internet: This protocol works in "Zero-G" zones.&lt;/li&gt;
&lt;li&gt;Privacy: Your raw data never leaves your device. Only the "math" (the adapter) is shared.&lt;/li&gt;
&lt;li&gt;Scalable: You can add 100 experts to a single phone, and it only takes milliseconds to switch between them.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Try it Yourself (Open Source)&lt;br&gt;
I've open-sourced the entire pipeline. You can generate the synthetic data, train the adapters, and run the Gossip Protocol on your own laptop.&lt;/p&gt;

&lt;p&gt;👉 GitHub Repository: &lt;a href="https://github.com/tflux2011/gossip-handshake" rel="noopener noreferrer"&gt;https://github.com/tflux2011/gossip-handshake&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Final Thoughts&lt;br&gt;
We need to stop trying to force AI into a "one size fits all" box. The future of AI is Modular, Decentralized, and Local.&lt;/p&gt;

&lt;p&gt;I’d love to hear from you: Have you tried merging LoRA adapters? What were your results? Let’s discuss in the comments!&lt;/p&gt;

</description>
      <category>machinelearning</category>
      <category>ai</category>
      <category>python</category>
      <category>opensource</category>
    </item>
    <item>
      <title>Stop Trusting Your AI Agents: How to Build a "Constitutional Sentinel"</title>
      <dc:creator>Tobi Lekan Adeosun</dc:creator>
      <pubDate>Sat, 21 Feb 2026 19:37:45 +0000</pubDate>
      <link>https://dev.to/tflux2011/stop-trusting-your-ai-agents-how-to-build-a-constitutional-sentinel-1kcg</link>
      <guid>https://dev.to/tflux2011/stop-trusting-your-ai-agents-how-to-build-a-constitutional-sentinel-1kcg</guid>
      <description>&lt;p&gt;In my last post, I wrote about why "Always-Online" AI agents fail in the real world and how to build an offline-first architecture.&lt;/p&gt;

&lt;p&gt;But solving the connectivity problem introduces a much scarier problem: Autonomous Risk. When an AI agent is operating offline or at the edge, it is making decisions without immediate human oversight. LLMs are notoriously "confident idiots", they will happily generate code that grants isAdmin=true to a guest user, or confidently drop a database table because it misunderstood a prompt.&lt;/p&gt;

&lt;p&gt;If you are building Agentic workflows, you cannot just hook an LLM directly to your execution environment. You need a middleman.&lt;/p&gt;

&lt;p&gt;In my Contextual Engineering framework, we call this the Constitutional Sentinel.&lt;/p&gt;

&lt;p&gt;What is a Constitutional Sentinel?&lt;br&gt;
A Sentinel is a deterministic safety layer (hardcoded logic) that wraps around your probabilistic AI agent. Before the agent is allowed to execute any tool_call or API request, the Sentinel intercepts the payload, evaluates it against a set of hard constraints (the "Constitution"), and decides whether to:&lt;/p&gt;

&lt;p&gt;Allow the execution.&lt;/p&gt;

&lt;p&gt;Block the execution and return an error to the agent to try again.&lt;/p&gt;

&lt;p&gt;Escalate to a Human-in-the-Loop (HITL).&lt;/p&gt;

&lt;p&gt;The Implementation (Python)&lt;br&gt;
Here is a simplified look at how to implement a Sentinel pattern to catch dangerous agent actions before they execute.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;class ConstitutionalSentinel:
    def __init__(self):
        # Hardcoded constraints the AI is NEVER allowed to break
        self.banned_actions = ["drop_table", "delete_user", "grant_admin"]
        self.max_spending_limit = 50.00

    def evaluate_action(self, agent_proposed_action, payload):
        """
        Intercepts the agent's decision BEFORE execution.
        """
        print(f"🔍 Sentinel Intercept: Evaluating '{agent_proposed_action}'...")

        # 1. Check for universally banned actions
        if agent_proposed_action in self.banned_actions:
            return self._block(f"Action '{agent_proposed_action}' violates core safety constitution.")

        # 2. Check context-specific constraints (e.g., financial limits)
        if agent_proposed_action == "issue_refund":
            amount = payload.get("amount", 0)
            if amount &amp;gt; self.max_spending_limit:
                return self._escalate_to_human(agent_proposed_action, amount)

        # 3. If it passes all checks, allow execution
        return self._allow()

    def _block(self, reason):
        print(f"❌ BLOCKED: {reason}")
        # Return context back to the LLM so it can correct its mistake
        return {"status": "blocked", "feedback": reason}

    def _escalate_to_human(self, action, context):
        print(f"⚠️ ESCALATED: Human approval required for {action} ({context})")
        return {"status": "pending_human_review"}

    def _allow(self):
        print("✅ ALLOWED: Action passed constitutional checks.")
        return {"status": "approved"}


# --- Example Usage in your Agent Loop ---
sentinel = ConstitutionalSentinel()

# The AI Agent decides it wants to grant admin access based on a user prompt
proposed_action = "grant_admin"
payload = {"user_id": "9942"}

# The Sentinel intercepts it
decision = sentinel.evaluate_action(proposed_action, payload)

if decision["status"] == "approved":
    execute_tool(proposed_action, payload)
else:
    print("Execution halted. Agent must rethink or wait for human.")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why "Green Checkmarks" Are Dangerous&lt;br&gt;
Without a Sentinel, your tests might pass because the AI successfully generated the correct JSON structure for the API call. But structurally correct doesn't mean logically safe.&lt;/p&gt;

&lt;p&gt;The Sentinel shifts your architecture from "Assuming the AI is right" to "Assuming the AI is a liability." It forces the system to prove its safety deterministically.&lt;/p&gt;

&lt;p&gt;The Full Blueprint&lt;br&gt;
The Constitutional Sentinel is just one piece of the Contextual Engineering architecture.&lt;/p&gt;

&lt;p&gt;If you want to see how this Sentinel integrates with the Sync-Later Queue and the Hybrid Router to build resilient, offline-first AI for low-resource environments, I’ve open-sourced the complete reference manuscript.&lt;/p&gt;

&lt;p&gt;You can download the full PDF on Zenodo for free (recently crossed 200+ downloads by other builders!):&lt;br&gt;
👉 &lt;a href="https://zenodo.org/records/18005435" rel="noopener noreferrer"&gt;https://zenodo.org/records/18005435&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Let’s stop building agents that just "work," and start building agents we can actually trust.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>architecture</category>
      <category>security</category>
    </item>
    <item>
      <title>Building Offline-First AI Agents: Why "Always-Online" Architectures Fail in the Real World</title>
      <dc:creator>Tobi Lekan Adeosun</dc:creator>
      <pubDate>Wed, 28 Jan 2026 03:21:18 +0000</pubDate>
      <link>https://dev.to/tflux2011/building-offline-first-ai-agents-why-always-online-architectures-fail-in-the-real-world-2o87</link>
      <guid>https://dev.to/tflux2011/building-offline-first-ai-agents-why-always-online-architectures-fail-in-the-real-world-2o87</guid>
      <description>&lt;p&gt;The "Happy Path" Problem&lt;br&gt;
If you look at the documentation for most AI agent frameworks (LangChain, AutoGPT, CrewAI), they all share a dangerous assumption: Abundance Connectivity.&lt;/p&gt;

&lt;p&gt;They assume your API calls to OpenAI will always succeed. They assume your websocket will never drop. They assume your user has stable 5G.&lt;/p&gt;

&lt;p&gt;But I build software for Lagos, Nigeria. Here, power flickers, fiber cuts happen, and latency is a physical constraint, not an edge case. When I tried deploying standard agentic workflows here, they didn't just fail, they failed catastrophically. Users lost data, workflows hallucinated, and API credits were wasted on timeouts.&lt;/p&gt;

&lt;p&gt;I call this the "Agentic Gap", the massive divide between how AI works in a demo video in San Francisco and how it works in a resource-constrained environment.&lt;/p&gt;

&lt;p&gt;We Need "Contextual Engineering"I spent the last year re-architecting how we build these systems. I call the approach Contextual Engineering. It’s not about making models smarter; it’s about making the system around them resilient.&lt;/p&gt;

&lt;p&gt;Here are two architectural patterns I built to fix this, which you can use in your own Python projects today.&lt;/p&gt;

&lt;p&gt;Pattern 1: The "Sync-Later" Queue&lt;br&gt;
Most agents use a synchronous User -&amp;gt; LLM -&amp;gt; Response loop. If the network dies in the middle, the context is lost.&lt;/p&gt;

&lt;p&gt;Instead, we treat every user intent as a Transaction.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Serialize the Intent: When a user prompts the agent, we don't hit the API immediately. We serialize the request and store it in a local SQLite queue.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Cryptographic Signing: We sign the request to ensure integrity.&lt;/p&gt;

&lt;p&gt;Opportunistic Sync: A background worker checks for connectivity (Ping/Heartbeat). Only when $N(t) = 1$ (network is available) do we flush the queue.&lt;/p&gt;

&lt;p&gt;The Python Implementation:Instead of a direct requests.post, we use a local buffer. Here is the logic from the open-source framework:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import sqlite3
import uuid

def queue_action(user_input, intent_type):
    # 1. Create a transaction ID
    tx_id = str(uuid.uuid4())

    # 2. Store locally first (Offline-First)
    conn = sqlite3.connect('agent_state.db')
    cursor = conn.cursor()
    cursor.execute(
        "INSERT INTO pending_actions (id, input, status) VALUES (?, ?, 'PENDING')",
        (tx_id, user_input)
    )
    conn.commit()

    # 3. Try to sync (if online)
    if check_connectivity():
        sync_manager.flush()
    else:
        print(f"Network down. Action {tx_id} queued for later.")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This ensures Zero Data Loss. The user can keep working, and the agent "catches up" when the internet comes back.&lt;/p&gt;

&lt;p&gt;Pattern 2: The Hybrid Inference Router&lt;br&gt;
Why route a simple "Hello" or "Summarize this text" to GPT-4? It’s slow, expensive, and requires a heavy internet connection.&lt;/p&gt;

&lt;p&gt;I implemented a Router Logic Gate that inspects the prompt before it leaves the device.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Low Complexity? → Route to a local SLM (like Llama-3-8B or Phi-2) running on-device. (Cost: $0, Latency: Low).&lt;/li&gt;
&lt;li&gt;High Complexity? → Route to the Cloud (GPT-4o).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The decision function looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# The Routing Logic
if network_is_down() or complexity &amp;lt; threshold:
    model = "Local Llama-3 (8B)" # Free, Fast, Offline
else:
    model = "GPT-4o"             # Smart, Costly, Online
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This simple check saved us about 40-60% on API costs and made the application feel "instant" for basic tasks, even on 3G networks.&lt;/p&gt;

&lt;p&gt;The "Contextual Engineering" Framework&lt;br&gt;
These patterns aren't just hacks; they are part of a broader discipline I’m trying to formalize called Contextual Engineering. It’s about building AI that respects the Contextual Tuple (C = {I, K, R}): Infrastructure, Knowledge (Culture), and Regulation.&lt;/p&gt;

&lt;p&gt;I’ve open-sourced the entire reference architecture. It includes the routing logic, the SQLite queue wrappers, and the "Constitutional Sentinel" for safety.&lt;/p&gt;

&lt;p&gt;Where to find the code&lt;br&gt;
I want to see more engineers building specifically for the Global South. You can find the full Python implementation here:&lt;/p&gt;

&lt;p&gt;👉 &lt;a href="https://github.com/tflux2011/contextual-engineering-patterns" rel="noopener noreferrer"&gt;Star the GitHub Repository&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The Deep Dive (Free Book)&lt;br&gt;
For those who want the math and the full architectural theory, I also wrote a 90-page reference manuscript titled "Contextual Engineering: Architectural Patterns for Resilient AI." It covers the full "Agentic Gap" theory and detailed diagrams.&lt;/p&gt;

&lt;p&gt;📖 &lt;a href="https://zenodo.org/records/18005435" rel="noopener noreferrer"&gt;Download the PDF (Open Access)&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Let me know in the comments: How do you handle network flakes in your LLM apps?&lt;/p&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>architecture</category>
      <category>systemdesign</category>
    </item>
  </channel>
</rss>
