AgentVault: Distributed Persistence for Local AI Agents

#ai #programming #python #dailybuild2026

When we build AI agents today, we face a classic trade-off: privacy and performance (Local LLMs) versus persistence and portability (Cloud LLMs).

If you're running agents locally using Ollama, you get incredible speed and data privacy. But the moment you move to a new machine, your agent’s context and memory are gone. Your agent has "dementia."

Today, we’re bridging that gap with AgentVault.

The Problem: The "Stateless Local Agent"

Local agents usually store their conversation history in a local SQLite database. This works fine on a single laptop, but it’s a silo. You can’t easily share that state with other developers, and you lose it if you switch devices.

The Solution: A Syncable distributed Brain

AgentVault is a Python SDK that lets you run agents locally while automatically synchronizing their state to a Hugging Face Bucket.

How it Works:

Sync Pull: Before the agent starts, it pulls the latest .db from the remote bucket. ### 2. Universal Persistence While we default to local LLMs with Ollama, AgentVault is provider-agnostic. You can switch between OpenAI, Claude, and OpenRouter while keeping the exact same memory vault.

# Switch models but keep the same distributed brain
agent = VaultAgent(
    name="Researcher",
    bucket="my-vault",
    model=OpenAIChat(id="gpt-4o")
)

Sync Push: After the task is complete, the updated .db is pushed back to the bucket using Xet-deduplication.

Xet-deduplication is the "secret sauce" here. It performs content-defined chunking, meaning if you have a 1GB database and only 1MB of it changed, Hugging Face only syncs that 1MB. It's fast, efficient, and optimized for global distribution via CDN.

🏗️ Architecture Deep Dive

The architecture of AgentVault is built on three main pillars:

1. `VaultStorage` (The Synchronizable Memory)

Extending the standard Agno SqliteAgentStorage, we’ve added hooks for the hf CLI.

class VaultStorage(SqliteAgentStorage):
    def sync_pull(self):
        # Calls 'hf sync remote local --include "*.db"'
        self.bucket_manager.sync_pull(self.remote_path, self.local_dir)

    def sync_push(self):
        # Calls 'hf sync local remote --include "*.db"'
        self.bucket_manager.sync_push(self.local_dir, self.remote_path)

2. `VaultAgent` (The Lifecycle Wrapper)

The VaultAgent acts as an orchestrator, ensuring that every interaction follows a strict Pull -> Execute -> Push cycle.

def print_response(self, text: str, **kwargs):
    self.storage.sync_pull()
    try:
        return self.agent.print_response(text, **kwargs)
    finally:
        self.storage.sync_push()

3. The "Safety Lock"

Distributed systems are prone to failure. If the sync fails (due to network issues or authentication), AgentVault triggers a Safety Lock. This allows the agent to continue running with local memory, ensuring the agent is always available, even if the cloud isn't.

🛠️ Built for Developers

We’ve focused on a high-quality Developer Experience (DX). With the built-in CLI, you can initialize a project and check its status in seconds:

agentvault init username/my-vault
agentvault status

Moreover, we’ve included a "Skills" system. This allows your AI assistant (like Gemini or Claude) to automatically understand how to use AgentVault when it enters your repository.

Why Hugging Face Buckets?

Hugging Face Buckets are S3-compatible, but they are uniquely optimized for machine learning workflows. They leverage a Global CDN with edge locations around the world, making state retrieval incredibly fast regardless of where you're working.

AgentVault gives you the best of both worlds. You get the privacy of local LLMs and the distributed persistence of the cloud. It’s the foundation for building multi-device, team-collaborative local AI agents.

Check it out on GitHub! https://github.com/harishkotra/hf-agentvault

Built with Agno, Ollama, and Hugging Face.