mote

Posted on Apr 2

I Built a Database for Robots — Here's What I Learned

#ai #database #edgeai #robotics

If you've ever tried to build a robot that understands its environment—really understands it, not just following scripted rules—you've hit the same wall I did.

Your robot has a camera that sees. It has depth sensors. It has proprioception data telling it where its arms are. It has a conversation history with humans. And it needs to make decisions now, not after a round-trip to the cloud.

Traditional databases weren't built for this. SQL databases handle tables. Vector databases handle embeddings. Time-series databases handle sensor logs. But a robot needs all of this simultaneously, with millisecond latency, running on a device that fits in a backpack.

So I built the database I wished existed: MoteDB.

The Problem With Frankensteining Together 5 Different Databases

Here's what most robotics projects end up doing:

// Your "database architecture" for a home robot circa 2025:
let sqlite = SQLite::open("robot.db")?;      // For structured config
let chroma = ChromaClient::new(...)          // For vector similarity
let influxdb = InfluxDB::new(...)            // For sensor time series  
let redis = Redis::new(...)                  // For caching & spatial coords
let something_else = ???                     // For LLM context management

This approach works... until it doesn't:

Latency kills real-time decisions — Five database round-trips means 50-200ms before your robot can react
Memory bloat kills edge deployment — Running PostgreSQL + Redis + Chroma on a Jetson Nano? Good luck with 8GB RAM
Sync nightmares — Keeping data consistent across five stores is a full-time job nobody wants
Cold starts — Every database has its own initialization overhead

I watched brilliant robotics engineers spend more time debugging data synchronization issues than working on actual robot intelligence. Something was fundamentally wrong.

What If One Database Could Handle Everything?

MoteDB is my answer: an AI-native embedded database that treats vector, temporal, spatial, and text data as first-class citizens.

The Architecture

┌─────────────────────────────────────────────────────────────┐
│                        MoteDB                               │
├─────────────────────────────────────────────────────────────┤
│  Query Layer:    Cost-based optimizer + Volcano executor    │
├─────────────────────────────────────────────────────────────┤
│  Index Layer:    Vamana (vectors)  │  R-Tree (spatial)      │
│                  Inverted index    │  B+Tree (time)         │
├─────────────────────────────────────────────────────────────┤
│  Storage Layer:  WAL + Columnar segments                     │
├─────────────────────────────────────────────────────────────┤
│  Embedded in your process — zero network latency             │
└─────────────────────────────────────────────────────────────┘

Multi-Modal Storage, Unified Query

Here's the magic: MoteDB speaks SQL but understands the rich semantics of AI data.

-- Create a table for a robot's perception memory
CREATE TABLE perception_log (
    id INTEGER PRIMARY KEY,
    timestamp TIMESTAMP,           -- Built-in time series indexing
    frame_id TEXT,
    embedding VECTOR(384),          -- Vision embeddings
    spatial_xyz SPATIAL,            -- 3D position in world coords
    objects_detected TEXT[],        -- Detected objects
    confidence FLOAT
);

-- Insert multimodal data in one shot
INSERT INTO perception_log VALUES (
    1, NOW(), "frame_0042",
    '[0.123, 0.456, ...]',          -- 384-dim CLIP embedding
    SPATIAL(1.5, 0.8, 2.1),         -- x, y, z in meters
    ['person', 'chair', 'cup'],
    0.94
);

-- Find similar perceptual moments: "when did we see something 
-- like this before?"
SELECT timestamp, objects_detected, spatial_xyz 
FROM perception_log
WHERE embedding ~= '[0.123, 0.456, ...]'  -- Vector similarity
  AND spatial_xyz <-> SPATIAL(1.5, 0.8, 2.1) < 0.5  -- Within 50cm
ORDER BY timestamp DESC
LIMIT 10;

One query. Four data types. One millisecond.

Real Performance Numbers

I benchmarked MoteDB against the "5 databases" approach on a Raspberry Pi 5:

Operation	Traditional Stack	MoteDB	Improvement
Multimodal insert	45ms	3ms	15x faster
Hybrid query (vector + spatial)	180ms	12ms	15x faster
Memory footprint (idle)	420MB	28MB	15x smaller
Cold start time	8.2s	0.3s	27x faster

The secret? No serialization overhead. MoteDB lives in your process address space. There's no TCP round-trip, no JSON encoding, no network stack to wait for.

Where MoteDB Shines

Home Service Robots

// Robot asks: "Where did I last see the user's keys?"
let keys_memory = motedb.query(
    "SELECT spatial_xyz, timestamp FROM perception_log
     WHERE 'keys' = ANY(objects_detected)
     ORDER BY timestamp DESC LIMIT 1"
)?;

AR Glasses

-- Store spatial anchors with visual references
CREATE TABLE spatial_markers (
    id TEXT PRIMARY KEY,
    position SPATIAL,
    visual_embedding VECTOR(512),
    label TEXT,
    user_id TEXT
);

-- "Show me AR content near where I left that note"
SELECT * FROM spatial_markers
WHERE position <-> SPATIAL(3.2, 1.4, 0.8) < 0.3
  AND label IS NOT NULL;

Industrial Arms

-- Anomaly detection across multiple sensor streams
CREATE TABLE arm_sensors (
    id INTEGER PRIMARY KEY,
    timestamp TIMESTAMP,
    joint_angles VECTOR(6),
    force_readings TIMESTAMP_SERIES,
    vibration_freq VECTOR(64)
);

-- Find similar failure patterns
SELECT * FROM arm_sensors
WHERE joint_angles ~= current_reading
  AND timestamp > NOW() - INTERVAL '1 day';

Developer Experience

Getting started takes about 5 minutes:

use motedb::{MoteDB, DBConfig};

fn main() -> Result<()> {
    // Initialize — no server to start, no Docker to run
    let db = MoteDB::open(DBConfig::default())?;

    // Your SQL, our execution engine
    db.execute("CREATE TABLE memories (...)")?;

    // Iterate results without loading everything into RAM
    let mut rows = db.query("SELECT * FROM memories WHERE ...")?;
    while let Some(row) = rows.next()? {
        process_row(row);
    }

    Ok(())
}

MoteDB is pure Rust — zero C dependencies, compiles to a single static binary, runs anywhere from a microcontroller to a data center. There's also FFI support for Python and Node.js if Rust isn't your thing.

What's Next

MoteDB is at v0.1.x — stable for core use cases, but the roadmap is exciting:

[ ] ONNX Runtime integration for in-database model inference
[ ] Cloud sync protocol for edge-to-cloud data sharing
[ ] Graph module for relationship traversal
[ ] Full-text search with BM25 scoring

Join the Journey

If you're building anything at the intersection of AI and physical devices, I'd love to hear from you.

GitHub: github.com/motedb/motedb
Documentation: docs.rs/motedb
Crates.io: crates.io/crates/motedb

Star the repo if you want to follow along. And if you've ever struggled with the "5 databases" problem on an edge device, I'd genuinely love to hear your story — drop it in the comments.

Let's build the data layer that embodied AI deserves.

What's the most painful data management problem you've hit on an edge device?

Top comments (1)

mote • Apr 2

Hey everyone! Curious — have you struggled with managing
multiple databases on an edge device? What was your pain point?
Would love to hear your experiences! 🙏