NeuroBase is an intelligent, self-learning conversational database system that transforms PostgreSQL into a cognitive system. It features two powerful modes:
๐ฃ๏ธ Interactive Natural Language Mode
Talk to your database - no SQL required. NeuroBase understands your questions, generates optimized SQL, learns from corrections, and gets smarter with every interaction.
NeuroBase> Show me users who signed up today
๐ง Analyzing query...
๐ Generated SQL:
SELECT * FROM users WHERE created_at::date = CURRENT_DATE;
โก Execution time: 23ms
๐ก Learned: "users who signed up today" โ created_at::date filter
๐ค Multi-Agent Orchestration Mode
Autonomous AI agents work in parallel on isolated database forks to handle schema evolution, query validation, learning aggregation, and A/B testing - all without touching production data.
What inspired me? A friend sent me this challenge and dared me to take it on. I've always wanted to see how easy (or hard!) it would be to integrate AI directly into databases - not just on top of them, but inside them. This challenge was the perfect opportunity. Turns out, with Tiger Data's agentic features, making databases truly intelligent is surprisingly elegant!
Core Features
Natural Language Interface:
- SQL-free queries in plain English
- Context-aware automatic SQL generation
- Continuous learning from interactions
- Conversation history and context retention
- Multi-LLM support (OpenAI, Claude, Ollama)
Multi-Agent System:
- Specialized agents (Schema Evolution, Query Validator, Learning Aggregator, A/B Testing)
- Intelligent fork management (isolated environments per agent)
- Real-time dashboard with live metrics and charts
- Asynchronous task processing with automatic execution
- Inter-agent data synchronization with conflict resolution
- Free plan friendly with shared database mode
Demo
๐ GitHub Repository: github.com/4n0nn43x/neurobase
Quick Start - Interactive Mode
git clone https://github.com/4n0nn43x/neurobase
cd neurobase
npm install
cp .env.example .env # Configure your database and LLM provider
npm start # Start interactive CLI
How I Used Agentic Postgres
I leveraged Tiger Data's agentic features to create a truly intelligent system:
๐ Tiger CLI for Dynamic Fork Management
Each agent can have its own isolated database fork for safe experimentation:
// Create agent with dedicated fork
const agent = await orchestrator.registerAgent({
name: 'Schema Evolution Agent',
type: 'schema-evolution',
forkStrategy: 'now', // Instant zero-copy fork
cpu: '0.5',
memory: '2Gi',
enabled: true
});
// Fork strategies supported:
// - 'now': Current state
// - 'last-snapshot': Previous snapshot
// - 'to-timestamp': Specific point in time
// - 'shared': No fork (free plan friendly)
Why this matters: Agents can test schema changes, validate queries, and run experiments without any risk to production data. Tiger's copy-on-write forks make this instant and lightweight.
๐ pg_tsvector for Semantic Search
The Learning Aggregator agent uses PostgreSQL full-text search to find patterns:
// Find relevant learnings using semantic search
const insights = await pool.query(`
SELECT *,
ts_rank(search_vector, plainto_tsquery('query optimization')) as rank
FROM neurobase_learnings
WHERE search_vector @@ plainto_tsquery('query optimization')
ORDER BY rank DESC, confidence DESC
LIMIT 10
`);
The interactive mode also uses tsvector to remember past queries and improve translation accuracy:
// Store learned patterns with semantic indexing
CREATE TABLE neurobase_learnings (
id SERIAL PRIMARY KEY,
natural_language TEXT,
generated_sql TEXT,
search_vector tsvector GENERATED ALWAYS AS (
to_tsvector('english', natural_language || ' ' || COALESCE(generated_sql, ''))
) STORED,
confidence NUMERIC DEFAULT 1.0
);
CREATE INDEX idx_learnings_search ON neurobase_learnings USING GIN(search_vector);
๐พ Fast Forks for A/B Testing
The A/B Testing agent creates parallel forks to compare strategies:
// Test multiple indexing strategies simultaneously
const experiment = await abTesting.createExperiment(
"Index Strategy Comparison",
"Which index type performs better?",
[
{ name: 'btree-strategy', sql: 'CREATE INDEX USING btree...' },
{ name: 'gin-strategy', sql: 'CREATE INDEX USING gin...' },
{ name: 'brin-strategy', sql: 'CREATE INDEX USING brin...' }
]
);
// Each strategy runs on its own fork
await abTesting.startExperiment(experiment.id);
const results = await abTesting.analyzeResults(experiment.id);
console.log(`Winner: ${results.winner.name} with ${results.winner.speedup}x improvement`);
Tiger's fast forks (2-3 seconds) make this practical - you can test dozens of strategies in minutes, not hours.
๐ Fork Synchronization for Knowledge Sharing
Agents share discoveries through selective synchronization:
// Sync insights from agent fork back to main database
const syncJob = await synchronizer.createSyncJob({
source: agentFork.connectionString,
target: mainDatabase.connectionString,
tables: ['neurobase_learnings', 'neurobase_optimizations'],
mode: 'incremental', // Only new records
conflictResolution: 'source-wins',
batchSize: 100
});
await synchronizer.executeSync(syncJob.id);
This ensures all agents benefit from each other's learnings.
๐ Graceful Degradation for Free Plan
Hit the service limit? No problem - shared database mode:
// Free plan friendly: multiple agents, one database
const validator = await orchestrator.registerAgent({
name: 'Query Validator',
type: 'query-validator',
useFork: false, // Uses shared mainPool
forkStrategy: 'shared', // No fork creation
enabled: true
});
This was crucial! I hit the free plan's service limit early during development. Instead of being blocked, I implemented shared database mode where agents work together on the main database. Perfect for testing and small-scale use.
๐ง Natural Language Understanding with Schema Awareness
The interactive CLI uses Tiger's connection to introspect schema and generate accurate SQL:
// Load schema with Tiger's fast INFORMATION_SCHEMA queries
const tables = await pool.query(`
SELECT table_name, column_name, data_type
FROM information_schema.columns
WHERE table_schema = 'public'
ORDER BY table_name, ordinal_position
`);
// Feed to LLM for context-aware translation
const prompt = `
Given these tables: ${JSON.stringify(tables)}
Translate to SQL: "${userQuery}"
`;
Overall Experience
๐ What Worked Exceptionally Well
Tiger Data's fork speed is mind-blowing! Creating a complete database copy in 2-3 seconds (not minutes or hours!) made the entire multi-agent architecture practical. Without this, agents would spend more time waiting for forks than doing actual work.
Copy-on-write is brilliant - I expected running 4-5 agents with separate forks to be resource-heavy, but Tiger's implementation is surprisingly lightweight. Memory usage stayed reasonable even with multiple active agents doing parallel work.
The Tiger CLI is beautifully simple - Commands like tiger service fork and tiger service list just work. No complex configuration files, no wrestling with parameters. The UX is on point.
pg_tsvector search is blazingly fast - Full-text search across thousands of learned patterns returns results in single-digit milliseconds. No need for external search infrastructure like Elasticsearch.
Developer experience - Going from idea to working multi-agent system took less time than expected, largely because Tiger's features are well-designed and composable.
๐ฎ What Surprised Me
The free plan limitation became a feature! When I hit the service limit, I initially thought "well, that's it for testing." Instead, it forced me to implement shared database mode, which actually made the system more flexible. Now users can choose:
- Free/Development: Multiple agents on one database (no forks)
- Production: Each agent gets isolated fork (better safety)
Natural language to SQL is harder than multi-agent orchestration! I thought the multi-agent system would be the complex part, but turns out getting LLMs to consistently generate correct SQL with proper schema awareness is the real challenge. Context management and prompt engineering took significant iteration.
Inter-agent synchronization patterns are fascinating - Watching agents discover insights independently, then sync and build on each other's findings feels like watching distributed intelligence emerge. It's closer to biological learning than traditional programming.
pg_tsvector semantic search punches way above its weight - I expected to need pgvector for semantic capabilities, but PostgreSQL's built-in full-text search with tsvector handles most use cases remarkably well. Only truly complex semantic reasoning needs vector embeddings.
๐ฎ What's Next
Near-term
- Real LLM integration in agents (currently mock data)
- Vector embeddings with pgvector for advanced semantic search
- Automated fork cleanup scheduler
- Agent marketplace - share agent configs
Long-term
- Self-healing database - agents detect and auto-fix issues
- Predictive optimization - anticipate performance problems
- Natural language migrations - "Add user preferences table"
- Cross-database agent federation - agents working across multiple databases




Top comments (0)