Jules

Posted on Nov 10, 2025 • Edited on Dec 6, 2025

Magnolia v1

#devchallenge #agenticpostgreschallenge #ai #postgres

Agentic Postgres Challenge Submission

This is a submission for the Agentic Postgres Challenge with Tiger Data

What I Built

🌸 Magnolia - Perplexity for Your Mind

Magnolia is an AI-powered second brain that transforms scattered ChatGPT conversations into an organized, searchable knowledge base. Instead of losing valuable insights in thousands of chat messages, Magnolia turns them into a living, queryable memory system.

The Problem: We have amazing conversations with AI, but they vanish into the void. Want to recall that brilliant idea from 3 months ago? Good luck scrolling through hundreds of chats.

The Solution: Import your ChatGPT exports and Magnolia automatically:

🔍 Creates semantic vector embeddings for every message
🧠 Builds a knowledge graph of entities, facts, and relationships
🏷️ Auto-categorizes conversations into topics using AI
💬 Lets you chat with your entire conversation history using RAG
📚 Provides hybrid search combining vector similarity + relational data

Key Features:

Semantic Search: Find ideas using natural language, not keywords
AI Chat with Memory: Ask questions and get answers grounded in your personal knowledge
Knowledge Graph: Entities and observations that evolve over time
Auto-Organization: AI automatically extracts topics and relationships
Beautiful Dark UI: Vercel/Apple-inspired glassmorphism interface

Demo

🖼️ Screenshots:

The dashboard with import stats
Semantic search results with similarity scores
Chat interface with RAG responses
Knowledge base with topics and entities
Vector embeddings being created
Generate artifacts based on preferences

🚀 Live Demo:
To use it register with a username, generate demo data (it takes some time) and explore the 3d memory graph, you can also chat about anything and the agent will remember :)

How I Used Agentic Postgres

Magnolia showcases Postgres as the brain of an AI agent system, not just a database. Here's how I leveraged Agentic Postgres features creatively:

1. pgvector for Semantic Memory 🧠

I use pgvector to store 768-dimensional embeddings from Google's Gemini model, enabling semantic search across all conversation history:

-- Vector embeddings table with IVFFlat index
CREATE TABLE embeddings (
  id SERIAL PRIMARY KEY,
  message_id INTEGER REFERENCES messages(id),
  conversation_id INTEGER REFERENCES conversations(id),
  embedding vector(768),
  content TEXT NOT NULL
);

-- Fast approximate nearest neighbor search
CREATE INDEX embeddings_vector_idx 
ON embeddings USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);

The semantic search uses cosine similarity to find relevant messages:

SELECT 
  e.content,
  c.title as conversation_title,
  1 - (e.embedding <=> $queryVector::vector) as similarity
FROM embeddings e
JOIN conversations c ON e.conversation_id = c.id
WHERE 1 - (e.embedding <=> $queryVector::vector) > 0.7
ORDER BY e.embedding <=> $queryVector::vector
LIMIT 10;

2. Knowledge Graph in Postgres 🕸️

Built a living knowledge graph using relational tables that evolves as the AI learns:

-- Entities: Users, topics, concepts, preferences
CREATE TABLE entities (
  id SERIAL PRIMARY KEY,
  name TEXT UNIQUE NOT NULL,
  entity_type TEXT NOT NULL,
  metadata JSONB DEFAULT '{}'
);

-- Observations: Facts about entities (also have vector embeddings!)
CREATE TABLE observations (
  id SERIAL PRIMARY KEY,
  entity_id INTEGER REFERENCES entities(id),
  content TEXT NOT NULL,
  embedding vector(768),  -- Semantic search within memories
  observed_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Relations: How entities connect (extensible graph)
CREATE TABLE relations (
  id SERIAL PRIMARY KEY,
  from_entity_id INTEGER REFERENCES entities(id),
  to_entity_id INTEGER REFERENCES entities(id),
  relation_type TEXT NOT NULL  -- 'updates', 'relates_to', 'extends'
);

3. Hybrid RAG Architecture ⚡

Combined vector search with traditional relational queries for powerful hybrid RAG:

// Step 1: Vector search for relevant conversations
const ragContext = await searchSimilar(queryEmbedding, 10, 0.7);

// Step 2: Get structured memory from knowledge graph
const userPreferences = await getUserPreferences();
const semanticMemory = await searchObservations(queryEmbedding, 5, 0.3);

// Step 3: Join everything for context-aware AI responses
const answer = await answerWithContextAndMemory(
  question,
  ragContext,        // Vector search results
  userPreferences,   // Relational query of user entity
  conversationMemory, // Graph traversal of related entities
  semanticMemory     // Vector search within observations
);

4. JSONB for Flexible Metadata 📦

Leveraged JSONB columns to store dynamic metadata without schema changes:

-- Store rich metadata without rigid schemas
metadata JSONB DEFAULT '{}'

-- Example: Query conversations by metadata
SELECT * FROM conversations
WHERE metadata @> '{"source": "chatgpt", "has_code": true}';

-- Create GIN indexes for fast JSONB queries
CREATE INDEX idx_metadata ON conversations USING gin(metadata);

5. Advanced Indexing Strategy 🚀

Optimized for AI workloads with strategic indexes:

-- Vector indexes for similarity search
CREATE INDEX embeddings_vector_idx ON embeddings 
USING ivfflat (embedding vector_cosine_ops);

CREATE INDEX observations_embedding_idx ON observations
USING ivfflat (embedding vector_cosine_ops);

-- B-tree indexes for relational joins
CREATE INDEX idx_messages_conversation ON messages(conversation_id);
CREATE INDEX idx_embeddings_conversation ON embeddings(conversation_id);

-- BRIN index for time-series observation queries
CREATE INDEX idx_observations_time ON observations 
USING BRIN(observed_at);

-- Graph traversal indexes
CREATE INDEX idx_relations_from ON relations(from_entity_id);
CREATE INDEX idx_relations_to ON relations(to_entity_id);

6. JSONB Aggregation for Memory Recall 🎯

Used Postgres's powerful JSON aggregation to build complex memory structures:

SELECT
  e.name as entity_name,
  COALESCE(
    json_agg(
      json_build_object(
        'content', o.content,
        'observed_at', o.observed_at,
        'similarity', 1 - (o.embedding <=> $query::vector)
      ) ORDER BY o.observed_at DESC
    ) FILTER (WHERE o.id IS NOT NULL),
    '[]'::json
  ) as observations
FROM entities e
LEFT JOIN observations o ON o.entity_id = e.id
WHERE e.name = 'user_preferences'
GROUP BY e.id;

7. Automatic Schema Initialization 🛠️

Built a self-initializing database that sets up the entire AI infrastructure on first run:

export async function initDatabase() {
  // Enable pgvector extension
  await sql`CREATE EXTENSION IF NOT EXISTS vector`;

  // Create all tables, indexes, and relations
  // Perfect for agent-first applications that need to bootstrap quickly
}

Overall Experience

Building Magnolia with TigerData and Agentic Postgres was a revelation. Here's what stood out:

What Worked Brilliantly ✨

1. Postgres as a Vector Database
I expected to need a specialized vector database like Pinecone or Weaviate, but pgvector delivered incredible performance. Storing 768-dimensional vectors alongside relational data in the same database eliminated data synchronization headaches. The IVFFlat indexes made similarity search blazingly fast.

2. Unified Data Layer
Having vectors, relational data, JSON, and graph data in one system was game-changing. I could do things like:

Vector search → Join with conversations → Aggregate metadata → Return structured results
All in a single query, no microservices, no data pipeline

3. JSONB is Underrated
JSONB columns gave me schema flexibility while maintaining query performance. The GIN indexes on JSONB made metadata queries as fast as traditional columns. Perfect for AI systems where data structures evolve.

4. Knowledge Graph in SQL
I initially considered Neo4j, but modeling the knowledge graph in Postgres worked beautifully. Recursive CTEs for graph traversal, JSON aggregation for building nested structures, and the ability to combine graph queries with vector search was powerful.

What Surprised Me 🤯

1. Vector Search Performance
I was skeptical about using Postgres for vector search at scale, but the performance was impressive. Cosine similarity searches across thousands of embeddings returned in milliseconds. The <=> operator is elegant.

2. Hybrid Queries
The ability to mix vector similarity with traditional WHERE clauses blew my mind:

WHERE 1 - (embedding <=> $query) > 0.7  -- Vector similarity
  AND metadata @> '{"topic": "AI"}'     -- JSONB contains
  AND created_at > NOW() - INTERVAL '30 days'  -- Time range

This level of composability is impossible with specialized vector databases.

3. Real-time Learning
The knowledge graph updating in real-time as users chat was magical. Observations accumulate, entities evolve, and the system gets smarter without complex ETL pipelines. Postgres handles it all transactionally.

Challenges and Learnings 📚

1. Embedding Dimensionality
Initially used 3072-dimensional embeddings (Gemini's default) and hit performance issues. Switching to 768 dimensions using MRL (Matryoshka Representation Learning) improved speed dramatically while maintaining accuracy. TigerData's Postgres handled it flawlessly.

2. Index Selection
Learned the hard way that IVFFlat indexes need tuning. The lists parameter affects both index build time and query accuracy. Found the sweet spot at lists = 100 for my dataset size.

3. JSON vs Relational Tradeoffs
Initially over-used JSONB for everything. Learned to use JSONB for truly dynamic data (metadata, API responses) and proper columns for data I frequently query. Best of both worlds.

4. Connection Pooling
Node.js Postgres connections need careful management. Using the postgres library with proper connection limits was crucial for serverless deployments.

Why Agentic Postgres? 🐘

This challenge made me rethink Postgres entirely. It's not just a database—it's an AI-native data platform:

Vectors + Relations = More powerful than pure vector DBs
JSONB + Graph = More flexible than rigid schemas
ACID Transactions = Critical for agent systems that modify their own memory
Mature Ecosystem = 30+ years of tooling, backups, scaling knowledge

The killer feature: Postgres lets you build AI agents that reason over structured data, semantic embeddings, and graph relationships in a single query. That's the future of agentic systems.

Building Magnolia taught me that Postgres is the perfect brain for AI agents. It's not about choosing between vector databases, graph databases, or relational databases—Postgres is all three, battle-tested, and ready for agentic workloads.

Thanks to the Tiger Data team for the free tier that made this possible! 🐯

Tech Stack: Next.js 15, TypeScript, Mastra AI, Google Gemini 2.5 Flash, TigerData (Agentic Postgres), pgvector, Tailwind CSS, shadcn/ui

Made with 🌸 by the Magnolia team