How I built Connection Board — an AI-powered due diligence investigation platform — using Cognee for persistent memory and Graph RAG.
TL;DR: We built Connection Board, an interactive "red-string" evidence board that acts as an AI detective. By utilizing Cognee's Graph RAG memory layer, our platform ingests documents, images, and videos, extracts key entities, and persists them into a graph database. This solves the classic LLM amnesia problem, allowing investigators to query complex networks of connections across massive datasets without the AI ever forgetting context.
If you've ever built anything with Large Language Models (LLMs), you know the struggle: the dreaded context window limit. You start a conversation, the AI is brilliant, but as time goes on, it starts to lose the plot. Suddenly, it forgets who it's talking about, what happened earlier, and you're left repeatedly pasting the same context over and over again.
This is the exact problem we tackled during the "The Hangover Part AI: Where's My Context?" hackathon, hosted by WeMakeDevs and powered by Cognee (running from June 29 to July 5, 2026). The challenge was simple yet incredibly profound: Build an AI that doesn't forget.
In this post, I want to share my experience building our project, Connection Board, the tech stack we used, the UI/UX decisions we made, and how Cognee's persistent memory layer changed how we think about AI state.
The Hackathon: Curing AI Amnesia
WeMakeDevs put together an incredible event focused on solving LLM amnesia. The premise was funny but accurate: Your AI woke up in Vegas with no memory of last night. Give it a memory.
To do this, we had to heavily utilize Cognee, an open-source, self-hosted, hybrid graph-vector memory layer. Cognee's API consists of four elegant operations:
-
remember()— Ingest and permanently structure text, files, and URLs into a knowledge graph. -
recall()— Query memory, with automatic routing between semantic similarity and deep graph traversals. -
improve()— Enrich memory and adapt weights based on feedback. -
forget()— Surgically prune datasets when no longer needed.
What makes Cognee stand apart from naive RAG is the Graph RAG approach. Instead of just chunking text into vector embeddings, Cognee builds a knowledge graph — entities become nodes, relationships become edges, and context is preserved structurally. When you recall(), it doesn't just do cosine similarity; it traverses the graph to find meaningful connection paths.
Armed with these tools, we brainstormed what kind of AI would benefit the most from absolute, unshakeable memory. The answer? An investigator.
Enter 'Connection Board'
Connection Board is a premium, AI-powered due diligence investigation platform.
In real-world due diligence, intelligence operations, or investigative journalism, professionals must sift through mountains of unstructured data — transcripts, financial filings, PDFs, images, video footage, and web articles. They are looking for key actors, hidden networks, financial links, and conflicts of interest.
Connection Board automates this. It ingests files, videos, and URLs, extracts entities and relationships using LLMs and Vision models, and visualizes them on an interactive "detective corkboard" using a red-string evidence graph.
Because we used Cognee, investigators can query the system in natural language to discover paths of connection — for example:
"How is John Doe connected to Acme Corp?"
The system traverses its semantic graph, finds the shortest path through intermediate entities, and highlights the connection chain visually on the board. This makes it an indispensable tool for uncovering hidden networks.
The Demo: Finding Doug from The Hangover
For our hackathon demo, we fed it the ultimate chaotic investigation: Finding Doug, the missing groom from The Hangover. By feeding in character bios, receipts, police reports, and voicemails, the agent was able to map out exactly what happened and where Doug was!
The demo showcased the full pipeline: uploading disparate source documents, extracting entities like "Phil," "Stu," "Alan," "Cindy," and "Mr. Chow," identifying relationships like "married to," "kidnapped by," and "fought at," and then querying the graph to trace Doug's location through the network of connections.
Designing the Detective Corkboard
We wanted the UI to feel like an immersive, premium investigative tool. We leaned heavily into the modern, digitized "red-string evidence board" aesthetic — think conspiracy theory wall meets sleek dark-mode SaaS.
Visual Design Decisions
- The Canvas: Deep neutral darks with subtle noise textures to emulate depth, creating a sleek dark mode corkboard. The dark background makes colored nodes and red strings pop visually.
-
The Red String: We used vibrant crimson (
D32F2F) for primary actions and confirmed relationship edges. Every connection between entities is rendered as a physical red string — reinforcing the investigative metaphor. -
Nodes & Entities: We color-coded node elements by entity type for quick forensic scanning:
- Electric Blue — People
- Emerald Green — Companies / Organizations
- Crimson Red — Locations
- Pale Yellow — Manual sticky notes (theories/hypotheses)
- White Polaroid — Document nodes containing images
- Micro-Animations: Glassmorphic hover states on cards and smooth fade-ins for floating overlays like the Query Console and Node Details slide-out panels.
Interactive Features
The board isn't just a static visualization — it's a fully interactive workspace:
- Pan & Zoom: Navigate large graphs fluidly with Cytoscape.js
- Node Selection: Click any node to see its properties, linked entities, and source documents
- Path Highlighting: When a query finds a connection, the path glows yellow while unrelated nodes fade out
- Drag & Drop Upload: Drag files directly onto the board to ingest evidence
- Filtering: Toggle entity types and adjust confidence thresholds to focus the view
- Drawing Tools: Toolbar with 6 tools — pointer, arrow (manual edge creation), sticky note, text box, rectangle, and circle. Double-tap any shape to edit inline on the canvas
- Resizable Nodes: Drag handles on shapes and documents to resize them with zoom-aware coordinate math
- Position Saving: Dragged node positions persist to the backend automatically
- Chronological Timeline: A panel tracking the history of graph updates, featuring an "Explain Timeline" button that generates an AI-narrated summary of how the investigation unfolded.
The Tech Stack
Building a real-time, graph-based tool required a modern stack. Here's the full breakdown:
1. Frontend: React, Vite, & Cytoscape.js
We used React (bootstrapped with Vite) for the UI, with Zustand for state management and Tailwind CSS for styling.
For the actual graph visualization, we relied on Cytoscape.js with the fcose (force-directed) layout algorithm. It handled graph panning, zooming, and complex layout calculations. We even styled document nodes containing images as physical polaroid photos — complete with a slight rotation and drop shadow to sell the corkboard aesthetic.
Key frontend decisions:
- Axios interceptors for automatic JWT token attachment and 401 handling
- WebSocket connection per dataset for real-time graph synchronization
-
Component architecture: Clean separation between
EvidenceBoard,QueryConsole,NodeDetails,DocumentUpload,FilterBar,TimelinePanel, andLegendPanel(collapsible entity type legend)
2. Backend: FastAPI & Python
Our backend was powered by FastAPI with Python's async capabilities. It handled:
-
Authentication: JWT-based auth with a demo
investigatoruser for quick onboarding - Data Ingestion: PDF parsing (pypdf), URL scraping (BeautifulSoup), and image OCR (GPT-4o-mini via OpenRouter)
- Entity Extraction: LLM-powered entity and relationship extraction with a deterministic regex fallback
- Real-time Sync: WebSocket manager to broadcast graph updates — when an investigator adds evidence, everyone looking at the board sees the new nodes and red strings appear instantly
3. The Memory Layer: PostgreSQL + Cognee
This is where the magic happened. We utilized a dual-store hybrid database layer:
- PostgreSQL: Used for key-value persistence, storing our active datasets and user lists as JSONB. Simple, reliable, and enough for structured state.
-
Cognee: Our semantic brain. Whenever text was extracted from a PDF, URL, or image (using
gpt-4o-minifor OCR via OpenRouter), we fed it into Cognee viacognee.remember(). Cognee indexed the raw texts as semantic graphs/vectors.
When an investigator queries the system via the Query Console, we hit cognee.recall(). If Cognee couldn't find a direct semantic answer, our backend gracefully fell back to a local Breadth-First Search (BFS) graph traversal to find the shortest path between entities.
This dual approach means the system always has an answer — even if the semantic layer is unavailable or returns empty results.
The Ingestion & Feedback Pipeline
Our ingestion pipeline is designed to handle diverse evidence sources seamlessly.
Ingestion Flow
An investigator can upload a PDF, paste a URL, drop an image, or upload a video file:
-
Text Extraction: Our backend extracts raw text — pypdf for PDFs, BeautifulSoup for URLs, GPT-4o-mini Vision API for images (OCR), and a custom FFmpeg-based
Media Extractorpipeline that extracts subtitles and key scene screenshots from video files. - Entity & Relationship Extraction: The text is passed to an LLM via OpenRouter to extract entities (people, companies, locations, money) and their relationships. A regex-based fallback handles cases where the LLM is unavailable.
- Deduplication: Entities are deduplicated using slug-based keys to avoid duplicate nodes.
- Persistence: Entities and relationships are saved to PostgreSQL as structured graph data.
-
Semantic Indexing: The raw text is fed into Cognee via
cognee.remember()for deep semantic indexing. - Real-time Push: The updated graph state is pushed via WebSockets to the frontend, triggering a live board refresh.
Feedback Loop
We also built a feedback loop that makes the system smarter over time:
- Investigators can "Verify", "Reject", or mark connections as neutral
- Verification increases edge confidence by
+0.1(up to 1.0) - Rejection decreases edge confidence by
-0.2(down to 0.0) - Neutral marks add
+0.02(subtle signal of awareness without commitment) - These feedback signals trigger
cognee.improve()to dynamically adjust connection weights - Low-confidence edges can be filtered out using the confidence slider in the UI
Challenges Faced: The Semantic Indexing Gap
No hackathon project is complete without a few bumps in the road. One of our biggest challenges was the Cognee Semantic Indexing Gap during case imports.
The Problem
We built an import/export functionality to transfer active workspaces using ZIP archives. When importing, the system successfully rebuilt the PostgreSQL DocumentRecord, EntityRecord, and EdgeRecord records, and broadcasted the UI update over WebSockets perfectly.
However, we initially forgot to invoke cognee.remember() for the imported documents.
The Impact
Because Cognee's database had no records of the imported dataset, natural language queries ran against the imported case returned empty answers. The Query Console would show nothing — no entities, no paths, no confidence scores.
Thankfully, our local BFS fallback kicked in, keeping the visual Cytoscape paths functional. But the rich, context-aware semantic responses were completely offline for imported workspaces.
The Fix
We realized we needed to run Cognee's indexing as a FastAPI background task during import to avoid blocking the HTTP response:
from fastapi import BackgroundTasks
async def index_imported_documents(dataset_id: str, documents: list):
for doc in documents:
await _best_effort_cognee_remember(dataset_id, doc.text)
This runs the indexing asynchronously — the import completes quickly and returns the graph to the UI immediately, while Cognee processes the documents in the background. After a few seconds, semantic queries start working on the imported case.
Lessons Learned
What Worked Well
-
Cognee's API simplicity: The four operations (
remember,recall,improve,forget) map perfectly to CRUD-like patterns. It was easy to integrate and reason about. - Graph RAG over naive RAG: Having structured relationships between entities — not just flat text chunks — made the query results dramatically more useful. "How is X connected to Y?" is a graph traversal question, not a similarity search question.
- The BFS fallback: Having a deterministic fallback for when the semantic layer is unavailable meant the system was always functional, even under degraded conditions.
- Real-time WebSockets: The live graph updates made the tool feel responsive and collaborative — crucial for an investigation workflow.
What I'd Do Differently
- Per-dataset PostgreSQL storage: Storing all datasets as a single JSONB blob creates race conditions. I'd move to per-dataset rows in a proper relational schema.
- Background indexing from the start: Cognee indexing should always be a background task, not synchronous. It adds latency to the ingestion endpoint.
- Typed entity model: The current entity extraction relies on LLM output parsing. A more structured extraction pipeline (with validated schemas) would reduce noise.
Looking Back
The WeMakeDevs Cognee hackathon was a phenomenal experience. It pushed us to think beyond standard RAG (Retrieval-Augmented Generation) and explore the bleeding edge of Graph RAG and persistent memory.
Building Connection Board proved that when you give an LLM a structured, graph-based memory layer like Cognee, it stops being just a conversational bot and becomes a powerful, stateful reasoning engine. We focused heavily on the hackathon's judging criteria—specifically the Best Use of Cognee and User Experience—by deeply integrating the memory lifecycle APIs into a highly polished, interactive UI.
The difference between a chatbot and an investigator is memory. Cognee gives you that memory.
If you are building AI applications today, I highly recommend checking out Cognee. Stop letting your AI wake up with a hangover. Give it a memory.
Top comments (0)