This is a submission for the DEV's Worldwide Show and Tell Challenge Presented by Mux
Modern enterprises generate an overwhelming amount of unstructured information—emails, documents, meeting transcripts, dashboards, and chats. While this data holds immense value, most of it remains siloed, unsearchable, and disconnected.
KnowledgeForge is an AI-powered knowledge management platform designed to solve this exact problem. It transforms unstructured enterprise content into a queryable knowledge graph, enabling teams to explore their organizational knowledge using natural language.
Instead of searching through folders, inboxes, or dashboards, you can simply ask questions like:
“What did Rajesh Kumar discuss last quarter?”
“Which topics are most associated with declining KPIs?”
KnowledgeForge handles the rest.
The Core Idea
At its heart, KnowledgeForge combines three powerful concepts:
- LLM-driven understanding of unstructured content
- Graph-based modeling of people, topics, and metrics
- Natural language access to both unstructured and structured data
By blending vector search, entity graphs, and virtualized data access, KnowledgeForge becomes a single intelligence layer over your organization’s knowledge.
Key Capabilities
Intelligent Entity Extraction
Every piece of content—whether it’s an email, document, or meeting transcript—is analyzed using large language models. The system automatically extracts:
- People mentioned
- Topics and themes
- Business metrics and KPIs
- Temporal references
This turns raw text into structured, reusable knowledge.
A Living Knowledge Graph
Extracted entities don’t live in isolation. KnowledgeForge builds relationships between:
- Knowledge items (documents, meetings, emails)
- People
- Topics
- Metrics
- Time dimensions
Over time, this forms a rich graph that reflects how information, decisions, and outcomes are connected across the organization.
Semantic Search That Understands Meaning
Traditional keyword search fails when phrasing changes. KnowledgeForge uses vector embeddings to enable semantic search, meaning you can retrieve relevant information even if the wording doesn’t match exactly.
This is especially powerful for exploratory questions and research-style queries.
KB Genie: Conversational Knowledge Access
On top of the knowledge graph sits KB Genie, a conversational AI assistant.
KB Genie:
- Detects entities in user questions
- Traverses the knowledge graph
- Retrieves relevant document chunks using vector search
- Queries structured data when needed
- Generates clear, contextual answers with supporting evidence
The result feels less like “search” and more like talking to your organization’s collective memory.
Bridging Unstructured and Structured Data with Denodo
One of KnowledgeForge’s differentiators is its integration with Denodo AI SDK.
While vector databases and LLMs excel at unstructured data, enterprises still rely heavily on structured systems like data warehouses. KnowledgeForge uses Denodo to:
- Translate natural language questions into SQL
- Query virtualized views across multiple data sources
- Combine structured query results with unstructured knowledge
This creates a unified experience where documents, metrics, and dashboards all answer the same question.
How the System Works
Ingestion Pipeline
When new content is ingested, it flows through a multi-step pipeline:
- Content is received (email, document, or transcript)
- LLMs extract entities and relationships
- Entities are resolved and deduplicated
- Content is chunked into semantically meaningful sections
- Embeddings are generated for each chunk
- Data is stored across relational and vector databases
Each step adds structure and context without losing the original content.
Query Flow
When a user asks a question:
- The system identifies entities in the query
- The knowledge graph is traversed to find related information
- Relevant document chunks are retrieved using semantic search
- Structured data is queried via Denodo when needed
- A language model synthesizes a final answer
This hybrid approach ensures responses are both accurate and contextually rich.
The Data Model
KnowledgeForge is built around a central concept: the knowledge item.
Every document, email, or meeting acts as an anchor node, connected to:
- People
- Topics
- Metrics
- Time
This design makes it easy to ask cross-cutting questions like:
- “Which people are most associated with this topic?”
- “What metrics tend to appear in discussions about delays?”
Built for Real-World Engineering Teams
The platform is designed with modern stacks and scalability in mind:
- A FastAPI backend exposes ingestion, search, and agent endpoints
- A React-based frontend provides dashboards, exploration tools, and chat
- Vector storage enables fast semantic retrieval
- Relational storage ensures consistency and traceability
- Virtualized data access avoids duplicating enterprise data
Everything is modular, making it easy to extend or adapt to different enterprise environments.
Why KnowledgeForge Matters
Enterprises don’t suffer from a lack of data—they suffer from a lack of understanding.
KnowledgeForge turns scattered information into:
- Connected knowledge
- Searchable insights
- Conversational answers
It doesn’t replace your existing systems. It connects them, learns from them, and makes them usable by anyone who can ask a question.
Github: https://github.com/Shaik-mohd-huzaifa/knowledge-forge
Top comments (0)