DEV Community

Shaik Mohammed Huzaifa
Shaik Mohammed Huzaifa Subscriber

Posted on

KnowledgeForge: Turning Enterprise Chaos into a Living Knowledge Graph đź§ 

This is a submission for the DEV's Worldwide Show and Tell Challenge Presented by Mux

Modern enterprises generate an overwhelming amount of unstructured information—emails, documents, meeting transcripts, dashboards, and chats. While this data holds immense value, most of it remains siloed, unsearchable, and disconnected.

KnowledgeForge is an AI-powered knowledge management platform designed to solve this exact problem. It transforms unstructured enterprise content into a queryable knowledge graph, enabling teams to explore their organizational knowledge using natural language.

Instead of searching through folders, inboxes, or dashboards, you can simply ask questions like:

“What did Rajesh Kumar discuss last quarter?”
“Which topics are most associated with declining KPIs?”

KnowledgeForge handles the rest.


The Core Idea

At its heart, KnowledgeForge combines three powerful concepts:

  1. LLM-driven understanding of unstructured content
  2. Graph-based modeling of people, topics, and metrics
  3. Natural language access to both unstructured and structured data

By blending vector search, entity graphs, and virtualized data access, KnowledgeForge becomes a single intelligence layer over your organization’s knowledge.


Key Capabilities

Intelligent Entity Extraction

Every piece of content—whether it’s an email, document, or meeting transcript—is analyzed using large language models. The system automatically extracts:

  • People mentioned
  • Topics and themes
  • Business metrics and KPIs
  • Temporal references

This turns raw text into structured, reusable knowledge.


A Living Knowledge Graph

Extracted entities don’t live in isolation. KnowledgeForge builds relationships between:

  • Knowledge items (documents, meetings, emails)
  • People
  • Topics
  • Metrics
  • Time dimensions

Over time, this forms a rich graph that reflects how information, decisions, and outcomes are connected across the organization.


Semantic Search That Understands Meaning

Traditional keyword search fails when phrasing changes. KnowledgeForge uses vector embeddings to enable semantic search, meaning you can retrieve relevant information even if the wording doesn’t match exactly.

This is especially powerful for exploratory questions and research-style queries.


KB Genie: Conversational Knowledge Access

On top of the knowledge graph sits KB Genie, a conversational AI assistant.

KB Genie:

  • Detects entities in user questions
  • Traverses the knowledge graph
  • Retrieves relevant document chunks using vector search
  • Queries structured data when needed
  • Generates clear, contextual answers with supporting evidence

The result feels less like “search” and more like talking to your organization’s collective memory.


Bridging Unstructured and Structured Data with Denodo

One of KnowledgeForge’s differentiators is its integration with Denodo AI SDK.

While vector databases and LLMs excel at unstructured data, enterprises still rely heavily on structured systems like data warehouses. KnowledgeForge uses Denodo to:

  • Translate natural language questions into SQL
  • Query virtualized views across multiple data sources
  • Combine structured query results with unstructured knowledge

This creates a unified experience where documents, metrics, and dashboards all answer the same question.


How the System Works

Ingestion Pipeline

When new content is ingested, it flows through a multi-step pipeline:

  1. Content is received (email, document, or transcript)
  2. LLMs extract entities and relationships
  3. Entities are resolved and deduplicated
  4. Content is chunked into semantically meaningful sections
  5. Embeddings are generated for each chunk
  6. Data is stored across relational and vector databases

Each step adds structure and context without losing the original content.


Query Flow

When a user asks a question:

  1. The system identifies entities in the query
  2. The knowledge graph is traversed to find related information
  3. Relevant document chunks are retrieved using semantic search
  4. Structured data is queried via Denodo when needed
  5. A language model synthesizes a final answer

This hybrid approach ensures responses are both accurate and contextually rich.


The Data Model

KnowledgeForge is built around a central concept: the knowledge item.

Every document, email, or meeting acts as an anchor node, connected to:

  • People
  • Topics
  • Metrics
  • Time

This design makes it easy to ask cross-cutting questions like:

  • “Which people are most associated with this topic?”
  • “What metrics tend to appear in discussions about delays?”

Built for Real-World Engineering Teams

The platform is designed with modern stacks and scalability in mind:

  • A FastAPI backend exposes ingestion, search, and agent endpoints
  • A React-based frontend provides dashboards, exploration tools, and chat
  • Vector storage enables fast semantic retrieval
  • Relational storage ensures consistency and traceability
  • Virtualized data access avoids duplicating enterprise data

Everything is modular, making it easy to extend or adapt to different enterprise environments.


Why KnowledgeForge Matters

Enterprises don’t suffer from a lack of data—they suffer from a lack of understanding.

KnowledgeForge turns scattered information into:

  • Connected knowledge
  • Searchable insights
  • Conversational answers

It doesn’t replace your existing systems. It connects them, learns from them, and makes them usable by anyone who can ask a question.

Github: https://github.com/Shaik-mohd-huzaifa/knowledge-forge

Top comments (0)