Gao Dalie (Ilyass)

Posted on Feb 16

KAG Graph + Multimodal RAG + LLM Agents = Powerful AI Reasoning

As AI technology booms, RAG are becoming a game changer, quickly becoming partners in problem-solving and domain applications, and this is what makes RAG unique.

However, RAG has problems such as a large gap between vector similarity and knowledge reasoning relevance, and insensitivity to knowledge logic (such as numerical values, time relationships, expert rules, etc.), which hinder the implementation of professional knowledge services.

Can you imagine? you have a chatbot that requires reasoning based on specific relationships between knowledge fragments to collect information related to answering questions. However, RAG usually relies on the similarity of text or vectors to retrieve reference information, which may lead to incomplete and repeated search results.

That’s where KAG comes in. Knowledge Augmented Generation aims to fully utilize the advantages of knowledge graphs and vector retrieval, and bidirectionally enhance large language models and knowledge graphs to solve these problems

Knowledge alignment based on semantic reasoning KAG significantly outperforms methods such as NaiveRAG and HippoRAG in multi-hop question-answering tasks, with a relative improvement of 19.6% in the F1 score on hotpotQA and 33.5% on 2wiki.

These performance leaps are mainly attributed to the more efficient index construction, knowledge alignment, and development of hybrid problem-solving libraries in the framework.

So, Let me give you a quick demo of a live chatbot to show you what I mean.

Check a video

We have two separate parts. First, we start with knowledge management, where I will upload a PDF that includes all the charts, tables and images. let me show you how KAG extracts data. KAG employs a knowledge representation model to organize information into a structured format, making it compatible with both structured and unstructured data.

Next, KAG uses a mutual-indexing mechanism that links knowledge graphs with original text chunks. Its indexing efficiently retrieves relevant information based on user queries and connects structured knowledge with unstructured data.

Please stay tuned until the end because I will show you something cool: how to use SQL and Neo4j to extract your data.

Then, we move to the Knowledge Base Question and Answer section. I will ask it a straightforward question: “What is Net income in 2024” When I pose the question, KAG processes the query to understand its intent and context. This involves identifying key entities, relationships, and the overall structure of the question. Afterwards, KAG generates a logical form based on my query. It then retrieves relevant information from the knowledge graph (KG), including entities, relationships, triples, and data aggregation, to generate a clear and understandable answer for the user.

In this video, I’ll quickly go over the Dcouement so you are 100% up to speed on what KAG is, what it features, how it works, what is difference between Graphrag and KAG and even We’ll be installing applications on-screen that you can copy, paste, and adapt for your uses

What is KAG :

The Knowledge-Aware Graph Generator (KAG) framework is open-source and fully utilizes the complementary advantages of knowledge graph and RAG technology. It not only integrates the graph structure into the knowledge base, but also integrates the semantic types, relations, and knowledge graph question answering (KGQA) of the knowledge graph into KAG.

Features

The Knowledge-Aware Graph Generator (KAG) framework has several important features that make it better for answering questions in professional settings. These features are:

LLM-Friendly Knowledge Representation:

KAG uses a system (LLMFriSPG) that works well with large language models (LLMs). It helps LLMs understand data, information, and knowledge, making it easier to use knowledge graphs.

Mutual Indexing:

The framework connects knowledge graphs with original text through mutual indexing. It helps find and organize information more easily, linking structured knowledge with unstructured text.

Logical-Form-Guided Hybrid Reasoning Engine:

KAG has a reasoning engine that combines different types of reasoning, like planning, retrieval, and math. It allows KAG to turn natural language questions into structured steps for solving problems, making it better at handling complex questions.

Knowledge Alignment with Semantic Reasoning:

KAG uses semantic reasoning to match the knowledge with the user’s question. It improves the accuracy of the answers by ensuring the information fits the context and is aligned with the user’s needs.

Improved Natural Language Processing:

The framework improves basic tasks like understanding, reasoning, and generating language. These improvements help KAG better understand questions, think through them, and generate clear answers.

How It Works:

As shown in the figure above, the KAG architecture consists of three core components: KAG-Builder, KAG-Solver and KAG-Model.

KAG-Builder is responsible for building offline indexes. This module proposes a knowledge representation framework compatible with large language models and implements a mutual indexing mechanism between knowledge structures and text fragments.

Path to Dir: [link]

KAG-Solver introduces a hybrid reasoning engine guided by logical forms, integrating large language model reasoning, knowledge reasoning, and mathematical logic reasoning. Semantic reasoning is used for knowledge alignment to enhance the accuracy of KAG-Builder and KAG-Solver in knowledge representation and retrieval.

Path to Dir: [link]

• KAG-Model is based on a general language model and optimizes the specific capabilities required by each module, thereby improving overall module performance.

KAG Vs Graphrag

KAG and GraphRAG differ primarily in their integration and reasoning capabilities. KAG fully leverages knowledge graphs (KGs) by incorporating semantic relationships and employing a hybrid reasoning engine for logical, retrieval, and numerical tasks, enabling structured and complex problem-solving.

It enhances general LLM capabilities in professional domains with improved semantic alignment and tailored Natural Language Understanding, Natural Language Inference, and Natural Language Generation.

In contrast, GraphRAG focuses more on retrieval and generation, with less emphasis on semantic reasoning, logical planning, and domain-specific performance, potentially limiting its effectiveness for complex queries and professional applications.

Step-by-Step Process :

The KAG graph backend service is based on the OpenSPG knowledge graph construction framework, which we have discussed. First, build the graph server service using the official OpenSPG-Server server documentation.

Let’s go to the docker website and download the docker file on Windows.

Once you install Docker Desktop, we open the terminal and run the following command

curl -sSL https://raw.githubusercontent.com/OpenSPG/openspg/refs/heads/master/dev/release/docker-compose.yml -o docker-compose.yml

Then we check our services are up and running and Run this command

docker ps

To ensure everything is working correctly, check the logs of the main service and Run this command

docker logs -f release-openspg-server

Copy the

http://127.0.0.1:8887/

and paste it into the browser to access the openspg-kag product interface.

Create a knowledge base

Then we click Create Knowledge Base, First, we start choosing a Chinese name for the Knowledge Base. Next, you’ll need an English name — remember, it must start with a capital letter, contain at least three characters, and only include letters and numbers. in this video, I will name it something like ‘KAGDemo’.

After that, we’ll set up the graph storage configuration. we copy simple JSON setup; by default, you can use the local Neo4j database.

{
  "uri":"neo4j://release-openspg-neo4j:7687",
  "user":"neo4j"
}

Now, let’s move to the model configuration. Choose a model like ChatGPT, or DeepSeek. Add your API key and other details in JSON format.

{
  "client_type": "maas",
  "model": "gpt-4o",
  "base_url": "https://api.openai.com/v1",
  "api_key": "Your_path"
}

For embedding, I will use OpenAI Embedding. You can also use Ollama for embedding, as it has some cool embedding models.

{
  "vectorizer":"kag.common.vectorizer.OpenAIVectorizer",
  "model":"text-embedding-3-small",
  "base_url":"https://api.openai.com/v1",
  "api_key":"Your_path"
}

Lastly, you’ll need to set the language for your Knowledge Base — either Chinese (zh) or English (en) I will keep it as the default.

Keep in mind that you can set up the chatbot fully locally using Ollama. If you want to know how to do it, please check this document.

{
  "biz_scene":"default",
  "language":"en"
}

Once you save the configuration successfully you see the little box includes knowledge management and question and answer as shown below

In this case, you didn’t save the configuration successfully you may have faced the problem that I faced when I installed Neoj4 as an unknown error

The way to solve this problem is simple: just check if release-openspg-neo4j starts successfully, then rerun the container

docker logs release-openspg-neo4j
docker start release-openspg-neo4j

Chatbot Demo :

Let’s click on Knowledge Management once you click we create a task name your knowledge task and choose the local file to upload files with all supported type file suffixes through the knowledge base management page to carry out the knowledge base construction process

We click on the next step make Max Segment length as default and hit the Next Step

Once you see this screen keep it as default and hit the finish button once you finish you can create task as much as you can the more knowledge the better your chatbot is.

Note: You need to wait until the icon turns green for the task status to be completed. To check the progress, click the log view icon to ensure all content has been extracted successfully, as shown below.

One of the many features I like about OpenSGP is that we can use the Neo4j browser to extract knowledge and check the knowledge extraction results. This feature is really helpful for anyone who wants to track the data and ensure that the chatbot generates accurate responses using a Cypher query.

http://127.0.0.1:7474/browser/

So, let’s click on Knowledge Management to interact with the chatbot and test it out and I will ask the chatbot the complex question from KAG Paper.

what is the all name of the model they fine-tuning-free

If you take a look, you’ll see that when I asked the question, the chatbot used logical reasoning to generate the output. The answer is accurate, well-structured, easy for non-technical users to understand, and more precise, without any unrelated information.

Conclusion :

The KAG framework is still in the early stages of development, so there’s room for changes and improvements. With new features like custom schemas and visual queries, KAG not only enhances the accuracy and efficiency of knowledge extraction and question-answering but also strengthens its foundation. These updates pave the way for developing more robust and reliable professional knowledge services.

Plus, the abstract generation classes have been optimized. If we try using different scale models at various stages, KAG’s performance could get even better. And hey, since KAG is open-source, we should take advantage of the code and see how it can help us create custom solutions for whatever we need!

If this article might be helpful to your friends, please forward it to them.

Reference :

🧙‍♂️ I am an AI Generative expert! If you want to collaborate on a project, drop an inquiry here or Book a 1-on-1 Consulting Call With Me.

I would highly appreciate it if you

❣join to my Patreon:https://www.patreon.com/GaoDalie_AI
Book an Appointment with me:https://topmate.io/gaodalie_ai
Support the Content (every Dollar goes back into the video):https://buymeacoffee.com/gaodalie98d
Subscribe Newsletter for free:https://substack.com/@gaodalie

DEV Community