Chinmay Bhosale

Posted on Oct 20

Search Types in Cognee

#ai #architecture #opensource

In my previous blog on Building 100% local AI memory with cognee, we explored setting up a completely local AI memory system. But one of Cognee's most powerful features is its flexibility—you're not locked into a single provider. You can mix and match LLM providers, embedding models, and databases to fit your specific needs.

Cognee is an open-source memory framework that replaces traditional RAG systems with a structured knowledge graph approach, achieving 92.5% accuracy compared to RAG's 60%. Whether you're building locally or in the cloud, Cognee adapts to your infrastructure.

This blog demonstrates Cognee's cross-platform capabilities by using OpenAI's GPT models for high-quality entity extraction while keeping embeddings local with Ollama. More importantly, we'll dive deep into the different search types Cognee offers—from simple vector similarity to advanced graph traversal—each optimized for specific use cases. Understanding these search modes is crucial for getting accurate, relevant results from your knowledge graphs.

Primary Search Types (User-Facing)

1. SUMMARIES

Returns pre-generated hierarchical summaries that were created during the cognify process. When you query with this type, it performs a vector similarity search against stored summary nodes and returns the most relevant pre-computed summaries without requiring LLM processing at query time. This makes it extremely fast for getting quick overviews of content, as the summarization work was already done during data processing.

2. INSIGHTS

Retrieves structured entity relationships and semantic connections directly from the knowledge graph. It performs vector search to find relevant entities, then traverses the graph to extract their relationships (edges) and returns them in a human-readable format showing how concepts connect to each other. This is ideal for understanding the structure of knowledge without needing natural language generation - you get raw relationship data like "Entity A --[relationship_type]--> Entity B".

3. CHUNKS

Performs pure vector similarity search to find and return raw text segments that semantically match your query. It searches the vector database for document chunks with embeddings similar to your query embedding, then returns the actual text content of those chunks along with their metadata (source document, position, etc.). This is the fastest search type because it bypasses both graph traversal and LLM processing, making it perfect for finding specific passages or citations.

4. RAG_COMPLETION

Traditional Retrieval-Augmented Generation that retrieves relevant document chunks via vector search, then feeds them as context to an LLM to generate a natural language answer. Unlike graph-based approaches, this treats documents as flat collections of chunks without leveraging the knowledge graph structure. It's useful when you want LLM-generated answers but don't need the deeper semantic understanding that comes from graph relationships.

5. GRAPH_COMPLETION (Default)

The most sophisticated search type that combines vector search, graph traversal, and LLM reasoning. It first finds relevant entities through vector search, then traverses the knowledge graph to gather connected information (relationships, related entities), and finally uses an LLM to synthesize this graph-structured context into a coherent natural language answer. This provides the most intelligent and contextually-aware responses because it understands how concepts relate to each other through the graph structure.

6. CODE

Specialized search for code repositories that understands programming language syntax and semantics. It searches through code-specific knowledge graphs built by the codify process, returning structured information about functions, classes, methods, and their relationships. The results include code context, implementation details, and how different code elements connect, making it ideal for understanding codebases.

7. FEELING_LUCKY

An intelligent meta-search type that automatically selects the most appropriate search type for your query. It uses an LLM to analyze your query and determine whether it's best answered by graph completion, insights, chunks, or another search type, then executes that search. This is perfect when you're unsure which search type to use or want the system to make the best choice automatically.

Advanced/Specialized Search Types

8. GRAPH_SUMMARY_COMPLETION

An enhanced version of GRAPH_COMPLETION that adds an intermediate summarization step. After retrieving graph edges through vector search and traversal, it summarizes the retrieved context before passing it to the LLM for final answer generation. This reduces redundancy in the context and can improve answer quality when dealing with large amounts of retrieved information.

9. GRAPH_COMPLETION_COT (Chain-of-Thought)

Implements iterative reasoning by generating follow-up questions and refining answers through multiple rounds. After an initial graph completion, it validates the answer, generates follow-up questions based on reasoning gaps, retrieves additional context for those questions, and produces a refined final answer. This mimics human-like reasoning by breaking down complex questions into steps.

10. GRAPH_COMPLETION_CONTEXT_EXTENSION

Iteratively expands the context by retrieving related graph triplets over multiple rounds. It starts with initial context, generates a completion, uses that completion to query for more related triplets, adds them to the context, and repeats until no new information is found or a maximum number of rounds is reached. This ensures comprehensive context gathering by following chains of relationships in the graph.

11. FEEDBACK

Not actually a search type but a mechanism to save user feedback on search results. It records whether an answer was helpful or not, connecting this feedback to the query, answer, and graph triplets that were used. This enables learning from user interactions to improve future search quality.

12. TEMPORAL

Time-aware search that extracts temporal constraints from queries and filters results accordingly. It parses time expressions in your query (like "last week" or "in 2023"), converts them to time intervals, and filters graph nodes/events that fall within those time ranges. This is essential for queries about when things happened or finding information from specific time periods.

Setting Up the Hybrid Environment

This setup demonstrates Cognee's flexibility by combining OpenAI's GPT models with local Ollama embeddings—getting the best of both worlds.

Installing Ollama

For Windows and macOS: Visit ollama.com and download the installer. Run it and follow the prompts—the server starts automatically on http://localhost:11434.

For Linux: Run this command in your terminal:

curl -fsSL https://ollama.com/install.sh | sh

Verify the installation:

ollama --version

Pulling the Embedding Model

We'll use OpenAI for LLM tasks but keep embeddings local for cost efficiency. Pull the embedding model:

ollama pull avr/sfr-embedding-mistral:latest

Verify it's ready:

ollama list

You should see avr/sfr-embedding-mistral:latest in the output.

Installing Cognee

Install Cognee with Ollama support:

pip install "cognee[ollama]"

Configuring the Hybrid Setup

Create a .env file in your project directory with this configuration:

# LLM Configuration - OpenAI
LLM_API_KEY="your_openai_api_key"  
LLM_MODEL="gpt-4o-mini"  
LLM_PROVIDER="openai"  

# Embedding Configuration - Local Ollama
EMBEDDING_PROVIDER="ollama"  
EMBEDDING_MODEL="avr/sfr-embedding-mistral:latest"  
EMBEDDING_ENDPOINT="http://localhost:11434/api/embeddings"  
EMBEDDING_API_VERSION=""  
EMBEDDING_DIMENSIONS=4096  
HUGGINGFACE_TOKENIZER="Salesforce/SFR-Embedding-Mistral"  

# Database Settings (defaults)
DB_PROVIDER="sqlite"  
DB_NAME="cognee_db"  
VECTOR_DB_PROVIDER="lancedb"  
GRAPH_DATABASE_PROVIDER="kuzu"

Key Configuration Notes:

Replace "your_openai_api_key" with your actual OpenAI API key from platform.openai.com
The LLM_PROVIDER is set to "openai" for entity extraction and reasoning
The EMBEDDING_PROVIDER is set to "ollama" for local vector generation
This hybrid approach uses OpenAI's intelligence while keeping embeddings private and cost-effective

Cognee automatically loads this configuration when imported. You're now ready to build knowledge graphs with the best of both worlds!

1. SUMMARIES Example

import cognee
from cognee import SearchType

# Sample data: Article about artificial intelligence
data = """
Artificial Intelligence (AI) has revolutionized modern technology. Machine learning, 
a subset of AI, enables computers to learn from data without explicit programming. 
Deep learning, using neural networks, has achieved breakthroughs in image recognition, 
natural language processing, and autonomous vehicles. Companies worldwide are investing 
billions in AI research to stay competitive in the digital age.
"""

# Add data
await cognee.add(data, "ai_articles")

# Process into knowledge graph
await cognee.cognify(["ai_articles"])

# Search for summaries
search_results = await cognee.search(
    query_type=SearchType.SUMMARIES,
    query_text="What are the main topics about AI?",
)

# Display results
for result in search_results:
    print(result)

2. INSIGHTS Example

import cognee
from cognee import SearchType

# Sample data: Scientific explanation
data = """
Quantum computing leverages quantum mechanics principles like superposition and entanglement. 
Unlike classical bits, qubits can exist in multiple states simultaneously. This enables 
quantum computers to solve certain problems exponentially faster than classical computers. 
Major tech companies like IBM, Google, and Microsoft are racing to build practical quantum 
computers for cryptography, drug discovery, and optimization problems.
"""

# Add and process data
await cognee.add(data, "quantum_dataset")
await cognee.cognify(["quantum_dataset"])

# Search for entity relationships
search_results = await cognee.search(
    query_type=SearchType.INSIGHTS, 
    query_text="quantum computing"
)

# Results show entity relationships
for result in search_results:
    print(f"{result}\n")

3. CHUNKS Example

import cognee
from cognee import SearchType

# Sample data: Historical text
data = """
The Renaissance was a period of cultural rebirth in Europe from the 14th to 17th century. 
It began in Italy and spread throughout Europe, marking the transition from medieval to 
modern times. Key figures included Leonardo da Vinci, Michelangelo, and Galileo. The 
Renaissance saw advances in art, science, literature, and philosophy. The printing press, 
invented by Gutenberg, revolutionized information dissemination.
"""

# Add and process
await cognee.add(data, "history_dataset")
await cognee.cognify(["history_dataset"])

# Search for specific text chunks
search_results = await cognee.search(
    query_type=SearchType.CHUNKS, 
    query_text="Renaissance art and science",
    datasets=["history_dataset"]
)

# Display chunks
for result in search_results:
    print(f"{result}\n")

4. RAG_COMPLETION Example

import cognee
from cognee import SearchType

# Sample data: Business report
data = """
The global e-commerce market reached $5.7 trillion in 2023, growing 15% year-over-year. 
Mobile commerce accounts for 72% of all e-commerce sales. Amazon dominates with 38% 
market share, followed by Alibaba and JD.com. Key trends include AI-powered personalization, 
voice commerce, and sustainable packaging. Experts predict the market will exceed $8 trillion 
by 2027, driven by emerging markets and technological innovation.
"""

# Add and process
await cognee.add(data, "business_reports")
await cognee.cognify(["business_reports"])

# RAG completion with LLM
search_results = await cognee.search(
    query_type=SearchType.RAG_COMPLETION,
    query_text="What are the key trends in e-commerce?",
)

# Returns LLM-generated answer
print(search_results[0])

5. GRAPH_COMPLETION Example

import cognee
from cognee import SearchType

# Sample data: Geographic information
data = """
Germany is located in Central Europe, bordered by nine countries. To the north lies Denmark, 
to the east are Poland and Czech Republic, to the south are Austria and Switzerland, and to 
the west are France, Luxembourg, Belgium, and the Netherlands. Berlin is the capital and 
largest city. Germany is the most populous country in the European Union with over 83 million 
inhabitants. It has the largest economy in Europe and is a founding member of the EU.
"""

# Add and process
await cognee.add(data, "geography_dataset")
await cognee.cognify(["geography_dataset"])

# Most sophisticated search with graph context
graph_completion = await cognee.search(
    query_type=SearchType.GRAPH_COMPLETION,
    query_text="Which countries border Germany?",
    datasets=["geography_dataset"],
    save_interaction=True
)

print("Completion result:")
print(graph_completion)

6. CODE Example

import cognee
from cognee import SearchType
from cognee.api.v1.cognify.code_graph_pipeline import run_code_graph_pipeline

# Sample code repository structure
# Create a test Python file with sample code
sample_code = """
class UserAuthentication:
    '''Handles user authentication and session management.'''

    def __init__(self, database):
        self.db = database
        self.session_timeout = 3600

    def login(self, username, password):
        '''Authenticate user credentials.'''
        user = self.db.find_user(username)
        if user and self.verify_password(password, user.password_hash):
            return self.create_session(user)
        return None

    def verify_password(self, password, hash):
        '''Verify password against stored hash.'''
        return bcrypt.checkpw(password.encode(), hash)
"""

# First, process code repository
async for run_status in run_code_graph_pipeline("/path/to/your/repo"):
    print(f"{run_status.pipeline_run_id}: {run_status.status}")

# Search code
search_results = await cognee.search(
    query_type=SearchType.CODE,
    query_text="authentication functions",
)

# Display code results
for file in search_results:
    print(file["name"])

7. FEELING_LUCKY Example

import cognee
from cognee import SearchType

# Sample data: Technology news
data = """
Apple announced its new Vision Pro mixed reality headset at WWDC 2023. The device features 
dual 4K displays, spatial audio, and hand tracking. Priced at $3,499, it targets professional 
users and early adopters. The Vision Pro runs visionOS, a new operating system designed for 
spatial computing. Analysts predict it will create a new product category, though mainstream 
adoption may take years due to the high price point.
"""

# Add and process
await cognee.add(data, "tech_news")
await cognee.cognify(["tech_news"])

# Let the system choose the best search type
search_results = await cognee.search(
    query_type=SearchType.FEELING_LUCKY,
    query_text="What is Apple's new product?",
)

# Returns results in format of auto-selected type
print(search_results)

8. GRAPH_SUMMARY_COMPLETION Example

import cognee
from cognee import SearchType

# Sample data: Medical information
data = """
Type 2 diabetes is a chronic condition affecting how the body processes blood sugar. 
Risk factors include obesity, physical inactivity, family history, and age over 45. 
Symptoms include increased thirst, frequent urination, fatigue, and blurred vision. 
Treatment involves lifestyle changes, medication like metformin, and blood sugar monitoring. 
Complications can include heart disease, kidney damage, and nerve problems. Prevention 
focuses on maintaining healthy weight, regular exercise, and balanced diet.
"""

# Add and process
await cognee.add(data, "medical_info")
await cognee.cognify(["medical_info"])

# Graph completion with summarization
completion_sum = await cognee.search(
    query_type=SearchType.GRAPH_SUMMARY_COMPLETION,
    query_text="What are the risk factors for Type 2 diabetes?",
    save_interaction=True,
)

# Returns summarized LLM response
print(completion_sum[0])

9. GRAPH_COMPLETION_COT Example

import cognee
from cognee import SearchType

# Sample data: Complex problem
data = """
Climate change is caused by greenhouse gas emissions from human activities. The primary 
source is burning fossil fuels for energy, transportation, and industry. Deforestation 
reduces CO2 absorption capacity. Rising temperatures cause ice cap melting, sea level rise, 
and extreme weather events. Solutions include renewable energy adoption, carbon capture 
technology, reforestation, and international cooperation through agreements like the Paris 
Climate Accord. Individual actions like reducing consumption and using public transport also help.
"""

# Add and process
await cognee.add(data, "climate_dataset")
await cognee.cognify(["climate_dataset"])

# Chain-of-thought reasoning
completion_cot = await cognee.search(
    query_type=SearchType.GRAPH_COMPLETION_COT,
    query_text="How can we address climate change?",
    save_interaction=True,
)

# Returns refined answer after iterative reasoning
print(completion_cot[0])

10. GRAPH_COMPLETION_CONTEXT_EXTENSION Example

import cognee
from cognee import SearchType

# Sample data: Historical events
data = """
World War II began in 1939 when Germany invaded Poland. The war involved most of the world's 
nations, divided into Allies and Axis powers. Major battles included Stalingrad, D-Day, and 
Midway. The Holocaust resulted in the genocide of six million Jews. The war ended in 1945 
with Germany's surrender in May and Japan's surrender in August after atomic bombs were 
dropped on Hiroshima and Nagasaki. The war reshaped global politics, leading to the United 
Nations and the Cold War.
"""

# Add and process
await cognee.add(data, "history_ww2")
await cognee.cognify(["history_ww2"])

# Context extension through multiple rounds
completion_ext = await cognee.search(
    query_type=SearchType.GRAPH_COMPLETION_CONTEXT_EXTENSION,
    query_text="What were the major events of World War II?",
    save_interaction=True,
)

# Returns answer with expanded context
print(completion_ext[0])

11. FEEDBACK Example

import cognee
from cognee import SearchType

# Sample data: Product information
data = """
The iPhone 15 Pro features a titanium design, A17 Pro chip, and improved camera system. 
The main camera is 48MP with advanced computational photography. Battery life is up to 
29 hours video playback. It supports USB-C charging and has an Action button replacing 
the mute switch. Available in four colors: Natural Titanium, Blue Titanium, White Titanium, 
and Black Titanium. Storage options range from 128GB to 1TB.
"""

# Add and process
await cognee.add(data, "products")
await cognee.cognify(["products"])

# First, perform a search with save_interaction=True
await cognee.search(
    query_type=SearchType.GRAPH_COMPLETION,
    query_text="What are the features of iPhone 15 Pro?",
    save_interaction=True,
)

# Then provide feedback on the last interaction
await cognee.search(
    query_type=SearchType.FEEDBACK,
    query_text="This answer was very helpful and accurate",
    last_k=1,
)

12. TEMPORAL Example

import cognee
from cognee import SearchType

# Sample data: Timeline of events
data = """
In January 2023, ChatGPT reached 100 million users. In March 2023, GPT-4 was released. 
In May 2023, Google announced Bard AI. In July 2023, Meta released Llama 2 as open source. 
In September 2023, Amazon announced AI coding assistant CodeWhisperer. In November 2023, 
OpenAI held its first DevDay conference. In December 2023, Google released Gemini, its 
most capable AI model. These events marked 2023 as a breakthrough year for AI technology.
"""

# Add and process
await cognee.add(data, "ai_timeline")
await cognee.cognify(["ai_timeline"])

# Time-aware search
search_results = await cognee.search(
    query_type=SearchType.TEMPORAL,
    query_text="What AI events happened in 2023?",
)

# Returns time-filtered results
for result in search_results:
    print(result)

To find more about cognee visit cognee

DEV Community