DEV Community

Armel BOBDA
Armel BOBDA

Posted on

Building a CLI Tool with Cognee: Lessons from 5 Epics

Building a CLI Tool with Cognee: Lessons from 5 Epics

I just finished building Sentinel, a CLI tool that uses Cognee to detect energy conflicts in personal schedules. Five development epics. 860+ tests. Four critical bugs were found and fixed.

Along the way, I learned a lot about working with Cognee that isn't in the documentation. This article shares those lessons so you can avoid my mistakes.


What I Built

Sentinel analyses schedule text and builds a knowledge graph to find "energy collisions"—situations where a draining activity (dinner with a difficult relative) precedes a demanding one (important presentation).

$ sentinel paste < schedule.txt
✓ Extracted 7 entities
Found 6 relationships.
✓ Graph saved to ~/.local/share/sentinel/graph.db

$ sentinel check
⚠️  COLLISION DETECTED                    Confidence: 85%

[Aunt Susan] --DRAINS--> (drained)
                            |
                     CONFLICTS_WITH
                            |
                      (focused) <--REQUIRES-- [Strategy Presentation]
Enter fullscreen mode Exit fullscreen mode

HTML export with collision highlighting:

Check collision

The tool uses Cognee for entity extraction and relationship building, then applies custom collision detection logic on top.

Here's what I learned.


Lesson 1: Use CYPHER, Not GRAPH_COMPLETION

This one cost me hours of debugging.

The mistake:

# DON'T DO THIS for graph extraction
results = await cognee.search(
    SearchType.GRAPH_COMPLETION,
    query_text="*"
)
Enter fullscreen mode Exit fullscreen mode

My unit tests (with mocked Cognee) passed. Production extracted zero entities.

The problem: GRAPH_COMPLETION returns LLM-generated prose, not structured graph data:

"The schedule contains a dinner event with Aunt Susan on Sunday,
which is described as emotionally draining..."
Enter fullscreen mode Exit fullscreen mode

Useful for chat interfaces. Useless for graph algorithms.

The fix:

from cognee.api.v1.search import SearchType

# Get nodes
node_results = await cognee.search(
    query_text="MATCH (n) RETURN n",
    query_type=SearchType.CYPHER,
)

# Get edges
edge_results = await cognee.search(
    query_text="MATCH (a)-[r]->(b) RETURN a, r, b",
    query_type=SearchType.CYPHER,
)
Enter fullscreen mode Exit fullscreen mode

Takeaway: If you need structured graph data for programmatic use, always use SearchType.CYPHER with explicit Cypher queries.


Lesson 2: Cognee Results Are Deeply Nested

When you get Cypher results back, don't expect a flat list of nodes.

Actual structure:

results = [
    {
        'search_result': [
            [
                [node1_data],  # <-- Your actual node is here
                [node2_data],
                ...
            ]
        ]
    }
]
Enter fullscreen mode Exit fullscreen mode

Access pattern:

def extract_nodes(results):
    if not results:
        return []

    nodes = []
    search_result = results[0].get('search_result', [])

    if search_result:
        node_list = search_result[0]  # First level unwrap
        for node_wrapper in node_list:
            if isinstance(node_wrapper, list) and node_wrapper:
                node = node_wrapper[0]  # Second level unwrap
                nodes.append(node)

    return nodes
Enter fullscreen mode Exit fullscreen mode

Takeaway: Write robust extraction helpers and test them against real Cognee output, not mocks.


Lesson 3: Filter to Entity Nodes Only

Cognee's graph contains multiple node types. Not all of them are what you want.

Node Type What It Is Keep?
Entity Actual entities from your text ✅ Yes
DocumentChunk Text segments ❌ No
EntityType Category definitions ❌ No
TextDocument Source document metadata ❌ No
TextSummary LLM-generated summaries ❌ No

Filter pattern:

def extract_entities(nodes):
    return [
        node for node in nodes
        if node.get('type') == 'Entity'
    ]
Enter fullscreen mode Exit fullscreen mode

Without this filter, your graph will be cluttered with infrastructure nodes that aren't useful for domain logic.


Lesson 4: Properties Are JSON Strings

Cognee returns node properties as JSON strings, not Python dicts:

# What you get
node = {
    'id': 'abc-123',
    'name': 'Aunt Susan',
    'type': 'Entity',
    'properties': '{"description": "Family member", "entity_type": "PERSON"}'
}
Enter fullscreen mode Exit fullscreen mode

Parse them:

import json

def parse_properties(node):
    props = node.get('properties', '{}')
    if isinstance(props, str):
        try:
            return json.loads(props)
        except json.JSONDecodeError:
            return {}
    return props if isinstance(props, dict) else {}
Enter fullscreen mode Exit fullscreen mode

Lesson 5: The LLM Will Generate Unexpected Relation Types

This was my biggest surprise. I expected Cognee to use consistent relation type names. Instead:

What I expected:

DRAINS, REQUIRES, INVOLVES, SCHEDULED_AT
Enter fullscreen mode Exit fullscreen mode

What I got (sampling from multiple runs):

drains, depletes, exhausts, causes_fatigue,
emotionally_draining, negatively_impacts,
is_emotionally_draining, energy_draining,
leads_to_exhaustion, causes_exhaustion...
Enter fullscreen mode Exit fullscreen mode

Eleven variations for one concept. Per run.

Why this happens: Cognee's LLM extraction has no ontology constraints. The model generates semantically correct but lexically variable relation names.

The fix: Build a normalisation layer. I wrote a 3-tier matching system:

# Tier 1: Exact match dictionary (85+ entries)
RELATION_MAP = {
    "drains": "DRAINS",
    "depletes": "DRAINS",
    "exhausts": "DRAINS",
    # ...
}

# Tier 2: Keyword matching (stems)
KEYWORDS = {
    "DRAINS": ["drain", "exhaust", "deplet", "fatigue"],
    # ...
}

# Tier 3: Fuzzy matching (RapidFuzz)
from rapidfuzz import fuzz, process
# Match against candidate phrases
Enter fullscreen mode Exit fullscreen mode

Takeaway: Don't assume LLM output will be consistent. Build robust normalisation for any categorical data coming from Cognee.

I wrote a full deep-dive on this pattern: Taming LLM Output Chaos: A 3-Tier Normalisation Pattern


Lesson 6: Custom Prompts Change Everything

Cognee's cognify() function accepts a custom_prompt parameter. This was the key to getting domain-specific relationships.

Default behavior:

  • Generic entity extraction
  • Relations like involves, about, scheduled_at
  • No energy-domain relationships (DRAINS, REQUIRES)

With custom prompt:

EXTRACTION_PROMPT = """
You are extracting a PERSONAL ENERGY knowledge graph.

**REQUIRED RELATIONSHIP TYPES** (use ONLY these):
- DRAINS: Activity depletes energy/focus
- REQUIRES: Activity needs energy/focus
- CONFLICTS_WITH: Energy state conflicts with requirement
- SCHEDULED_AT: Activity occurs at time
- INVOLVES: Activity includes person/thing

**COLLISION PATTERN** (create when applicable):
[draining_activity] --DRAINS--> (energy_state) --CONFLICTS_WITH-->
[requiring_activity] --REQUIRES--> (resource)

**EXAMPLE**:
Input: "Sunday: Draining dinner. Monday: Important presentation."
Graph:
- [dinner] --DRAINS--> (emotional_energy)
- (emotional_energy) --CONFLICTS_WITH--> [presentation]
- [presentation] --REQUIRES--> (sharp_focus)
"""

await cognee.cognify(custom_prompt=EXTRACTION_PROMPT)
Enter fullscreen mode Exit fullscreen mode

Results:

  • Before custom prompt: ~20% collision detection rate
  • After custom prompt: ~70% edge type accuracy (still needed normalisation)
  • After prompt + normalisation: 100% collision detection

Takeaway: Don't fight Cognee's defaults. Guide them with domain-specific prompts that include examples and explicit relationship ontologies.


Lesson 7: Node IDs Vary Too (Semantic Consolidation)

Even with good prompts and relation normalisation, I had one more problem:

Run 1: [dinner] --DRAINS--> (emotional_exhaustion)
Run 2: [dinner] --DRAINS--> (low_energy)
Run 3: [dinner] --DRAINS--> (drained_state)
Enter fullscreen mode Exit fullscreen mode

Same concept, different node labels. My BFS collision detection couldn't find paths because it was doing exact string matching on node IDs.

The fix: Semantic node consolidation using RapidFuzz:

from rapidfuzz import fuzz

def group_similar_nodes(nodes, threshold=70):
    groups = []
    for node in nodes:
        merged = False
        for group in groups:
            if fuzz.WRatio(node.label, group[0].label) >= threshold:
                group.append(node)
                merged = True
                break
        if not merged:
            groups.append([node])
    return groups

def consolidate(graph):
    groups = group_similar_nodes(graph.nodes)
    # Pick canonical representative, rewrite edge references
    # ...
Enter fullscreen mode Exit fullscreen mode

Takeaway: LLM variability affects both relation types AND node identity. Handle both.


Lesson 8: Mocked Tests Will Lie to You

I had 178 tests passing. All green. Two critical bugs in production.

Bug 1: SearchType.GRAPH_COMPLETION returned prose instead of graph data. My mock returned what I expected Cognee to return, not what it actually returns.

Bug 2: Rich console interpreted [node labels] as style markup. My tests didn't render through the actual Rich console.

The fix: Live API tests.

@pytest.mark.live
async def test_real_entity_extraction():
    """Verify actual Cognee behavior."""
    engine = CogneeEngine()
    graph = await engine.ingest("Dinner with Aunt Susan on Sunday")

    assert len(graph.nodes) > 0, "No entities extracted"
    labels = {n.label.lower() for n in graph.nodes}
    assert any("susan" in l for l in labels)
Enter fullscreen mode Exit fullscreen mode

Run them manually before marking stories "done":

# Requires API key
uv run pytest tests/live/ -m live -v

# Skip in CI
uv run pytest -m "not live"
Enter fullscreen mode Exit fullscreen mode

Takeaway: For LLM integrations, unit tests with mocks are necessary but not sufficient. Add live API tests for critical paths.


Lesson 9: Suppress Cognee's Logging (But Keep a Debug Mode)

Cognee produces verbose output during normal operation. Great for debugging, annoying for users.

Solution: Lazy import with suppression:

from contextlib import redirect_stdout, redirect_stderr
from io import StringIO

def get_engine():
    with redirect_stdout(StringIO()), redirect_stderr(StringIO()):
        import warnings
        with warnings.catch_warnings():
            warnings.simplefilter("ignore")
            from sentinel.core.engine import CogneeEngine
    return CogneeEngine()
Enter fullscreen mode Exit fullscreen mode

But keep a debug flag:

@click.option('--debug', '-d', is_flag=True)
def main(debug):
    if debug:
        engine = CogneeEngine()  # Normal import, verbose
    else:
        engine = get_engine()  # Suppressed
Enter fullscreen mode Exit fullscreen mode

The Journey: 5 Epics in Numbers

Metric Value
Development epics 5
Stories completed 37
Tests written 860+
Critical bugs found 4
Relation type mappings 85+
Collision detection rate 15% → 100%

The architecture that worked:

User Input
    ↓
Cognee Extraction (custom prompt)
    ↓
3-Tier Relation Mapping (exact → keyword → fuzzy)
    ↓
Semantic Node Consolidation (RapidFuzz grouping)
    ↓
BFS Collision Detection
    ↓
Rich Terminal Output
Enter fullscreen mode Exit fullscreen mode

Each layer handles a different source of LLM variability.


Key Takeaways for Cognee Users

  1. Use SearchType.CYPHER for structured graph data, not GRAPH_COMPLETION.

  2. Expect nested results. Write robust extraction helpers.

  3. Filter to Entity nodes. Cognee returns infrastructure nodes too.

  4. Parse JSON properties. They come as strings.

  5. Normalise relation types. The LLM will surprise you.

  6. Use custom prompts. Domain-specific ontologies need explicit guidance.

  7. Consolidate semantically equivalent nodes. IDs vary like relation types.

  8. Add live API tests. Mocks don't catch integration bugs.

  9. Suppress verbose logging. But keep a debug mode.


Resources

Related Articles


Built for the Cognee Mini Challenge 2026 - January Edition. Happy building!

Top comments (0)