Abkari Mohammed Sayeem

Posted on Apr 4

Context Engineering > Prompt Engineering: I built a Python lib that does it automatically

#llm #ai #python #opensource

Stop Copy-Pasting Files Into ChatGPT

Every developer using AI coding tools hits the same wall:

Which files do I paste into Claude?
Token limit exceeded... again
Bad answer because I included the wrong context
10 minutes wasted before even asking the question

I got frustrated with this daily and built ctxeng — a Python library that solves it automatically.

What Is Context Engineering?

Context engineering is the practice of carefully constructing what goes into an LLM's context window. It's more impactful than prompt engineering because:

The quality of your LLM's output depends almost entirely on what you put in the context window — not how you phrase the question.

The problem is doing it manually is painful. ctxeng automates it.

How ctxeng Works

from ctxeng import ContextEngine

engine = ContextEngine(root=".", model="claude-sonnet-4")
ctx = engine.build("Fix the authentication bug in the login flow")

print(ctx.summary())
# Context summary (12,340 tokens / 197,440 budget):
#   Included : 8 files
#   Skipped  : 23 files (over budget)
#   [████████  ] 0.84  src/auth/login.py
#   [███████   ] 0.71  src/auth/middleware.py
#   [█████     ] 0.53  src/models/user.py
#   [████      ] 0.41  tests/test_auth.py

print(ctx.to_string())  # paste this directly into Claude

It scans your entire codebase, scores every file for relevance, and fits the best ones into your model's token window.

The Scoring Algorithm

Every file gets a relevance score from 0 → 1 using four signals:

1. Keyword Overlap

How many query terms appear in the file content. Simple but surprisingly effective.

2. Python AST Analysis

ctxeng actually parses your Python files and extracts class names, function names, and import names — then matches them against your query. So if you ask about "authentication", files containing class AuthService or def authenticate() score higher.

3. Git Recency

Files touched in recent commits score higher. The logic: if you're asking about a bug, the files you recently changed are probably related.

4. File Path Matching

Filename and directory names matched against query tokens. src/auth/login.py scores higher for auth-related queries.

Fluent Builder API

For more control, use the chainable builder:

from ctxeng import ContextBuilder

ctx = (
    ContextBuilder(root=".")
    .for_model("gpt-4o")
    .only("**/*.py")
    .exclude("tests/**", "migrations/**")
    .from_git_diff()           # only changed files
    .with_system("You are a senior Python engineer.")
    .build("Refactor the payment module to use async/await")
)

print(ctx.to_string("markdown"))

CLI

# Build context for a query
ctxeng build "Fix the auth bug"

# Only git-changed files
ctxeng build "Review my changes" --git-diff

# Save to file
ctxeng build "Explain the payment flow" --output context.md

# Project stats
ctxeng info

Smart Token Truncation

When a file doesn't fully fit the budget, ctxeng doesn't just cut it off. It keeps:

The head (imports, class definitions, docstrings)
The tail (most recent changes)
Drops the middle

Both ends are high-signal. The middle is usually implementation details.

One-Line LLM Integration

from ctxeng import ContextEngine
from ctxeng.integrations import ask_claude

engine = ContextEngine(".", model="claude-sonnet-4")
ctx = engine.build("Why is the test_login test failing?")

response = ask_claude(ctx)
print(response)

Works With Every Model

Model	Window	Auto-detected
claude-sonnet-4	200K	✓
gpt-4o	128K	✓
gemini-1.5-pro	1M	✓
Local models	custom	set manually

For local models just set the budget manually:

ctx = (
    ContextBuilder(".")
    .with_budget(total=32_768)
    .build("Refactor this module")
)

Installation

pip install ctxeng

# With accurate token counting
pip install "ctxeng[tiktoken]"

# With Claude integration
pip install "ctxeng[anthropic]"

# Everything
pip install "ctxeng[all]"

Zero required dependencies by default. MIT license.

What's Next

Semantic similarity scoring (embedding model)
ctxeng watch — auto-rebuild on file changes
VSCode extension
Import graph analysis

Try It

⭐ GitHub: https://github.com/Sayeem3051/python-context-engineer

Would love feedback — especially on the scoring algorithm. What signals would make this more useful for your workflow?

DEV Community