You open a seemingly simple file in your codebase:
// src/api/user-profile.ts (52 lines)
import { validateUser } from './validators';
import { formatResponse } from './formatters';
import { logRequest } from './logger';
export async function getUserProfile(userId: string) {
validateUser(userId);
const user = await fetchUser(userId);
logRequest('getUserProfile', userId);
return formatResponse(user);
}
Looks clean, right? Just 52 lines, three imports, straightforward logic. But when your AI assistant tries to understand this file, here's what actually gets loaded into its context window:
src/api/user-profile.ts 52 lines 1,245 tokens
└─ validators.ts 89 lines 2,134 tokens
└─ validation-rules.ts 156 lines 3,721 tokens
└─ error-types.ts 41 lines 982 tokens
└─ formatters.ts 103 lines 2,456 tokens
└─ format-utils.ts 78 lines 1,867 tokens
└─ logger.ts 67 lines 1,603 tokens
└─ log-transport.ts 124 lines 2,967 tokens
└─ log-formatter.ts 91 lines 2,178 tokens
Total: 801 lines, 19,153 tokens
Your 52-line file just became a 19,153-token context load. That's 366x more expensive than it appears. And your AI assistant has to load all of this to understand your simple function.
This is the hidden cost of import chains—and it's one of the biggest reasons AI struggles with your codebase.
The Context Window Crisis
Every import creates a cascading context cost:
- Direct dependencies: Files you import
- Transitive dependencies: Files your imports import
- Type dependencies: Interfaces and types needed for understanding
- Implementation depth: How deep the chain goes
Modern AI models have context windows of 128K-1M tokens. Sounds like a lot, right? But in a real codebase:
- Average file: 200-300 lines = 4,800-7,200 tokens
- With direct imports: 800-1,200 lines = 19,200-28,800 tokens
- With deep chains: 2,000+ lines = 48,000+ tokens
- Multiple related files: Context exhaustion
Suddenly that 128K context window doesn't feel so spacious. Add a few related files to analyze a feature, and your AI is already hitting limits—or worse, truncating critical context.
Real-World Impact: The receiptclaimer Analysis
When I ran @aiready/context-analyzer on receiptclaimer's codebase, I discovered patterns that shocked me:
Before Refactoring:
Average context budget per file: 12,450 tokens
Maximum depth: 7 levels
Fragmented domains: 4 (User, Receipt, Auth, Payment)
Low cohesion files: 23 (43% of analyzed files)
Top offenders:
- src/api/receipt-processor.ts: 47,821 tokens (cascade depth: 7)
- src/services/user-service.ts: 38,945 tokens (cascade depth: 6)
- src/api/payment-handler.ts: 35,102 tokens (cascade depth: 6)
After Refactoring:
Average context budget per file: 4,780 tokens (-62%)
Maximum depth: 4 levels
Fragmented domains: 2 (consolidated User+Auth, Receipt+Payment)
Low cohesion files: 5 (9% of analyzed files)
Top files (now optimized):
- src/api/receipt-processor.ts: 8,234 tokens (depth: 3)
- src/services/user-service.ts: 6,891 tokens (depth: 3)
- src/api/payment-handler.ts: 7,445 tokens (depth: 4)
Impact on AI Performance:
- Response time: Avg 8.2s → 3.1s (62% faster)
- Context truncation errors: 34 → 2 (94% reduction)
- Suggestions quality: Subjectively much better, AI now references correct patterns
- Developer satisfaction: "AI finally gets what I'm trying to do"
The Four Dimensions of Context Cost
@aiready/context-analyzer measures four key metrics:
1. Import Depth (Cascade Levels)
How many layers deep your dependencies go:
// Depth 0: No imports
export function add(a: number, b: number) {
return a + b;
}
// Depth 1: Direct imports only
import { add } from './math';
export function calculate(x: number) {
return add(x, 10);
}
// Depth 3+: Deep chain (EXPENSIVE)
import { processUser } from './user-processor'; // imports 5 files
// └─ which imports './validators' // imports 3 files
// └─ which imports './validation-rules' // imports 2 files
Rule of thumb:
- Depth 0-2: ✅ Excellent (< 5,000 tokens)
- Depth 3-4: ⚠️ Acceptable (5,000-15,000 tokens)
- Depth 5+: ❌ Expensive (15,000+ tokens)
2. Context Budget (Total Tokens)
The total number of tokens AI needs to understand your file:
// Small budget (< 3,000 tokens)
// File: 120 lines, 1 import, shallow dependency
import { API_URL } from './config';
export function fetchUser(id: string) {
return fetch(`${API_URL}/users/${id}`);
}
// Large budget (> 20,000 tokens)
// File: 200 lines, 8 imports, deep dependencies
import { validateInput } from './validators'; // +4,500 tokens
import { transformData } from './transformers'; // +6,200 tokens
import { enrichUser } from './enrichment'; // +8,100 tokens
import { formatResponse } from './formatters'; // +3,800 tokens
// ... more imports ...
Target zones:
- < 5,000 tokens: ✅ AI-friendly
- 5,000-15,000 tokens: ⚠️ Monitor
- 15,000+ tokens: ❌ Refactor needed
3. Domain Fragmentation
How scattered your related logic is across files:
// FRAGMENTED (user logic in 8 files)
src/api/user-login.ts // Authentication
src/api/user-profile.ts // Profile management
src/services/user-validator.ts // Validation
src/utils/user-formatter.ts // Formatting
src/models/user-types.ts // Types
src/db/user-repository.ts // Data access
src/middleware/user-auth.ts // Auth middleware
src/helpers/user-utils.ts // Utilities
// CONSOLIDATED (user logic in 3 files)
src/domain/user/
├─ user.service.ts // Core business logic
├─ user.repository.ts // Data access
└─ user.types.ts // Types and interfaces
Why fragmentation matters:
When AI tries to understand user-related features, it must:
- Load 8 separate files (fragmented) vs 3 files (consolidated)
- Parse 3,200+ lines vs 800 lines
- Navigate 24+ imports vs 6 imports
- Build mental model across scattered context vs cohesive modules
4. Cohesion Score
How well a file focuses on one responsibility:
// LOW COHESION (mixed concerns)
// user-service.ts
export class UserService {
validateEmail() {
/* validation logic */
}
sendEmail() {
/* email sending logic */
}
formatUserName() {
/* formatting logic */
}
logUserAction() {
/* logging logic */
}
encryptPassword() {
/* crypto logic */
}
renderUserProfile() {
/* rendering logic */
}
}
// HIGH COHESION (single responsibility)
// user-service.ts
export class UserService {
createUser() {
/* user creation */
}
updateUser() {
/* user updates */
}
deleteUser() {
/* user deletion */
}
getUserById() {
/* user retrieval */
}
}
Cohesion calculation:
The tool analyzes:
- Method names and their similarity
- Import types (business logic vs utilities vs external)
- File path and naming conventions
- Return types and parameter types
Scores:
- 80-100%: ✅ Highly cohesive (focused responsibility)
- 60-79%: ⚠️ Moderate cohesion (some mixing)
- < 60%: ❌ Low cohesion (refactor into separate modules)
Technical Deep Dive: How Context-Analyzer Works
Step 1: Build Dependency Graph
// Pseudo-code of the analysis
function analyzeDependencies(entryFile: string) {
const graph = new DependencyGraph();
function traverse(file: string, depth: number = 0) {
const ast = parseFile(file);
const imports = extractImports(ast);
for (const imp of imports) {
const resolvedPath = resolveImport(imp, file);
graph.addEdge(file, resolvedPath, depth + 1);
if (depth < MAX_DEPTH) {
traverse(resolvedPath, depth + 1);
}
}
}
traverse(entryFile);
return graph;
}
Step 2: Calculate Token Costs
function calculateContextBudget(file: string, graph: DependencyGraph) {
let totalTokens = 0;
const visited = new Set();
function countTokens(currentFile: string) {
if (visited.has(currentFile)) return;
visited.add(currentFile);
const content = readFile(currentFile);
const tokens = estimateTokens(content); // ~24 tokens per 100 chars
totalTokens += tokens;
// Recursively count dependencies
for (const dep of graph.getDependencies(currentFile)) {
countTokens(dep);
}
}
countTokens(file);
return totalTokens;
}
Step 3: Detect Fragmentation
function detectFragmentation(files: string[]) {
const domains = new Map();
for (const file of files) {
const domain = extractDomain(file); // e.g., "user", "receipt"
if (!domains.has(domain)) {
domains.set(domain, []);
}
domains.get(domain).push(file);
}
// Flag domains split across many files
return [...domains.entries()]
.filter(([_, files]) => files.length > 5)
.map(([domain, files]) => ({
domain,
fileCount: files.length,
fragmentationScore: calculateFragmentation(files),
}));
}
Step 4: Measure Cohesion
function analyzeCohesion(file: string) {
const ast = parseFile(file);
const exports = extractExports(ast);
const imports = extractImports(ast);
// Analyze semantic similarity of exports
const similarities = [];
for (let i = 0; i < exports.length - 1; i++) {
for (let j = i + 1; j < exports.length; j++) {
const sim = calculateSimilarity(exports[i], exports[j]);
similarities.push(sim);
}
}
// High average similarity = high cohesion
const avgSimilarity =
similarities.reduce((a, b) => a + b, 0) / similarities.length;
// Penalty for mixed import types
const importTypes = categorizeImports(imports);
const mixedPenalty = Object.keys(importTypes).length > 3 ? 0.2 : 0;
return Math.max(0, avgSimilarity - mixedPenalty);
}
Example: Refactoring receiptclaimer's Receipt Processing
Let me show you a real refactoring that reduced context budget by 82%.
Before: Deep Import Chain (47,821 tokens)
// src/api/receipt-processor.ts
import { validateReceipt } from '../validators/receipt-validator';
import { parseReceiptImage } from '../services/ocr-service';
import { extractLineItems } from '../parsers/line-item-parser';
import { calculateTotals } from '../calculators/total-calculator';
import { enrichMerchantData } from '../enrichment/merchant-enricher';
import { formatReceiptResponse } from '../formatters/receipt-formatter';
import { logProcessing } from '../logging/process-logger';
import { notifyUser } from '../notifications/user-notifier';
export async function processReceipt(imageUrl: string, userId: string) {
logProcessing('start', userId);
const validation = validateReceipt(imageUrl);
if (!validation.valid) throw new Error('Invalid receipt');
const ocrResult = await parseReceiptImage(imageUrl);
const lineItems = extractLineItems(ocrResult);
const totals = calculateTotals(lineItems);
const enriched = await enrichMerchantData(ocrResult.merchant, lineItems);
await notifyUser(userId, 'Receipt processed');
return formatReceiptResponse({ lineItems, totals, merchant: enriched });
}
Dependency tree:
receipt-processor.ts (180 lines, 4,302 tokens)
├─ receipt-validator.ts (94 lines, 2,247 tokens)
│ ├─ validation-rules.ts (156 lines, 3,721 tokens)
│ └─ error-types.ts (41 lines, 982 tokens)
├─ ocr-service.ts (203 lines, 4,847 tokens)
│ ├─ image-preprocessor.ts (145 lines, 3,461 tokens)
│ ├─ ocr-client.ts (89 lines, 2,125 tokens)
│ └─ text-extractor.ts (178 lines, 4,249 tokens)
├─ line-item-parser.ts (167 lines, 3,987 tokens)
├─ total-calculator.ts (78 lines, 1,862 tokens)
├─ merchant-enricher.ts (134 lines, 3,201 tokens)
│ └─ merchant-api-client.ts (98 lines, 2,340 tokens)
├─ receipt-formatter.ts (103 lines, 2,458 tokens)
├─ process-logger.ts (67 lines, 1,601 tokens)
│ └─ log-transport.ts (124 lines, 2,967 tokens)
└─ user-notifier.ts (89 lines, 2,125 tokens)
└─ notification-service.ts (156 lines, 3,724 tokens)
Total: 1,902 lines, 47,821 tokens
Depth: 7 levels
After: Consolidated Module (8,234 tokens)
// src/domain/receipt/receipt.service.ts
import { ReceiptRepository } from './receipt.repository';
import { OCRProvider } from './ocr.provider';
import { ReceiptTypes } from './receipt.types';
export class ReceiptService {
constructor(
private repository: ReceiptRepository,
private ocrProvider: OCRProvider
) {}
async processReceipt(
imageUrl: string,
userId: string
): Promise<ReceiptTypes.ProcessedReceipt> {
// Validation (inline, simple)
if (!this.isValidImageUrl(imageUrl)) {
throw new ReceiptTypes.ValidationError('Invalid image URL');
}
// OCR processing (delegated to focused provider)
const ocrResult = await this.ocrProvider.parseImage(imageUrl);
// Business logic (co-located)
const receipt = this.buildReceipt(ocrResult);
const lineItems = this.parseLineItems(ocrResult.text);
const totals = this.calculateTotals(lineItems);
// Enrichment (co-located)
const merchant = await this.enrichMerchant(ocrResult.merchantName);
// Persistence
const saved = await this.repository.save({
...receipt,
lineItems,
totals,
merchant,
userId,
});
return saved;
}
private isValidImageUrl(url: string): boolean {
return url.startsWith('http') && /\.(jpg|jpeg|png|pdf)$/i.test(url);
}
private parseLineItems(text: string): ReceiptTypes.LineItem[] {
// Inline parsing logic (previously in separate file)
// ~30 lines of focused parsing
}
private calculateTotals(items: ReceiptTypes.LineItem[]): ReceiptTypes.Totals {
// Inline calculation (previously in separate file)
// ~15 lines of calculation
}
private async enrichMerchant(name: string): Promise<ReceiptTypes.Merchant> {
// Inline enrichment (previously in separate file + client)
// ~20 lines of enrichment logic
}
private buildReceipt(ocrResult: OCRResult): Partial<ReceiptTypes.Receipt> {
// Mapping logic
}
}
New dependency tree:
receipt.service.ts (245 lines, 5,856 tokens)
├─ receipt.repository.ts (87 lines, 2,078 tokens)
├─ ocr.provider.ts (45 lines, 1,072 tokens) [thin wrapper]
└─ receipt.types.ts (38 lines, 908 tokens)
Total: 415 lines, 8,234 tokens
Depth: 3 levels
Reduction: 47,821 → 8,234 tokens (82.8% decrease)
What Changed?
1. Consolidated scattered logic:
- 8 separate files → 1 service file
- Related functions co-located
- Clear domain boundary
2. Inlined simple utilities:
-
validateReceipt: 94 lines → 3 lines (simple inline check) -
calculateTotals: 78 lines → 15 lines (removed abstraction overhead) -
parseLineItems: 167 lines → 30 lines (removed generic parsers)
3. Removed unnecessary abstractions:
- Separate formatter → methods on service
- Separate logger → focused logging where needed
- Notification → moved to message queue trigger
4. Created thin wrappers:
- OCR client: Fat client (203 lines) → thin provider (45 lines)
- Repository: Focused data access only
Migration Strategy: How to Refactor Without Breaking Everything
Refactoring deep import chains is scary. Here's how to do it safely:
Step 1: Measure Current State
# Generate baseline report
npx @aiready/context-analyzer ./src --output baseline.json
# Identify top offenders
npx @aiready/context-analyzer ./src --sort-by budget --limit 10
Step 2: Prioritize Refactoring
Focus on:
- High-traffic files: API handlers, services, core business logic
- High-budget files: > 15,000 tokens
- Deep chains: Depth > 5
- Low cohesion: Score < 60%
Step 3: Create Domain Boundaries
Before (scattered):
src/
├─ api/
├─ services/
├─ utils/
├─ formatters/
├─ validators/
└─ helpers/
After (domain-driven):
src/
├─ domain/
│ ├─ user/
│ │ ├─ user.service.ts
│ │ ├─ user.repository.ts
│ │ └─ user.types.ts
│ ├─ receipt/
│ └─ payment/
└─ infrastructure/
├─ api/
└─ database/
Step 4: Refactor Incrementally
Week 1: Consolidate one domain (e.g., User)
Week 2: Consolidate another domain (e.g., Receipt)
Week 3: Update imports across codebase
Week 4: Remove old files, update tests
Step 5: Verify Improvements
# Generate new report
npx @aiready/context-analyzer ./src --output after.json
# Compare
npx @aiready/cli compare baseline.json after.json
Best Practices
✅ Do:
- Co-locate related logic: Keep domain logic together
- Inline simple utilities: < 20 lines, used in one place
- Use dependency injection: Makes testing easier, reduces coupling
- Create thin adapters: For external services, databases
- Measure regularly: Track context budget over time
❌ Don't:
- Over-abstract: Not everything needs a separate file
- Create deep hierarchies: Flat is better than nested
- Split prematurely: Extract only when reused 3+ times
- Ignore cohesion: Low cohesion = mixed concerns = high context cost
- Refactor blindly: Understand dependencies before moving code
Integration with CI/CD
GitHub Actions: Context Budget Check
name: Context Budget Check
on: [pull_request]
jobs:
context-analysis:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/setup-node@v3
- name: Analyze context budget
run: npx @aiready/context-analyzer ./src --threshold 15000
- name: Check for regressions
run: |
npx @aiready/context-analyzer ./src --output current.json
npx @aiready/cli compare baseline.json current.json --fail-on-regression
Pre-commit Hook: Prevent Deep Chains
#!/bin/sh
# .git/hooks/pre-commit
echo "Checking import depth..."
npx @aiready/context-analyzer ./src --max-depth 4 --quiet
if [ $? -ne 0 ]; then
echo "❌ Import chains too deep. Refactor before committing."
exit 1
fi
The Bottom Line
Import chains are invisible expensive. Every import adds context cost that:
- Slows down AI responses
- Increases token usage (costs money on paid APIs)
- Causes context truncation errors
- Makes AI suggestions less accurate
But unlike many optimization problems, this one has clear metrics and actionable fixes:
- Measure: Run context-analyzer to see your current state
- Prioritize: Focus on high-budget, deep-chain, low-cohesion files
- Refactor: Consolidate domains, inline utilities, remove unnecessary abstractions
- Verify: Measure again, track improvements over time
Try It Yourself
# Analyze your codebase
npx @aiready/context-analyzer ./src
# Check specific file
npx @aiready/context-analyzer ./src/api/handler.ts --detailed
# Find files over budget
npx @aiready/context-analyzer ./src --threshold 15000
# Export report
npx @aiready/context-analyzer ./src --output report.json
# Unified CLI with all metrics
npx @aiready/cli scan --score
Before you refactor:
- Measure your current context budget
- Identify top offenders (top 10 files by token cost)
- Pick one domain to consolidate
After you refactor:
- Measure again
- Calculate percentage improvement
- Share your results!
Resources:
- GitHub: github.com/getaiready/aiready-cli
- Docs: getaiready.dev
- Report issues: github.com/getaiready/aiready-cli/issues
What's your biggest context budget file? Run the analyzer and share your findings—I'd love to see what you discover.
Peng Cao is the founder of receiptclaimer and creator of aiready, an open-source suite for measuring and optimizing codebases for AI adoption.
Top comments (0)