DEV Community: Ihor Haiduk

Stop Wasting AI Tokens: How I Built a Smart MCP Server to Feed Perfect Code Context

Ihor Haiduk — Thu, 28 May 2026 08:53:47 +0000

This is a submission for the GitHub Finish-Up-A-Thon Challenge

What I Built

I built smarter-faster-better-mcp — an MCP (Model Context Protocol) server designed to stop developers from wasting expensive AI tokens on irrelevant code.

Right now, when you ask an AI like Claude to fix a bug, you either let it blindly search your codebase (burning tokens and time) or you manually copy-paste multiple files into the context window. Both approaches lead to the same problem: the model gets confused by the noise and starts hallucinating.

My MCP acts as an intelligent middleman. Before the heavy AI model even touches the request, this server analyzes your codebase using real AST (Abstract Syntax Tree) parsing. For the JS/TS ecosystem, it uses a blazing-fast Rust-based parser (OXC), and for other languages, it uses Web Tree-Sitter. It walks the syntax tree, finds the exact component needed for the task, resolves complex paths (respecting your tsconfig.json paths), and maps out dependencies down to the specific line numbers.

The AI receives a perfectly packaged, minimal context payload. It processes the task instantly, costs pennies instead of dollars, and doesn't hallucinate because it's not distracted by noise.

Demo

🔗 Project Repository: https://github.com/iHaiduk/smarter-faster-better-mcp

Result for AI:

{
  "markdown": "[Scout: FOUND]\n---\n## filterMap (src/filter.ts:L9-14)\n```

ts\nexport const filterMap = (map: ProjectMap, query: string): SymbolEntry[] => {\n  const keywords = extractKeywords(query)\n  return map.symbols.filter((sym) => \n    keywords.some((kw) => sym.name.toLowerCase().includes(kw))\n  )\n}\n

```\nUsed: src/__tests__/filter.test.ts, src/bin/test-scout.ts",

  "structuredContent": {
    "symbols": "file|symbol|lines|tier|status\nsrc/filter.ts|filterMap|9-14|mustRead|ok",
    "deps": "src/types.ts[37],src/shared/prompts/stop-words.ts[4]",
    "confidence": 1.0,
    "reason": "deterministic",
    "queries": "trace_symbol:ProjectMap,trace_symbol:extractKeywords"
  }
}

In the screenshot above, you can see how the MCP server acts as a smart proxy. Instead of dumping whole files into the prompt, it identifies the exact target function and automatically extracts the precise lines of code from dependency files. The result is a clean, structured package sent directly to the LLM.

The Comeback Story

This project was born out of pure frustration with API costs, but for a long time, it was stuck in a "fragile proof-of-concept" state. It worked in ideal conditions but would break instantly in the real world.

Before the Finish-Up-A-Thon, the project had critical flaws:

Destructive Cleanup: The cache cleanup procedure was way too aggressive. It could accidentally wipe out dist or build folders, and because it ran in the background on startup, it caused silent system failures.
Root Directory Crashes: Running the server in a root directory (/ or ~) would immediately crash it due to permission errors when trying to index hidden system files.
Broken MCP Connection: This was a nightmare. When starting via bunx or npx, the dotenv library would output debug logs (like ◇ injected env...) straight to stdout. Since MCP strictly communicates via JSON over stdout, these text strings corrupted the JSON packets and caused instant disconnects in Claude Desktop or Cursor.
Memory Leaks: If it accidentally hit a node_modules folder, the indexing process would consume gigabytes of RAM and freeze.
Hard Setup: Users had to manually configure environment variables in their IDE configs just to get it running.

I completely overhauled the architecture to transform it from a fragile script into a robust, production-ready tool. Here is how the project evolved:

Smarter AI Interactions: Instead of letting LLMs lazily rely on slow, blind text searches, I engineered the system to strictly guide them toward structured, AST-driven analysis. This ensures the AI actually "understands" the code structure rather than just matching strings.
Lightning-Fast Indexing: I completely reworked the file traversal engine to aggressively filter out unnecessary noise right from the start. This transformed the indexing process from a potential memory hog into a near-instantaneous operation.
Bulletproof Filesystem Handling: I overhauled how the tool interacts with your local environment. It now operates with strict boundaries and smart safeguards, ensuring that background processes never interfere with your source code, build artifacts, or critical system directories.
Flawless Protocol Communication: A massive amount of work went into sanitizing the data streams. By completely isolating system logs from the strict JSON-based MCP communication channel, the connection is now rock-solid and unbreakable across all major AI clients.
Production-Grade Developer Experience: I modernized the entire build and delivery pipeline for maximum efficiency and a minimal footprint. Coupled with automated CI/CD pipelines and secure publishing, the tool is now as easy to install as it is powerful to use.

Taking it from a "hacky script that might delete your build folder" to a stable, production-ready developer tool feels amazing.

My Experience with GitHub Copilot

GitHub Copilot was my co-pilot throughout this entire stabilization phase. I didn't just use it for basic autocomplete; I relied on it heavily for code quality control and finding bottlenecks.

When I was rewriting the AST traversal logic or fixing the complex path resolution with tsconfig.json, Copilot helped me spot edge cases I would have missed. It was instrumental in debugging the weird protocol issues and optimizing the TypeScript code to ensure the server remains lightweight and fast. It acted as a true pair programmer, allowing me to focus on the architecture while it handled the heavy lifting of code refinement.

From Monolith to Symphony — Why the Future of AI Fits in Your Pocket

Ihor Haiduk — Thu, 28 May 2026 06:48:36 +0000

What is the most enduring cognitive illusion regarding the phenomenon of intelligence? It is the notion that a true intellect must know everything. For centuries, the epitome of genius was the erudite—a walking encyclopedia capable of effortlessly operating across the realms of physics, philosophy, and art. However, the exponential growth of information has exposed a harsh epistemological truth: omniscience is not merely impossible—it is devoid of evolutionary meaning. Knowing everything means having no focal point for deep insight.

The abandonment of the illusion of universalism in favor of narrow specialization became the metabolic fuel of our progress. And today, observing the collapse of yet another technological paradigm in the artificial intelligence industry, we are experiencing a logical déjà vu. AI is inevitably forced to undergo the exact same evolutionary path—from doomed attempts to build an omnipotent monolith to a distributed network of narrow minds.

The Anatomy of Progress: Why Generalists Are Doomed

The transition from the paradigm of "I can do everything" to the principle of "I can do one thing, but absolutely" is not merely an economic trend; it is a fundamental engine of civilization. In 1776, Adam Smith demonstrated this using the example of a pin factory: one universal master could produce 20 pins a day. But the moment the process was decomposed into 18 narrow operations, ten specialists began producing 48,000. Efficiency increased 240-fold.

A century later, Émile Durkheim extrapolated this principle to social evolution. Archaic societies consisted of generalists—each individual procured their own food and built their own shelter. They were completely independent, yet entirely primitive. Advanced civilization, conversely, relies on the critical interdependence of narrow specialists. Synergy is born not in the mind of a single polymath, but at the intersection of diverse expertises.

The Integration Problem: The Conductor in a World of Virtuosos

However, narrow specialization generates a systemic problem—fragmentation. If every expert is focused exclusively on their own niche, who assembles the scattered puzzle pieces into a cohesive picture?

In a world of narrow profiles, an integrator is critically essential. A manager does not need to know how to write C++ code to direct a developer, nor how to interpret an MRI to manage a clinic. Their function is to see the holistic system, connect the elements, and ensure synchronization. As Peter Drucker argued, the knowledge of specialists is dead on its own; it must be integrated into a unified result.

The Cognitive Dead End of Monolithic AI

And this is precisely where the modern AI industry makes a fatal logical error. The developers of leading LLMs (Large Language Models) ignore centuries of economic and social evolutionary experience. They are attempting to create a monolithic "Superbrain"—a single neural network that knows and can do everything.

This is akin to trying to combine the qualifications of a neurosurgeon, a poet, a mechanic, and a lawyer within a single consciousness. When a critical mass of parameters is reached, the knowledge within the model begins to intersect and interfere. The attempt to compress the universe into a single architecture leads to a dilution of focus: the model generates confident but factually false constructs—the very "hallucinations." The generalist becomes inefficient once again.

But the problem lies not only in the realm of cognitive distortions. It lies in the economics of computation.

The Economic Trap: A CEO for $500 an Hour

Training monolithic models costs hundreds of millions of dollars, and their operation requires colossal data centers, whose energy consumption is projected to double by 2030. To recoup these investments, corporations monetize access to the "omnipotent" mind through premium subscriptions.

The situation becomes even more absurd with the advent of AI agents. Modern frameworks use that same ultra-expensive model as the manager. Imagine: you ask the system to find a local file and summarize it. The expensive model itself decomposes the task and calls itself to perform the simplest routine actions. It is like hiring a CEO for $500 an hour to decide who takes out the trash—and then taking it out themselves. We are paying for high-level intelligence where primitive deterministic mechanics are required.

The Architecture of the Future: Division of Cognitive Labor

The way out of this dead end was theoretically justified as early as 1986 by Marvin Minsky in his concept of "The Society of Mind": the computational power of intellect arises from the interaction of a multitude of small, specialized processes, each of which is devoid of intelligence on its own.

The true architecture of future AI is a multi-agent ecosystem with a radically different principle of orchestration. Instead of one expensive, error-prone monolith, we are creating a swarm of highly specialized agents. But the key insight lies in who manages them. The function of the conductor must be performed not by an expensive cloud model, but by a deterministic algorithm or a local lightweight model.

The principle is crystal clear: the separation of grunt work and deep synthesis.

Cheap local models act as scanners and filters. They do not "think"; they mechanically extract relevant fragments from massive data arrays, cutting off informational noise.
The purified, super-concentrated context is passed to the large model, but exclusively for final analysis and synthesis.

Such architecture is already ceasing to be a mere theoretical construct. For instance, solutions like smarter-faster-better-mcp implement exactly this principle: lightweight local agents take on the grunt work of scanning files and filtering noise, passing an already purified "broth" of context to the large model for final synthesis. This is just one example of how the division of cognitive labor is being realized in practice, proving its viability.

The result of this approach is obvious: the expensive model receives not raw data, but a ready-made concentrate. It does not waste computational resources on search, the quality of answers increases exponentially, hallucinations drop to near zero, and the cost of the query plummets. We take the grunt work away from expensive intelligence and hand it to those who do it cheaper, faster, and more reliably.

Compute Sovereignty: AI in Your Pocket

This architecture breaks the primary technological barrier of our time—dependence on mega-servers. Narrowly targeted agents have a minimal number of parameters. This opens the door to Edge AI—running models directly on user devices. No internet. No transferring personal data to corporate servers. Absolute confidentiality and zero latency.

The industry is already moving in this direction: protocols are standardizing agent interaction, and the share of enterprise applications utilizing multi-agent architectures is growing rapidly.

Conclusion

We stand on the threshold of the final rejection of the omniscience illusion. The strength of artificial intelligence, like the strength of human civilization, lies not in universality, but in focus. Not in the scale of the monolith, but in flawless coordination.

The future belongs not to the AI that requires an entire data center to answer a simple question, nor to the corporation renting out that data center. The future belongs to you—the conductor directing an orchestra of simple, cheap, and incredibly effective narrow specialists. And this entire orchestra already fits in your pocket.