neitherGalax

Posted on May 30

Tokyo Transit: How MCP Helped Me Fix a Broken Multi-Agent System

#devchallenge #githubchallenge #devjournal #opensource

GitHub “Finish-Up-A-Thon” Challenge Submission

This is a submission for the GitHub Finish-Up-A-Thon Challenge

What I Built

Tokyo’s train system is incredible—but it can also be overwhelming. Multiple rail operators, dense transfer stations, and complicated route decisions make even simple trips confusing, especially for visitors.

There are already transit apps out there, but most still feel static. They give information, but they don’t really reason through the trip experience. Some even lock basic features behind paywalls.

That pushed me to build something different.

I built Tokyo Transportation MCP, an AI-powered transit assistant that uses MCP and skill-based agent workflows to retrieve and organize Tokyo transit information in a more natural, conversational way.

Instead of jumping between maps and apps, users can ask questions like “What’s the best route from Shinjuku to Maihama?” and receive clear, practical guidance with transfers, timing, and routing context.

More than just route calculation, the project is about making one of the world’s most complex transit systems feel easier to understand through AI agents.

Demo

🔗Repo: Link

Baseline Prompt

Transfer-Heavy Prompt

Ambiguous Prompt

Time-Sensitive Prompt

Alias Prompt

The Comeback Story

This project actually started much earlier as a simple Tokyo transportation agent.

At the time, I was experimenting with early multi-agent workflows and system prompting with the Hugging Face ecosystem. But honestly, the tooling and orchestration patterns were still under development, and I struggled to make the agents behave reliably. The system worked sometimes, but not consistently enough to feel usable.

Eventually, I shelved the project.

Later on, after learning more about MCP architecture and Skill-based systems, I realized I finally had a cleaner way to structure the idea. Instead of forcing complex orchestration too early, I rebuilt the project around modular skills, clearer agent behavior, and more reliable transit retrieval workflows.

What started as an abandoned experiment slowly turned into a much more practical and stable AI transit assistant.

The project is still under active development, and there are plenty of improvements and edge cases left to tackle. But reaching the point where it functions as a stable MCP-powered transit tool feels like a meaningful milestone.

My Experience with GitHub Copilot

GitHub Copilot helped speed up the rebuild, especially when I moved toward MCP and Skill-based design.

A big part of this project wasn’t just coding — it was designing Skill.md, the instruction layer that defines how the agent should behave and interpret transit tasks. Getting that right required a lot of iteration.

Copilot was useful for scaffolding the MCP tool code and exploring different implementation ideas. It made early prototyping much faster.

From there, the focus shifted to refinement: tightening instructions, adjusting structure, and iterating until the agent behaved consistently in real scenarios.

Top comments (2)

Harjot Singh • May 31

Nice concrete case - "MCP fixed a broken multi-agent system" is more convincing than any abstract MCP pitch because it shows the actual value: the system was probably broken on bespoke, brittle integrations between agents/tools, and standardizing the tool interface removed a whole class of glue-code failures. The win usually isn't "MCP is magic," it's "MCP let me stop hand-wiring every integration and the failures that came with them."

The transit angle is a great stress test too - multi-modal routing (trains/buses/transfers) is genuinely hard state to coordinate across agents, so if MCP cleaned that up, that's a strong signal. The lesson I take: standardized interfaces between agents matter more than smarter agents, because brittle handoffs are where multi-agent systems actually break (it's the thing I gate hardest in Moonshift, a multi-agent pipeline shipping a prompt to a real SaaS). Cool writeup - what specifically was breaking before MCP: context loss between agents, or tool-call inconsistency? That detail is the useful part for others hitting the same wall.

neitherGalax • May 31

Thanks for the thoughtful comment and great points.

Reflecting on it, I don't think the biggest issue in my original project was context loss between agents. It was more about prompt alignment, instruction following, and maintaining a consistent output structure. The project was built during my early experiments with SmolAgents, and I spent a lot of time trying to manage increasingly complex behavior through a large system prompt.

I think what MCP and later Skill-based design gave me wasn't just a standardized interface—it gave me a cleaner way to organize capabilities, instructions, and tool interactions. Instead of adding more prompt engineering, which is disappearing anyway, I could rely on better architecture.

Looking back, I was probably trying to solve an architecture problem with prompts. Really appreciate the perspective—it helped me better understand what actually changed between the first version and the current one.