DEV Community: Alpic

Skybridge V1.0: THE framework for building MCP apps

Julien Vallini — Tue, 19 May 2026 15:05:32 +0000

Building an MCP App is a genuinely new problem. You're not building a website or a standard API; you're building something that runs inside an AI assistant, interacts with a non-deterministic model, and renders UI within a chat. A year ago, none of that existed. Today, MCP Apps run reliably in Claude and ChatGPT, the official SDK has matured, and +1,000 apps are present in the public app stores.

Earlier this year, we open-sourced Skybridge, a full-stack TypeScript framework for building MCP Apps. It handles the plumbing: the MCP server, view rendering, CSP configuration, dev tools, client compatibility, and the connection between your server logic and your UI components. The goal is to let you focus on what your app does, rather than how to wire it all together.

The response was bigger than we expected. Skybridge now has 100k monthly downloads and powers over 10% of apps on the Claude and ChatGPT stores, with teams ranging from Fortune 500 companies to early-stage startups. That adoption is what's driving this release.

Today we're thrilled to introduce Skybridge 1.0, a first-class framework with a stable API, a complete developer toolchain, and a clear commitment to the MCP Apps ecosystem. If you're building an MCP App, this is the foundation to build on.

Simpler, type-safe API

The goal of Skybridge has always been to abstract away underlying implementation details. In v1, that means one entry point: server.registerTool(config, handler). It replaces the old split between plain tools and tools with a view: include a view in the config if you need one, and Skybridge wires it up. The Vite plugin handles component discovery and bundling automatically. Less boilerplate, fewer wiring mistakes, and a cleaner mental model overall.

Think of Skybridge as the React Native of MCP App development: one codebase, compatible with every environment.

Notably, we're the first framework to include strongly typed tool-to-component binding, which is particularly useful when building with agents, where small mistakes are difficult to track down. We've also standardized on the term "views" across the framework and docs.

A (really) complete dev loop

One of the hardest things about building MCP Apps has always been the feedback loop. You make a change, tunnel your local server, switch to ChatGPT or Claude, test, and repeat. For every small UI tweak or logic fix, that cycle compounds fast.

In V1, the dev tools have been redesigned from the ground up. They launch automatically when you run skybridge dev, with a layout reflecting how you actually work: views are now first-class in the interface, with a dedicated panel that makes it easy to see what your view looks like and how it responds to tool calls. There's a mobile preview mode for easy small-screen viewing.

Most importantly, the dev tools are now a unified control panel for your entire development workflow. From the dev tools header, you can:

Open a tunnel to test against a live AI client: Alpic Tunnel exposes your local server via a stable, authenticated URL that works with any MCP-compatible client.
Test your app in the Alpic playground, a production-like environment that avoids having to add your app to ChatGPT or Claude. It's the fastest way to iterate on tool behavior and UI together.
Audit your app with Beacon, a full compliance scan of your app against the Claude Connector Directory & ChatGPT App Store. Beacon catches the things that are easy to miss in local testing: CSP misconfigurations, missing tool descriptions, client-specific quirks.
Deploy directly to Alpic or wherever your stack lives once you're ready to ship.

We believe these devtools really combine everything a developer needs to build a production-grade MCP app with no platform or iteration friction.

Deploy everywhere

Lastly, we've long believed that a framework's value shouldn't come from locking you in.

1.0 makes good on that. Your app ships with a Dockerfile in the template, and the framework is now compatible with Cloudflare Workers. Alpic Cloud remains the fastest path to production if you want zero infrastructure work, but it's a choice, not a requirement.

In a similar vein, we've exposed the Express server giving you full control over the application and the ability to extend it with custom routes or middleware.

Getting started

If you're migrating an existing project, check out (or hand your coding agent) our release notes at:

https://github.com/alpic-ai/skybridge/releases/tag/v1.0.0

For new projects:

npm create skybridge

What's next?

V1 is a foundation, not a finish line, and it's a project we're investing more time into with every release. On the roadmap:

Dev tools: continued improvements, especially around CSP and auth debugging
Deployment: broader platform coverage
Protocol: simplified auth configuration & expanded client support
Agent-friendliness: programmatic devtool APIs, better plugins, and more

Is there something you'd like to see? Open an issue or find us on Discord!

Check out our latest Code with Fred to see the dev tools live in action.

Designing a CLI for Both Humans and Agents

Julien Vallini — Wed, 15 Apr 2026 10:03:17 +0000

We recently released the Alpic MCP and CLI, giving users two new interfaces with which they can interact. Designing the Alpic CLI for both humans and agents surfaced a set of challenges and tradeoffs worth writing down!

Why does building for agents matter?

Interfaces and layouts were designed for humans, i.e. to be easy to understand, and actions easy to perform. At Alpic we believe that agents are becoming the new interface: instead of directly interacting with a system, humans interact with an agent that interacts with the system. The human-agent interaction has been solved by LLMs — naturally, as agents have been trained mainly on human content, they are very good at understanding humans.

With the Alpic engineering team we're committed to solve the remaining challenge: the agent-system interface. In other words, how to give agents the capabilities to perform the same actions as humans do. Besides MCP, which has been designed exactly for this purpose, CLIs happen to be a surprisingly good connector. They were designed in the first place for humans to interact with machines in a textual way, so they naturally work well for agents too, which are heavily text-driven. On top of that, CLIs are composable and well-represented in training corpora, with lots of examples of how they should be used.

But designing a system (here a CLI) for both humans and agents comes with different requirements. This blog post explores these.

How are agents and humans different?

When it comes to CLIs, humans and agents behave surprisingly similarly: both will try calling a command with the --help flag to be guided toward the right usage.

The first difference is the context window: as an agent executes subsequent commands, it fills its context window, meaning that every additional call adds to token costs. This means verbose output is expensive — a command that dumps 200 lines of logs costs real money in an agentic loop.

Agents are also quite bad at polling. A human starting a deployment with a CLI will intuitively wait a minute or two before checking the status, expecting a final state (either deployed or failed). Agents won't wait around doing nothing; unless the command they executed blocks until completion, they'll immediately poll again and again.

Another difference is the inability for agents to handle interactive CLIs. This may improve in the future, but at the moment, agents are far more efficient sending non-interactive one-shot commands and getting the result as parsable JSON.

The Alpic CLI secret sauce

`--non-interactive`

All our commands implement a --non-interactive flag, which allows users to automatically accept confirmation prompts such as "Are you sure you want to…". The goal is to reduce context usage and prevent agents from being blocked by interactive prompts.

We also chose not to provide a JSON output format for now. Our tests show that agents are fully capable of understanding output intended for human users, and since JSON is a relatively verbose format, it adds unnecessary overhead to context usage. Additionally, dynamic console artifacts (such as loading spinners) tend to fill the agent's context with noise and should be avoided.

That said, this space is evolving quickly, and our perspective is still forming. We'd love to hear how others are approaching these tradeoffs. Feel free to share your experiences on our Discord!

Use only named parameters

We noticed that agents (and actually humans as well!) struggle with positional parameters in commands. By enforcing all parameters to be named, we reduce the risk of confusion by a lot. We also avoid a round trip to the documentation or to the --help flag.

No `--cwd` flag to avoid working directory confusion

We decided not to provide a way to choose the working directory of a command. For example, when creating a project with a relative root-dir: is it relative to the current working directory or to something else? Agents are good at navigating between folders, but bad at checking in which folder they execute a command, so removing --cwd reduces ambiguity.

We also added checks on deployment, for example to fail early if the directory the deploy command has been executed from is obviously a wrong one (e.g. an empty directory).

Commands better wait than returning early

Humans are fine with retrying a command every few seconds. Agents are not — they'll either poll aggressively (wasting tokens) or miss the final state entirely. Making commands block until completion is a much better fit for agentic workflows. Of course, when possible, it's even better for commands to return quickly, as this gives control back to the agent to decide what to do next.

In practice, this means our alpic deploy command doesn't return until the deployment has either succeeded or failed. Of course if something snags, our CLI gives up and returns an explicit error rather than leaving the agent waiting forever.

Explicit command names, not abbreviated

Unlike humans, agents don't mind typing a few more keystrokes. To reduce ambiguity, all of our commands and parameters use full words. We chose for example to name our command alpic environment-variable instead of alpic env which could be mislead as a command to manage environments and not environment variables. We understand that this added verbosity slightly increases token usage, but our tests show it's a worthwhile tradeoff, favoring clarity over minimal token usage.

Being stateless (not relying on implicit server state)

Humans know their own context: they may know, for example, whether they've already deployed their project successfully. An agent doesn't carry that context between sessions. Calling a command such as alpic deployments inspect without any parameter — expecting it to return the latest deployment — requires an implicit knowledge about what's on the server that agents can't reliably track.

For example, inspecting a deployment requires explicitly passing either a --deployment-id or an --environment-id, and these are mutually exclusive. Even retrieving the "latest deployment" must be scoped to a specific environment via --environment-id, rather than relying on hidden defaults.

We thus chose to always enforce explicit variables, ensuring that every command is deterministic and does not depend on implicit or session-specific context.

We also state explicitly in the console when a flag has been deduced from a linked project. This helps agents understand what happened, and it's actually better for humans too.

Conclusion

Most of the optimisations we made were actually aiming to reduce ambiguity and ensure that CLI output and parameters make the fewest assumptions possible. The result is that humans may need to type a few more keystrokes, but that's a fair price compared to the self-documenting side-effect of providing clear, full-word, named parameters and commands.

Our goal — which is also how we measure our CLI's agent-readiness — is to make sure we can develop, deploy, and monitor apps while only interacting with an agent. If the agent is the only interface our users need to interact with Alpic and the experience is smooth, we consider our CLI successful.

While our understanding of agent usage is still evolving and our experiments may shift our perspective over time, we're committed to building our CLI as a reference system where both agents and humans are treated as first-class citizens. Models are improving, and new agentic frameworks and systems are appearing every day — we look forward to keeping our CLI at the forefront and in step with these emerging use cases.

Head to our documentation to give the Alpic CLI a try.

15 Lessons Learned Building ChatGPT Apps

Nikolay Rodionov — Mon, 16 Feb 2026 14:00:00 +0000

At Alpic, we believe the next generation of products and services will be built around AI-first experiences, interfaces where users collaborate with models instead of navigating traditional, predetermined UI workflows.

When OpenAI released the Apps SDK, we immediately started building with it. Over the course of three months, we developed two dozen ChatGPT Apps for both internal use and for our customers across B2B and B2C spaces such as travel, retail, and SaaS.

What we discovered early on is that building ChatGPT Apps is fundamentally different from building traditional web or mobile applications. Patterns that work well on the web (just-in-time data fetching, UI-driven state, explicit user configuration, etc.) often break down or actively harm the experience in an agentic environment.

This post is a distilled set of the 15 most important lessons we learned while building real-world ChatGPT Apps, followed by how we incorporated those lessons into an open-source framework for the community, Skybridge, and a Codex Skill to help developers ideate, build, test, and ship Apps significantly faster.

The three body problem

With traditional web apps, things were simple: you only had a user and a UI. In a ChatGPT app, a third body enters the system: the model.

One of the hardest parts of building for ChatGPT is managing how information flows between this trio. If a user clicks a “Select” button in your widget, the UI updates visually, but the model, the brain of the conversation, remains unaware unless you explicitly surface that context. If the user then asks, “Give me more details about this product,” the model has no idea what the user is actually looking at.

We call this context asymmetry where each body has partial knowledge of the system, and no single one has the full picture. Building good ChatGPT Apps isn’t about keeping everything in sync, but about deciding what information should be shared, when it should be shared, and who needs visibility into it. Solving this is the difference between a clunky app and a seamless agentic experience.

1. Not all context should be shared

Our initial instinct was to “just share everything everywhere.” That turned out to be one of our earliest mistakes.

In practice, different parts of a ChatGPT App often need intentionally different views of the same state. Why?

For performance: UI widgets often require far more data than the model should ever need: for example, in a travel booking app that could be images, pricing variants, preloaded options. Sending all of this to the model would increase token usage, latency, and cognitive noise.
For logic: some information must remain asymmetric by design. In one of our earliest apps, a Murder in the Valleys mystery game, the model needs to know who the killer is to roleplay correctly, while the UI and user must not. In a Time’s Up-style game, the situation is reversed: the UI shows the secret word to the user, while the model must remain unaware.

The lesson wasn’t “always sync everything,” but rather: decide explicitly who needs to know what. We formalized this using different tool output fields:

Field	Purpose	Visible to
structuredContent	Typed data for the widget and model	Both widget and model (via toolOutput and callTool functions)
_meta	Response metadata	Widget only, hidden from the model

For example, for the Time’s Up game, we were passing the secret word only to the widget in the _meta field, letting the model guess the word from the user’s hints.

2. Lazy-loading doesn’t translate well to AI apps

Coming from web development, we defaulted to lazy-loading: fetching data when the user clicks; loading details on demand; optimizing for minimal upfront payloads.

In ChatGPT, the paradigm is reversed: tool calls imply delays, often taking several seconds due to security sandboxing and model reasoning.

In practice, we learned to front-load aggressively: sending as much data as possible into the initial tool response, and hydrating the widget via window.openai.toolOutput. This almost always resulted in a faster and more responsive experience.

Of course, if the widget can safely fetch data from a public API endpoint, and doesn’t need to share information with the model, it’s always possible to use classic XHR calls inside your widget, but most of the time you want the model to be able to call tools autonomously to keep the experience conversational.

3. The model needs visibility

A subtle but critical problem arises when the user interacts with a widget (e.g., selecting a specific product in a list) and then asks a question in the chat. If the model doesn’t know what part of the UI the user is referring to, it won’t be able to answer correctly.

For this we used window.openai.setWidgetState(state), which allows you to store specific state data that is added to the model’s context on the next user-model interaction.

With apps growing in complexity, we saw that we were adding setWidgetState in a lot of places for the model to keep track of the navigation. So we decided to introduce a declarative way to describe UI context. Instead of updating the model imperatively on every interaction, we attach a data-llm attribute directly to components:

<div
  data-llm={
    selectedTab === "details"
      ? "User is viewing product details"
      : "User is viewing reviews"
  }
>

For this to work behind the scenes, we built a Vite plugin that scrapes these attributes and automatically updates the widgetState. From the model’s perspective, it simply receives the relevant UI context at the right time, without developers having to manually synchronize every interaction.

You can find this Vite plugin (and many other tips we share in this article) in the open-source framework we created to share our learnings with the community.

4. Different interactions require different APIs

ChatGPT Apps involve multiple interaction paths between the widget, the server, and the model. These paths are not interchangeable: each exists to support a different kind of interaction.

One of the key lessons in building ChatGPT Apps is making these communication paths explicit, and being intentional about which mechanism is responsible for which part of the experience.

Mapping out that path looks something like this:

These lessons establish the foundations of a ChatGPT App: how context is shared, how the model gains visibility, and how different interactions propagate through the system. The next section builds on this foundation and focuses on the implications for UI design.

Reinventing UI for AI

ChatGPT Apps are a completely new environment, so we quickly learned to set aside our preconceived notions about UI and use the new capabilities fully. This section covers interface design assumptions that we needed to learn (and unlearn) to create effective apps.

5. UI must adapt to multiple display modes, and their constraints

ChatGPT Apps don’t live in a single layout. Depending on how and when they’re invoked, the same widget can be rendered in three different display modes.

Apps can appear inline in the conversation, in picture-in-picture (PiP) on top of it, or in fullscreen when more space is needed. While PiP and fullscreen enable richer interfaces, they also introduce UI overlays that the widget doesn’t control. Accounting for device-specific safe zones, such as the persistent close button on mobile, is essential to avoid clipped content and to optimize interactions.

Over time, we identified patterns around display modes and when to use them:

What it looks like	When to use it
Inline	Default display mode. The widget stays in the conversation history.	for quick interactions
Fullscreen	Widget takes up the entire screen, with the chat bar at the bottom.	if your widget is complex and needs a lot of space (e.g., maps)
Picture-in-Picture	Same size as inline, but the widget stays on top of the conversation	if your widget remains relevant during conversational follow-ups after generation

6. UI consistency matters in an embedded environment

Early on, one uncertainty we ran into was how much visual freedom a ChatGPT App should take. As a new interface for users, it needed to feel familiar and consistent, both within our own apps and with the surrounding ChatGPT ecosystem. Unlike a standalone product, a widget lives inside an existing interface, where visual inconsistencies stand out immediately.

Fortunately, the OpenAI Apps SDK UI Kit gave us a clear baseline.

Built on Tailwind CSS, it provides ready-to-use components, icons, and design tokens that align with ChatGPT’s design system. Using it allowed us to move quickly while ensuring our widgets felt native and visually consistent with the surrounding interface, even when building custom components (for example, for our Mapbox integration).

7. Language-first filtering

Traditional dashboards are built on sidebars full of checkboxes and range sliders. In agentic UI, this is often a regression. When users can express intent directly in natural language, for example, “Sunny destinations in Europe for under $200,” forcing them through multiple UI controls adds friction. They should be able to just say it.

We therefore decided to go the way of “no filters” for most of our apps. Instead of a sidebar with options to filter and sort, we provide the model with a List of Values (LOV) for our tool parameters.

This allows the model to take the user’s message as input directly, preventing it from “guessing” what options are available. In other words, it allows it to map natural language directly to our backend’s API requirements. If a user says “sunny,” the model knows to call the tool with weather=“sunny”.

8. Files can unlock richer interactions

One lesson that emerged as we built more complex apps is that files shouldn’t be treated as secondary inputs. In ChatGPT Apps, files can unlock new interactions. Instead of starting from forms or filters, experiences can start from something the user already has.

For example, in an ecommerce app, a user can upload a photo of a product in the chat, have the model identify it, and then continue into product matching or discovery directly in the widget.

This is made possible by letting files flow through both sides of the system. On the model side, tools can directly consume files uploaded in the chat via openai/fileParams, allowing the model to reason over images or other user-provided assets. On the UI side, widgets can also work with files directly using window.openai.uploadFile and window.openai.getFileDownloadUrl, making it possible to request uploads as part of the UI flow or generate files users can download and reuse.

Going to production

Next, as apps move beyond local development, a different set of considerations comes into play around security, configuration, and tooling. That’s what this third set of lessons covers.

9. CSPs are the new CORS

For security reasons, OpenAI renders Apps inside a double-nested iframe. Content Security Policies (CSPs) are a native mechanism of iframe isolation, and this setup enforces them strictly, often surfacing as the classic “it works locally but breaks in production” syndrome.

Unlike traditional web dev where you might get away with a loose policy, the Apps SDK requires you to be surgical.

In the app manifest, this means carefully declaring which domains are allowed for each type of interaction:

Field	Purpose	Example	Common mistakes
connectDomains	API & XHR requests	https://api.weather.com	Forgetting the staging API vs. production.
resourceDomains	Images, fonts, scripts	https://cdn.jsdelivr.net	Using a generic CDN like delivr.net without whitelisting it
frameDomains	Embedding iframes	https://www.youtube.com	Embedding a YouTube video or Mapbox instance without whitelisting it.
redirectDomains	External links opened without warnings	https://app.alpic.ai	Forgetting the checkout or OAuth callback domain.

Treating CSP configuration as a first-class concern early on saved us a significant amount of production debugging later.

10. Small widget flags have outsized impact

Beyond CSPs, a small set of widget-level settings determines how control is shared between the widget, the model, and the host environment. These flags are easy to overlook, but they define critical boundaries for navigation, tool access, and publishing.

Host and navigation boundaries

widgetDomain is required for submission. It defines the default location where the “Open in ” button points in fullscreen mode and participates in origin whitelisting, since widgets are rendered under <widgetDomain>.web-sandbox.oaiusercontent.com. We used setOpenInAppUrl to route users to the appropriate path based on context.

Model and tool boundaries

Tool annotations must follow publishing guidelines. Flags like readOnly, destructiveHint, and openWorldHint are required and validated during submission.
Tool visibility matters: tools that should not be callable by the model must be explicitly marked as private.

Widget execution boundaries

widgetAccessible controls whether the widget can call tools on its own using callTool.

Individually these settings are small, but together they determine whether an app behaves correctly once published.

Optimizing for fast iteration

The Apps SDK is evolving rapidly, and we’ve been excited to build alongside it. To support a smooth and efficient development workflow, we decided to develop our own open-source framework and share it with the community. Here are some of the learnings to avoid some of the developer experience issues we met in the beginning.

11. Fast iteration requires hot reload

One of the first things we tackled was iteration speed. The combination of long-TTL resource caching and the use of JSON-RPC to forward the resources makes standard hot module reload (as found in Vite or Next.js) incompatible with ChatGPT Apps out of the box.

After spending considerable time understanding Vite’s internals, we built a Vite plugin that enables live reload of widgets directly inside ChatGPT. The plugin intercepts resource requests to the MCP server and injects real-time updates into the ChatGPT iframe. Seeing a change in the IDE immediately reflected inside ChatGPT dramatically shortened our feedback loop.

12. Not every test belongs in ChatGPT

Testing on ChatGPT is the gold standard, but for the first iterations, a local emulator can help you move more quickly, especially when you are working on tool definitions that require app reloads in Developer Mode.

To speed up early iterations, we built a lightweight local emulator that mocks the ChatGPT host environment, complete with debugging tools and apps-specific logs. This allowed us to iterate on React state and layout in milliseconds, reserving real ChatGPT tests for validating model interactions and edge cases.

13. Mobile testing requires explicit support

Mobile testing introduced a separate challenge: while tunnelling your local server is necessary for testing in ChatGPT, Vite’s default use of localhost makes the same URL inaccessible from other devices.

We addressed this by extending our Vite plugin to support domain forwarding on tunnelled ports, which unblocked testing on both iOS and Android devices and made mobile validation part of our regular workflow.

14. Familiar abstractions (like React hooks) speed up frontend work

The Apps SDK exposes powerful capabilities, but largely through low-level JavaScript APIs. As longtime React users, we wanted to get closer to concepts we already mastered.

So we introduced some React-friendly abstractions—hooks like useCallTool, useWidgetState, and useLocale, as well as more advanced state management like createStore built on Zustand for complex data flows. Reintroducing familiar frontend patterns reduced boilerplate and made widget development feel closer to modern web workflows.

Turning lessons into a Codex Skill

15. Turn lessons into reusable tooling

As these patterns emerged across multiple apps, it became clear that repeatedly rediscovering them was slowing us down. To make ChatGPT App development faster and more predictable we decided to encode these lessons directly into our tooling, and not just for ourselves but for the community.

This led to two complementary efforts:

The Skybridge Framework: an open-source React framework packages many of the patterns described in this post into reusable building blocks, including our hooks (useCallTool, useToolInfo), the dev tools (HMR and local emulator), and the data-llm attribute.
The chatgpt-apps-builder Codex Skill: on top of the framework, we built a dedicated Codex Skill to support the full app lifecycle:
- Ideation: brainstorming how to make an app “agentic” rather than just a web port.
- Code generation: writing both the React frontend and the MCP server backend simultaneously, pre-configured with all the right UX and UI patterns.
- Local testing: starting dev servers and connecting local apps to ChatGPT for real-time iteration via hot reload.
- QA and publishing: running structured checks against OpenAI’s submission guidelines, including CSP validation, safe-zone considerations, and production testing.
- Deployment of the app: assisting with the final steps required to ship and iterate on an app.

To install and use the Skill, simply use the following command:

npx skills add alpic-ai/skybridge

Conclusion

Building ChatGPT Apps requires rethinking how context flows, how interfaces behave, and how users and models collaborate. Many of the lessons in this post came from gaps between familiar web patterns and the realities of agentic systems.

By sharing these lessons, and by encoding them into our open-source framework and Codex skill, we hope to help teams spend less time rediscovering the same issues and more time exploring what this new interaction model makes possible. The most compelling ChatGPT Apps won’t be simple ports of existing products, but experiences deliberately designed around this new AI-first experience.

[Template] ChatGPT Apps starter kit (Vite + React + HMR)

Erica Beavers — Tue, 28 Oct 2025 13:02:24 +0000

GitHub: https://github.com/alpic-ai/apps-sdk-template

OpenAI's ChatGPT Apps SDK lets you build interactive widgets that render inside ChatGPT using MCP. The initial template released a few weeks ago worked more or less, but the dev experience was rough. Every widget change required rebuilding the entire pipeline to get fresh assets.

That’s why built a starter template with HMR and a Skybridge framework to simplify the entire workflow:

What it includes:

Vite dev server with HMR running alongside your MCP Express server (one process, instant widget reload in ChatGPT)
Skybridge framework with file-based conventions that automatically map MCP widget endpoints to React components (name your endpoint pokemon-card, create web/src/widgets/pokemon-card.tsx, done)
One-click deploy to Alpic with bundling, hosting, and MCP analytics included, or to the platform of your choice No lock-in - built on the official @modelcontextprotocol/sdk

Quick start:

bash
git clone https://github.com/alpic-ai/apps-sdk-template
cd apps-sdk-template
pnpm install && pnpm dev
ngrok http 3000

Add https://your-url.ngrok-free.app/mcp to ChatGPT Settings → Connectors

Edit React components in web/src/widgets/ and see changes instantly in ChatGPT. No reconnecting, no rebuilding as the naming convention handles all the wiring automatically.

A few more details:

Skybridge's file-based convention: endpoint name must match widget filename (pokemon-card endpoint → pokemon-card.tsx component)
HMR updates widgets in real-time while MCP server keeps running
Production build compiles everything and deploys to Alpic in ~30 seconds

Plus, the sample app is Pokemon for a little nostalgia while you’re developing (you can thank us for that by starring in the repo!)

Happy to answer questions about the implementation or MCP integration patterns! And stay tuned for our next article that will offer insight on how we built the framework ;)

And a quick demo for the road:

Behind the Kiwi.com MCP server: building an agentic flight booking service

Erica Beavers — Wed, 27 Aug 2025 14:11:19 +0000

When Kiwi.com released their MCP server earlier this month, it became one of the first examples of agentic travel booking. This post covers how we helped them build it, what we optimized, and where we think there’s still room to improve.

If you'd like to try it out, the install guide is here: https://mcp-install-instructions.alpic.cloud/servers/kiwi-com-flight-search

Why MCP instead of APIs or scraping?

Scraping the UI: brittle, slow, and expensive (cookie banners, custom date pickers, JavaScript quirks).

Direct APIs: better, but not designed for LLMs. Hundreds of endpoints return too much irrelevant data, flooding context windows.

MCP provides a middle ground. It lets developers expose exactly the right tools, with guardrails, so the model can handle the task effectively.

Kiwi.com’s MCP server: first version

The current server exposes a single search-flight tool with the following parameters:

Trip type (one-way, round-trip)
Origin and destination (city or airport)
Dates with ±3 day flexibility
Passenger mix (adult, child, infant)
Cabin class

Each result comes with a direct booking link.

What we optimized with Alpic

1. One-click deployment & hosting

The server is deployed from a Git repo with standard build commands. Once pushed, it’s live on a secure HTTPS endpoint with a custom domain. Behind the scenes:

TLS termination and request parsing handled automatically
Tool execution in stateless, isolated environments
Built-in DDoS protection and rate limiting

This reduced Kiwi.com’s operational overhead so they could iterate quickly and expose the service to real users.

2. Server design choices

Shortened booking links: Long URLs eat context and risk breaking. We introduced shortened booking links to keep token usage small.

Structured responses: Instead of a free blob of text, the MCP instructs the LLM to return results in a table format, making comparison easier for users.

Fewer, curated results: Kiwi.com’s API can return thousands of flights, but the MCP server only sends a few dozen “best” options. This leverages Kiwi’s business logic while avoiding models making poor tradeoffs (like suggesting 15-hour layovers for €30 savings).

What’s missing and what’s next

Right now, the server only handles simple round-trip and one-way searches. Features not yet handled include:

Multi-destination itineraries
Checked bags
Max duration filtering
Account login and loyalty integration

On the protocol side, new MCP features could open up more options:

Elicitations: proactively asking the user to clarify (e.g. “Do you prefer Orly or CDG?” when searching Paris flights).

User preferences: storing seat choices, airlines, or price vs. comfort tradeoffs in reusable context.

Finally, better client-side capabilities will make adoption smoother: server registries (e.g. Claude Directory), less installation friction, and eventually end-to-end booking inside assistants without link redirects.

Takeaways

MCP servers are not just API wrappers, they require deliberate design for LLM usability.

Optimizing context usage (short links, curated data, structured responses) is critical.

Kiwi.com’s MCP server is an early step, but it shows how travel booking could work in an agent-native world. We'd love to have your feedback and ideas for future iterations.

Also, if you’re building with MCP and want one-step deployment, hosting, and AI-specific analytics, check out Alpic.ai.

Better MCP tools/call Error Responses: Help Your AI Recover Gracefully

Frédéric Barthelet — Mon, 28 Jul 2025 07:41:02 +0000

When building MCP servers, we often focus on the happy path: what happens when tools execute successfully. But what about when things go wrong? The quality of your error responses can make the difference between a frustrated user and an AI that recovers gracefully on its own.

Understanding MCP Error Types: Protocol vs Tool Errors

Before diving into error response strategies, it's crucial to understand the distinction between two types of errors in MCP:

MCP Protocol-Level Errors

These are errors in the MCP communication itself:

Connection closed or request timeout
Tool not found
Malformed requests or protocol violations
Internal server errors

These errors trigger standard JSON-RPC error responses and typically indicate something is fundamentally broken with the request or the server.

{
  "jsonrpc": "2.0",
  "id": 1,
  "error": {
    "code": -32001,
    "message": "Request Timeout"
  }
}

Tools/call Errors (The Focus of This Article)

These are errors that occur during tool execution. The tool was found and called, but something went wrong during the processing. These should not be returned as MCP protocol errors, but as successful MCP JSON-RPC responses with isError: true in the result payload.

{
  "jsonrpc": "2.0",
  "id": 2,
  "result": {
    "content": [
      {
        "type": "text",
        "text": "An error occurred."
      }
    ],
    "isError": true
  }
}

Tools/call Error Responses Are Context, Not Dead Ends

Why bother sharing so many details about the difference between these two error formats? They're both still errors, right? Nothing that needs much attention.

Wrong! MCP protocol-level errors are captured by the MCP client, eventually surfaced in the UI (like a notification in Claude), and discarded. On the other hand, tools/call errors are injected back into the LLM context window, just like successful responses. Smart error messages can be leveraged by the model as much as any other prompt, giving it a chance to recover from the error without human intervention.

Most open-source MCP implementations I've seen return generic tool error messages that leave the AI (and users) in the dark. Let's look at what it takes to rework error messages and increase your server's overall quality.

3 Use-Cases of Better Error Responses

Here are examples of elevated error messages that improve model task completion rate (the north star metric used to evaluate MCP server quality).

Tool Ordering Guidance

If the application's state prevents the model from using a tool for a given resource, provide instructions on how to update that state to make the tool usable. For example, if you're a famous three-letter infrastructure company exposing a tool to terminate an instance, but this tool can only be called when the instance is in a stopped state, say so in the error message.

{
  "content": [
    {
      "type": "text",
      "text": "You can't terminate an instance in the running state. Use the stop_instance tool first on this instance."
    }
  ],
  "isError": true
}

Refined Validation Messages

When tool input validation criteria aren't fully representable in JSON schema, use tool error messages to give the model additional context. If you're a travel company exposing a booking tool on your MCP server and the model accidentally misinterprets the current year for your booking request, you can correct it:

{
  "content": [
    {
      "type": "text",
      "text": "The requested travel date cannot be set in the past. You requested travel on July 31st, 2024, but the current date is July 25th, 2025. Did you mean to plan for travel on July 31st, 2025 instead?"
    }
  ],
  "isError": true
}

Smart Unknown Error Handling

Even when you can't provide precise details about an error, give the model instructions on retry strategy and fallback actions to direct the user to:

{
  "content": [
    {
      "type": "text",
      "text": "An unknown error happened. Try again immediately. If it's the 3rd time you're encountering this issue, provide the user with a link to https://mydashboard.example.com/manual-task to perform the task manually."
    }
  ],
  "isError": true
}

Conclusion

Error handling in MCP isn't just about graceful failures—it's about creating collaborative experiences where AI can self-correct and recover. By treating error responses as contextual guidance rather than terminal states, you transform frustrating dead ends into stepping stones toward success.

Remember: every error response is an opportunity to teach the AI how to do better next time.

What patterns have you found effective for MCP error handling? Share your experiences in the comments below.

DEV Community: Alpic

Skybridge V1.0: THE framework for building MCP apps

Simpler, type-safe API

A (really) complete dev loop

Deploy everywhere

Getting started

What's next?

Designing a CLI for Both Humans and Agents

Why does building for agents matter?

How are agents and humans different?

The Alpic CLI secret sauce

--non-interactive

Use only named parameters

No --cwd flag to avoid working directory confusion

Commands better wait than returning early

Explicit command names, not abbreviated

Being stateless (not relying on implicit server state)

Conclusion

15 Lessons Learned Building ChatGPT Apps

The three body problem

1. Not all context should be shared

2. Lazy-loading doesn’t translate well to AI apps

3. The model needs visibility

4. Different interactions require different APIs

Reinventing UI for AI

5. UI must adapt to multiple display modes, and their constraints

6. UI consistency matters in an embedded environment

7. Language-first filtering

8. Files can unlock richer interactions

Going to production

9. CSPs are the new CORS

10. Small widget flags have outsized impact

Host and navigation boundaries

Model and tool boundaries

Widget execution boundaries

Optimizing for fast iteration

11. Fast iteration requires hot reload

12. Not every test belongs in ChatGPT

13. Mobile testing requires explicit support

14. Familiar abstractions (like React hooks) speed up frontend work

Turning lessons into a Codex Skill

15. Turn lessons into reusable tooling

Conclusion

[Template] ChatGPT Apps starter kit (Vite + React + HMR)

Behind the Kiwi.com MCP server: building an agentic flight booking service

Why MCP instead of APIs or scraping?

Kiwi.com’s MCP server: first version

What we optimized with Alpic

What’s missing and what’s next

Takeaways

Better MCP tools/call Error Responses: Help Your AI Recover Gracefully

Understanding MCP Error Types: Protocol vs Tool Errors

MCP Protocol-Level Errors

Tools/call Errors (The Focus of This Article)

Tools/call Error Responses Are Context, Not Dead Ends

3 Use-Cases of Better Error Responses

Tool Ordering Guidance

Refined Validation Messages

Smart Unknown Error Handling

Conclusion

`--non-interactive`

No `--cwd` flag to avoid working directory confusion