<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Alpic</title>
    <description>The latest articles on DEV Community by Alpic (@alpic).</description>
    <link>https://dev.to/alpic</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Forganization%2Fprofile_image%2F11088%2Fd212e0c5-c450-44b8-b527-2dd28112a60f.png</url>
      <title>DEV Community: Alpic</title>
      <link>https://dev.to/alpic</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/alpic"/>
    <language>en</language>
    <item>
      <title>Designing a CLI for Both Humans and Agents</title>
      <dc:creator>Julien Vallini</dc:creator>
      <pubDate>Wed, 15 Apr 2026 10:03:17 +0000</pubDate>
      <link>https://dev.to/alpic/designing-a-cli-for-both-humans-and-agents-4069</link>
      <guid>https://dev.to/alpic/designing-a-cli-for-both-humans-and-agents-4069</guid>
      <description>&lt;p&gt;We recently released the Alpic MCP and CLI, giving users two new interfaces with which they can interact. Designing the Alpic CLI for both humans and agents surfaced a set of challenges and tradeoffs worth writing down!&lt;/p&gt;

&lt;h2&gt;
  
  
  Why does building for agents matter?
&lt;/h2&gt;

&lt;p&gt;Interfaces and layouts were designed for humans, i.e. to be easy to understand, and actions easy to perform. At Alpic we believe that agents are becoming the new interface: instead of directly interacting with a system, humans interact with an agent that interacts with the system. The human-agent interaction has been solved by LLMs — naturally, as agents have been trained mainly on human content, they are very good at understanding humans.&lt;/p&gt;

&lt;p&gt;With the Alpic engineering team we're committed to solve the remaining challenge: the agent-system interface. In other words, how to give agents the capabilities to perform the same actions as humans do. Besides MCP, which has been designed exactly for this purpose, CLIs happen to be a surprisingly good connector. They were designed in the first place for humans to interact with machines in a textual way, so they naturally work well for agents too, which are heavily text-driven. On top of that, CLIs are composable and well-represented in training corpora, with lots of examples of how they should be used.&lt;/p&gt;

&lt;p&gt;But designing a system (here a CLI) for both humans and agents comes with different requirements. This blog post explores these.&lt;/p&gt;

&lt;h2&gt;
  
  
  How are agents and humans different?
&lt;/h2&gt;

&lt;p&gt;When it comes to CLIs, humans and agents behave surprisingly similarly: both will try calling a command with the &lt;code&gt;--help&lt;/code&gt; flag to be guided toward the right usage.&lt;/p&gt;

&lt;p&gt;The first difference is the &lt;strong&gt;context window&lt;/strong&gt;: as an agent executes subsequent commands, it fills its context window, meaning that every additional call adds to token costs. This means verbose output is expensive — a command that dumps 200 lines of logs costs real money in an agentic loop.&lt;/p&gt;

&lt;p&gt;Agents are also quite &lt;strong&gt;bad at polling&lt;/strong&gt;. A human starting a deployment with a CLI will intuitively wait a minute or two before checking the status, expecting a final state (either deployed or failed). Agents won't wait around doing nothing; unless the command they executed blocks until completion, they'll immediately poll again​​ and again.&lt;/p&gt;

&lt;p&gt;Another difference is the &lt;strong&gt;inability for agents to handle interactive CLIs&lt;/strong&gt;. This may improve in the future, but at the moment, agents are far more efficient sending non-interactive one-shot commands and getting the result as parsable JSON.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Alpic CLI secret sauce
&lt;/h2&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;--non-interactive&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;All our commands implement a &lt;code&gt;--non-interactive&lt;/code&gt; flag, which allows users to automatically accept confirmation prompts such as "Are you sure you want to…". The goal is to reduce context usage and prevent agents from being blocked by interactive prompts.&lt;/p&gt;

&lt;p&gt;We also chose not to provide a JSON output format for now. Our tests show that agents are fully capable of understanding output intended for human users, and since JSON is a relatively verbose format, it adds unnecessary overhead to context usage. Additionally, dynamic console artifacts (such as loading spinners) tend to fill the agent's context with noise and should be avoided.&lt;/p&gt;

&lt;p&gt;That said, this space is evolving quickly, and our perspective is still forming. We'd love to hear how others are approaching these tradeoffs. Feel free to share your experiences on our Discord!&lt;/p&gt;

&lt;h3&gt;
  
  
  Use only named parameters
&lt;/h3&gt;

&lt;p&gt;We noticed that agents (and actually humans as well!) struggle with positional parameters in commands. By enforcing all parameters to be named, we reduce the risk of confusion by a lot. We also avoid a round trip to the documentation or to the &lt;code&gt;--help&lt;/code&gt; flag.&lt;/p&gt;

&lt;h3&gt;
  
  
  No &lt;code&gt;--cwd&lt;/code&gt; flag to avoid working directory confusion
&lt;/h3&gt;

&lt;p&gt;We decided not to provide a way to choose the working directory of a command. For example, when creating a project with a relative &lt;code&gt;root-dir&lt;/code&gt;: is it relative to the current working directory or to something else? Agents are good at navigating between folders, but bad at checking in which folder they execute a command, so removing &lt;code&gt;--cwd&lt;/code&gt; reduces ambiguity.&lt;/p&gt;

&lt;p&gt;We also added checks on deployment, for example to fail early if the directory the deploy command has been executed from is obviously a wrong one (e.g. an empty directory).&lt;/p&gt;

&lt;h3&gt;
  
  
  Commands better wait than returning early
&lt;/h3&gt;

&lt;p&gt;Humans are fine with retrying a command every few seconds. Agents are not — they'll either poll aggressively (wasting tokens) or miss the final state entirely. Making commands block until completion is a much better fit for agentic workflows. Of course, when possible, it's even better for commands to return quickly, as this gives control back to the agent to decide what to do next.&lt;/p&gt;

&lt;p&gt;In practice, this means our &lt;code&gt;alpic deploy&lt;/code&gt; command doesn't return until the deployment has either succeeded or failed. Of course if something snags, our CLI gives up and returns an explicit error rather than leaving the agent waiting forever.&lt;/p&gt;

&lt;h3&gt;
  
  
  Explicit command names, not abbreviated
&lt;/h3&gt;

&lt;p&gt;Unlike humans, agents don't mind typing a few more keystrokes. To reduce ambiguity, all of our commands and parameters use full words. We chose for example to name our command &lt;code&gt;alpic environment-variable&lt;/code&gt; instead of &lt;code&gt;alpic env&lt;/code&gt; which could be mislead as a command to manage environments and not environment variables. We understand that this added verbosity slightly increases token usage, but our tests show it's a worthwhile tradeoff, favoring clarity over minimal token usage.&lt;/p&gt;

&lt;h3&gt;
  
  
  Being stateless (not relying on implicit server state)
&lt;/h3&gt;

&lt;p&gt;Humans know their own context: they may know, for example, whether they've already deployed their project successfully. An agent doesn't carry that context between sessions. Calling a command such as &lt;code&gt;alpic deployments inspect&lt;/code&gt; without any parameter — expecting it to return the latest deployment — requires an implicit knowledge about what's on the server that agents can't reliably track.&lt;/p&gt;

&lt;p&gt;For example, inspecting a deployment requires explicitly passing either a &lt;code&gt;--deployment-id&lt;/code&gt; or an &lt;code&gt;--environment-id&lt;/code&gt;, and these are mutually exclusive. Even retrieving the "latest deployment" must be scoped to a specific environment via &lt;code&gt;--environment-id&lt;/code&gt;, rather than relying on hidden defaults.&lt;/p&gt;

&lt;p&gt;We thus chose to always enforce explicit variables, ensuring that every command is deterministic and does not depend on implicit or session-specific context.&lt;/p&gt;

&lt;p&gt;We also state explicitly in the console when a flag has been deduced from a linked project. This helps agents understand what happened, and it's actually better for humans too.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Most of the optimisations we made were actually aiming to reduce ambiguity and ensure that CLI output and parameters make the fewest assumptions possible. The result is that humans may need to type a few more keystrokes, but that's a fair price compared to the self-documenting side-effect of providing clear, full-word, named parameters and commands.&lt;/p&gt;

&lt;p&gt;Our goal — which is also how we measure our CLI's agent-readiness — is to make sure we can develop, deploy, and monitor apps while only interacting with an agent. If the agent is the only interface our users need to interact with Alpic and the experience is smooth, we consider our CLI successful.&lt;/p&gt;

&lt;p&gt;While our understanding of agent usage is still evolving and our experiments may shift our perspective over time, we're committed to building our CLI as a reference system where both agents and humans are treated as first-class citizens. Models are improving, and new agentic frameworks and systems are appearing every day — we look forward to keeping our CLI at the forefront and in step with these emerging use cases.&lt;/p&gt;

&lt;p&gt;Head to our documentation to give the Alpic CLI a try.&lt;/p&gt;

</description>
      <category>cli</category>
      <category>ai</category>
      <category>devtools</category>
      <category>opensource</category>
    </item>
    <item>
      <title>15 Lessons Learned Building ChatGPT Apps</title>
      <dc:creator>Nikolay Rodionov</dc:creator>
      <pubDate>Mon, 16 Feb 2026 14:00:00 +0000</pubDate>
      <link>https://dev.to/alpic/15-lessons-learned-building-chatgpt-apps-2i89</link>
      <guid>https://dev.to/alpic/15-lessons-learned-building-chatgpt-apps-2i89</guid>
      <description>&lt;p&gt;At &lt;a href="https://alpic.ai" rel="noopener noreferrer"&gt;Alpic&lt;/a&gt;, we believe the next generation of products and services will be built around &lt;strong&gt;AI-first experiences&lt;/strong&gt;, interfaces where users collaborate with models instead of navigating traditional, predetermined UI workflows.&lt;/p&gt;

&lt;p&gt;When OpenAI released the &lt;strong&gt;Apps SDK&lt;/strong&gt;, we immediately started building with it. Over the course of three months, we developed two dozen ChatGPT Apps for both internal use and for our customers across B2B and B2C spaces such as &lt;strong&gt;travel, retail, and SaaS&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;What we discovered early on is that &lt;strong&gt;building ChatGPT Apps is fundamentally different from building traditional web or mobile applications&lt;/strong&gt;. Patterns that work well on the web (just-in-time data fetching, UI-driven state, explicit user configuration, etc.) often break down or actively harm the experience in an agentic environment.&lt;/p&gt;

&lt;p&gt;This post is a distilled set of the &lt;strong&gt;15 most important lessons&lt;/strong&gt; we learned while building real-world ChatGPT Apps, followed by how we incorporated those lessons into an open-source framework for the community, &lt;a href="https://github.com/alpic-ai/skybridge" rel="noopener noreferrer"&gt;&lt;strong&gt;Skybridge&lt;/strong&gt;&lt;/a&gt;, and a &lt;a href="https://github.com/alpic-ai/skybridge/tree/main/skills/chatgpt-app-builder" rel="noopener noreferrer"&gt;&lt;strong&gt;Codex Skill&lt;/strong&gt;&lt;/a&gt; to help developers ideate, build, test, and ship Apps significantly faster.&lt;/p&gt;

&lt;h2&gt;
  
  
  The three body problem
&lt;/h2&gt;

&lt;p&gt;With traditional web apps, things were simple: you only had a &lt;strong&gt;user&lt;/strong&gt; and a &lt;strong&gt;UI&lt;/strong&gt;. In a ChatGPT app, a third body enters the system: the &lt;strong&gt;model&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;One of the hardest parts of building for ChatGPT is managing how information flows between this trio. If a user clicks a “Select” button in your widget, the UI updates visually, but the model, the brain of the conversation, remains unaware unless you explicitly surface that context. If the user then asks, &lt;em&gt;“Give me more details about this product,”&lt;/em&gt; the model has no idea what the user is actually looking at.&lt;/p&gt;

&lt;p&gt;We call this &lt;strong&gt;context asymmetry&lt;/strong&gt; where each body has partial knowledge of the system, and no single one has the full picture. Building good ChatGPT Apps isn’t about keeping everything in sync, but about deciding &lt;em&gt;what&lt;/em&gt; information should be shared, &lt;em&gt;when&lt;/em&gt; it should be shared, and &lt;em&gt;who&lt;/em&gt; needs visibility into it. Solving this is the difference between a clunky app and a seamless agentic experience.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Not all context should be shared
&lt;/h3&gt;

&lt;p&gt;Our initial instinct was to “just share everything everywhere.” That turned out to be one of our earliest mistakes.&lt;/p&gt;

&lt;p&gt;In practice, different parts of a ChatGPT App often need &lt;em&gt;intentionally different&lt;/em&gt; views of the same state. Why?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;For performance:&lt;/strong&gt; UI widgets often require far more data than the model should ever need: for example, in a travel booking app that could be images, pricing variants, preloaded options. Sending all of this to the model would increase token usage, latency, and cognitive noise.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;For logic:&lt;/strong&gt; some information must remain asymmetric by design. In one of our earliest apps, a &lt;em&gt;Murder in the Valleys&lt;/em&gt; mystery game, the model needs to know who the killer is to roleplay correctly, while the UI and user must not. In a &lt;em&gt;Time’s Up&lt;/em&gt;-style game, the situation is reversed: the UI shows the secret word to the user, while the model must remain unaware.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The lesson wasn’t “always sync everything,” but rather: &lt;strong&gt;decide explicitly who needs to know what&lt;/strong&gt;. We formalized this using different &lt;em&gt;tool output&lt;/em&gt; fields:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Field&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;th&gt;Visible to&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;structuredContent&lt;/td&gt;
&lt;td&gt;Typed data for the widget and model&lt;/td&gt;
&lt;td&gt;Both widget and model (via toolOutput and callTool functions)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;_meta&lt;/td&gt;
&lt;td&gt;Response metadata&lt;/td&gt;
&lt;td&gt;Widget only, hidden from the model&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For example, for the Time’s Up game, we were passing the secret word only to the widget in the _meta field, letting the model guess the word from the user’s hints.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Lazy-loading doesn’t translate well to AI apps
&lt;/h3&gt;

&lt;p&gt;Coming from web development, we defaulted to lazy-loading: fetching data when the user clicks; loading details on demand; optimizing for minimal upfront payloads.&lt;/p&gt;

&lt;p&gt;In ChatGPT, the paradigm is reversed: tool calls imply delays, often taking several seconds due to security sandboxing and model reasoning.&lt;/p&gt;

&lt;p&gt;In practice, we learned to front-load aggressively: sending as much data as possible into the initial tool response, and hydrating the widget via &lt;em&gt;window.openai.toolOutput&lt;/em&gt;. This almost always resulted in a faster and more responsive experience.&lt;/p&gt;

&lt;p&gt;Of course, if the widget can safely fetch data from a public API endpoint, and doesn’t need to share information with the model, it’s always possible to use classic XHR calls inside your widget, but most of the time you want the model to be able to call tools autonomously to keep the experience conversational.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. The model needs visibility
&lt;/h3&gt;

&lt;p&gt;A subtle but critical problem arises when the user interacts with a widget (e.g., selecting a specific product in a list) and then asks a question in the chat. If the model doesn’t know what part of the UI the user is referring to, it won’t be able to answer correctly.&lt;/p&gt;

&lt;p&gt;For this we used &lt;code&gt;window.openai.setWidgetState(state)&lt;/code&gt;, which allows you to store specific state data that is added to the model’s context on the next user-model interaction.&lt;/p&gt;

&lt;p&gt;With apps growing in complexity, we saw that we were adding &lt;code&gt;setWidgetState&lt;/code&gt; in a lot of places for the model to keep track of the navigation. So we decided to introduce a declarative way to describe UI context. Instead of updating the model imperatively on every interaction, we attach a &lt;code&gt;data-llm&lt;/code&gt; attribute directly to components:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;div
  data-llm={
    selectedTab === "details"
      ? "User is viewing product details"
      : "User is viewing reviews"
  }
&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For this to work behind the scenes, we built a Vite plugin that scrapes these attributes and automatically updates the widgetState. From the model’s perspective, it simply receives the relevant UI context at the right time, without developers having to manually synchronize every interaction.&lt;/p&gt;

&lt;p&gt;You can find this Vite plugin (and many other tips we share in this article) in the &lt;a href="https://github.com/alpic-ai/skybridge" rel="noopener noreferrer"&gt;open-source framework&lt;/a&gt; we created to share our learnings with the community.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Different interactions require different APIs
&lt;/h3&gt;

&lt;p&gt;ChatGPT Apps involve multiple interaction paths between the widget, the server, and the model. These paths are not interchangeable: each exists to support a different kind of interaction.&lt;/p&gt;

&lt;p&gt;One of the key lessons in building ChatGPT Apps is making these communication paths explicit, and being intentional about which mechanism is responsible for which part of the experience.&lt;/p&gt;

&lt;p&gt;Mapping out that path looks something like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fry2sts21ogebwek1kd4c.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fry2sts21ogebwek1kd4c.jpg" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;These lessons establish the foundations of a ChatGPT App: how context is shared, how the model gains visibility, and how different interactions propagate through the system. The next section builds on this foundation and focuses on the implications for UI design.&lt;/p&gt;

&lt;h2&gt;
  
  
  Reinventing UI for AI
&lt;/h2&gt;

&lt;p&gt;ChatGPT Apps are a completely new environment, so we quickly learned to set aside our preconceived notions about UI and use the new capabilities fully. This section covers interface design assumptions that we needed to learn (and unlearn) to create effective apps.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. UI must adapt to multiple display modes, and their constraints
&lt;/h3&gt;

&lt;p&gt;ChatGPT Apps don’t live in a single layout. Depending on how and when they’re invoked, the same widget can be rendered in three different display modes.&lt;/p&gt;

&lt;p&gt;Apps can appear &lt;strong&gt;inline&lt;/strong&gt; in the conversation, in &lt;strong&gt;picture-in-picture (PiP)&lt;/strong&gt; on top of it, or in &lt;strong&gt;fullscreen&lt;/strong&gt; when more space is needed. While PiP and fullscreen enable richer interfaces, they also introduce UI overlays that the widget doesn’t control. Accounting for device-specific safe zones, such as the persistent close button on mobile, is essential to avoid clipped content and to optimize interactions.&lt;/p&gt;

&lt;p&gt;Over time, we identified patterns around display modes and when to use them:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;What it looks like&lt;/th&gt;
&lt;th&gt;When to use it&lt;/th&gt;
&lt;th&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Inline&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Default display mode. The widget stays in the conversation history.&lt;/td&gt;
&lt;td&gt;for quick interactions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Fullscreen&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Widget takes up the entire screen, with the chat bar at the bottom.&lt;/td&gt;
&lt;td&gt;if your widget is complex and needs a lot of space (e.g., maps)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Picture-in-Picture&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Same size as inline, but the widget stays on top of the conversation&lt;/td&gt;
&lt;td&gt;if your widget remains relevant during conversational follow-ups after generation&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  6. UI consistency matters in an embedded environment
&lt;/h3&gt;

&lt;p&gt;Early on, one uncertainty we ran into was how much visual freedom a ChatGPT App should take. As a new interface for users, it needed to feel familiar and consistent, both within our own apps and with the surrounding ChatGPT ecosystem. Unlike a standalone product, a widget lives inside an existing interface, where visual inconsistencies stand out immediately.&lt;/p&gt;

&lt;p&gt;Fortunately, the &lt;a href="https://github.com/openai/apps-sdk-ui" rel="noopener noreferrer"&gt;OpenAI Apps SDK UI Kit&lt;/a&gt; gave us a clear baseline.&lt;/p&gt;

&lt;p&gt;Built on Tailwind CSS, it provides ready-to-use components, icons, and design tokens that align with ChatGPT’s design system. Using it allowed us to move quickly while ensuring our widgets felt native and visually consistent with the surrounding interface, even when building custom components (for example, for our Mapbox integration).&lt;/p&gt;

&lt;h3&gt;
  
  
  7. Language-first filtering
&lt;/h3&gt;

&lt;p&gt;Traditional dashboards are built on sidebars full of checkboxes and range sliders. In agentic UI, this is often a regression. When users can express intent directly in natural language, for example, “Sunny destinations in Europe for under $200,” forcing them through multiple UI controls adds friction. They should be able to just say it.&lt;/p&gt;

&lt;p&gt;We therefore decided to go the way of “no filters” for most of our apps. Instead of a sidebar with options to filter and sort, we provide the model with a &lt;strong&gt;List of Values (LOV)&lt;/strong&gt; for our tool parameters.&lt;/p&gt;

&lt;p&gt;This allows the model to take the user’s message as input directly, preventing it from “guessing” what options are available. In other words, it allows it to map natural language directly to our backend’s API requirements. If a user says “sunny,” the model knows to call the tool with weather=“sunny”.&lt;/p&gt;

&lt;h3&gt;
  
  
  8. Files can unlock richer interactions
&lt;/h3&gt;

&lt;p&gt;One lesson that emerged as we built more complex apps is that files shouldn’t be treated as secondary inputs. In ChatGPT Apps, files can unlock new interactions. Instead of starting from forms or filters, experiences can start from something the user already has.&lt;/p&gt;

&lt;p&gt;For example, in an ecommerce app, a user can upload a photo of a product in the chat, have the model identify it, and then continue into product matching or discovery directly in the widget.&lt;/p&gt;

&lt;p&gt;This is made possible by letting files flow through both sides of the system. On the model side, tools can directly consume files uploaded in the chat via &lt;code&gt;openai/fileParams&lt;/code&gt;, allowing the model to reason over images or other user-provided assets. On the UI side, widgets can also work with files directly using &lt;code&gt;window.openai.uploadFile&lt;/code&gt; and &lt;code&gt;window.openai.getFileDownloadUrl&lt;/code&gt;, making it possible to request uploads as part of the UI flow or generate files users can download and reuse.&lt;/p&gt;

&lt;h2&gt;
  
  
  Going to production
&lt;/h2&gt;

&lt;p&gt;Next, as apps move beyond local development, a different set of considerations comes into play around security, configuration, and tooling. That’s what this third set of lessons covers.&lt;/p&gt;

&lt;h3&gt;
  
  
  9. CSPs are the new CORS
&lt;/h3&gt;

&lt;p&gt;For security reasons, OpenAI renders Apps inside a double-nested iframe. Content Security Policies (CSPs) are a native mechanism of iframe isolation, and this setup enforces them strictly, often surfacing as the classic “it works locally but breaks in production” syndrome.&lt;/p&gt;

&lt;p&gt;Unlike traditional web dev where you might get away with a loose policy, the Apps SDK requires you to be surgical.&lt;/p&gt;

&lt;p&gt;In the app manifest, this means carefully declaring which domains are allowed for each type of interaction:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Field&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;th&gt;Example&lt;/th&gt;
&lt;th&gt;Common mistakes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;connectDomains&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;API &amp;amp; XHR requests&lt;/td&gt;
&lt;td&gt;&lt;a href="https://api.weather.com" rel="noopener noreferrer"&gt;https://api.weather.com&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Forgetting the staging API vs. production.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;resourceDomains&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Images, fonts, scripts&lt;/td&gt;
&lt;td&gt;&lt;a href="https://cdn.jsdelivr.net" rel="noopener noreferrer"&gt;https://cdn.jsdelivr.net&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Using a generic CDN like delivr.net without whitelisting it&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;frameDomains&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Embedding iframes&lt;/td&gt;
&lt;td&gt;&lt;a href="https://www.youtube.com" rel="noopener noreferrer"&gt;https://www.youtube.com&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Embedding a YouTube video or Mapbox instance without whitelisting it.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;redirectDomains&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;External links opened without warnings&lt;/td&gt;
&lt;td&gt;&lt;a href="https://app.alpic.ai" rel="noopener noreferrer"&gt;https://app.alpic.ai&lt;/a&gt;&lt;/td&gt;
&lt;td&gt;Forgetting the checkout or OAuth callback domain.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Treating CSP configuration as a first-class concern early on saved us a significant amount of production debugging later.&lt;/p&gt;

&lt;h3&gt;
  
  
  10. Small widget flags have outsized impact
&lt;/h3&gt;

&lt;p&gt;Beyond CSPs, a small set of widget-level settings determines how control is shared between the widget, the model, and the host environment. These flags are easy to overlook, but they define critical boundaries for navigation, tool access, and publishing.&lt;/p&gt;

&lt;h4&gt;
  
  
  Host and navigation boundaries
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;widgetDomain&lt;/code&gt;&lt;/strong&gt; is required for submission. It defines the default location where the “Open in ” button points in fullscreen mode and participates in origin whitelisting, since widgets are rendered under &lt;code&gt;&amp;lt;widgetDomain&amp;gt;.web-sandbox.oaiusercontent.com&lt;/code&gt;. We used &lt;code&gt;setOpenInAppUrl&lt;/code&gt; to route users to the appropriate path based on context.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Model and tool boundaries
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Tool annotations&lt;/strong&gt; must follow publishing guidelines. Flags like &lt;code&gt;readOnly&lt;/code&gt;, &lt;code&gt;destructiveHint&lt;/code&gt;, and &lt;code&gt;openWorldHint&lt;/code&gt; are required and validated during submission.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool visibility&lt;/strong&gt; matters: tools that should not be callable by the model must be explicitly marked as private.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Widget execution boundaries
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;widgetAccessible&lt;/code&gt;&lt;/strong&gt; controls whether the widget can call tools on its own using &lt;code&gt;callTool&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Individually these settings are small, but together they determine whether an app behaves correctly once published.&lt;/p&gt;

&lt;h2&gt;
  
  
  Optimizing for fast iteration
&lt;/h2&gt;

&lt;p&gt;The Apps SDK is evolving rapidly, and we’ve been excited to build alongside it. To support a smooth and efficient development workflow, we decided to develop our own open-source framework and share it with the community. Here are some of the learnings to avoid some of the developer experience issues we met in the beginning.&lt;/p&gt;

&lt;h3&gt;
  
  
  11. Fast iteration requires hot reload
&lt;/h3&gt;

&lt;p&gt;One of the first things we tackled was iteration speed. The combination of long-TTL resource caching and the use of JSON-RPC to forward the resources makes standard hot module reload (as found in Vite or Next.js) incompatible with ChatGPT Apps out of the box.&lt;/p&gt;

&lt;p&gt;After spending considerable time understanding Vite’s internals, we built a Vite plugin that enables live reload of widgets directly inside ChatGPT. The plugin intercepts resource requests to the MCP server and injects real-time updates into the ChatGPT iframe. Seeing a change in the IDE immediately reflected inside ChatGPT dramatically shortened our feedback loop.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnl4y1ocjup313e2mmw60.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnl4y1ocjup313e2mmw60.gif" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  12. Not every test belongs in ChatGPT
&lt;/h3&gt;

&lt;p&gt;Testing on ChatGPT is the gold standard, but for the first iterations, a local emulator can help you move more quickly, especially when you are working on tool definitions that require app reloads in Developer Mode.&lt;/p&gt;

&lt;p&gt;To speed up early iterations, we built a lightweight local emulator that mocks the ChatGPT host environment, complete with debugging tools and apps-specific logs. This allowed us to iterate on React state and layout in milliseconds, reserving real ChatGPT tests for validating model interactions and edge cases.&lt;/p&gt;

&lt;h3&gt;
  
  
  13. Mobile testing requires explicit support
&lt;/h3&gt;

&lt;p&gt;Mobile testing introduced a separate challenge: while tunnelling your local server is necessary for testing in ChatGPT, Vite’s default use of localhost makes the same URL inaccessible from other devices.&lt;/p&gt;

&lt;p&gt;We addressed this by extending our Vite plugin to support domain forwarding on tunnelled ports, which unblocked testing on both iOS and Android devices and made mobile validation part of our regular workflow.&lt;/p&gt;

&lt;h3&gt;
  
  
  14. Familiar abstractions (like React hooks) speed up frontend work
&lt;/h3&gt;

&lt;p&gt;The Apps SDK exposes powerful capabilities, but largely through low-level JavaScript APIs. As longtime React users, we wanted to get closer to concepts we already mastered.&lt;/p&gt;

&lt;p&gt;So we introduced some React-friendly abstractions—hooks like &lt;code&gt;useCallTool&lt;/code&gt;, &lt;code&gt;useWidgetState&lt;/code&gt;, and &lt;code&gt;useLocale&lt;/code&gt;, as well as more advanced state management like &lt;code&gt;createStore&lt;/code&gt; built on Zustand for complex data flows. Reintroducing familiar frontend patterns reduced boilerplate and made widget development feel closer to modern web workflows.&lt;/p&gt;

&lt;h2&gt;
  
  
  Turning lessons into a Codex Skill
&lt;/h2&gt;

&lt;h3&gt;
  
  
  15. Turn lessons into reusable tooling
&lt;/h3&gt;

&lt;p&gt;As these patterns emerged across multiple apps, it became clear that repeatedly rediscovering them was slowing us down. To make ChatGPT App development faster and more predictable we decided to encode these lessons directly into our tooling, and not just for ourselves but for the community.&lt;/p&gt;

&lt;p&gt;This led to two complementary efforts:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;The &lt;a href="https://github.com/alpic-ai/skybridge" rel="noopener noreferrer"&gt;Skybridge Framework&lt;/a&gt;:&lt;/strong&gt; an open-source React framework packages many of the patterns described in this post into reusable building blocks, including our hooks (&lt;code&gt;useCallTool&lt;/code&gt;, &lt;code&gt;useToolInfo&lt;/code&gt;), the dev tools (HMR and local emulator), and the data-llm attribute.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The chatgpt-apps-builder &lt;a href="https://github.com/alpic-ai/skybridge/tree/main/skills/chatgpt-app-builder" rel="noopener noreferrer"&gt;Codex Skill&lt;/a&gt;:&lt;/strong&gt; on top of the framework, we built a dedicated Codex Skill to support the full app lifecycle:

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Ideation:&lt;/strong&gt; brainstorming how to make an app “agentic” rather than just a web port.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Code generation:&lt;/strong&gt; writing both the React frontend and the MCP server backend simultaneously, pre-configured with all the right UX and UI patterns.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Local testing:&lt;/strong&gt; starting dev servers and connecting local apps to ChatGPT for real-time iteration via hot reload.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;QA and publishing:&lt;/strong&gt; running structured checks against OpenAI’s submission guidelines, including CSP validation, safe-zone considerations, and production testing.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deployment of the app:&lt;/strong&gt; assisting with the final steps required to ship and iterate on an app.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;To install and use the Skill, simply use the following command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;npx skills add alpic-ai/skybridge
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;

  &lt;iframe src="https://www.youtube.com/embed/jjdUUdpYO5k"&gt;
  &lt;/iframe&gt;


&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Building ChatGPT Apps requires rethinking how context flows, how interfaces behave, and how users and models collaborate. Many of the lessons in this post came from gaps between familiar web patterns and the realities of agentic systems.&lt;/p&gt;

&lt;p&gt;By sharing these lessons, and by encoding them into our open-source framework and Codex skill, we hope to help teams spend less time rediscovering the same issues and more time exploring what this new interaction model makes possible. The most compelling ChatGPT Apps won’t be simple ports of existing products, but experiences deliberately designed around this new AI-first experience.&lt;/p&gt;

</description>
      <category>chatgpt</category>
      <category>mcp</category>
      <category>ui</category>
    </item>
    <item>
      <title>[Template] ChatGPT Apps starter kit (Vite + React + HMR)</title>
      <dc:creator>Erica Beavers</dc:creator>
      <pubDate>Tue, 28 Oct 2025 13:02:24 +0000</pubDate>
      <link>https://dev.to/alpic/template-chatgpt-apps-starter-kit-vite-react-hmr-3lpj</link>
      <guid>https://dev.to/alpic/template-chatgpt-apps-starter-kit-vite-react-hmr-3lpj</guid>
      <description>&lt;p&gt;&lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/alpic-ai/apps-sdk-template" rel="noopener noreferrer"&gt;https://github.com/alpic-ai/apps-sdk-template&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;OpenAI's ChatGPT Apps SDK lets you build interactive widgets that render inside ChatGPT using MCP. The initial template released a few weeks ago worked more or less, but the dev experience was rough. Every widget change required rebuilding the entire pipeline to get fresh assets.&lt;/p&gt;

&lt;p&gt;That’s why built a starter template with HMR and a Skybridge framework to simplify the entire workflow:&lt;/p&gt;

&lt;p&gt;What it includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Vite dev server&lt;/strong&gt; with HMR running alongside your MCP Express server (one process, instant widget reload in ChatGPT)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Skybridge framework&lt;/strong&gt; with file-based conventions that automatically map MCP widget endpoints to React components (name your endpoint &lt;code&gt;pokemon-card&lt;/code&gt;, create &lt;code&gt;web/src/widgets/pokemon-card.tsx&lt;/code&gt;, done)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;One-click deploy to &lt;a href="//www.alpic.ai"&gt;Alpic&lt;/a&gt;&lt;/strong&gt; with bundling, hosting, and MCP analytics included, or to the platform of your choice
No lock-in - built on the official @modelcontextprotocol/sdk&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Quick start:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;bash
git clone https://github.com/alpic-ai/apps-sdk-template
cd apps-sdk-template
pnpm install &amp;amp;&amp;amp; pnpm dev
ngrok http 3000

Add https://your-url.ngrok-free.app/mcp to ChatGPT Settings → Connectors

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Edit React components in &lt;code&gt;web/src/widgets/&lt;/code&gt; and see changes instantly in ChatGPT. No reconnecting, no rebuilding as the naming convention handles all the wiring automatically.&lt;/p&gt;

&lt;p&gt;A few more details:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Skybridge's file-based convention: endpoint name must match widget filename (&lt;code&gt;pokemon-card&lt;/code&gt; endpoint → &lt;code&gt;pokemon-card.tsx&lt;/code&gt; component)&lt;/li&gt;
&lt;li&gt;HMR updates widgets in real-time while MCP server keeps running&lt;/li&gt;
&lt;li&gt;Production build compiles everything and deploys to Alpic in ~30 seconds&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Plus, the sample app is Pokemon for a little nostalgia while you’re developing (you can thank us for that by starring in the repo!)&lt;/p&gt;

&lt;p&gt;Happy to answer questions about the implementation or MCP integration patterns! And stay tuned for our next article that will offer insight on how we built the framework ;) &lt;/p&gt;

&lt;p&gt;And a quick demo for the road: &lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgwvu2502qzw80a3yyxcq.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgwvu2502qzw80a3yyxcq.gif" alt="Demo of developing with chatGPT app starter kit" width="760" height="427"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>chatgpt</category>
      <category>react</category>
    </item>
    <item>
      <title>Behind the Kiwi.com MCP server: building an agentic flight booking service</title>
      <dc:creator>Erica Beavers</dc:creator>
      <pubDate>Wed, 27 Aug 2025 14:11:19 +0000</pubDate>
      <link>https://dev.to/alpic/behind-the-kiwicom-mcp-server-building-an-agentic-flight-booking-service-2pdd</link>
      <guid>https://dev.to/alpic/behind-the-kiwicom-mcp-server-building-an-agentic-flight-booking-service-2pdd</guid>
      <description>&lt;p&gt;When Kiwi.com released their MCP server earlier this month, it became one of the first examples of agentic travel booking. This post covers how we helped them build it, what we optimized, and where we think there’s still room to improve.&lt;/p&gt;

&lt;p&gt;If you'd like to try it out, the install guide is here: &lt;a href="https://mcp-install-instructions.alpic.cloud/servers/kiwi-com-flight-search" rel="noopener noreferrer"&gt;https://mcp-install-instructions.alpic.cloud/servers/kiwi-com-flight-search&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why MCP instead of APIs or scraping?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Scraping the UI&lt;/strong&gt;: brittle, slow, and expensive (cookie banners, custom date pickers, JavaScript quirks).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Direct APIs&lt;/strong&gt;: better, but not designed for LLMs. Hundreds of endpoints return too much irrelevant data, flooding context windows.&lt;/p&gt;

&lt;p&gt;MCP provides a middle ground. It lets developers expose exactly the right tools, with guardrails, so the model can handle the task effectively.&lt;/p&gt;

&lt;h2&gt;
  
  
  Kiwi.com’s MCP server: first version
&lt;/h2&gt;

&lt;p&gt;The current server exposes a single search-flight tool with the following parameters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Trip type (one-way, round-trip)&lt;/li&gt;
&lt;li&gt;Origin and destination (city or airport)&lt;/li&gt;
&lt;li&gt;Dates with ±3 day flexibility&lt;/li&gt;
&lt;li&gt;Passenger mix (adult, child, infant)&lt;/li&gt;
&lt;li&gt;Cabin class&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each result comes with a direct booking link.&lt;/p&gt;

&lt;h2&gt;
  
  
  What we optimized with Alpic
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. One-click deployment &amp;amp; hosting&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The server is deployed from a Git repo with standard build commands. Once pushed, it’s live on a secure HTTPS endpoint with a custom domain. Behind the scenes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;TLS termination and request parsing handled automatically&lt;/li&gt;
&lt;li&gt;Tool execution in stateless, isolated environments&lt;/li&gt;
&lt;li&gt;Built-in DDoS protection and rate limiting&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This reduced Kiwi.com’s operational overhead so they could iterate quickly and expose the service to real users.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Server design choices&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Shortened booking links: Long URLs eat context and risk breaking. We introduced shortened booking links to keep token usage small.&lt;/p&gt;

&lt;p&gt;Structured responses: Instead of a free blob of text, the MCP instructs the LLM to return results in a table format, making comparison easier for users. &lt;/p&gt;

&lt;p&gt;Fewer, curated results: Kiwi.com’s API can return thousands of flights, but the MCP server only sends a few dozen “best” options. This leverages Kiwi’s business logic while avoiding models making poor tradeoffs (like suggesting 15-hour layovers for €30 savings).&lt;/p&gt;

&lt;h2&gt;
  
  
  What’s missing and what’s next
&lt;/h2&gt;

&lt;p&gt;Right now, the server only handles simple round-trip and one-way searches. Features not yet handled include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Multi-destination itineraries&lt;/li&gt;
&lt;li&gt;Checked bags&lt;/li&gt;
&lt;li&gt;Max duration filtering&lt;/li&gt;
&lt;li&gt;Account login and loyalty integration&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;On the protocol side, new MCP features could open up more options:&lt;/p&gt;

&lt;p&gt;Elicitations: proactively asking the user to clarify (e.g. “Do you prefer Orly or CDG?” when searching Paris flights).&lt;/p&gt;

&lt;p&gt;User preferences: storing seat choices, airlines, or price vs. comfort tradeoffs in reusable context.&lt;/p&gt;

&lt;p&gt;Finally, better client-side capabilities will make adoption smoother: server registries (e.g. Claude Directory), less installation friction, and eventually end-to-end booking inside assistants without link redirects.&lt;/p&gt;

&lt;h2&gt;
  
  
  Takeaways
&lt;/h2&gt;

&lt;p&gt;MCP servers are not just API wrappers, they require deliberate design for LLM usability.&lt;/p&gt;

&lt;p&gt;Optimizing context usage (short links, curated data, structured responses) is critical.&lt;/p&gt;

&lt;p&gt;Kiwi.com’s MCP server is an early step, but it shows how travel booking could work in an agent-native world. We'd love to have your feedback and ideas for future iterations.&lt;/p&gt;

&lt;p&gt;Also, if you’re building with MCP and want one-step deployment, hosting, and AI-specific analytics, check out &lt;a href="https://alpic.ai/" rel="noopener noreferrer"&gt;Alpic.ai&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>ai</category>
      <category>aiops</category>
    </item>
    <item>
      <title>Better MCP tools/call Error Responses: Help Your AI Recover Gracefully</title>
      <dc:creator>Frédéric Barthelet</dc:creator>
      <pubDate>Mon, 28 Jul 2025 07:41:02 +0000</pubDate>
      <link>https://dev.to/alpic/better-mcp-toolscall-error-responses-help-your-ai-recover-gracefully-15c7</link>
      <guid>https://dev.to/alpic/better-mcp-toolscall-error-responses-help-your-ai-recover-gracefully-15c7</guid>
      <description>&lt;p&gt;When building MCP servers, we often focus on the happy path: what happens when tools execute successfully. But what about when things go wrong? The quality of your error responses can make the difference between a frustrated user and an AI that recovers gracefully on its own.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding MCP Error Types: Protocol vs Tool Errors
&lt;/h2&gt;

&lt;p&gt;Before diving into error response strategies, it's crucial to understand the distinction between two types of errors in MCP:&lt;/p&gt;

&lt;h3&gt;
  
  
  MCP Protocol-Level Errors
&lt;/h3&gt;

&lt;p&gt;These are errors in the MCP communication itself:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Connection closed or request timeout&lt;/li&gt;
&lt;li&gt;Tool not found&lt;/li&gt;
&lt;li&gt;Malformed requests or protocol violations&lt;/li&gt;
&lt;li&gt;Internal server errors&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These errors trigger standard JSON-RPC error responses and typically indicate something is fundamentally broken with the request or the server.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"jsonrpc"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"error"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"code"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;-32001&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"message"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Request Timeout"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Tools/call Errors (The Focus of This Article)
&lt;/h3&gt;

&lt;p&gt;These are errors that occur during tool execution. The tool was found and called, but something went wrong during the processing. These should &lt;strong&gt;not&lt;/strong&gt; be returned as MCP protocol errors, but as successful MCP JSON-RPC responses with &lt;code&gt;isError: true&lt;/code&gt; in the result payload.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"jsonrpc"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2.0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"result"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"An error occurred."&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"isError"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Tools/call Error Responses Are Context, Not Dead Ends
&lt;/h2&gt;

&lt;p&gt;Why bother sharing so many details about the difference between these two error formats? They're both still errors, right? Nothing that needs much attention.&lt;/p&gt;

&lt;p&gt;Wrong! MCP protocol-level errors are captured by the MCP client, eventually surfaced in the UI (like a notification in Claude), and discarded. On the other hand, &lt;strong&gt;tools/call errors are injected back into the LLM context window, just like successful responses&lt;/strong&gt;. Smart error messages can be leveraged by the model as much as any other prompt, giving it a chance to recover from the error without human intervention.&lt;/p&gt;

&lt;p&gt;Most open-source MCP implementations I've seen return generic tool error messages that leave the AI (and users) in the dark. Let's look at what it takes to rework error messages and increase your server's overall quality.&lt;/p&gt;

&lt;h2&gt;
  
  
  3 Use-Cases of Better Error Responses
&lt;/h2&gt;

&lt;p&gt;Here are examples of elevated error messages that improve model task completion rate (the north star metric used to evaluate MCP server quality).&lt;/p&gt;

&lt;h3&gt;
  
  
  Tool Ordering Guidance
&lt;/h3&gt;

&lt;p&gt;If the application's state prevents the model from using a tool for a given resource, provide instructions on how to update that state to make the tool usable. For example, if you're a famous three-letter infrastructure company exposing a tool to terminate an instance, but this tool can only be called when the instance is in a stopped state, say so in the error message.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"You can't terminate an instance in the running state. Use the stop_instance tool first on this instance."&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"isError"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Refined Validation Messages
&lt;/h3&gt;

&lt;p&gt;When tool input validation criteria aren't fully representable in JSON schema, use tool error messages to give the model additional context. If you're a travel company exposing a booking tool on your MCP server and the model accidentally misinterprets the current year for your booking request, you can correct it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"The requested travel date cannot be set in the past. You requested travel on July 31st, 2024, but the current date is July 25th, 2025. Did you mean to plan for travel on July 31st, 2025 instead?"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"isError"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Smart Unknown Error Handling
&lt;/h3&gt;

&lt;p&gt;Even when you can't provide precise details about an error, give the model instructions on retry strategy and fallback actions to direct the user to:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"An unknown error happened. Try again immediately. If it's the 3rd time you're encountering this issue, provide the user with a link to https://mydashboard.example.com/manual-task to perform the task manually."&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"isError"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Error handling in MCP isn't just about graceful failures—it's about creating collaborative experiences where AI can self-correct and recover. By treating error responses as contextual guidance rather than terminal states, you transform frustrating dead ends into stepping stones toward success.&lt;/p&gt;

&lt;p&gt;Remember: every error response is an opportunity to teach the AI how to do better next time.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;What patterns have you found effective for MCP error handling? Share your experiences in the comments below.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>mcp</category>
    </item>
  </channel>
</rss>
