<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Malik Chohra</title>
    <description>The latest articles on DEV Community by Malik Chohra (@malik_chohra).</description>
    <link>https://dev.to/malik_chohra</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1343661%2F5a5ed7f3-f672-4bd4-b592-b57036da4c95.jpg</url>
      <title>DEV Community: Malik Chohra</title>
      <link>https://dev.to/malik_chohra</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/malik_chohra"/>
    <language>en</language>
    <item>
      <title>Building a generative-UI SDK for React Native: registry, Zod, Hermes-safe streaming</title>
      <dc:creator>Malik Chohra</dc:creator>
      <pubDate>Mon, 22 Jun 2026 19:25:49 +0000</pubDate>
      <link>https://dev.to/malik_chohra/building-a-generative-ui-sdk-for-react-native-registry-zod-hermes-safe-streaming-47nm</link>
      <guid>https://dev.to/malik_chohra/building-a-generative-ui-sdk-for-react-native-registry-zod-hermes-safe-streaming-47nm</guid>
      <description>&lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Generative UI lets an AI model assemble the interface at runtime instead of hard-coding every screen.&lt;/li&gt;
&lt;li&gt;The web already ships it (Vercel AI SDK, Tambo, Google's A2UI). Mobile has almost nothing native.&lt;/li&gt;
&lt;li&gt;React Native blocks it three ways: broken streaming, costly nested trees, no native agent renderers.&lt;/li&gt;
&lt;li&gt;So I'm building Wire RN, an open-source generative UI SDK for iOS and Android.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Short version up front: the web already has good tools for generative UI, mobile has almost none, and I got tired of waiting. So I'm building one. The longer version is the more interesting part, and it starts with how interfaces have always changed.&lt;/p&gt;

&lt;p&gt;Every era of the interface moved in the same direction. You just have to squint to see it.&lt;/p&gt;

&lt;p&gt;Command lines made you learn the machine. Exact syntax, no forgiveness. The graphical UI flipped some of that: windows, a mouse, things you could see instead of memorize. Touch went further, the screen became the thing you manipulated directly. Then chat arrived and you could just type what you wanted in plain language. Each step moved a little more of the burden off the user and onto the software.&lt;/p&gt;

&lt;p&gt;Generative UI is the next step on that same line. And it's the biggest one yet.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F1vjwbljsdumvh8lhgie1.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F1vjwbljsdumvh8lhgie1.png" alt="infographic — the evolution of the interface (command line to graphical UI to touch to conversational to generative" width="800" height="340"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I'm a React Native engineer, nine years in. I've spent the last six months building with this pattern daily. I want to show you the SDK I'm building and why. A sneak peek.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fkoybdjz94nhmb2kqhc4y.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fkoybdjz94nhmb2kqhc4y.gif" alt="GIF — onboarding screen changing component type turn to turn, rendered live by the model" width="360" height="781"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;A flow in my own app, built on the pattern this whole piece is about. The screen changes shape between turns because the model is choosing what to render next.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What generative UI actually is
&lt;/h2&gt;

&lt;p&gt;Generative UI is when a model decides the interface at runtime, instead of a developer hard-coding every screen in advance. The model reads context (your last answer, your history, the task) and emits structured data that says "render this component, with these props." Your app maps that to a real component. The user never sees the structured data. They see a screen that happens to be different from the one their neighbor got.&lt;/p&gt;

&lt;p&gt;The distinction that matters: this is not a chatbot. A chatbot returns text and you read it. Generative UI returns interface. Things you tap and type into. The model is behind the screen.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://cloud.google.com/discover/generative-ui" rel="noopener noreferrer"&gt;Google now defines it&lt;/a&gt; in almost exactly these terms: a front-end architecture where the interface is built by AI in real time rather than hard-coded by developers. They frame the old way as the "wall of text" problem. Models could reason and plan, then collapsed it all into a paragraph of markdown, which is what generative UI fixes. Let the natural output of a capable model be an actual interface.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why big tech suddenly cares
&lt;/h2&gt;

&lt;p&gt;Because the data backs it up. &lt;a href="https://research.google/blog/generative-ui-a-rich-custom-visual-interactive-user-experience-for-any-prompt/" rel="noopener noreferrer"&gt;Google's own evaluations&lt;/a&gt; show people strongly prefer generated interactive experiences over plain text answers. And Google is openly calling this work the first step toward fully AI-generated user experiences, where interfaces get tailored to the user instead of pulled from a fixed catalog of apps.&lt;/p&gt;

&lt;p&gt;Read that last part again. The endgame they're describing isn't "nicer chat answers." It's interfaces assembling themselves per user, per moment, personalized by user need, the fixed app catalog dissolving into something generated on demand. When the company that owns Android and Chrome writes that down as a direction, mobile teams should pay attention.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://tambo.co/blog/posts/what-is-generative-ui" rel="noopener noreferrer"&gt;Tambo&lt;/a&gt;, one of the web libraries leading here, puts the same idea in plainer language: we used to adapt to software, now software adapts to us.&lt;/p&gt;

&lt;h2&gt;
  
  
  What web companies already ship
&lt;/h2&gt;

&lt;p&gt;This is the uncomfortable part for mobile people. On the web, generative UI is past theory. It's npm-installable. The players worth knowing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://ai-sdk.dev/docs/ai-sdk-ui/generative-user-interfaces" rel="noopener noreferrer"&gt;Vercel's AI SDK&lt;/a&gt;&lt;/strong&gt; wires a model's tool calls straight to React components. The model calls a tool, the tool returns data, and that result connects to a component instead of a string of text.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://tambo.co" rel="noopener noreferrer"&gt;Tambo&lt;/a&gt;&lt;/strong&gt; is the clearest template for the pattern. You register your React components with Zod schemas, and the agent picks which one to render from natural language. Zod validates the props at runtime, so a malformed output gets caught before it ever reaches render. No "undefined is not a function" in production. The catch: it's React-only, and "other frameworks" includes the one your phone runs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://www.copilotkit.ai/generative-ui" rel="noopener noreferrer"&gt;CopilotKit's AG-UI&lt;/a&gt;&lt;/strong&gt; pushes toward a declarative middle ground, where agents emit a structured spec of cards, lists, forms, and widgets rather than free-form code, so one spec can render across React, mobile, and desktop.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://developers.googleblog.com/introducing-a2ui-an-open-project-for-agent-driven-interfaces/" rel="noopener noreferrer"&gt;Google's A2UI&lt;/a&gt;&lt;/strong&gt; (&lt;a href="https://a2ui.org/" rel="noopener noreferrer"&gt;a2ui.org&lt;/a&gt;) is the open protocol version of that idea: agents send declarative component descriptions, the client renders them with its own native widgets. The reference renderers shipped so far are Angular, Flutter, Lit, and web components. React Native is not on the list.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The registry-plus-schema pattern is the through-line in all of them, and it's the thing that makes this safe to ship. The model isn't writing UI code at runtime. It's filling out forms your components already defined. Creative freedom over the flow, zero freedom over what a component is.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fsuieepri7y9eguwg10yk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fsuieepri7y9eguwg10yk.png" alt="infographic — how generative UI stays safe (user context to model to schema gate to native render, with a fallback branch" width="800" height="347"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Gemini moment
&lt;/h2&gt;

&lt;p&gt;Then there's the example everyone's actually seen now.&lt;/p&gt;

&lt;p&gt;Gemini &lt;a href="https://research.google/blog/generative-ui-a-rich-custom-visual-interactive-user-experience-for-any-prompt/" rel="noopener noreferrer"&gt;shipped generative UI&lt;/a&gt; into its own app as two experiments. Dynamic View uses agentic coding to design and code a fully custom interactive response per prompt. Visual Layout generates magazine-style multimodal responses with photos and interactive modules. Ask it to plan a three-day trip to Rome and you get a visual itinerary you can actually explore and adjust across multiple turns. Not a wall of text. A built thing.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Gemini generating an interface, not an answer. The image and the interactive steps get built into the chat itself. This is the version most people will meet first, and it's worth knowing exactly how it works.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Here's the part that matters for us. Even Gemini's mobile experience is the model writing HTML, CSS, and JavaScript and rendering it in an app shell. It generates web code in real time and shows it inside the app. Impressive. But it's generated web, displayed in an app. It is not native components, your design system, or your offline behavior. Which is exactly where mobile's real problem lives.&lt;/p&gt;

&lt;h2&gt;
  
  
  So why is mobile still behind?
&lt;/h2&gt;

&lt;p&gt;Because every generative UI library that works is web-shaped, and mobile punishes web-shaped assumptions. Three walls, in the order you hit them:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Streaming doesn't work.&lt;/strong&gt; React Native's Hermes engine doesn't implement &lt;code&gt;ReadableStream&lt;/code&gt; on &lt;code&gt;fetch&lt;/code&gt;. Every LLM SDK that streams tokens through &lt;code&gt;response.body.getReader()&lt;/code&gt; breaks on a real device. OpenAI's, Anthropic's, Google's, all of them. The first thing every mobile AI developer learns is that the model provider's own quickstart doesn't run on their phone.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Recursive component trees are a tax.&lt;/strong&gt; Web generative UI emits nested trees: a Card holding a Row holding Buttons. On mobile, that recursion multiplies validation work, hammers the JS thread mid-stream, and hands the model more places to invent a prop. Models are measurably worse at deep nested structures than flat ones. Token cost climbs, malformed output climbs with it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No agent rails.&lt;/strong&gt; The agent protocols (A2A, AG-UI, &lt;a href="https://a2ui.org/" rel="noopener noreferrer"&gt;Google's A2UI&lt;/a&gt;) were spec'd web-first. A2UI ships native renderers for Angular, Flutter, Lit, and web. If you want an agent to drive native React Native screens today, you're writing the adapter yourself.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And look at what mobile actually has in 2026. The "best React Native UI libraries" lists are all static kits: Tamagui, NativeWind, gluestack, Restyle. Excellent component libraries. Not one of them is generative. The category just doesn't exist on mobile yet the way it does on web.&lt;/p&gt;

&lt;p&gt;So mobile teams do what teams always do. They patch a polyfill, give up on streaming, hand-roll a JSON-to-component mapper, and accidentally write 1,200 lines of glue code. I know the number because I wrote that glue for a client in 2024, and it broke every time the model invented a prop name.&lt;/p&gt;

&lt;h2&gt;
  
  
  How far does this go?
&lt;/h2&gt;

&lt;p&gt;Worth zooming out before I get to the building part. The designer Andy Budd recently sketched a ladder of "adaptive software," borrowing the metaphor from the autonomy levels we use for self-driving cars. At the bottom, humans author every screen. Near the top, a product stops shipping one best version and starts assembling a different experience per cluster of users, then eventually per individual, per session, per task. The reframe that stuck with me is his: the question stops being "what's the best version of this flow?" and becomes best for whom? At the far end, each interaction becomes a design problem of its own.&lt;/p&gt;

&lt;p&gt;He names the catch too, and it's a real one. A product that can adapt to your needs can adapt to your weaknesses just as easily. Personalization and manipulation run on the same engine. Anything that assembles interfaces per user needs rules about what it's allowed to optimize for, and in whose interest. That isn't a mobile problem or a web problem. It's a "we can suddenly do this" problem, and it shows up whether or not anyone writes the rules first.&lt;/p&gt;

&lt;p&gt;Here's the bridge back to my corner of it. You don't reach the top of that ladder with hard-coded screens. Per-user, per-session interfaces need a system that builds UI at runtime from validated components, and that mechanism is what generative UI actually is. The web has it. Mobile, as we've covered, mostly doesn't. Which is the gap I'm building into.&lt;/p&gt;

&lt;h2&gt;
  
  
  So I'm building it
&lt;/h2&gt;

&lt;p&gt;I'm building the thing I kept needing: an open-source generative UI SDK for iOS and Android, built on React Native. Wire RN.&lt;/p&gt;

&lt;p&gt;Same core pattern the web libraries proved. A fixed registry of components, strict schema validation between the model and the screen, the model choosing the flow and never inventing a component. But built for the walls above instead of around them:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Streaming that survives Hermes&lt;/strong&gt;, so token streaming actually works on a real device.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Flat components instead of recursive trees&lt;/strong&gt;, because mobile screens are sequential anyway, and flat is where models make fewer mistakes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The registry stays yours&lt;/strong&gt;, so your brand, design system, and accessibility never leave your control.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I rebuilt my own app's onboarding on it first.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Three turns of the same onboarding flow, left to right: the model rendered a text field, then a "reading your answer" beat, then a different choice-chips component. Same code, a different screen per user, no release cycle to change it.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The first version confidently rendered a date picker when it should have asked about sleep habits. So no, the robots are not running the show yet. Validation matters. And a rule of thumb on where to even use this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Good fits:&lt;/strong&gt; onboarding, check-ins, recommendations, anything high-variance where the right next screen depends on the last answer.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bad fits:&lt;/strong&gt; settings, billing, anything you want stable and predictable. Please do not make your settings screen generative.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So that's the why. The interface has always drifted toward adapting to the user instead of the other way around. The web got the generative version first. Mobile is late, but it's the surface where most people actually live, and being late means whoever picks it up now gets a personalization lever their competitors can only iterate on through App Store review cycles. I'd rather build the tool than keep writing the glue.&lt;/p&gt;

&lt;p&gt;I'm shipping Wire RN open-source this month.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Is generative UI the same as a chatbot?
&lt;/h3&gt;

&lt;p&gt;No. A chatbot returns text that you read. Generative UI returns interface: real native components you tap, type into, and swipe. The model decides which component renders next from your context, and your code decides what components are allowed to exist. The conversation happens through the UI, not in a chat bubble.&lt;/p&gt;

&lt;h3&gt;
  
  
  Does the AI write code at runtime?
&lt;/h3&gt;

&lt;p&gt;No, and this is the part most people get wrong. The model emits structured data that maps to pre-built, schema-validated components from a registry you control. It never writes React Native at runtime. If it emits something malformed, your validation layer rejects it before render and the user sees a fallback, not a crash.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why can't I just use Tambo or the Vercel AI SDK in React Native?
&lt;/h3&gt;

&lt;p&gt;They're built on web assumptions. They expect browser streaming APIs that React Native's Hermes engine doesn't implement, and they emit recursive component trees that perform badly on mobile and give the model more room to hallucinate props. They're excellent on the web. Mobile needs a runtime shaped for mobile.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is generative UI safe for production apps?
&lt;/h3&gt;

&lt;p&gt;Yes, if it's validated. The registry-plus-schema pattern means the model can only ever pick from components you shipped, with props checked against a schema before anything renders. The risk profile is closer to "remote config with opinions" than to "AI writes my app." The model owns the flow, never the component definitions.&lt;/p&gt;

&lt;h3&gt;
  
  
  Where does generative UI make sense first?
&lt;/h3&gt;

&lt;p&gt;Onboarding, check-ins, recommendation flows, coaching: anywhere the ideal next screen depends on what the user just did. These are high-variance and conversational, so personalization pays off. Static utility screens like settings and billing should stay hard-coded. Please do not make your settings screen generative.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;I write &lt;a href="https://codemeetai.substack.com" rel="noopener noreferrer"&gt;Code Meet AI&lt;/a&gt;, a weekly newsletter on AI-native mobile engineering. Wire RN goes open-source this month — the build, the bugs, and the repo are the next issue.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>reactnative</category>
      <category>ai</category>
      <category>typescript</category>
      <category>mobile</category>
    </item>
    <item>
      <title>Harness Engineering 101: Prompt Engineering wasn't enough. Neither was context. The harness was.</title>
      <dc:creator>Malik Chohra</dc:creator>
      <pubDate>Thu, 18 Jun 2026 09:17:46 +0000</pubDate>
      <link>https://dev.to/malik_chohra/harness-engineering-101-prompt-engineering-wasnt-enough-neither-was-context-the-harness-was-f3a</link>
      <guid>https://dev.to/malik_chohra/harness-engineering-101-prompt-engineering-wasnt-enough-neither-was-context-the-harness-was-f3a</guid>
      <description>&lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Prompt engineering and context engineering both still left me as the bottleneck. I re-explained myself every single session.&lt;/li&gt;
&lt;li&gt;The fix was structural, not verbal. A harness: standing context, memory files, real account access, delegation, and skills, so the model starts every morning already knowing my work.&lt;/li&gt;
&lt;li&gt;The term got named in 2026 (&lt;code&gt;Agent = Model + Harness&lt;/code&gt;). Two camps now argue about it and miss that they agree: the edge moved out of the model and into the structure around it.&lt;/li&gt;
&lt;li&gt;A harness is not free. It rots. Maintaining it is the actual job. And the version nobody is building yet is a harness inside the product you ship, not just around your desk.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A harness is everything you build around a model so it can do real work in your world: your files, your accounts, your standards, your history. The model is the swappable part. The harness is the part that makes the model useful, and it is also the part nobody screenshots.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prompt engineering, then context engineering, then the wall
&lt;/h2&gt;

&lt;p&gt;About a year and a half ago, my whole relationship with these models was prompt engineering. I collected phrasings that worked. "Act as a senior React Native engineer." "Think step by step." "Return only the diff." I had a notes file of magic openers. When an output was bad, my first instinct was that I had said it wrong.&lt;/p&gt;

&lt;p&gt;If you remember the wave of new AI influencers back then ("steal these prompts," "the prompt that killed marketing," and so on), the whole premise was that a better prompt was the fix.&lt;/p&gt;

&lt;p&gt;That works until it doesn't. The problem with prompt engineering is that the model still knows nothing about you. A perfect prompt produces a good answer to a generic question. I was not asking generic questions. I was asking about my codebase, my locked decisions, my half-built product. The prompt was clean and the answer was still confidently wrong, because the model had no idea what Morrow Self was or that the accent color was already decided.&lt;/p&gt;

&lt;p&gt;So I moved to context engineering, which is the obvious next step. Stop tuning the words, start assembling the right context window. Paste the relevant file. Paste the conventions. (I wrote a piece a while back on context engineering. [→ link to add])&lt;/p&gt;

&lt;p&gt;Paste yesterday's decision. The answers got dramatically better. Then I hit the wall, and the wall was me.&lt;/p&gt;

&lt;p&gt;I was the context. Every morning I sat there hand-assembling the same window: who I am, what I am building, what is locked, what shipped yesterday. I was a human glue layer copying my own life into a text box, and the moment the session ended, all of it evaporated. Context engineering made the model smarter per session and did nothing about the fact that every session started cold.&lt;/p&gt;

&lt;p&gt;That cold start is the actual problem. Not the wording. Not even the context itself. The fact that none of it persisted, so I rebuilt it by hand every day.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I built instead: a second brain that starts warm
&lt;/h2&gt;

&lt;p&gt;One of the first AI use cases that pulled me in was the second brain approach. I started early, and I will say one thing: it is &lt;em&gt;amazing&lt;/em&gt;. I would recommend a second brain to anyone. No AI use case is better, for me personally. I have a whole guide to help you get started: &lt;a href="https://gumroad.com/products/nhgsxf" rel="noopener noreferrer"&gt;the second brain starter guide&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I did not solve the cold-start problem on purpose. I solved it one annoyance at a time, and only later found out the pile of fixes had a shape.&lt;/p&gt;

&lt;p&gt;It started with a single file. A root &lt;code&gt;CLAUDE.md&lt;/code&gt; that tells the model who I am: nine years in React Native, how I write, what I am launching, which decisions are locked and not up for debate. Then a &lt;code&gt;CLAUDE.md&lt;/code&gt; per project, so inside the Wire RN repo it knows that codebase's rules, and inside my vault it knows the content rules. The model stopped starting cold. It started as someone who had worked with me for months.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F51cgr3ntsxjp5n7jg3iw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F51cgr3ntsxjp5n7jg3iw.png" alt="Claude.md for my morrow self app" width="800" height="947"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Then memory. Nearly a hundred markdown files now (97 the morning I counted), one fact each, with an index file the model reads at the top of every session. For example:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"Morrow Self is the locked app name."&lt;/li&gt;
&lt;li&gt;"The accent is teal, not violet, retired on June 11."&lt;/li&gt;
&lt;li&gt;"My ICP is B2C mobile app founders."&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The index is now big enough that it trips its own size limit, which tells you something honest about how this accretes. I do not re-explain my own business every morning anymore. It remembers, and when it is wrong, I fix one file instead of repeating myself for the hundredth time.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiyhyrc828a15ktpshs3a.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fiyhyrc828a15ktpshs3a.png" alt="My skills folder" width="800" height="730"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Then access. MCP connectors into Gmail, Calendar, Drive. The model reads my actual schedule and my actual inbox, not a sentence describing them. Context engineering was me narrating my calendar. This is the model just having the calendar.&lt;/p&gt;

&lt;p&gt;Then delegation, which is where my one real rule lives. When I need ten files grepped or a codebase mapped, that runs in a separate context and hands back the conclusion. This is the same principle I run on every build: the newest, most capable model plans, a cheaper and simpler one executes. The expensive brain decides what to do. The cheap one does the grunt work in its own window and never pollutes mine.&lt;/p&gt;

&lt;p&gt;On top of all of it sit the skills. Sixty-some of them: write a LinkedIn post in my voice, draft a newsletter, run the daily plan, plus scheduled jobs that fire without me sitting there.&lt;/p&gt;

&lt;p&gt;None of that was clever. Every piece exists because I got tired of repeating myself. That is the whole thing, and it is much more boring than the posts about it sound.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffxvclatjpc2d36v0401n.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffxvclatjpc2d36v0401n.png" alt="My Second Branin " width="800" height="800"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  So this has a name now: harness engineering
&lt;/h2&gt;

&lt;p&gt;Three months ago, someone replied to one of my posts to explain harness engineering to me. Kindly. Like I had never heard of it. He linked a newsletter, told me an agent is only as good as the scaffolding around it, and signed off with "wild stuff, right?"&lt;/p&gt;

&lt;p&gt;It was wild. I had been doing it since December. I just did not have the word.&lt;/p&gt;

&lt;p&gt;Here is the word. Mitchell Hashimoto coined "harness engineering" in early 2026: &lt;code&gt;Agent = Model + Harness&lt;/code&gt;. The model is the brain. The harness is everything around it that lets the brain act in your world. People break the harness into roughly five parts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;personalisation&lt;/li&gt;
&lt;li&gt;context&lt;/li&gt;
&lt;li&gt;action&lt;/li&gt;
&lt;li&gt;memory&lt;/li&gt;
&lt;li&gt;delegation, with skills and scheduled jobs as multipliers on top.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I mapped my own setup against that list expecting gaps and found I had quietly built all five.&lt;/p&gt;

&lt;p&gt;The number everyone repeats comes from a teardown of Claude Code, where the claim is that something like 98% of the system is harness and under 2% is the model. I will be honest: I have not verified what that figure actually counts, and most people reposting it have not either, so hold the exact number loosely. Directionally it matches what I see every day. The model is the small, swappable part. The scaffolding is where the work lives. (&lt;a href="https://martinfowler.com/articles/exploring-gen-ai/harness-engineering-memo.html" rel="noopener noreferrer"&gt;Martin Fowler's notes&lt;/a&gt; and &lt;a href="https://www.humanlayer.dev/blog/skill-issue-harness-engineering-for-coding-agents" rel="noopener noreferrer"&gt;HumanLayer's practitioner write-up&lt;/a&gt; are the two least hyped explainers I have read if you want the real version.)&lt;/p&gt;

&lt;h2&gt;
  
  
  Why it blew up, and the fight that misses the point
&lt;/h2&gt;

&lt;p&gt;While Hashimoto was naming the harness, another builder, Jake Van Clief, went the opposite direction and grew a community of tens of thousands in about six weeks, telling everyone to stop using agentic frameworks entirely. His pitch: delete LangChain, delete the orchestration libraries, replace all of it with numbered folders and markdown files. A folder and a model, he argues, beats a custom agent.&lt;/p&gt;

&lt;p&gt;Big shoutout to Jake. I love the guy, I follow him, and the advice and content are genuinely good. Highly recommend you follow him too: &lt;a href="https://www.youtube.com/@JEVanClief" rel="noopener noreferrer"&gt;youtube.com/@JEVanClief&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;So one camp says build more scaffolding and the other says tear the framework out and use the filesystem. They sound like enemies. They are saying the same thing.&lt;/p&gt;

&lt;p&gt;Both are telling you the model is not the point. The architecture around the model is the point. Whether your architecture is a LangChain graph or a folder named &lt;code&gt;02-draft&lt;/code&gt;, the bet is identical: the edge moved out of the model and into the structure you wrap around it.&lt;/p&gt;

&lt;p&gt;That is the thing I had been saying for six months before I had either of their vocabularies. I wrote a piece called &lt;a href="https://aimobilelauncher.com/blog/six-months-architecture-two-hours-redesign-ai-thesis" rel="noopener noreferrer"&gt;"I spent 6 months on architecture, then redesigned everything in 2 hours"&lt;/a&gt;. The redesign was fast because the harness was already there. The harness debate is the same argument in a newer hoodie. It blew up because two people gave a clean name to something a lot of us had already half-built and could suddenly point at.&lt;/p&gt;

&lt;h2&gt;
  
  
  The harness rots. Maintaining it is the job.
&lt;/h2&gt;

&lt;p&gt;Here is the part the harness posts leave out. A harness is not a one-time build. It is maintenance, and the maintenance is the actual job.&lt;/p&gt;

&lt;p&gt;Memory files rot. Mine contradict each other if I do not prune them. A good chunk of my files were one launch date out of sync within a month of being written. A stale memory is worse than no memory, because the model trusts it and so do you. People who run bigger memory systems than mine clear them out on a schedule, quarterly, and I now understand exactly why.&lt;/p&gt;

&lt;p&gt;Skills rot the same way. I have sixty installed. In a normal week maybe twelve fire. The other forty-eight are clutter I keep meaning to audit. A harness left untended does not stay neutral. It quietly fills with confident lies about your own life.&lt;/p&gt;

&lt;p&gt;So when someone tells you the harness is the new moat, the honest version is that the harness is the new gym membership. Owning it does nothing. Showing up to maintain it is the entire return.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to start one without a course
&lt;/h2&gt;

&lt;p&gt;If you are starting from zero, you do not need a framework, a course, or a community of thirty thousand people. You need three things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;One &lt;code&gt;CLAUDE.md&lt;/code&gt; (or its equivalent) that tells the model who you are and what is locked.&lt;/li&gt;
&lt;li&gt;A handful of memory files, one fact each, that the model reads at the start of a session.&lt;/li&gt;
&lt;li&gt;The discipline to add a new file every time you catch yourself explaining the same thing twice.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is a harness. Everything past that is refinement, not foundation.&lt;/p&gt;

&lt;p&gt;If you already have one, do not build more. Audit. Open your own memory files and count how many are still true. Count how many of your skills actually fired this week. The number will be humbling, and the prune will make the whole thing run better than any new addition would.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where this goes next: a harness inside the app, not around the dev
&lt;/h2&gt;

&lt;p&gt;Here is the gap I keep staring at. Everyone writing about this is pointing it at their own desk. Coding agents. Research assistants. Second brains like mine. Dev tools and knowledge work.&lt;/p&gt;

&lt;p&gt;Nobody is building a harness for shipping a consumer mobile app.&lt;/p&gt;

&lt;p&gt;That is the unclaimed corner, and it is the one I am standing in. The same idea, a model wrapped in structure it can trust, is what lets a mobile app render a different onboarding flow per user instead of the same six hard-coded questions for everyone.&lt;/p&gt;

&lt;p&gt;The harness for a product is not a &lt;code&gt;CLAUDE.md&lt;/code&gt;. It is a validated component registry, a streaming runtime that survives on a real phone, and an agent that drives screens instead of chat. That is what I have spent six months building into &lt;a href="https://getwireai.com" rel="noopener noreferrer"&gt;Wire RN&lt;/a&gt;, and it is the same lesson as the second brain, pointed at a different surface.&lt;/p&gt;

&lt;p&gt;That is also what I am shipping this week. Wire RN hits Product Hunt in a few days, and the next issue takes this exact harness idea and points it at an actual app, with the runtime and the component registry on screen instead of in theory.&lt;/p&gt;

&lt;p&gt;The people naming the harness are right. They are just looking at their own desk. The more interesting move is what happens when you put the harness inside the thing you ship.&lt;/p&gt;

&lt;p&gt;Now there is a shift toward Loop engineering. I already started playing with it, but as always, I want to test things first before I write a generic, AI-generated article about a new concept.&lt;/p&gt;

&lt;p&gt;I write a weekly issue on building AI-native software, mostly on mobile, mostly with receipts like these. If the cold-start problem in this piece sounds familiar, the next one shows the harness running inside a real app. &lt;a href="https://codemeetai.substack.com" rel="noopener noreferrer"&gt;codemeetai.substack.com&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>productivity</category>
      <category>reactnative</category>
    </item>
    <item>
      <title>Fable 5 Crashed My Subagents. The Fix Was the Cheaper Setup.</title>
      <dc:creator>Malik Chohra</dc:creator>
      <pubDate>Mon, 15 Jun 2026 12:02:00 +0000</pubDate>
      <link>https://dev.to/malik_chohra/fable-5-crashed-my-subagents-the-fix-was-the-cheaper-setup-16ij</link>
      <guid>https://dev.to/malik_chohra/fable-5-crashed-my-subagents-the-fix-was-the-cheaper-setup-16ij</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR for builders
&lt;/h2&gt;

&lt;p&gt;I ran Claude Fable 5 over a free-window weekend to rebrand six live sites on one design system. The lesson is a routing one:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fable 5 (about 2x Opus 4.8 per token) earns its price on judgment: the design system, the token structure, the calls the rest of the build inherits.&lt;/li&gt;
&lt;li&gt;Route the routine propagation to a cheaper model. I pinned subagents to Opus 4.8.&lt;/li&gt;
&lt;li&gt;Running Fable as the session model crashed my parallel subagent fan-outs. Pinning the subagents fixed the crash and happened to be the cheaper setup.&lt;/li&gt;
&lt;li&gt;None of it works on a codebase an agent can't read. The indexed repo did more for output quality than the model choice.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Full story below.&lt;/p&gt;

&lt;p&gt;Fable 5 dropped on a Monday or a Tuesday, I forget which. What I didn't forget: my weekly session resets Wednesday at 10am European time, and I'd just come back from holiday sitting on most of my limit with two days to burn it. After all the noise before the Mythos release, I wanted this model on a real job, not a demo.&lt;/p&gt;

&lt;p&gt;So Tuesday night I set two alarms. 2am, then 7am. The plan was stupid and simple. Wake up, run Fable until the session cap cut me off, sleep, wake up, finish. The cap came fast. Thirty minutes of real work, then a four-hour wait. I'd landed home with a backlog anyway, so the waits filled themselves: catch up in the morning, nap at night, run Fable in the windows between.&lt;/p&gt;

&lt;p&gt;Anthropic had Fable 5 on a free window for a couple of weeks, so I wasn't paying the 2x in euros. I was paying it in session caps. Same lesson, different currency.&lt;/p&gt;

&lt;p&gt;Most people one-shot a landing page with a model like this, watch it burn faster than Opus, and decide it's a tax. I did the opposite. I pointed it at a job I'd been avoiding: fix my branding across every site I own. I'd finally admitted the obvious. My palette was mostly violet, which is what every vibe-coded site on the internet looks like right now.&lt;/p&gt;

&lt;p&gt;I asked Claude Design for four color directions and picked between them. I pulled references from &lt;a href="https://ui.aceternity.com" rel="noopener noreferrer"&gt;Aceternity UI&lt;/a&gt;, &lt;a href="https://magicui.design" rel="noopener noreferrer"&gt;Magic UI&lt;/a&gt;, and &lt;a href="https://refero.design" rel="noopener noreferrer"&gt;Refero&lt;/a&gt;. The full "how I build a site I actually like" workflow is its own piece, coming soon. If there's an AI topic you want me to break down, reply to the email and tell me.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkfk1fvfbspa5ylt7hkmj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkfk1fvfbspa5ylt7hkmj.png" alt="Claude Design exploring four colorways on the same hero." width="800" height="489"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fux5739m54qg5oyuri6ag.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fux5739m54qg5oyuri6ag.png" alt="My choice for the new branding" width="800" height="437"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The one I shipped is the teal. Here is the before and after as a system, not a vibe.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F92yp1dg8rfy61spbzzjv.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F92yp1dg8rfy61spbzzjv.png" alt="The retired black-violet-cyan palette next to Colorway C: one ink, one paper, one teal accent" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It started with one site to replace the Linktree in my bio. Then I remembered I already had four more sites built. So I stopped and changed the goal. Instead of hand-fixing each one, build a system that launches a site with AI, scales, ships with its own rules and skills, and that I can reuse. That was the weekend.&lt;/p&gt;

&lt;p&gt;The honest catch: my old structure wasn't built for a model that bills like Fable 5. Before I could finish, I had to learn how to spend it.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is Claude Fable 5, and is the 2x worth it?
&lt;/h2&gt;

&lt;p&gt;Fable 5 is Anthropic's most capable public model right now. It sits above Opus 4.8 and costs about twice as much per token. That 2x is the whole conversation with this model, and most people get it backwards.&lt;/p&gt;

&lt;p&gt;It's not a tax. It's a routing decision.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The newest, most capable model plans. A cheaper, simpler one executes. That is the rule I run on every build.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;On any real build there are two kinds of work. There's the work where one wrong call costs you an hour of cleanup later: the design system, the token structure, deciding what "one accent line, never a fill" actually means in CSS across five sites. And there's the routine work: applying that decision to the hundredth component. Fable 5 earns its 2x on the first kind. The second kind, a cheaper model handles fine and you never notice the difference.&lt;/p&gt;

&lt;p&gt;So the question isn't "is Fable 5 worth 2x." It's "what am I asking it to do." Pay the premium where judgment lives. Route everything else down. I treated the model like a senior engineer I was renting by the token. I didn't have it rename variables. I had it make the calls the rest of the weekend would inherit.&lt;/p&gt;

&lt;p&gt;The memes about the bill are funny because they're half right. It is expensive if you point it at the wrong work.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2vqjf4o75fgfugrrocdt.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F2vqjf4o75fgfugrrocdt.png" alt="The internet's read on the Fable 5 bill" width="800" height="813"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What did I build with it?
&lt;/h2&gt;

&lt;p&gt;Six live sites, rebranded onto one editorial system, in a weekend. AI Web Launcher, AI Mobile Launcher, Wire RN's site, my personal site, my agency site, and one consumer app I left on its own palette on purpose. They all run on one shared boilerplate, so a rebrand is "change the design system once, let it propagate."&lt;/p&gt;

&lt;p&gt;Fable 5's job was the thinking, not the typing. It set the token structure, made the calls about what stayed consistent across sites and what was allowed to differ, and held the line on the rule that made the whole thing portable. The propagation, the hundred small edits behind each decision, I routed to a cheaper model. That split is the only reason a six-site rebrand fit into two days instead of two weeks.&lt;/p&gt;

&lt;p&gt;The output wasn't rough. Here are three of the six on Lighthouse, desktop, the morning after.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpy13y7e9buoqgc0rhj7y.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fpy13y7e9buoqgc0rhj7y.png" alt="Lighthouse desktop scores: aiweblauncher.com" width="800" height="882"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0dmxc2ddjsww9uf7xh7m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0dmxc2ddjsww9uf7xh7m.png" alt="Lighthouse desktop scores: aimobilelauncher.com" width="800" height="803"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9ci0z5kj7v25uutsjfmg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9ci0z5kj7v25uutsjfmg.png" alt="Lighthouse desktop scores: casainnov.com" width="800" height="861"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I also filmed the whole weekend, from the first color pick to the last deploy. The long version is going up on YouTube: [&lt;a href="https://youtu.be/51SS-kl-llo?si=JP5ClRDp47n3DtCn" rel="noopener noreferrer"&gt;https://youtu.be/51SS-kl-llo?si=JP5ClRDp47n3DtCn&lt;/a&gt;]. If you want the build narrated end to end, that's where it'll be.&lt;/p&gt;

&lt;h3&gt;
  
  
  The thing the weekend produced: AI Web Launcher
&lt;/h3&gt;

&lt;p&gt;The system I built to do all this is now a product. AI Web Launcher is a production-ready Next.js 15 boilerplate plus the workflow that takes a site from idea to deployed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Idea, brief, spec&lt;/li&gt;
&lt;li&gt;Copy and design&lt;/li&gt;
&lt;li&gt;Memory and architecture&lt;/li&gt;
&lt;li&gt;Build and deploy&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The part that matters isn't the starter screens. It's that the codebase ships documented and indexed, so an AI agent understands what already exists and edits that instead of guessing. That one property is the difference between "Fable made a clean call" and "Fable invented three files I didn't ask for." You get the full chain and the guardrails, not a blank repo and good luck.&lt;/p&gt;

&lt;p&gt;It's 99 euro. The first 10 people who apply get 50% off. I approve those by hand and send the discounted link, because I want feedback from the first ten more than I want the money. Apply at &lt;a href="https://aiweblauncher.com" rel="noopener noreferrer"&gt;aiweblauncher.com&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where did it bite me?
&lt;/h2&gt;

&lt;p&gt;Day one, I ran Fable 5 as the session model and it crashed my subagent fan-outs. I parallelize work across subagents constantly, and Fable as the orchestrator kept falling over on the parallel runs. The exact moment I wanted momentum, I got a stall.&lt;/p&gt;

&lt;p&gt;The fix was to stop using one model for everything. I pinned the subagents to Opus 4.8 for the routine fan-out and kept Fable for the decisions that mattered. The crash forced the exact setup the bill wanted anyway: expensive model for judgment, cheaper model for volume. I'd have gotten there from the cost side eventually. The bug just got me there by lunch.&lt;/p&gt;

&lt;p&gt;That's the honest version. This was not a frictionless "AI did my work" weekend. It was a model that's worth its price if you route it right and a waste if you don't, plus one setup bug I had to eat before it ran clean.&lt;/p&gt;

&lt;h2&gt;
  
  
  My Fable 5 session setup
&lt;/h2&gt;

&lt;p&gt;The thing that made it work isn't a prompt. It's a routing config:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Which model runs the session&lt;/li&gt;
&lt;li&gt;Which model the subagents are pinned to&lt;/li&gt;
&lt;li&gt;The rules that tell the agent to stop and ask before it spends Fable tokens on grunt work&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is also where the architecture earns its keep. Fable only made good calls because the codebase shipped with a map it could read, so "move everything to the new accent token" meant something specific instead of a guess. That map is part of a system I run called &lt;a href="https://aimobilelauncher.com/blog/u-amos-claude-code-skill-react-native-workflow" rel="noopener noreferrer"&gt;UAMOS&lt;/a&gt;, and I'm breaking the full thing down on the newsletter soon.&lt;/p&gt;

&lt;p&gt;I packaged the setup so you don't have to find the subagent crash yourself: the session config, the model-routing rules, the subagent pinning, and the short ruleset I ran. Reply to the newsletter with FABLE and I'll send it back.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where this breaks
&lt;/h2&gt;

&lt;p&gt;Fable 5 is not a default. Two ways it stops paying off:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Your work is mostly routine. CRUD screens, copy tweaks, predictable refactors. You'll pay 2x for output a cheaper model would have nailed, and you'll feel like the people who call it a tax. They're not wrong for their workload. They're wrong to generalize from it.&lt;/li&gt;
&lt;li&gt;Your codebase is one an agent can't read. I got good judgment out of Fable because the repo was indexed for it. Point any model, cheap or expensive, at an unmapped codebase and it guesses at the structure and breaks things a new way each session. The model is the smaller half of the result. The map is the bigger half.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And the free window is closing. After that, the routing discipline isn't optional. It's the only way the bill stays sane.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Is Claude Fable 5 worth the cost?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For judgment-heavy work, yes. It costs about 2x Opus 4.8 per token, and it earns that on calls where one wrong decision costs you an hour later: architecture, design systems, token structure. For routine, high-volume work, route to a cheaper model. The skill is knowing which is which.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fable 5 vs Opus 4.8, which should I use?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Both, on the same job. I ran Fable 5 for the decisions and pinned subagents to Opus 4.8 for the routine propagation. Using one model for everything is how you either overpay or underperform.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why did Fable 5 crash my subagents?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;In my case, running Fable as the session model fell over on parallel subagent fan-outs. Pinning the subagents to Opus 4.8 and keeping Fable for the orchestrating decisions fixed it. It also happens to be the cheaper, correct setup.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can a better model fix a messy codebase?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;No. A capable model on an unindexed codebase still guesses at the structure. The reliable speedup comes from indexing the repo so the agent edits what exists. Model quality is secondary to that.&lt;/p&gt;




&lt;p&gt;If you want my actual Fable 5 session setup, the routing config, subagent pinning, and the rules I ran, reply to the Code Meet AI newsletter with FABLE and I'll send it. One issue a week on AI-first mobile and web development: &lt;a href="https://codemeetai.substack.com" rel="noopener noreferrer"&gt;codemeetai.substack.com&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The boilerplate this ran on, AI Web Launcher, is for sale now. It's 99 euro, and the first 10 people who apply get 50% off. Apply at &lt;a href="https://aiweblauncher.com" rel="noopener noreferrer"&gt;aiweblauncher.com&lt;/a&gt;. I approve the first 10 by hand and send the discounted link.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;I'm Malik. I've built mobile products in health, wellness, and coaching for 9 years, and I'm building the AI-native infrastructure the next wave of those products will run on: &lt;a href="https://getwireai.com" rel="noopener noreferrer"&gt;Wire RN&lt;/a&gt;, open-source generative UI for React Native, and the launcher boilerplates. I write weekly at &lt;a href="https://codemeetai.substack.com" rel="noopener noreferrer"&gt;Code Meet AI&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>nextjs</category>
      <category>webdev</category>
      <category>productivity</category>
    </item>
    <item>
      <title>WWDC 2026: App Intents, Foundation Models, and what RN devs should ship</title>
      <dc:creator>Malik Chohra</dc:creator>
      <pubDate>Tue, 09 Jun 2026 14:51:27 +0000</pubDate>
      <link>https://dev.to/malik_chohra/wwdc-2026-app-intents-foundation-models-and-what-rn-devs-should-ship-3o8a</link>
      <guid>https://dev.to/malik_chohra/wwdc-2026-app-intents-foundation-models-and-what-rn-devs-should-ship-3o8a</guid>
      <description>&lt;p&gt;Apple made AI visibility mandatory and AI integration provider-agnostic. What changed, why it rhymes with App Tracking Transparency, and what to ship before iOS 27 lands in September.&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;App Intents is now the only way Siri reaches your app. No App Intents means your app is invisible to the new Siri.&lt;/li&gt;
&lt;li&gt;Foundation Models added a model abstraction layer. Swap Apple's on-device model, Gemini, or Claude with one line of code.&lt;/li&gt;
&lt;li&gt;The framework now ships a Python SDK, runs on Linux, and accepts image input from third-party apps.&lt;/li&gt;
&lt;li&gt;The EU and China get the developer APIs but not the consumer Siri AI features at launch.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;At WWDC 2026, Apple turned two things into platform defaults for mobile builders. First, AI visibility: App Intents is now the only way Siri can reach into your app, and SiriKit is on a deprecation clock. Skip it, and your app disappears from the assistant on iOS 27. Second, AI integration: the Foundation Models framework added a model abstraction layer, a Python SDK, Linux support, and image input, so adding on-device or multi-provider AI is a small, portable change now instead of a bet on Apple's roadmap.&lt;/p&gt;

&lt;h2&gt;
  
  
  The habit that made this keynote boring (in a good way)
&lt;/h2&gt;

&lt;p&gt;Years ago I kept a browser tab pinned to the Apple developer release notes. When Apple shipped App Tracking Transparency and made the data-sharing prompt mandatory, it broke analytics and attribution across half the apps I knew. You could not measure anything without explicit user consent, and teams scrambled for months to catch up. I had the consent flow wired into my React Native apps the week it went mandatory, because I had been watching it coming for a long time before that.&lt;/p&gt;

&lt;p&gt;That habit, watching what Apple makes mandatory and shipping it early, is most of the game on this platform. It is also why yesterday's keynote did not surprise me. The mandatory shift this year is not a tracking prompt. It is whether AI can see your app, and whether your app can use AI without betting the whole architecture on one vendor.&lt;/p&gt;

&lt;p&gt;Here is what mattered, filtered for people building mobile products. The React Native specifics are at the bottom.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why your app is invisible to Siri without App Intents
&lt;/h2&gt;

&lt;p&gt;I had the question before yesterday: how do you make your app visible to the AI tools running inside the phone? I half expected the answer to be an MCP server exposing your backend data, or something equally involved. App Intents was the answer I expected, just not as direct as Apple made it this time.&lt;/p&gt;

&lt;p&gt;Siri AI runs on App Intents. SiriKit is deprecated as of WWDC 2026 with compile-time warnings, and App Intents is the only supported way Siri can call into a third-party app. If your app does not expose its core actions as intents, the new Siri cannot see or trigger any of them.&lt;/p&gt;

&lt;p&gt;This is the App Tracking Transparency moment repeating. Back then the mandatory thing was a consent prompt. This time, it is a machine-readable description of what your app does. The assistant became the front door on iOS 27, and App Intents is the only key that fits the lock.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F44ggjgz26ny31zfr5472.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F44ggjgz26ny31zfr5472.png" alt="AI visibility diagram: without App Intents the assistant hits a wall; with App Intents your actions become callable " width="720" height="420"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;App Intents 2.0 added richer entity types, streaming responses for long-running actions, and multi-turn conversational follow-ups. If you ship a consumer app and your users touch Siri, this is the single highest-priority addition to your project before September. I am treating it the way I treated the tracking consent flow in 2021: not optional, and better done early.&lt;/p&gt;

&lt;p&gt;This is the work I am doing right now in Morrow Self, the habit app inside my AI Mobile Launcher. Morrow Self sits at the center of someone's daily routine, so making it reachable through Siri is exactly the kind of integration that earns its place: add a habit, check today's progress, log a streak, all by voice, without opening the app. I am wiring its core actions up as App Intents ahead of iOS 27 so the assistant can drive it directly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why App Intents need generative UI, and why that part is not solved yet
&lt;/h2&gt;

&lt;p&gt;App Intents answer one half of the question: the assistant can now call your app's actions. They do not answer the other half, which is what the user sees when it does. An intent can hand back a small SwiftUI snippet, but the moment the response is richer than a confirmation, a chart, a multi-step form, a list the user can act on, you are back to hand-building a fixed screen for every possible response.&lt;/p&gt;

&lt;p&gt;That is the generative UI gap. On the web the pattern is settled: the model picks from a set of registered components and fills them in, so the interface adapts to the request instead of being hard-coded for it. On mobile, that pattern barely exists in any packaged form. App Intents gives you a typed surface of actions the assistant can choose between. Generative UI would give you a typed surface of components it can compose into a response. Apple shipped the first half this year. The second half is still an open problem, and nobody has a clean answer for it yet.&lt;/p&gt;

&lt;p&gt;I care about this because it is the real product question hiding behind "make your app usable by AI." Being callable is table stakes. Being able to answer with the right interface, generated for that moment instead of pre-built for it, is where the experience is won. It is also why I am building Wire RN: a way to register React Native components an assistant can pick between, the same way App Intents lets it pick between actions. The intent layer is standardized now. The UI layer is the part still up for grabs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Apple just made AI providers swappable in one line
&lt;/h2&gt;

&lt;p&gt;Foundation Models in iOS 27 introduced a &lt;code&gt;LanguageModel&lt;/code&gt; protocol (&lt;a href="https://developer.apple.com/documentation/foundationmodels/systemlanguagemodel" rel="noopener noreferrer"&gt;docs here&lt;/a&gt;). Apple's on-device model, Google Gemini, and Anthropic Claude all implement it. You write your session logic once, and switching providers is a one-line change.&lt;/p&gt;

&lt;p&gt;Until yesterday, building on Foundation Models meant committing your AI architecture to Apple's model and Apple's release schedule. Most teams I talk to refused that bet and built on the OpenAI API instead, eating the cost and the privacy trade-offs. The abstraction layer removes the reason to refuse.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe91p3mly1w2hwbdxks20.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fe91p3mly1w2hwbdxks20.png" alt="model abstraction diagram: app code to one LanguageModel protocol routing to Apple on-device, Gemini, or Claude" width="720" height="440"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The practical shape is hybrid routing. Short, private, or simple calls go to the on-device model, which is free and works offline. Complex reasoning routes to Claude or Gemini through the same API. You pay only for the calls that need frontier quality. Production teams have built this by hand for two years, and Apple just made it a one-day job.&lt;/p&gt;

&lt;h2&gt;
  
  
  Foundation Models now runs on Linux and speaks Python
&lt;/h2&gt;

&lt;p&gt;Apple shipped a Python SDK for Foundation Models and made the framework run on Linux. An Apple Intelligence-adjacent runtime is now callable from a Python script on a Linux server.&lt;/p&gt;

&lt;p&gt;That sentence would have been absurd a year ago. It means you can prototype mobile AI features without an iOS device, run the same evaluation suite on CI and on-device, and target Apple's model from a Python agent framework next to Claude and Gemini. The cost of building for the Apple stack just fell to roughly what it costs to build for the OpenAI API.&lt;/p&gt;

&lt;p&gt;The on-device model also accepts image input from third-party apps now. Any app doing photo analysis, OCR, or scene understanding can drop its bundled 1 to 2 GB vision model and call the system API instead. Cal AI, the photo-to-calorie app that hit 40 million dollars in revenue before MyFitnessPal acquired it, is built on exactly this kind of vision call. After yesterday, that capability is free and on-device for everyone.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the Apple and Google deal admits
&lt;/h2&gt;

&lt;p&gt;Apple Foundation Models on Cloud is the frontier-quality tier, and it runs on Google's infrastructure. AFM Cloud Pro runs on Nvidia Blackwell B200 chips inside Google Cloud, under a partnership that reportedly costs Apple around 1 billion dollars a year for a custom Gemini model. Apple's contract bars Google from training future models on Siri queries.&lt;/p&gt;

&lt;p&gt;Apple framed this as a privacy decision. The simpler read is that Apple cannot build frontier LLMs alone yet, so it rented Google's. For builders, the upside is that the cloud tier gives you Gemini-comparable quality through Apple's API surface, wrapped in Apple's privacy contract. If you need frontier reasoning in a regulated context, that is a real option now.&lt;/p&gt;

&lt;h2&gt;
  
  
  The EU and China caveat, which is personal for me
&lt;/h2&gt;

&lt;p&gt;Siri AI consumer features are not launching in the EU or China at WWDC 2026, citing regulatory constraints. If you ship consumer apps in those markets, your users do not get the new Siri at launch.&lt;/p&gt;

&lt;p&gt;The nuance that matters: the Foundation Models developer API works everywhere. Your apps can call the on-device model, the cloud model, and the multi-provider abstraction in every region. The gap is on the consumer side, not the developer side. I am in Berlin, so the people around me will not see Siri AI on day one. The apps I build can still use the full framework today, which is the part worth acting on.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to do before iOS 27 ships
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Audit your iOS app for App Intents coverage. If you have none, your app is invisible to the new Siri. Plan the migration off SiriKit now, not in August.&lt;/li&gt;
&lt;li&gt;Wire the on-device Foundation Models into one real feature, behind an adapter, so you can swap providers later without a rewrite. Start with structured extraction or classification, not chat.&lt;/li&gt;
&lt;li&gt;If you ship native iOS, connect Claude Code or OpenAI Codex to your Xcode MCP server. Xcode 26.3 already exposes build, test, and diagnostics over MCP. This is the workflow change with the highest payoff from the keynote.&lt;/li&gt;
&lt;li&gt;If you build an AI product, apply for the Extensions developer beta. Apple opened Siri Extensions so users can pick Claude, ChatGPT, or Gemini as their assistant provider. That is a distribution channel on 2 billion devices, and the public launch is about three months out.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  For React Native developers
&lt;/h2&gt;

&lt;p&gt;Most of this translates to React Native, with one caveat and one opportunity.&lt;/p&gt;

&lt;p&gt;The caveat: Foundation Models is a Swift framework, so calling it from RN needs a bridge. The cleanest path today is &lt;code&gt;@react-native-ai/apple&lt;/code&gt; from Callstack, which already wraps the framework in a Vercel AI SDK-compatible interface. It was built against the iOS 26 version, so the iOS 27 additions (model abstraction, multimodal, fine-tuning) will need bridge updates before you can use them from RN. Watch the repo.&lt;/p&gt;

&lt;p&gt;The opportunity: the multi-provider story Apple shipped on native is the same one &lt;code&gt;@react-native-ai&lt;/code&gt; was already building toward. Register providers, swap with one line, same API. The two are now describing the same architecture from opposite ends, so the RN side should catch up fast.&lt;/p&gt;

&lt;p&gt;For App Intents, you expose them through Swift in the native side of the project, and there are community packages that reduce the boilerplate. For MCP, you do not need anything RN-specific: the Xcode integration works at the Xcode level, so point your agent at the workspace and it works. And Wire RN, the open-source generative UI SDK I am building at &lt;a href="https://getwireai.com" rel="noopener noreferrer"&gt;getwireai.com&lt;/a&gt;, will track the Foundation Models multi-provider API as it stabilizes.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Do I have to rewrite my app to support App Intents?
&lt;/h3&gt;

&lt;p&gt;No. App Intents are added alongside your existing UI, not instead of it. You declare your app's core actions as intents in Swift, and Siri, Spotlight, Shortcuts, and the Action button can all call them. Your existing screens keep working exactly as they do now.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is the on-device model good enough to replace cloud API calls?
&lt;/h3&gt;

&lt;p&gt;For some tasks, yes. The on-device model handles classification, structured extraction, summarization, and simple tool calling well, all for free and offline. For complex reasoning, long context, or frontier-quality generation, you still route to the cloud. The point of the new abstraction layer is that you can do both through one API.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can React Native apps use Apple Foundation Models?
&lt;/h3&gt;

&lt;p&gt;Yes, through a bridge. The &lt;code&gt;@react-native-ai/apple&lt;/code&gt; package from Callstack wraps the framework in a Vercel AI SDK-compatible interface. It currently targets the iOS 26 version of Foundation Models, so the new iOS 27 features will need bridge updates before they are available from React Native.&lt;/p&gt;

&lt;h3&gt;
  
  
  Does Siri AI work in the EU?
&lt;/h3&gt;

&lt;p&gt;The consumer Siri AI features are not available in the EU or China at launch, due to regulatory constraints. The Foundation Models developer API, however, works in every region. Your app can use on-device and cloud models in the EU even though end users will not get the new consumer Siri yet.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is App Intents only useful for Siri?
&lt;/h3&gt;

&lt;p&gt;No. App Intents power Siri, Spotlight search, the Shortcuts app, widgets, and the Action button. Adopting them makes your app reachable across all of those surfaces, not just the assistant. That breadth is why it is worth doing even before iOS 27 ships.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'm shipping this week
&lt;/h2&gt;

&lt;p&gt;I am going back through my own apps with the same checklist I just gave you, starting with App Intents coverage in Morrow Self, because I would rather migrate off SiriKit now than during the iOS 27 beta crunch. On the Wire RN side, the model abstraction announcement lines up almost exactly with the provider-swap design I was already building, so I am revisiting that interface to match the &lt;code&gt;LanguageModel&lt;/code&gt; shape Apple just standardized.&lt;/p&gt;

&lt;p&gt;If you want help shipping any of this on a React Native or native iOS stack, including App Intents migration or a multi-provider AI setup, that is what &lt;a href="https://casainnov.com" rel="noopener noreferrer"&gt;CasaInnov&lt;/a&gt; does.&lt;/p&gt;

&lt;p&gt;I write &lt;a href="https://codemeetai.substack.com" rel="noopener noreferrer"&gt;Code Meet AI&lt;/a&gt; on AI-first mobile development. Next issue goes deeper on the Future of Mobile, the foundations worth building on once on-device AI and generative UI become table stakes.&lt;/p&gt;

</description>
      <category>reactnative</category>
      <category>ios</category>
      <category>ai</category>
      <category>mobile</category>
    </item>
    <item>
      <title>How to write a Claude Code skill (and the gotchas the docs skip)</title>
      <dc:creator>Malik Chohra</dc:creator>
      <pubDate>Tue, 02 Jun 2026 08:23:44 +0000</pubDate>
      <link>https://dev.to/malik_chohra/how-to-write-a-claude-code-skill-and-the-gotchas-the-docs-skip-3gn5</link>
      <guid>https://dev.to/malik_chohra/how-to-write-a-claude-code-skill-and-the-gotchas-the-docs-skip-3gn5</guid>
      <description>&lt;p&gt;I put off Claude Code skills for six months because the docs made them sound like a framework. Then I opened one and it was a markdown file. One file, a few lines of YAML, done.&lt;/p&gt;

&lt;p&gt;If you already keep a &lt;code&gt;CLAUDE.md&lt;/code&gt; and a memory folder, you're most of the way there. This is the from-scratch guide: what a skill actually is, how to write one, the gotchas that aren't in the docs, and a real, non-trivial example (a memory system I packaged as a skill in an afternoon).&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;A Claude Code skill is a folder with one &lt;code&gt;SKILL.md&lt;/code&gt; file inside it. That's the whole mechanic.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;description&lt;/code&gt; field is the trigger. Spend most of your writing time there, not on the body.&lt;/li&gt;
&lt;li&gt;The minimum viable skill is about 30 lines. References, templates, and scripts are optional add-ons.&lt;/li&gt;
&lt;li&gt;I packaged my memory system, UAMOS, as a skill: 4 layers, 5 modes, version-controlled. The lessons are at the end.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What a skill is, mechanically
&lt;/h2&gt;

&lt;p&gt;Four parts, three of them optional:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A folder at &lt;code&gt;~/.claude/skills/&amp;lt;name&amp;gt;/&lt;/code&gt; for a global skill, or &lt;code&gt;&amp;lt;project&amp;gt;/.claude/skills/&amp;lt;name&amp;gt;/&lt;/code&gt; for a project-local one.&lt;/li&gt;
&lt;li&gt;A &lt;code&gt;SKILL.md&lt;/code&gt; file inside it. Required.&lt;/li&gt;
&lt;li&gt;YAML frontmatter at the top with &lt;code&gt;name&lt;/code&gt; and &lt;code&gt;description&lt;/code&gt;. Required.&lt;/li&gt;
&lt;li&gt;Markdown instructions below the frontmatter that tell Claude what to do once the skill fires.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Optionally, you add one or more of these. None are required for a working skill:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;references/&lt;/code&gt;: docs Claude reads when it needs depth, but never copies.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;templates/&lt;/code&gt;: files Claude copies into a project, usually with placeholders.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;scripts/&lt;/code&gt;: code Claude runs when a prompt isn't enough.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The mental model that helped me: a skill is to Claude what a cron job is to a server. It sits as files on disk and fires automatically when its trigger condition is met. You don't call it explicitly. You talk to Claude normally and it picks the right skill.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to create one, step by step
&lt;/h2&gt;

&lt;p&gt;Here is a complete, working skill in 30 lines:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;my-skill&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Run when the user says "do X", "perform Y", or asks for a Z report. Used for ABC purpose.&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;

&lt;span class="gh"&gt;# My Skill&lt;/span&gt;

You are a specialist in [whatever]. Your job: [one sentence].

&lt;span class="gu"&gt;## Steps&lt;/span&gt;
&lt;span class="p"&gt;
1.&lt;/span&gt; Read the file at [path].
&lt;span class="p"&gt;2.&lt;/span&gt; Do [thing].
&lt;span class="p"&gt;3.&lt;/span&gt; Output [format].

&lt;span class="gu"&gt;## Hard rules&lt;/span&gt;
&lt;span class="p"&gt;
-&lt;/span&gt; Never [the thing that would break trust].
&lt;span class="p"&gt;-&lt;/span&gt; Always [the thing that compounds].
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Save that to &lt;code&gt;~/.claude/skills/my-skill/SKILL.md&lt;/code&gt;, restart Claude Code, and ask for "do X." It fires. That's the entire create-a-skill loop. The hard part isn't the syntax. It's writing a description that actually triggers, which is the first and most important best practice.&lt;/p&gt;

&lt;h2&gt;
  
  
  Best practices nobody put in the docs
&lt;/h2&gt;

&lt;p&gt;I burned a few hours on each of these.&lt;/p&gt;

&lt;h3&gt;
  
  
  The description is the trigger, so write it for the matcher
&lt;/h3&gt;

&lt;p&gt;When Claude starts a session, it reads the &lt;code&gt;description&lt;/code&gt; of every installed skill and matches your request against them. The description does two jobs at once:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It tells you, the human, what the skill does.&lt;/li&gt;
&lt;li&gt;It tells Claude, the matcher, when to fire it.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;My first version of a skill had the description "Manages my project operations." Claude never fired it. Not once. I rewrote it to list the literal phrases I'd type ("set up the project here", "audit this", "rebuild the index") and it fired on the first matching request. Abstract summaries don't match concrete requests. Spend 80% of your effort here. The body is for Claude to follow after the skill is already firing; the description decides whether it fires at all.&lt;/p&gt;

&lt;h3&gt;
  
  
  One skill, one purpose
&lt;/h3&gt;

&lt;p&gt;Your first skill will be tempted to do five things. Resist. If you find yourself writing three workflows that don't share state, those are three skills, not one. A bloated description has to cover too much, so it matches inconsistently, and editing one part risks breaking the others. When several skills need the same knowledge, put it in one shared skill they all read, rather than copying it into each.&lt;/p&gt;

&lt;h3&gt;
  
  
  Know the difference between references and templates
&lt;/h3&gt;

&lt;p&gt;This one bites people. Templates are files Claude copies into your project, usually with placeholders to fill in. References are files Claude reads for context but never copies. Folder naming carries the meaning: put scaffolds in &lt;code&gt;templates/&lt;/code&gt;, put background docs in &lt;code&gt;references/&lt;/code&gt;. Get it backwards and Claude will either write a reference doc into your project (wrong) or treat a scaffold as read-only and never produce the output (also wrong).&lt;/p&gt;

&lt;h3&gt;
  
  
  Skills don't auto-reload
&lt;/h3&gt;

&lt;p&gt;If you edit a &lt;code&gt;SKILL.md&lt;/code&gt; mid-session, the running Claude Code instance is still using the version it loaded at startup. Restart the session to pick up changes. I lost 20 minutes debugging "why isn't this rule firing" before I realized this. The same applies to brand-new skills: they're discovered at session start.&lt;/p&gt;

&lt;h3&gt;
  
  
  Version-control your skills folder
&lt;/h3&gt;

&lt;p&gt;A skill is codified workflow. If &lt;code&gt;~/.claude/skills/&lt;/code&gt; isn't version-controlled, every machine you work on runs a slightly different version of it, and you stop trusting them. I keep mine in my notes vault and symlink. Committing the folder to a personal repo works just as well. Doing neither is how skills quietly drift apart across machines.&lt;/p&gt;

&lt;h2&gt;
  
  
  The worked example: building UAMOS as a skill
&lt;/h2&gt;

&lt;p&gt;Most of my skills are a single file. UAMOS is the one that needed all the optional parts, so it's the best example of how the pieces fit.&lt;/p&gt;

&lt;p&gt;UAMOS (Universal AI Memory Operating System) is the layer that gives a project memory: it loads consistent context, rules, and an index before any AI session writes code, so the agent stops reinventing things that already exist. It's stack-agnostic. I run it on a React Native codebase, but the same skill installs on Node or Python, because the structure is the same and only a couple of stack-specific rules get swapped at install time.&lt;/p&gt;

&lt;p&gt;It's four layers, and each one is a precondition for the next:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqm1ygj28z3sofd9o1ncs.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqm1ygj28z3sofd9o1ncs.png" alt="UAMOS four-layer architecture diagram (indexing, memory bank, rules, agents, each feeding the next)"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Read top to bottom:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Indexing&lt;/strong&gt; tells the agent where everything lives, so it stops running blind grep and glob.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory bank&lt;/strong&gt; keeps state across sessions in 9 tiered files (hot, warm, cold).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rules&lt;/strong&gt; constrain what the agent is allowed to write, in three tiers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agents&lt;/strong&gt; split the work across 5 specialists, each loading only the layer it needs.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The relation is the part that matters: indexing prevents the agent from hallucinating files that don't exist, memory keeps state across sessions, rules constrain the diff, agents specialize the workflow. Pull one layer out and the others degrade.&lt;/p&gt;

&lt;h3&gt;
  
  
  The file tree
&lt;/h3&gt;

&lt;p&gt;UAMOS uses every optional part of the skill spec:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;~/.claude/skills/uamos/
├── SKILL.md                    # the brain: modes + hard rules
├── references/                 # docs Claude READS, never copies
│   ├── 4-layer-architecture.md
│   ├── 7-point-checklist.md
│   └── memory-tiering.md
└── templates/                  # scaffolds Claude COPIES into a project
    ├── CLAUDE.md
    ├── globalRules.md
    ├── context_map.md
    ├── memory/   (9 tiered starter files)
    └── rules/    (3 critical rule files)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;SKILL.md&lt;/code&gt; holds the modes and the hard rules. The &lt;code&gt;references/&lt;/code&gt; files explain the system in depth so the &lt;code&gt;SKILL.md&lt;/code&gt; can stay short and point at them when a question needs detail. The &lt;code&gt;templates/&lt;/code&gt; files are working scaffolds with &lt;code&gt;{{PROJECT_NAME}}&lt;/code&gt; placeholders that get filled in during install. Keeping depth in references means the skill loads light every session and only pulls the heavy explanation when a mode actually needs it.&lt;/p&gt;

&lt;h3&gt;
  
  
  The modes and the commands that trigger them
&lt;/h3&gt;

&lt;p&gt;A single skill can package several related workflows as modes, each with its own trigger phrases. UAMOS has five, and I drive them in plain language:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;init&lt;/code&gt;&lt;/strong&gt; ("set up UAMOS here") interviews me with five questions about the project, then scaffolds the full folder structure and inventories the existing code.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;audit&lt;/code&gt;&lt;/strong&gt; ("audit my memory bank") walks the current setup and reports staleness with a status table, flagging a Hot tier that hasn't been touched this week or an inventory that's drifted from the real file count.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;reindex&lt;/code&gt;&lt;/strong&gt; ("reindex this codebase") rebuilds the inventories after a batch of new code ships.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;progress&lt;/code&gt; / &lt;code&gt;decide&lt;/code&gt; / &lt;code&gt;learn&lt;/code&gt;&lt;/strong&gt; ("append progress", "capture a decision", "capture a lesson") write dated, append-only entries to the right memory file. Memory is sacred here: nothing overwrites, and the skill never writes the progress log unless I trigger it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;migrate&lt;/code&gt;&lt;/strong&gt; ("migrate this project to UAMOS") is &lt;code&gt;init&lt;/code&gt; for a codebase that already has code: it indexes first, preserves any existing AI rules, and fills in the memory bank from the real structure.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Modes are how you package related-but-distinct workflows in one skill without splitting it into five skills with overlapping descriptions. UAMOS is honestly borderline on the one-skill-one-purpose rule, but the modes share enough underlying knowledge of the same file structure that splitting them felt worse than keeping them together. Make that call on purpose, not by accident.&lt;/p&gt;

&lt;h3&gt;
  
  
  What it cost and what it returned
&lt;/h3&gt;

&lt;p&gt;The skill took an afternoon to build, most of it on the templates, not the &lt;code&gt;SKILL.md&lt;/code&gt;. On the project I've run it on longest, the numbers after a month:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Context spend:&lt;/strong&gt; down roughly 91%, because the agent stops searching blindly.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hallucinated edits:&lt;/strong&gt; down roughly 93%, because it stops inventing things that already exist.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Setup cost:&lt;/strong&gt; about a day for the full install, an hour or two for a minimal one.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The part that compounds is the feedback loop. A recurring problem in the progress log becomes a new rule. A non-obvious pattern I had to work out gets logged so I don't relitigate it next month. Each session makes the next one cheaper, and that only works because skills are plain files reading and writing other plain files. No database, no orchestration engine.&lt;/p&gt;

&lt;h2&gt;
  
  
  When a skill is overkill
&lt;/h2&gt;

&lt;p&gt;Skills are for workflows you run by hand more than three times. A heavyweight skill like UAMOS is overkill in some cases and worth it in others:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Overkill for:&lt;/strong&gt; a throwaway script, a solo prototype you'll abandon in a week, or any codebase you won't reopen.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Worth it for:&lt;/strong&gt; long-lived projects, code that more than one AI tool touches, and anywhere consistency across sessions matters more than raw speed in a single one.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Two honest caveats beyond that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Auto-trigger isn't perfect. Even with a sharp description, Claude occasionally misses the match, and the fallback is to invoke the skill by name.&lt;/li&gt;
&lt;li&gt;Skills rot. An inventory drifts, a rule stops being true when the stack changes. Plan on a periodic audit, or you'll stop trusting the system, which defeats the point.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Where to start
&lt;/h2&gt;

&lt;p&gt;Don't build something like UAMOS first. Start small.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Pick one workflow you run by hand more than three times a month. One specific thing, not "everything I do with Claude."&lt;/li&gt;
&lt;li&gt;Create &lt;code&gt;~/.claude/skills/&amp;lt;name&amp;gt;/SKILL.md&lt;/code&gt; and write the 30-line skeleton from earlier.&lt;/li&gt;
&lt;li&gt;Spend most of your effort on the &lt;code&gt;description&lt;/code&gt;. List the literal phrases you'd type to trigger it.&lt;/li&gt;
&lt;li&gt;Restart Claude Code and test by saying one of those phrases word for word.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If it fires, you have a skill. Build the next one when you catch yourself doing the same thing by hand a fourth time. The memory layer can wait until you have a project worth remembering across sessions.&lt;/p&gt;

&lt;p&gt;I write &lt;a href="https://codemeetai.substack.com" rel="noopener noreferrer"&gt;Code Meet AI&lt;/a&gt;, one issue per week on AI-native mobile development. The UAMOS skeleton bundle (the &lt;code&gt;SKILL.md&lt;/code&gt;, the 9-file memory bank, and the 3 critical rule templates) goes out to subscribers.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What is a Claude Code skill in one sentence?
&lt;/h3&gt;

&lt;p&gt;A folder at &lt;code&gt;~/.claude/skills/&amp;lt;name&amp;gt;/&lt;/code&gt; containing a &lt;code&gt;SKILL.md&lt;/code&gt; file with YAML frontmatter (&lt;code&gt;name&lt;/code&gt; and &lt;code&gt;description&lt;/code&gt;) and markdown instructions, which Claude loads at session start and fires when your request matches the description.&lt;/p&gt;

&lt;h3&gt;
  
  
  How do I create a Claude Code skill from scratch?
&lt;/h3&gt;

&lt;p&gt;Create the folder, add a &lt;code&gt;SKILL.md&lt;/code&gt;, write YAML frontmatter where the description lists the actual phrases you'd say to trigger it, write your instructions below, and restart your Claude Code session. The minimum viable skill is about 30 lines.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why isn't my skill firing?
&lt;/h3&gt;

&lt;p&gt;Almost always the description. Claude matches your request against it to decide what to fire, so an abstract description ("manages my project") won't match a concrete request ("set up the project here"). Rewrite it with the literal phrases you'd type, then test by saying one of them verbatim. Also remember skills don't reload mid-session, so restart after editing.&lt;/p&gt;

&lt;h3&gt;
  
  
  What's the difference between references and templates in a skill?
&lt;/h3&gt;

&lt;p&gt;References are files Claude reads for context but never copies. Templates are files Claude copies into your project, usually with placeholders. Put background docs in &lt;code&gt;references/&lt;/code&gt; and scaffolds in &lt;code&gt;templates/&lt;/code&gt;. Confusing the two is one of the most common skill-authoring mistakes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can one skill do more than one thing?
&lt;/h3&gt;

&lt;p&gt;Yes, through modes: distinct workflows packaged in one skill, each with its own trigger phrases. UAMOS has five (install, audit, reindex, append-to-memory, migrate). Use modes when the workflows share underlying knowledge. When they don't, make them separate skills.&lt;/p&gt;

</description>
      <category>claude</category>
      <category>ai</category>
      <category>programming</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>How to build a second brain with Obsidian and Claude Code (step by step)</title>
      <dc:creator>Malik Chohra</dc:creator>
      <pubDate>Sat, 30 May 2026 16:18:48 +0000</pubDate>
      <link>https://dev.to/malik_chohra/how-to-build-a-second-brain-with-obsidian-and-claude-code-step-by-step-5gol</link>
      <guid>https://dev.to/malik_chohra/how-to-build-a-second-brain-with-obsidian-and-claude-code-step-by-step-5gol</guid>
      <description>&lt;p&gt;&lt;em&gt;Six folders, one context file, a memory directory, and a handful of slash commands. The exact setup, in build order.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;A second brain fails when notes pile up and nobody reads them again. The fix is a layer underneath the notes that an LLM reads for you.&lt;/li&gt;
&lt;li&gt;This is the setup I built in about a day and have run for two weeks: PARA folders, a &lt;code&gt;CLAUDE.md&lt;/code&gt; context file, a memory directory, and slash commands.&lt;/li&gt;
&lt;li&gt;Obsidian holds the markdown. Claude Code reads it, writes to it, and runs commands against it.&lt;/li&gt;
&lt;li&gt;Steps 1 to 4 are the structure. Steps 5 and 6 are the part that makes it stick.&lt;/li&gt;
&lt;li&gt;It is plain markdown in a folder. Nothing is locked in. If Claude disappears tomorrow, you still have your notes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For the full guide, with the prompt that you can use directly to generate your second brain, and a step-by-step, detailed guide, is &lt;a href="https://choumed.gumroad.com/l/nhgsxf" rel="noopener noreferrer"&gt;here&lt;/a&gt;: &lt;a href="https://choumed.gumroad.com/l/nhgsxf" rel="noopener noreferrer"&gt;https://choumed.gumroad.com/l/nhgsxf&lt;/a&gt;. you can grab it for free&lt;/p&gt;

&lt;h2&gt;
  
  
  What is a second brain?
&lt;/h2&gt;

&lt;p&gt;A second brain is a personal knowledge system that lives outside your head. It holds your ideas, decisions, project state, and references in a place you can return to and build on. The term comes from Tiago Forte's book &lt;em&gt;Building a Second Brain&lt;/em&gt;. His original framing assumed a human would read the notes back. Mine assumes an LLM will.&lt;/p&gt;

&lt;p&gt;Two tools do the work.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://obsidian.md" rel="noopener noreferrer"&gt;Obsidian&lt;/a&gt; is the storage. A desktop app that opens any folder of markdown files and adds links, search, and a graph view on top. Your files stay on your disk. No cloud unless you turn it on, no proprietary database, no export step. If you want your notes back, you already have them.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.anthropic.com/claude-code" rel="noopener noreferrer"&gt;Claude Code&lt;/a&gt; is the operator. A tool from Anthropic that runs in your terminal, reads files in a folder you point it at, and runs against saved prompts. You tell it to read your vault and act. It does.&lt;/p&gt;

&lt;p&gt;The pair is the whole idea. Obsidian makes the notes legible to a human. Claude Code makes them executable by a machine. Neither one alone gets you a second brain. Together, they do.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why most second brains die, and the one change that fixes it
&lt;/h2&gt;

&lt;p&gt;I have started a second brain six times. Roam, Notion, Tana, Logseq, Notion again, a short flirtation with Reflect. All six died the same way. I built the structure on a Sunday, filed notes through Wednesday, and by the next weekend the vault was just another inbox to clean.&lt;/p&gt;

&lt;p&gt;The standard diagnosis is "capture is easy, retrieval is hard." That is correct. You write 200 notes and six months later cannot find the thinking behind the decision you made in March. The graph view looks great in screenshots. It does not answer questions.&lt;/p&gt;

&lt;p&gt;But that diagnosis blames the tool. The real problem is that you were the only retrieval engine. Asking a human to read 500 markdown files back every week is asking them to be a database. They will not do it. The vault rots.&lt;/p&gt;

&lt;p&gt;The fix is to put a reader underneath the notes. Not a human. An LLM that treats your vault as required reading. That is the whole trick. Everything below is how to set it up.&lt;/p&gt;

&lt;h2&gt;
  
  
  What you are actually building
&lt;/h2&gt;

&lt;p&gt;Three parts, and the order matters.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The vault&lt;/strong&gt; is your brain. Plain markdown files in folders. This is Obsidian's job.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Claude Code&lt;/strong&gt; is the operator. It reads the vault, writes to it, and runs commands against it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Slash commands&lt;/strong&gt; are the interface. They turn the folder from a place you file things into a place you work from.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A vault without the operator is a filing cabinet. The operator without commands is a chat window. You need all three.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: Build the PARA skeleton
&lt;/h2&gt;

&lt;p&gt;Install &lt;a href="https://obsidian.md" rel="noopener noreferrer"&gt;Obsidian&lt;/a&gt; and create a vault. A vault is just a folder. Inside it, create six folders at the top level:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;00-Meta/        &lt;span class="c"&gt;# the operating layer (read first by everything)&lt;/span&gt;
01-Projects/    &lt;span class="c"&gt;# active work with a deadline and an outcome&lt;/span&gt;
02-Areas/       &lt;span class="c"&gt;# ongoing responsibilities, no deadline&lt;/span&gt;
03-Resources/   &lt;span class="c"&gt;# reference material and templates&lt;/span&gt;
04-Archives/    &lt;span class="c"&gt;# done, paused, dead&lt;/span&gt;
05-Daily/       &lt;span class="c"&gt;# one note per day, the journal&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The middle four are Tiago Forte's &lt;a href="https://fortelabs.com/blog/para/" rel="noopener noreferrer"&gt;PARA method&lt;/a&gt;. The two numbered additions are what a developer's vault needs and PARA does not specify: &lt;code&gt;00-Meta&lt;/code&gt; for the files that run the system, and &lt;code&gt;05-Daily&lt;/code&gt; for the journal.&lt;/p&gt;

&lt;p&gt;Treat &lt;code&gt;01-Projects&lt;/code&gt; the way you treat a codebase. One folder per project. Each gets a &lt;code&gt;progress.md&lt;/code&gt; (a dated log of what shipped) and a &lt;code&gt;roadmap.md&lt;/code&gt; (what is next). Feature first, same as your repo.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F41yh7vr1qwpzhmuzo68u.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F41yh7vr1qwpzhmuzo68u.png" alt="second brain" width="800" height="478"&gt;&lt;/a&gt;&lt;br&gt;
The folders took me 30 minutes. Do not spend a Sunday on this. The structure is not the hard part, and it is not where the value is.&lt;/p&gt;
&lt;h2&gt;
  
  
  Step 2: Write CLAUDE.md, the file Claude reads first
&lt;/h2&gt;

&lt;p&gt;This is the step that carries the weight. Create &lt;code&gt;00-Meta/CLAUDE.md&lt;/code&gt;. It is the file Claude Code reads before doing anything else, every session.&lt;/p&gt;

&lt;p&gt;Keep it to 200 to 300 lines. Mine covers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Who I am and what I am working toward right now&lt;/li&gt;
&lt;li&gt;My current projects and how they relate&lt;/li&gt;
&lt;li&gt;How I want Claude to work with me (direct, opinionated, no hedging)&lt;/li&gt;
&lt;li&gt;Voice rules for anything it writes&lt;/li&gt;
&lt;li&gt;A list of canonical source files, in priority order&lt;/li&gt;
&lt;li&gt;Locked decisions that should not be reopened&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here is the shape of it, stripped to a skeleton:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# CLAUDE.md: context for this vault&lt;/span&gt;

&lt;span class="gu"&gt;## Who I am&lt;/span&gt;
[Role, what you are building, what you are optimizing for.]

&lt;span class="gu"&gt;## Current priorities&lt;/span&gt;
[The 3 to 5 things that matter this quarter.]

&lt;span class="gu"&gt;## How to work with me&lt;/span&gt;
[Tone, format, what to push back on.]

&lt;span class="gu"&gt;## Canonical sources (trust these first)&lt;/span&gt;
&lt;span class="p"&gt;1.&lt;/span&gt; CLAUDE.md (this file)
&lt;span class="p"&gt;2.&lt;/span&gt; [Your day plan file]
&lt;span class="p"&gt;3.&lt;/span&gt; [Your active-work file]

&lt;span class="gu"&gt;## Decisions locked (do not reopen)&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; [Decision, with date.]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The difference this makes is large. Without it, every session starts cold and you get generic productivity advice. With it, you ask "plan my day" and get a briefing that already knows your deadlines, your constraints, and the decision you locked last week. You stop repeating yourself. That alone is worth the setup.&lt;/p&gt;

&lt;p&gt;Edit this file when decisions change. Twice a week is normal.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Add the memory layer
&lt;/h2&gt;

&lt;p&gt;CLAUDE.md is static context. It does not change much. But the state of your work changes daily: what shipped, what got renamed, which decision got made in last night's notes.&lt;/p&gt;

&lt;p&gt;That state needs its own home. Claude Code keeps a memory directory per project. Drop one file per fact in there, plus a &lt;code&gt;MEMORY.md&lt;/code&gt; index that lists them all.&lt;/p&gt;

&lt;p&gt;Naming is boring on purpose:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;decision_pricing_locked.md
project_app_launch_timeline.md
feedback_always_run_tests_first.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One file, one fact. When you lock a decision in a chat session, write it to memory before the session ends. The next session reads the index, pulls what is relevant, and never asks "wait, what did we decide about pricing?" This is the part that gives the vault continuity. Obsidian gives you the spatial layout. The memory directory is what makes next week start from a richer state than last week.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4: Link notes with wikilinks
&lt;/h2&gt;

&lt;p&gt;Obsidian links files with this syntax: &lt;code&gt;[[Project-Name]]&lt;/code&gt;. Use it everywhere. The rule I follow: when I create a file, I add links to the related project or area before I save it.&lt;/p&gt;

&lt;p&gt;This builds the graph. Obsidian's graph view turns the vault into a visual map of how everything connects.&lt;/p&gt;

&lt;p&gt;The graph looks impressive, and that is the trap. The pretty picture is not the point. The point is that wikilinks let Claude Code traverse relationships. When it reads a project file and sees a link to a decision note, it can follow it without you telling it where to look. The graph is for the machine, not for the screenshot.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5: Write the slash commands
&lt;/h2&gt;

&lt;p&gt;This is the step that converts the vault from storage into a system. A slash command in Claude Code is a markdown file in &lt;code&gt;.claude/commands/&lt;/code&gt;. It is a saved prompt. That is all. &lt;code&gt;today.md&lt;/code&gt; becomes &lt;code&gt;/today&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;I run 13. The ones that matter daily:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;/context&lt;/code&gt; loads the full vault state at the start of a session. It reads CLAUDE.md, the memory index, and the active project files, then prints a situation report.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/today&lt;/code&gt; produces the day's briefing. It reads the day plan and the active work, then outputs a top priority with the steps under it.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/log&lt;/code&gt; structures the evening journal into the daily note.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/sunday&lt;/code&gt; runs the weekly review.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The rest are thinking tools: &lt;code&gt;/trace&lt;/code&gt; to see how a decision evolved, &lt;code&gt;/challenge&lt;/code&gt; to poke holes in a plan, &lt;code&gt;/drift&lt;/code&gt; to catch where I am slipping from my goals. Start with the four above. Add the others when you feel the need, not before.&lt;/p&gt;

&lt;p&gt;Writing a command is not hard. Open a markdown file, describe what you want Claude to read and what you want it to output, and save it in &lt;code&gt;.claude/commands/&lt;/code&gt;. The first version of mine took an afternoon.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 6: Run the loops
&lt;/h2&gt;

&lt;p&gt;Structure is dead weight without a rhythm. Two loops keep the vault alive.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The daily loop.&lt;/strong&gt; Morning: open Claude Code in the vault, run &lt;code&gt;/context&lt;/code&gt; then &lt;code&gt;/today&lt;/code&gt;. Evening: record a short voice memo about what worked and what broke, paste the transcript into Claude, run &lt;code&gt;/log&lt;/code&gt;. The command writes a structured note into &lt;code&gt;05-Daily/&lt;/code&gt;. A bare daily note looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# 2026-05-22&lt;/span&gt;

&lt;span class="gu"&gt;## Top 3 today&lt;/span&gt;
&lt;span class="p"&gt;1.&lt;/span&gt;
&lt;span class="p"&gt;2.&lt;/span&gt;
&lt;span class="p"&gt;3.&lt;/span&gt;

&lt;span class="gu"&gt;## Shipped&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; What:

&lt;span class="gu"&gt;## Wins&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt;

&lt;span class="gu"&gt;## Friction&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt;

&lt;span class="gu"&gt;## Notes and ideas&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt;

&lt;span class="gu"&gt;## End of day reflection&lt;/span&gt;
&lt;span class="gt"&gt;&amp;gt; One thing I would do differently tomorrow?&lt;/span&gt;
&lt;span class="gt"&gt;&amp;gt; What should move into a project file or the inbox?&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No elaborate template. Five headings. The structure exists so &lt;code&gt;/sunday&lt;/code&gt; can read across the week and find patterns.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The weekly loop.&lt;/strong&gt; Sunday: run &lt;code&gt;/sunday&lt;/code&gt;. It reads the week's daily notes, surfaces patterns you missed, and outputs one win, one friction, one thing to change. That output becomes next week's starting context.&lt;/p&gt;

&lt;p&gt;Capture flows in through &lt;code&gt;/log&lt;/code&gt;. Context flows out through &lt;code&gt;/context&lt;/code&gt;. Decisions get locked into memory. That loop is the second brain. The folders were never the hard part.&lt;/p&gt;

&lt;h2&gt;
  
  
  How I built mine
&lt;/h2&gt;

&lt;p&gt;I scaffolded the vault with Claude in an afternoon. Folders, &lt;code&gt;CLAUDE.md&lt;/code&gt;, the first four slash commands. Then I did the harder part. I took my phone to a park, sat in the sun, and opened the Claude mobile app. For about three hours I talked to it like a partner. What I am working on, what is stuck, what I have been avoiding. It asked clarifying questions. I answered. When I got home, I opened the Claude desktop app, pointed it at the vault, and asked it to sync the conversation into the right files. Et voilà. The skeleton I had built in the morning was filled in with my actual life by the evening.&lt;/p&gt;

&lt;p&gt;Structure first, content second. Use the mobile app for the talking, the desktop for the filing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where this breaks
&lt;/h2&gt;

&lt;p&gt;The honest section, because tutorials never have one.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CLAUDE.md drifts from reality. Rename a file, forget to update CLAUDE.md, and &lt;code&gt;/context&lt;/code&gt; produces confidently wrong output. Keep a changelog of structural changes. I still forget to use mine about a third of the time.&lt;/li&gt;
&lt;li&gt;The memory directory will outgrow a flat folder. Thirty files is fine. Three hundred will need search or embeddings. The naming convention buys you a long runway, not forever.&lt;/li&gt;
&lt;li&gt;It depends on Claude Code as the reader. If pricing changes hard or the CLI gets killed, the executable layer evaporates. The mitigation is that your notes are plain markdown and the commands are short prompts. You lose the operator, not the brain.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Get the prompt that builds this for you
&lt;/h2&gt;

&lt;p&gt;If you want to skip starting from a blank page, paste the Vault Architect prompt into Claude, answer four short rounds of questions, and it builds the whole vault customized to you. Folders, your CLAUDE.md, the four slash commands, the daily template.&lt;/p&gt;

&lt;p&gt;The whole thing is a free kit on Gumroad: this guide plus that prompt. &lt;strong&gt;&lt;a href="https://choumed.gumroad.com/l/nhgsxf" rel="noopener noreferrer"&gt;Get the Second Brain Vault Kit, free.&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I also write &lt;a href="https://codemeetai.substack.com" rel="noopener noreferrer"&gt;Code Meet AI&lt;/a&gt;, a weekly newsletter on AI-native developer workflows. One issue a week, tactical, no fluff.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  How do I connect Obsidian to Claude?
&lt;/h3&gt;

&lt;p&gt;Obsidian stores your vault as plain markdown files in a folder. Open Claude Code with that folder as the working directory and it reads the files directly. There is no plugin or API to wire up. The connection is just the shared folder.&lt;/p&gt;

&lt;h3&gt;
  
  
  Do I need Claude Code, or does ChatGPT work?
&lt;/h3&gt;

&lt;p&gt;The folder structure and the CLAUDE.md context file work with any LLM that reads context, including ChatGPT custom instructions and Cursor rules. The slash commands are specific to Claude Code, but they are short markdown prompts you can port to any CLI agent.&lt;/p&gt;

&lt;h3&gt;
  
  
  What folders should a developer second brain have?
&lt;/h3&gt;

&lt;p&gt;Start with PARA: Projects, Areas, Resources, Archives. Add two more that PARA does not specify but a developer needs: a meta folder for the operating files like CLAUDE.md, and a daily folder for the journal. Six folders total. Resist adding more until something genuinely does not fit.&lt;/p&gt;

&lt;h3&gt;
  
  
  Is Obsidian or Notion better for an AI second brain?
&lt;/h3&gt;

&lt;p&gt;Obsidian, for one reason: the vault is plain markdown in a folder you control, so an LLM can read it with no export step. Notion's data lives behind a database and an API, which adds friction. If you are starting fresh, Obsidian plus a local folder is the cleaner path.&lt;/p&gt;

&lt;h3&gt;
  
  
  How long does it take to build?
&lt;/h3&gt;

&lt;p&gt;The folders take 30 minutes. The first CLAUDE.md took me about two hours. The slash commands were an afternoon. Total upfront cost is under a day. The memory directory then fills in on its own as you work.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>obsidian</category>
      <category>claude</category>
    </item>
    <item>
      <title>Firebase Hybrid Inference + Gemini Nano: what changed for React Native at I/O 2026</title>
      <dc:creator>Malik Chohra</dc:creator>
      <pubDate>Fri, 29 May 2026 14:27:26 +0000</pubDate>
      <link>https://dev.to/malik_chohra/firebase-hybrid-inference-gemini-nano-what-changed-for-react-native-at-io-2026-1935</link>
      <guid>https://dev.to/malik_chohra/firebase-hybrid-inference-gemini-nano-what-changed-for-react-native-at-io-2026-1935</guid>
      <description>&lt;p&gt;Google I/O 2026 was the first keynote in three years where I came out with a different product roadmap than the one I brought in.&lt;/p&gt;

&lt;p&gt;Not because the demos were impressive. Because three announcements have direct implications for product decisions I've been putting off — including specific decisions about my React Native stack. Firebase Hybrid Inference. Gemini Nano in ML Kit. Gemini Spark as a consumer agent. These change what mobile apps (including RN apps) need to do to stay competitive in the next 12 to 18 months.&lt;/p&gt;

&lt;p&gt;Here is what mattered, filtered for mobile builders. There's a React Native-specific section at the bottom with concrete package paths.&lt;/p&gt;

&lt;h2&gt;
  
  
  Is your app's AI running in the right place?
&lt;/h2&gt;

&lt;p&gt;Firebase AI Logic now supports the full Gemini 3.x family. The more important announcement: Hybrid Inference for Android and iOS. Your app decides at runtime whether a given AI task runs locally on the device via Gemini Nano or falls back to the cloud, based on network conditions, device capability, and cost.&lt;/p&gt;

&lt;p&gt;The product implication is real. On-device AI is faster (no round-trip latency), cheaper (no API call), and private (data never leaves the device). Cloud AI handles complex reasoning that changes frequently. Most apps today make this choice once, at architecture time, and stick with it. Hybrid Inference makes the routing dynamic.&lt;/p&gt;

&lt;p&gt;I always saw this coming. Small models are getting more powerful in execution and size. When I started working on my AI boilerplate for React Native apps (&lt;a href="https://aimobilelauncher.com" rel="noopener noreferrer"&gt;aimobilelauncher.com&lt;/a&gt;), I wrote an article predicting this hybrid approach: &lt;a href="https://medium.com/stackademic/the-future-of-ai-in-mobile-app-beyond-chatgpt-wrappers-0edb5d656c3a" rel="noopener noreferrer"&gt;The Future of AI in Mobile Apps Beyond ChatGPT Wrappers&lt;/a&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Side note:&lt;/strong&gt; I've opened a first cohort of 20 users — with or without a technical background — who want to launch their mobile apps. Contact me at &lt;a href="mailto:malik@aimobilelauncher.com"&gt;malik@aimobilelauncher.com&lt;/a&gt; for access. You'll get 50% off, and we do the onboarding manually, from first run to shipped app, together.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Users who interact with on-device AI don't wait for a spinner. They get a result in under a second. The apps that feel fast and smart in 2026 will have figured out which tasks belong on-device. The apps that haven't will feel slow by comparison, and users won't know why. They'll just open your competitor's app instead.&lt;/p&gt;

&lt;p&gt;Gemini Nano is already on modern Android devices. This is not a capability you are waiting for. It is available now through Firebase AI Logic, and Gemini Nano's latest version handles audio and image processing on-device too, not just text.&lt;/p&gt;

&lt;h2&gt;
  
  
  What agentic development changes for your team
&lt;/h2&gt;

&lt;p&gt;Google announced Antigravity 2.0: a standalone desktop app and CLI that lets developers orchestrate AI subagents across their workflow. Scaffold a backend, write tests, and manage deployments simultaneously, in a sandboxed environment with credential masking and hardened Git policies.&lt;/p&gt;

&lt;p&gt;If you follow AI development tools, this is Google's answer to Claude Code. The architecture is nearly identical: agents that take on complex multi-step tasks, not just autocomplete. Two major AI companies independently building the same model tells you something. This is not a product experiment. This is where software development is going.&lt;/p&gt;

&lt;p&gt;Android Studio went further. It added Agent Skills: modular instruction sets that ground the AI in your specific stack and architecture. Parallel conversation threads, so one agent writes documentation while another debugs test failures. And a Migration Agent that autonomously analyzes React Native or iOS codebases and does the heavy lifting to migrate them to native Kotlin.&lt;/p&gt;

&lt;p&gt;For a technical founder running a small team, the development teams that adopt agentic workflows will ship faster and with fewer context switches. The developer who spends four hours on scaffolding before writing any real logic is at a structural disadvantage against a team running orchestrated agents. That gap will widen as the tooling matures.&lt;/p&gt;

&lt;p&gt;Since Cursor started getting momentum, my job has shifted from software engineer to review engineer.&lt;/p&gt;

&lt;h2&gt;
  
  
  Generative UI changes the product iteration speed
&lt;/h2&gt;

&lt;p&gt;Google AI Studio now lets you describe an app idea, generate production Jetpack Compose code, run it in an in-browser Android emulator, push to a physical device via ADB, and deploy to Google Play's internal test track in one flow. They also teased a mobile app version for prototyping on the go.&lt;/p&gt;

&lt;p&gt;The competitive implication is not about you using this tool. It's that your competitors will. The cost of generating a functional-looking UI prototype just dropped to a text prompt. The time between "should we test this product idea" and "we have something running on a device" is now hours, not days.&lt;/p&gt;

&lt;p&gt;Your competitive moat is no longer in the ability to build quickly. It is in the judgment to build the right thing. The founders who use faster prototyping loops to run more product experiments per month will learn faster. The ones who don't will make the same number of bets at a higher cost.&lt;/p&gt;

&lt;p&gt;Nothing will beat Generative UI in mobile apps. Our mobile apps need AI not on top, but as a primary source of interaction. I started working on a React Native library for that — the amount of interest and traction is confirming it. Check it out: &lt;a href="https://getwireai.com/" rel="noopener noreferrer"&gt;getwireai.com&lt;/a&gt;. An example of my usage: &lt;a href="https://getwireai.com/onboarding/food-recommendation" rel="noopener noreferrer"&gt;food recommendation onboarding&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  What your users are about to start expecting
&lt;/h2&gt;

&lt;p&gt;Google announced Gemini Spark: an always-on AI agent that breaks a user's biggest goals into actionable steps across their apps. Daily Brief: an agentic digest that pulls from Gmail, Calendar, and Drive into a single prioritized view. Gemini Omni: video creation and remixing on mobile, directly from a prompt.&lt;/p&gt;

&lt;p&gt;These are consumer features, not developer tools. But they set the expectation floor for what a smart app does. A user with Gemini Spark helping them organize their week will notice, at some level, when your app doesn't do anything proactive for them. Not because they'll articulate it. Because your app will feel passive and static.&lt;/p&gt;

&lt;p&gt;The pattern has a clear history. Apps that felt sophisticated in 2022 had smart push notifications. Apps that felt sophisticated in 2024 had AI chat. The 2026 pattern is agentic: apps that act on behalf of users instead of waiting for taps. You don't need to ship a full agent runtime today. But you need to identify at least one place in your app where proactive AI would replace friction, and plan for it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The honest limitations
&lt;/h2&gt;

&lt;p&gt;Hybrid Inference is Firebase-native. If your stack doesn't include Firebase, you get the pattern but build the routing logic yourself. It's doable. It's not zero work.&lt;/p&gt;

&lt;p&gt;The generative UI tooling in AI Studio generates Jetpack Compose. There is no cross-platform output. Flutter and React Native developers are not the target for that specific feature. The concept travels; the tooling doesn't.&lt;/p&gt;

&lt;p&gt;Gemini Nano on-device is an Android story for now. iOS developers are watching WWDC (early June) to see what Apple does with on-device AI APIs at the OS level. The Android-iOS capability gap on AI features has narrowed over the last 18 months, but it still exists.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to do this week
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Map your app's AI features against the on-device/cloud split.&lt;/strong&gt; Summarization, input validation, short text generation: strong on-device candidates. Complex reasoning over a large context: still cloud. Hybrid Inference is the pattern whether or not you use Firebase.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;If your team isn't running agentic development tools, spend one week on a real task with one.&lt;/strong&gt; The goal isn't to evaluate the tool. It's to learn what changes about your workflow when the AI can orchestrate tasks instead of answering single questions.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Find one screen in your app where a proactive AI action would replace a user decision.&lt;/strong&gt; That's your first agent feature candidate.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  If you're building in React Native
&lt;/h2&gt;

&lt;p&gt;The Firebase Hybrid Inference pattern is accessible via &lt;code&gt;@react-native-firebase&lt;/code&gt;. If you want the on-device/cloud routing without pulling in Firebase, &lt;code&gt;react-native-litert-lm&lt;/code&gt; via Nitro Modules handles the on-device leg (Phi-3 Mini, Moondream2) and any cloud API covers the fallback. The routing logic is around 40 lines of TypeScript and doesn't require a Firebase dependency.&lt;/p&gt;

&lt;p&gt;Gemini Nano via ML Kit GenAI APIs will reach React Native through the &lt;code&gt;@react-native-ml-kit&lt;/code&gt; binding path. Official timeline for Gemini Nano GenAI API support in that binding is [VERIFY: check Callstack or the ml-kit-rn repo]. Today, &lt;code&gt;react-native-litert-lm&lt;/code&gt; covers the same on-device capability.&lt;/p&gt;

&lt;p&gt;Antigravity 2.0 is worth watching as a comparison point to Claude Code, but it doesn't replace Claude Code for RN development. The Claude Code + UAMOS workflow already gives you subagent orchestration, memory banking across hot/warm/cold tiers, and sandboxed execution. If you're running that workflow, I/O 2026 confirmed you're on the right architecture.&lt;/p&gt;

&lt;p&gt;Agent Skills in Android Studio map to the same pattern as Claude Code skills and the UAMOS memory bank: domain-specific instruction sets that ground the model in your specific codebase. If you haven't set this up for your RN project yet, that's the highest-leverage AI tooling investment you can make right now.&lt;/p&gt;

&lt;p&gt;The generative UI announcement (Jetpack Compose generation in AI Studio) is Android-specific. For React Native, Wire RN is the equivalent component model: LLM outputs structured JSON, Wire RN renders native components. MIT licensed, 15-minute quickstart at &lt;a href="https://getwireai.com" rel="noopener noreferrer"&gt;getwireai.com&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What did Google announce at I/O 2026 that matters for mobile app founders?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The highest-impact announcements for product decisions: Firebase Hybrid Inference (on-device Gemini Nano plus cloud fallback routing for Android and iOS), Gemini Nano in ML Kit GenAI APIs for on-device multimodal processing, and Gemini Spark as a consumer always-on AI agent. On the development side: Antigravity 2.0 for agentic coding workflows and Agent Skills in Android Studio.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is Firebase Hybrid Inference and how does it work?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It routes AI tasks between on-device Gemini Nano and cloud processing at runtime, deciding based on network conditions, device capability, and cost. Available through Firebase AI Logic for Android and iOS apps. If your stack doesn't include Firebase, the routing pattern is replicable with any on-device model package and a cloud API.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is Gemini Spark and what does it mean for my app?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Gemini Spark is an always-on AI agent that breaks user goals into actionable steps across apps. It represents a shift in user expectations: apps that proactively act on behalf of users rather than waiting for interaction. Not every app needs a full agent runtime, but every mobile product should now have a clear answer to where it will add proactive AI value.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is Google Antigravity 2.0?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Google's standalone agent harness for development, co-optimized for Gemini 3.5 Flash. Developers orchestrate subagents to handle complex workflows simultaneously, in a sandboxed environment with credential masking and Git policy enforcement. It's structurally the same model as Claude Code's agentic development workflow.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Should I migrate my React Native app to Kotlin after the Android Studio Migration Agent?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Probably not as a primary initiative. The Migration Agent will get clean codebases a significant percentage of the way, but production apps with years of history still require substantial manual work after the automated pass. More relevant question: is your React Native app using the on-device AI capabilities that are available now?&lt;/p&gt;




&lt;p&gt;I write &lt;a href="https://codemeetai.substack.com" rel="noopener noreferrer"&gt;Code Meet AI&lt;/a&gt; weekly on AI-first mobile development, with a focus on where AI and mobile products actually intersect. If you want the local-vs-cloud LLM decision framework I use for routing between on-device and cloud AI calls, subscribe and reply to the newsletter and I'll send it.&lt;/p&gt;

&lt;p&gt;If you want to think out loud about your AI mobile stack, I run a &lt;a href="https://www.casainnov.com/services/vibe-coding" rel="noopener noreferrer"&gt;Vibe Coding service at CasaInnov&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>reactnative</category>
      <category>gemini</category>
      <category>mobile</category>
      <category>googleio</category>
    </item>
    <item>
      <title>qwen2.5-coder is too slow for Claude Code on a Mac. Here's the fix.</title>
      <dc:creator>Malik Chohra</dc:creator>
      <pubDate>Sat, 23 May 2026 14:55:49 +0000</pubDate>
      <link>https://dev.to/malik_chohra/qwen25-coder-is-too-slow-for-claude-code-on-a-mac-heres-the-fix-5c99</link>
      <guid>https://dev.to/malik_chohra/qwen25-coder-is-too-slow-for-claude-code-on-a-mac-heres-the-fix-5c99</guid>
      <description>&lt;p&gt;&lt;em&gt;Claude Code does not care where the model lives. Point it at a local model and it works with no network. I tested that at 35,000 feet, picked the wrong model first, and swapped mid-flight.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Claude Code reads two environment variables to decide where its model lives. Point them at Ollama and it runs fully offline.&lt;/li&gt;
&lt;li&gt;I tested this on a real flight. Berlin, May 13, wifi off, cabin door closed.&lt;/li&gt;
&lt;li&gt;I started on &lt;code&gt;qwen2.5-coder:14b&lt;/code&gt;. It was too slow for anything agentic. One tool call sat for 25 seconds, the next for 52.&lt;/li&gt;
&lt;li&gt;I switched to &lt;code&gt;gemma4:26b&lt;/code&gt;. That one carried the session.&lt;/li&gt;
&lt;li&gt;Local is for offline work, privacy-sensitive code, and cheap drafting. Cloud is still better for heavy reasoning and large-context tasks.&lt;/li&gt;
&lt;li&gt;The install takes 20 minutes once. After that, switching models is one command.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The setup, in one paragraph
&lt;/h2&gt;

&lt;p&gt;Ollama runs an open-weights model on your laptop. Claude Code points at Ollama instead of Anthropic's servers. No network call leaves the machine. The cloud account is irrelevant for that session. The only real decision is which local model you run, and that decision is where I got it wrong the first time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why offline beats "just use a smaller cloud model"
&lt;/h2&gt;

&lt;p&gt;Before the setup, the three objections I get every time:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;"Just don't code on a plane."&lt;/strong&gt; A flight is six uninterrupted hours. No social media, no notifications, nothing that pulls focus. That is rare now. Throwing it away because your LLM needs wifi, when the wifi problem is fixable, is a planning failure.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;"Just use Copilot offline."&lt;/strong&gt; Copilot's local mode does completions. Anything context-heavy still hits the network. The moment you ask for the work that justifies an AI assistant, you are back online.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;"Just use a smaller cloud model."&lt;/strong&gt; Haiku and GPT-4o-mini still live in the cloud. Smaller is not local. No network, no inference. Same failure, smaller bill.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Local is the only setup that runs at 35,000 feet. It also runs on a train through a tunnel, in a cafe with broken wifi, and on the morning the OpenAI status page goes red. The flight is just the stress test.&lt;/p&gt;

&lt;h2&gt;
  
  
  What you need
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;A Mac on Apple Silicon (M1 or newer). Linux and Windows via WSL2 work with minor changes.&lt;/li&gt;
&lt;li&gt;Claude Code installed and already authenticated against your cloud account.&lt;/li&gt;
&lt;li&gt;About 16 GB of unified memory. 32 GB if you want the larger models comfortable.&lt;/li&gt;
&lt;li&gt;Homebrew, for the Ollama install.&lt;/li&gt;
&lt;li&gt;20 minutes the first time. Roughly 90 seconds every time after.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Step 1 — Pull the model before you fly
&lt;/h2&gt;

&lt;p&gt;Install Ollama and pull a model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;brew &lt;span class="nb"&gt;install &lt;/span&gt;ollama
ollama pull qwen2.5-coder:14b
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Do this on home wifi the night before. The pull is around 9 GB. Airport wifi and hotspots will not cooperate, and finding that out at the gate is its own small tragedy.&lt;/p&gt;

&lt;p&gt;Confirm it landed:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama list
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This was my mistake, so I will be blunt about it: I prepped &lt;code&gt;qwen2.5-coder:14b&lt;/code&gt; because it is the model every "local LLM for coding" post recommends. Pull more than one. You will see why in Step 4.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2 — Point Claude Code at Ollama
&lt;/h2&gt;

&lt;p&gt;Start the Ollama server in one terminal:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama serve
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then in a new terminal, launch Claude Code against your local model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama launch claude &lt;span class="nt"&gt;--model&lt;/span&gt; qwen2.5-coder:14b
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Wrap that in two shell aliases so the rest of your workflow has named modes. Add these to &lt;code&gt;~/.zshrc&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;alias &lt;/span&gt;claude-local&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'ollama launch claude --model gemma4:26b'&lt;/span&gt;
&lt;span class="nb"&gt;alias &lt;/span&gt;claude-cloud&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'claude'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then &lt;code&gt;source ~/.zshrc&lt;/code&gt;. That is the entire switching layer.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;claude-local&lt;/code&gt; runs offline against Ollama. &lt;code&gt;claude-cloud&lt;/code&gt; runs against the real Anthropic API. Two commands, one decision per session.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3 — Verify on the ground
&lt;/h2&gt;

&lt;p&gt;Prove the setup works in airplane mode before you board anything. This is non-negotiable. Discovering a missing step at altitude is bad theater with no exits.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Make sure &lt;code&gt;ollama serve&lt;/code&gt; is running.&lt;/li&gt;
&lt;li&gt;Turn wifi off. Actually off, not "disconnected from this network."&lt;/li&gt;
&lt;li&gt;Run &lt;code&gt;claude-local&lt;/code&gt; and point it at a real file.&lt;/li&gt;
&lt;li&gt;Confirm a real answer comes back.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7kn0chd0mlxvhqvx1y1z.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7kn0chd0mlxvhqvx1y1z.png" alt="Terminal screenshot of  raw `claude-local` endraw  successfully answering with wifi off" width="800" height="569"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If it loads your project and answers with wifi off, it will work on the plane.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4 — The flight: qwen2.5-coder was too slow
&lt;/h2&gt;

&lt;p&gt;The best move I made was running the model without wifi on the ground first and measuring real performance. Every forum I read pointed at &lt;code&gt;qwen2.5-coder&lt;/code&gt;. I trusted them. They were wrong for this job.&lt;/p&gt;

&lt;p&gt;File reads were fine. Short explanations were fine. Then the model tried anything agentic, and the wait times stopped being a rounding error.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3ehzlobvo5nshn0le84i.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3ehzlobvo5nshn0le84i.png" alt="Terminal screenshot showing slow tool-call wait times on qwen2.5-coder during the flight" width="800" height="591"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;One tool call crunched for 25 seconds. An earlier step had sat at 52. For a single step in a loop that needs five or six of them, that is not a workflow. That is staring at a terminal while the person next to you finishes a movie.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;qwen2.5-coder:14b&lt;/code&gt; is a fine model for single-shot edits. For the multi-step tool loop that Claude Code actually runs, on this hardware, it could not keep up. The model every post recommends was the wrong call for the job I had.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5 — The swap: gemma4:26b carried the session
&lt;/h2&gt;

&lt;p&gt;I had pulled a second model before the flight, exactly because I did not fully trust the first one. So I switched to &lt;code&gt;gemma4:26b&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkuy964sho11tpx5eu2ue.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkuy964sho11tpx5eu2ue.png" alt="Terminal screenshot of  raw `ollama pull gemma4:26b` endraw " width="572" height="111"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Bigger model, 17 GB on disk, and on this MacBook it was the difference between a demo and a tool. The tool loop ran at a speed I would actually choose. The gap analysis completed. Multi-step reasoning held together instead of stalling halfway.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgcehsv89pa0f6hv8guyn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fgcehsv89pa0f6hv8guyn.png" alt="Terminal screenshot of gemma4:26b running a real Claude Code gap-analysis task" width="800" height="564"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Honest scorecard for the flight: roughly 70 percent of my normal Claude Code workflow worked on &lt;code&gt;gemma4:26b&lt;/code&gt;. The 30 percent that did not was the heavy "go reason across the whole repo" pattern, which is cloud territory anyway. For six hours of focus on a known task, it was a real working setup, not a downgrade.&lt;/p&gt;

&lt;p&gt;Because I already had a tight context-engineering setup with optimised token consumption, it ran smoothly. The Mac started lagging briefly when I had Xcode and Antigravity open alongside, but closing those and cleaning up Chrome tabs sorted it. If you want the context-engineering side, the U-AMOS write-up is here: &lt;a href="https://codemeetai.substack.com/p/i-spent-6-months-losing-fights-with-ai" rel="noopener noreferrer"&gt;I spent 6 months losing fights with AI in React Native. Then I built U-AMOS.&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Practical tip: install the OneTab Chrome extension. Collapse open tabs into a list when you start a focus session. RAM frees up immediately and so does your attention. &lt;a href="https://chromewebstore.google.com/detail/onetab/chphlpgkkbolifaimnlloiipkdnihall" rel="noopener noreferrer"&gt;OneTab on the Chrome Web Store&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Which local model should you actually run?
&lt;/h2&gt;

&lt;p&gt;The lesson from the flight changed my default. Here is the short list I keep now:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8yymbjy3gtjdli6zi4wx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8yymbjy3gtjdli6zi4wx.png" alt="Model picker comparison card — Devstral / Qwen3-Coder / Gemma 4 / Llama 3.3 " width="800" height="591"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Devstral Small (24B)&lt;/strong&gt; — built for agentic coding, multi-file edits, tool use. Currently the strongest open-source option on SWE-bench.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Qwen3-Coder (30B)&lt;/strong&gt; — RL-trained on SWE-bench, native tool calling, large context. The successor to the model that failed me, and it is a real upgrade.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Gemma 4 (4B to 31B)&lt;/strong&gt; — the best size-to-capability ratio. The 26b variant is what saved my flight.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Llama 3.3 (70B)&lt;/strong&gt; — solid general coding and stable tool calling if your machine can carry it.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Notice what is not on that list: &lt;code&gt;qwen2.5-coder&lt;/code&gt;. That is not an accident. Pick a model that is RL-trained for tool use, not just code completion. Claude Code lives or dies on the tool loop.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to use local vs cloud
&lt;/h2&gt;

&lt;p&gt;After running both for weeks, the rule is simple.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reach for &lt;code&gt;claude-local&lt;/code&gt; when:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;There is no network. Planes, trains, dead cafes, conference wifi.&lt;/li&gt;
&lt;li&gt;The code is privacy-sensitive. Client work under NDA, anything you do not want crossing a vendor boundary.&lt;/li&gt;
&lt;li&gt;You are drafting and iterating prompts before spending cloud tokens on the real run. Local cost stays at zero.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Reach for &lt;code&gt;claude-cloud&lt;/code&gt; when:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The work is multi-tool and agentic. Subagents, MCP calls, parallel reads.&lt;/li&gt;
&lt;li&gt;The task needs large context. Whole-repo refactors, "explain this project."&lt;/li&gt;
&lt;li&gt;The output ships to production. The polish gap between a local model and cloud Claude is real.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You do not pick once and live there. The aliases exist so you can switch inside a single session. Draft offline, land, run &lt;code&gt;claude-cloud&lt;/code&gt; for the high-stakes execution.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where this breaks
&lt;/h2&gt;

&lt;p&gt;The honest section, because AI-generated tutorials never have one.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Tool use is the weak point.&lt;/strong&gt; Even good local models are less reliable than cloud Claude at chaining many tool calls. Expect rough edges if your workflow leans hard on subagents and MCP servers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Context windows are smaller.&lt;/strong&gt; Sessions that try to load the entire repo will choke. Scope to the files in play, not the whole tree.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Battery drains faster.&lt;/strong&gt; Running a 26B model while your editor and browser are open will eat the battery noticeably quicker than cloud Claude. Plan for it on a long offline session.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The endpoint shape is a soft contract.&lt;/strong&gt; Ollama's responses are close to Anthropic's, not identical. Most coding requests work. If you hit a strange parsing error mid-stream, that mismatch is usually why, and &lt;code&gt;claude-cloud&lt;/code&gt; is the fix in the moment.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Model versioning is your job now.&lt;/strong&gt; Ollama makes pulling easy, but you decide when to upgrade and which variant. Keep a note of what you run and why.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Where to go next
&lt;/h2&gt;

&lt;p&gt;This offline setup is one of three layers in a full AI-coding stack: cloud LLMs for heavy reasoning, local LLMs for offline and private work, and on-device LLMs for the mobile apps you ship to users. The on-device side for React Native is its own problem, covered in the Phi-3 Mini integration walkthrough. All three ship pre-wired in the &lt;a href="https://aimobilelauncher.com" rel="noopener noreferrer"&gt;AI Mobile Launcher&lt;/a&gt; AI Pro tier, so you are not assembling this from scratch.&lt;/p&gt;

&lt;p&gt;I packaged the rest of this into the &lt;strong&gt;Local LLM with Claude Code bundle&lt;/strong&gt;: the paste-ready zshrc aliases plus a &lt;code&gt;claude-status&lt;/code&gt; helper, the Ollama config tuned for Apple Silicon, the model-picker matrix, and a pre-flight checklist so the setup is never a surprise at altitude. Reply to the &lt;a href="https://codemeetai.substack.com" rel="noopener noreferrer"&gt;Code Meet AI newsletter&lt;/a&gt; and I will send it.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Can I run Claude itself locally?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;No. Claude is closed-weight, so there is no local-runnable Claude. This setup uses Claude Code, the CLI, with an open-weights model like Gemma 4 or Devstral serving the inference. The CLI is the interface, the model is whatever endpoint you point it at.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is the best local LLM for coding with Claude Code?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For the agentic tool loop Claude Code runs, pick a model RL-trained for tool use: Devstral Small, Qwen3-Coder, or Gemma 4. Avoid older completion-tuned models like &lt;code&gt;qwen2.5-coder&lt;/code&gt;. They handle single edits fine but stall on multi-step work.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Does Claude Code airplane mode actually work with no signal?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Yes. With Claude Code pointed at local Ollama, no request leaves your laptop. I ran a full session at 35,000 feet with wifi off. The only requirement is pulling the model in advance.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why Ollama and not LM Studio or llama.cpp?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Ollama wraps llama.cpp with a clean HTTP API on a known port. LM Studio works too but is GUI-first. Direct llama.cpp gives more control and more setup pain. Ollama is the path of least resistance for getting this running in under 30 minutes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Will I get the same code quality as cloud Claude?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;No. A good local model is excellent for syntax-level work: refactors, cleanup, rewriting a hook. For plan-heavy or reasoning-heavy tasks the gap is large. Use cloud for design, local for execution, or use local to draft and cloud to polish.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Malik Chohra — 9 yrs software, 7 in React Native. Building &lt;a href="https://getwireai.com" rel="noopener noreferrer"&gt;Wire RN&lt;/a&gt;, &lt;a href="https://aimobilelauncher.com" rel="noopener noreferrer"&gt;AI Mobile Launcher&lt;/a&gt;, and &lt;a href="https://codemeetai.substack.com" rel="noopener noreferrer"&gt;Code Meet AI&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>claudecode</category>
      <category>ollama</category>
      <category>ai</category>
      <category>llm</category>
    </item>
    <item>
      <title>AI didn't cause 2026's layoffs. History predicts more developers.</title>
      <dc:creator>Malik Chohra</dc:creator>
      <pubDate>Thu, 14 May 2026 12:25:49 +0000</pubDate>
      <link>https://dev.to/malik_chohra/ai-didnt-cause-2026s-layoffs-history-predicts-more-developers-4bem</link>
      <guid>https://dev.to/malik_chohra/ai-didnt-cause-2026s-layoffs-history-predicts-more-developers-4bem</guid>
      <description>&lt;p&gt;&lt;strong&gt;Andrew Ng is right: there is no AI jobpocalypse. The Jevons paradox, BLS projections, and CEO behavior all point the same direction.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Andrew Ng's call: software engineering hiring stays strong despite being the sector most affected by AI tools.&lt;/li&gt;
&lt;li&gt;US BLS projects 15% software developer growth from 2024 to 2034, vs. 3% for all occupations. AI is cited as a demand driver.&lt;/li&gt;
&lt;li&gt;Jevons paradox and Bessen's ATM-teller research show cheaper tools historically expand employment, not shrink it.&lt;/li&gt;
&lt;li&gt;For builders: learn AI tools to compound your skills, then build distribution before it commoditizes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Most 2026 tech layoffs framed as AI efficiency are not about AI replacing workers. They're a mix of post-COVID over-hire correction, slowing revenue growth, and the need to fund $700 billion in AI capital expenditure. Andrew Ng's argument that there is no AI jobpocalypse is supported by US Bureau of Labor Statistics projections of 15% growth in software developer employment through 2034. Historically, when a productive input gets cheaper, total consumption expands. That's the Jevons paradox, observed since 1865 and confirmed in ATMs, spreadsheets, and compilers. AI is making building cheaper. The lesson for developers: learn the new tools, then learn distribution.&lt;/p&gt;

&lt;p&gt;I keep getting DMs from senior devs panicking about the layoffs. The memos all say AI. The framing all says "efficiency." Most of them are reading the memo wrong and acting on the wrong lesson.&lt;/p&gt;

&lt;p&gt;A friend got laid off from a Series B last month. His memo cited "AI-driven productivity gains." He spent the next two weeks trying to ramp up on Claude Code at speed because he thought he was behind. The real reason his role got cut? His company missed its Q4 revenue target and the AI line read better in board decks than the slowdown line did.&lt;/p&gt;

&lt;p&gt;The 2026 layoff story is the cleanest example I've seen of a press release winning over a spreadsheet.&lt;/p&gt;

&lt;h2&gt;
  
  
  What companies are saying vs. what their CEOs admit
&lt;/h2&gt;

&lt;p&gt;Andy Jassy was unusually honest on Amazon's Q3 2025 earnings call. Asked about the largest layoff in Amazon's 31-year history, he said the cuts "were not really financially driven, and it's not even really AI-driven, not right now at least. It's culture."&lt;/p&gt;

&lt;p&gt;Three months later, Beth Galetti's formal layoff memo at Amazon talked about reducing layers, increasing ownership, and removing bureaucracy. AI was not mentioned once. By spring 2026, the continued cuts (now 30,000 corporate roles since October 2025) had been absorbed into the broader industry narrative of AI-driven efficiency.&lt;/p&gt;

&lt;p&gt;The CEO of the company doing the largest cuts said publicly it wasn't AI. The press and the market treated it as AI anyway.&lt;/p&gt;

&lt;p&gt;The script is consistent across announcements. Meta cut 8,000 in April 2026 to "offset the other investments we're making." Block cut 4,000 with Jack Dorsey citing intelligence tools paired with smaller and flatter teams. Snap cut 16% citing rapid advancements in AI. Salesforce cut customer support from 9,000 to 5,000 with Marc Benioff saying AI agents handle 50% of interactions. Microsoft offered buyouts to 8,750 US employees.&lt;/p&gt;

&lt;p&gt;Through April 2026, AI has been cited as a factor in 49,135 announced job cuts, per Challenger, Gray &amp;amp; Christmas. The narrative is dominant. The math doesn't support it.&lt;/p&gt;

&lt;p&gt;Andrew Ng named the mechanism directly in his May 2026 &lt;a href="https://www.deeplearning.ai/the-batch/issue-352" rel="noopener noreferrer"&gt;Batch letter&lt;/a&gt;: "Businesses have a strong incentive to talk about layoffs as if they were caused by AI. Talking about how they're using AI to be far more productive with fewer staff makes them look smart. This is a better message than admitting they overhired during the pandemic when capital was abundant due to low interest rates and a massive government financial stimulus."&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu4bqdka5yk74cnnut4nn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fu4bqdka5yk74cnnut4nn.png" alt="Memes about Latest Layoffs" width="568" height="908"&gt;&lt;/a&gt;&lt;br&gt;
That sentence describes most of the layoffs of 2026.&lt;/p&gt;

&lt;h2&gt;
  
  
  The three things driving 2026's layoffs
&lt;/h2&gt;

&lt;p&gt;Strip away the AI narrative and three things are happening at once across the companies announcing the largest cuts.&lt;/p&gt;

&lt;h3&gt;
  
  
  COVID over-hiring is still being unwound
&lt;/h3&gt;

&lt;p&gt;Amazon hired aggressively from 2019 to 2022, growing global headcount from 798,000 to 1.6 million. Meta doubled. Microsoft, Google, Salesforce all hired into pandemic-era demand assumptions that didn't survive 2023.&lt;/p&gt;

&lt;p&gt;Block expanded headcount aggressively through 2021 to 2023, building parallel teams across Square and Cash App. The 40% cut Dorsey announced in February 2026 is mostly Block returning to roughly its 2020 size. The "AI-native, flatter teams" language is the public-facing wrapper around what is, structurally, a duplicate-org cleanup.&lt;/p&gt;

&lt;p&gt;Marc Andreessen, hardly a layoff skeptic, attributed recent cuts to "higher interest rates and a complete loss of discipline in hiring during the pandemic. The hiring binge that companies went on in COVID was just wild." This is the same Marc Andreessen whose firm is one of the loudest voices on AI replacing work. Even he won't credit AI for the current wave.&lt;/p&gt;

&lt;p&gt;"We over-hired" is a flat story. "AI made us more efficient" is a forward-looking transformation story. Same cuts, different press release.&lt;/p&gt;

&lt;h3&gt;
  
  
  Revenue is slowing and Chinese competition is squeezing margins
&lt;/h3&gt;

&lt;p&gt;Some of the most aggressive layoffs aren't at hyperscalers building data centers. They're at companies losing market share or facing structural revenue pressure.&lt;/p&gt;

&lt;p&gt;PayPal's cuts followed slowing revenue growth, stalled active-user counts, and competition from Stripe, Apple, Visa, and Mastercard. Coinbase rode the 2021 crypto boom, cut during the 2022 winter, rehired into the next cycle, then framed 2026 cuts around "AI-native teams." The underlying driver is the volatility of crypto demand, not a productivity unlock.&lt;/p&gt;

&lt;p&gt;Chegg cut 45% of its workforce in October 2025 because students stopped using it. They use ChatGPT instead. That is a real AI-driven layoff, but in the inverse sense: AI killed the product, not the headcount of an "efficient" company.&lt;/p&gt;

&lt;p&gt;The macro backdrop matters. US GDP grew just 0.5% in Q4 2025 before rebounding to 2.0% in Q1 2026. The Conference Board's Leading Economic Index declined 0.6% in March 2026. The Challenger, Gray &amp;amp; Christmas tracker tells the cleanest version of the story: the most-cited reason for 2026 layoffs is "market and economic conditions" at 53,058 cuts, more than double the AI count of 21,490 in the same period.&lt;/p&gt;

&lt;p&gt;Then there's China. DeepSeek V4 Pro is priced at $1.74 / $3.48 per million input/output tokens. Claude Opus 4.7 sits at $5 / $25. GPT-5.5 at $5 / $30. RAND research puts Chinese model costs at one-sixth to one-fourth of comparable US systems. When the cost of your most strategic capability is being undercut 4 to 6 times by an open-source competitor, you cut headcount somewhere. You don't blame DeepSeek in your layoff memo. You say "AI efficiency."&lt;/p&gt;

&lt;h3&gt;
  
  
  AI capex is eating the room
&lt;/h3&gt;

&lt;p&gt;This is the story most companies don't want to tell directly: they need to fund $700 billion of capex, and headcount is the easiest line to cut.&lt;/p&gt;

&lt;p&gt;The four largest hyperscalers (Amazon, Microsoft, Alphabet, Meta) are projected to spend $725 billion on capex in 2026, up 77% year over year. Roughly 75% is AI-specific. Capital intensity at hyperscalers is now 45 to 57% of revenue, ratios that look like utility companies, not software companies.&lt;/p&gt;

&lt;p&gt;Meta is the cleanest case. The company is planning $125 to $145 billion in 2026 capex, per the January 29 earnings call. The 8,000 layoffs free roughly $2.4 billion in annual run-rate operating expense. That is 1.7% of the capex bill. Even fully replacing the workforce with AI would save about $27 billion, a fraction of the $145 billion infrastructure spend.&lt;/p&gt;

&lt;p&gt;Meta's Q1 2026 still printed $56.3 billion in revenue (up 33% year over year), 41% operating margins, and $10.44 EPS. This is not a company in distress. The cuts aren't about AI productivity. They're about creating room on the income statement for a capex bill growing roughly twice as fast as revenue.&lt;/p&gt;

&lt;p&gt;Larry Page reportedly told colleagues: "I'm willing to go bankrupt rather than lose this race." That is the actual posture inside hyperscalers. The AI-framed layoffs are the public face of that posture.&lt;/p&gt;

&lt;p&gt;There's a fourth incentive Ng called out that's worth reading directly. AI companies anchor their pricing to salaries rather than SaaS norms. A SaaS tool charges $100 to $1,000 per user per year. If an AI tool can replace a $100,000 employee or make them 50% more productive, charging $10,000 looks reasonable. By anchoring to salaries, AI vendors capture much more revenue than traditional SaaS pricing would allow. That commercial logic depends on the layoff narrative being true. The incentive to keep the narrative alive is direct and financial.&lt;/p&gt;

&lt;h2&gt;
  
  
  Is software engineering finished? History says cheaper tools grow employment
&lt;/h2&gt;

&lt;p&gt;In 1865, British economist William Stanley Jevons noticed something counterintuitive about coal. As steam engines became more efficient, total coal consumption rose instead of falling. Cheaper coal per unit of output made coal-powered production viable in more industries, expanding total demand faster than efficiency reduced it. He called this the paradox of efficiency. Microsoft CEO Satya Nadella invoked it explicitly when DeepSeek's pricing hit the markets in early 2025.&lt;/p&gt;

&lt;p&gt;The textbook case in employment economics is bank tellers and ATMs. From 1988 to 2004, ATMs cut the number of tellers needed per US bank branch from 20 to 13. The intuitive prediction was teller employment would collapse. It didn't. Cheaper branch operations let banks open 43% more branches in urban areas. Total teller employment rose. Economist James Bessen documented this for the IMF in 2015, and the pattern has become the standard reference for thinking about automation and jobs.&lt;/p&gt;

&lt;p&gt;The same pattern shows up with spreadsheets, where the prediction was that VisiCalc would kill accountants. The reality: financial analyst jobs grew because cheaper analysis made more analysis worth doing. With compilers, where assembly programmers were supposed to be displaced. The reality: total developer count grew by orders of magnitude because cheaper code made more code worth writing. With electrification, where the same panic played out across factory work.&lt;/p&gt;

&lt;p&gt;The mechanism is consistent. When a productive input gets cheaper, the supply of work that input can support expands faster than the input becomes redundant. People build things that weren't worth building when the input cost was higher.&lt;/p&gt;

&lt;p&gt;The US Bureau of Labor Statistics, working from the assumption that AI will accelerate over the next decade, &lt;a href="https://www.bls.gov/ooh/computer-and-information-technology/software-developers.htm" rel="noopener noreferrer"&gt;projects software developer employment to grow 15%&lt;/a&gt; from 2024 to 2034, against 3% for all US occupations. Their report names AI explicitly as a demand driver: "Demand for software developers, software quality assurance analysts, and testers is projected to be strong due to the continued expansion of software development for artificial intelligence, Internet of Things, robotics, and other automation applications." About 129,200 openings are projected per year over the decade.&lt;/p&gt;

&lt;p&gt;There's a Jevons split inside the BLS data worth noticing. The narrow category "computer programmers" (repetitive coding work) is projected to decline 6%, with the explicit reason being "computer programming work continues to be automated." The broad category "software developers" (designing, integrating, shipping software) grows 15%. The narrow, repetitive task gets automated. The broader role expands. This is exactly the pattern Bessen described for tellers in 1988.&lt;/p&gt;

&lt;p&gt;Andrew Ng's Batch letter argues the same point at a higher level. Software engineering is the sector most affected by AI tools. Hiring remains strong. US unemployment is 4.3%. His prediction is what he calls an "AI jobapalooza": more good AI engineering jobs, in companies that aren't traditionally software employers, with skill mixes that look different from 2018.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this time could still be different
&lt;/h2&gt;

&lt;p&gt;This time might be different. Three reasons to take seriously.&lt;/p&gt;

&lt;p&gt;First, speed. The ATM-to-teller transition played out over forty years. AI's transition into coding has taken about three years. Even if the long-run equilibrium is more developers, the transition is happening on a timeline that gives workers little room to retrain.&lt;/p&gt;

&lt;p&gt;Second, completeness of automation. Bessen's bank teller story has a sequel most quotations miss. Teller employment did eventually decline, not from ATMs, but from mobile banking after 2010. When automation went from partial (ATMs handled some tasks) to nearly complete (mobile banking handled them all), the Jevons effect stopped protecting jobs. The question is whether AI's coding capability will graduate from partial automation (helps engineers be faster) to complete automation (does the job end-to-end). Today it's clearly partial. The question is for how long.&lt;/p&gt;

&lt;p&gt;Third, the macro data isn't yet showing the productivity boom that would generate Jevons-style demand expansion. Torsten Slok, chief economist at Apollo, summarized this in a phrase that's now widely quoted: "AI is everywhere except in the incoming macroeconomic data." Stanford's "Canaries in the Coal Mine" study from November 2025 found employment declining for workers whose jobs may be affected by AI. The specific roles named were software developers and customer-service representatives. These are exactly the roles Jevons should be protecting.&lt;/p&gt;

&lt;p&gt;So the long-run pattern says Jevons holds. The short-run data is mixed. Builders should adapt now rather than wait to find out which way the next five years go.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this means for developers in 2026
&lt;/h2&gt;

&lt;p&gt;Here is where most takes on these layoffs go wrong.&lt;/p&gt;

&lt;p&gt;The instinct, especially for engineers, is to read the layoff memos at face value: "AI is taking our jobs. Better get good at AI fast." Half of that is wrong. Half is right.&lt;/p&gt;

&lt;p&gt;Half is wrong. AI is not currently replacing software engineers at scale. The cuts aren't because AI engineers write 10x more code. They're because companies over-hired, growth is slowing, and someone needs to absorb $700 billion in capex.&lt;/p&gt;

&lt;p&gt;Half is right. AI is making building cheaper. Not because it replaces engineers, but because it compresses the time from idea to working prototype. I can spin up a working React Native app with auth, theming, i18n, and Redux Toolkit using my own &lt;a href="https://github.com/chohra-med/expo_boilerplate" rel="noopener noreferrer"&gt;expo_boilerplate&lt;/a&gt; plus Claude Code in an afternoon. Two years ago that was a weekend. Five years ago it was a small team's first sprint.&lt;/p&gt;

&lt;p&gt;The Jevons reading: cheaper building means more total building. That expands demand for the work AI can't yet do well: system design, integration, debugging, judgment calls, shipping, and getting users.&lt;/p&gt;

&lt;p&gt;The bottleneck has moved. Building used to be the moat. The number of people who can ship a working product has exploded. The cost of building has collapsed. Distribution didn't get cheaper. Attention didn't get cheaper. Trust didn't get cheaper. The audience you spent five years building is still worth what it was worth in 2020. The newsletter with 10,000 engaged readers is still rare.&lt;/p&gt;

&lt;p&gt;This is the inversion that matters more for your career than any layoff announcement. The work didn't get harder. It shifted. The skill stack that paid in 2018 (deep technical specialization) pays less now. The skill stack that pays in 2026 is technical work plus distribution work.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to learn this week: AI tools and distribution
&lt;/h2&gt;

&lt;p&gt;If I were starting over today as a senior engineer reading the layoff news, this is what I'd learn first, in this order.&lt;/p&gt;

&lt;h3&gt;
  
  
  Use AI as a builder, not a topic to study
&lt;/h3&gt;

&lt;p&gt;Stop reading about AI and start using it. Pick one workflow you do every week and rebuild it with Claude Code or Cursor. Measure the time saved. Notice where it breaks. The point is not to become an "AI engineer." The point is to compound your existing skill stack with tools that make you 3 to 5 times faster on the boring parts.&lt;/p&gt;

&lt;h3&gt;
  
  
  Ship one real thing per month
&lt;/h3&gt;

&lt;p&gt;Not a tutorial project. A real thing you put online with your name on it. The boilerplate I open-sourced on GitHub was a forcing function for me: every time I built a CasaInnov client project, I extracted the reusable parts and pushed them back. Two years of that is now a credible authority signal anyone can clone.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pick one distribution channel and commit publicly
&lt;/h3&gt;

&lt;p&gt;Newsletter, LinkedIn, Twitter, YouTube, Reddit, GitHub. Pick one. Get good at it. The cost of building a 5,000-person audience in your niche is one consistent post per week for two years. That sounds boring. It is boring. It also outperforms 95% of what your peers are doing.&lt;/p&gt;

&lt;h3&gt;
  
  
  Write what you wish someone had written for you
&lt;/h3&gt;

&lt;p&gt;I started Code Meet AI because I kept losing days to integration problems nobody had documented well. Hermes failing on cold start after Expo SDK upgrades. Claude Code hallucinating React Native imports that don't exist. Generative UI patterns that work on web but break on mobile. The writing is now its own moat.&lt;/p&gt;

&lt;p&gt;The combination of these four habits is, in my opinion, more resilient than any specific technical skill. Skills depreciate. A distribution channel and a track record of shipping compound.&lt;/p&gt;

&lt;p&gt;If you want a head start, my &lt;a href="https://github.com/chohra-med/expo_boilerplate" rel="noopener noreferrer"&gt;expo_boilerplate&lt;/a&gt; is MIT-licensed and built for exactly this: TypeScript, auth, theming, i18n, Redux Toolkit, feature-first architecture, Cursor and Claude rules already wired in. Clone it, change three things, ship something this weekend.&lt;/p&gt;

&lt;p&gt;The 2026 layoffs are not the signal most people are reading them as. The current cuts are about COVID over-hire correction, slowing growth, and AI capex pressure. The economics history says cheaper tools grow employment, not shrink it. Andrew Ng calls it an AI jobapalooza. The Bureau of Labor Statistics is projecting 15% growth.&lt;/p&gt;

&lt;p&gt;The work shifted. Adapt to where it went.&lt;/p&gt;

&lt;h2&gt;
  
  
  FAQ
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Will AI replace software engineers?
&lt;/h3&gt;

&lt;p&gt;Probably not net replace, based on Bureau of Labor Statistics data and historical precedent. BLS projects 15% growth in software developer employment from 2024 to 2034, with AI explicitly named as a demand driver. The Jevons paradox, cheaper inputs expand total consumption, has held for two centuries across electrification, spreadsheets, and ATMs. The transition may displace specific narrow roles (BLS projects "computer programmers" to decline 6%) while expanding the broader category of software work.&lt;/p&gt;

&lt;h3&gt;
  
  
  What did Andrew Ng say about AI and jobs?
&lt;/h3&gt;

&lt;p&gt;In his May 2026 Batch letter, Ng predicted there will be no AI jobpocalypse. His evidence: software engineering hiring remains strong despite being the sector most affected by AI tools, and US unemployment sits at 4.3%. He attributes the panic narrative to AI labs wanting to sound powerful, AI companies anchoring pricing to salaries rather than typical SaaS norms, and businesses preferring "AI efficiency" to admitting they over-hired during the pandemic stimulus era. He predicts an "AI jobapalooza" instead.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is the Jevons paradox and why does it matter for developers?
&lt;/h3&gt;

&lt;p&gt;Jevons paradox is the 1865 observation that when a resource gets used more efficiently, total consumption of that resource often rises rather than falls. Applied to software: AI making coding cheaper doesn't necessarily reduce demand for code. It expands what's worth building. The bank teller case is the canonical example. ATMs cut tellers per branch from 20 to 13, but banks opened 43% more branches, so total teller employment grew. The pattern holds until automation gets complete enough to handle the whole job, which AI hasn't yet for software.&lt;/p&gt;

&lt;h3&gt;
  
  
  Should I learn AI to avoid being laid off?
&lt;/h3&gt;

&lt;p&gt;Yes, but not because AI is taking your job. Because AI tools make you 3 to 5 times faster on boring parts of the work, which is now table stakes for senior engineers. The bigger lever is distribution. AI commoditized building. Distribution didn't get cheaper. A small audience and a track record of shipping are now more durable than any specific technical specialization.&lt;/p&gt;

&lt;h3&gt;
  
  
  Where do I start this week?
&lt;/h3&gt;

&lt;p&gt;Three concrete moves. Audit your AI tooling (set up Claude Code or Cursor with project rules and verification gates). Pick one distribution channel and schedule three posts. Ship one open-source artifact with your name on it. The combination compounds faster than any individual technical certification.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;I write &lt;a href="https://aimeetcode.substack.com" rel="noopener noreferrer"&gt;Code Meet AI&lt;/a&gt;, a weekly newsletter for engineers shipping AI features in production. No hype, no thought-leader cadence. Real builds, honest takes, what's working in mobile-AI right now.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The boilerplate is at &lt;/em&gt;&lt;/p&gt;&lt;div class="ltag-github-readme-tag"&gt;&lt;em&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;a href="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" class="article-body-image-wrapper"&gt;&lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;&lt;/a&gt;
      &lt;a href="https://github.com/chohra-med" rel="noopener noreferrer"&gt;
        chohra-med
      &lt;/a&gt; / &lt;a href="https://github.com/chohra-med/expo_boilerplate" rel="noopener noreferrer"&gt;
        expo_boilerplate
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      AI-first React Native + Expo boilerplate. Feature-first architecture, TypeScript, auth, i18n, theming, Redux Toolkit — with Cursor/Claude rules included. Lite version of AI Mobile Launcher.
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;MobileLauncher — React Native Boilerplate&lt;/h1&gt;
&lt;/div&gt;
&lt;p&gt;&lt;a href="https://github.com/chohra-med/expo_boilerplate/stargazers" rel="noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/03b27eb9f790e65bfbdcd81ac8a63c0d51d7aec10b7664a19f3794f064c9fb91/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f73746172732f63686f6872612d6d65642f6578706f5f626f696c6572706c6174653f7374796c653d666c61742d737175617265" alt="GitHub stars"&gt;&lt;/a&gt;
&lt;a href="https://github.com/chohra-med/expo_boilerplate/network" rel="noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/00250ebcd83a599f0c57702804266d93e93e80cd253c6286ffd324fe74e0b4e8/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f666f726b732f63686f6872612d6d65642f6578706f5f626f696c6572706c6174653f7374796c653d666c61742d737175617265" alt="GitHub forks"&gt;&lt;/a&gt;
&lt;a href="https://github.com/chohra-med/expo_boilerplate/LICENSE" rel="noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/a7e65aee57b11d28e4caff8b945729a66be0bb663f7f93bd24c5aa65699f148e/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f4c6963656e73652d4d49542d626c75652e7376673f7374796c653d666c61742d737175617265" alt="License: MIT"&gt;&lt;/a&gt;
&lt;a href="https://github.com/chohra-med/expo_boilerplate/commits" rel="noopener noreferrer"&gt;&lt;img src="https://camo.githubusercontent.com/b0970eb34e41443d073b42d46a5779a2f8bbf5057ac53ca2aaec6cb7271e3421/68747470733a2f2f696d672e736869656c64732e696f2f6769746875622f6c6173742d636f6d6d69742f63686f6872612d6d65642f6578706f5f626f696c6572706c6174653f7374796c653d666c61742d737175617265" alt="Last commit"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The React Native foundation I use on every production project — open-sourced.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Feature-first architecture, TypeScript strict, auth, i18n, theming, Redux Toolkit, and Expo SDK 54 with the New Architecture. Structured so Cursor, Claude Code, and Antigravity generate consistent code without hallucinating your patterns.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Want the full version?&lt;/strong&gt; RevenueCat, Firebase, U-AMOS 2.0 memory bank, and AI Pro features are in the paid tier.&lt;br&gt;
→ &lt;strong&gt;&lt;a href="https://www.aimobilelauncher.com/?utm_source=github&amp;amp;utm_medium=readme&amp;amp;utm_campaign=expo_boilerplate&amp;amp;utm_content=hero_cta" rel="nofollow noopener noreferrer"&gt;AI Mobile Launcher — aimobilelauncher.com&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;




&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Why this boilerplate?&lt;/h2&gt;
&lt;/div&gt;

&lt;p&gt;Most React Native starters give you a blank canvas. That's fine for a side project — it's a liability on production work or when you're using AI coding tools.&lt;/p&gt;

&lt;p&gt;After 7 years of shipping React Native apps — enterprise clients, health tech, coaching platforms — I kept rebuilding the same foundation from scratch. Authentication, onboarding, theming, i18n, state management, folder structure, TypeScript config. Every time.&lt;/p&gt;

&lt;p&gt;This is that foundation, extracted and open-sourced.&lt;/p&gt;

&lt;p&gt;Three reasons…&lt;/p&gt;&lt;/div&gt;


&lt;/div&gt;
&lt;br&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/chohra-med/expo_boilerplate" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;br&gt;
&lt;/em&gt;&lt;/div&gt;&lt;em&gt;
&lt;br&gt;
&lt;/em&gt;&lt;p&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;MIT-licensed. Clone, fork, ship.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>career</category>
      <category>webdev</category>
      <category>discuss</category>
    </item>
    <item>
      <title>How I wire Claude into my React Native workflow (skills, projects, Cowork)</title>
      <dc:creator>Malik Chohra</dc:creator>
      <pubDate>Wed, 13 May 2026 08:01:35 +0000</pubDate>
      <link>https://dev.to/malik_chohra/how-i-wire-claude-into-my-react-native-workflow-skills-projects-cowork-4pk3</link>
      <guid>https://dev.to/malik_chohra/how-i-wire-claude-into-my-react-native-workflow-skills-projects-cowork-4pk3</guid>
      <description>&lt;p&gt;Claude isn't a chat app anymore. It's a runtime. The interface is still text, but the architecture underneath is execution: load context, pick tools, call APIs, write files, schedule work. Most people are still typing at it like ChatGPT in 2023 and wondering why their workflow hasn't changed.&lt;/p&gt;

&lt;p&gt;The shift happened quietly, across four primitives. Each one shipped without much fanfare. Together they're what "advanced" means in 2026: not a longer prompt, but a better-wired one.&lt;/p&gt;

&lt;p&gt;This piece is the primer. The four things to understand before you can use Claude well.&lt;/p&gt;

&lt;h2&gt;
  
  
  The mental model
&lt;/h2&gt;

&lt;p&gt;The outdated framing: &lt;em&gt;Claude is good at writing, explaining, coding.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The 2026 framing: &lt;em&gt;Claude is a runtime that loads skills, scopes memory in projects, calls external systems through MCP, and executes multi-step work in Cowork.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Same model file, completely different surface. The question used to be "what can Claude do?" The question now is "what can I wire into Claude?"&lt;/p&gt;

&lt;p&gt;That reframe is the whole article. Everything below is the four primitives that make the reframe real.&lt;/p&gt;

&lt;h2&gt;
  
  
  1/ Skills (the tool layer)
&lt;/h2&gt;

&lt;p&gt;A skill is a folder with a &lt;code&gt;SKILL.md&lt;/code&gt; file. YAML frontmatter at the top with &lt;code&gt;name&lt;/code&gt; and &lt;code&gt;description&lt;/code&gt;. Markdown body underneath with the instructions Claude follows. That's the entire format.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6g6xzmdciesrrrkpi166.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F6g6xzmdciesrrrkpi166.png" alt="SKILL.md folder structure diagram. Source: Anthropic documentation at docs.claude.com" width="800" height="584"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The mechanism is the part most people miss. The description is what Claude sees in its skill list before responding. The body only loads when the skill triggers. So you can have 50 skills sitting available and pay context cost on only the one that fires.&lt;/p&gt;

&lt;p&gt;This changes what you'd put in a skill. A skill isn't a system prompt by another name. It's a tool you teach Claude once and reach for whenever the task fits.&lt;/p&gt;

&lt;p&gt;Things skills are good at:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Domain procedures: how your team does code review, how your brand voice works, what your component library calls things&lt;/li&gt;
&lt;li&gt;Multi-step workflows: write article → format for Medium → cross-post to Dev.to → generate carousel&lt;/li&gt;
&lt;li&gt;Technical conventions: your API's auth quirks, your codebase's folder structure, your testing harness&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Two patterns I've seen work in production.&lt;/p&gt;

&lt;p&gt;A context skill holds your domain knowledge once. Other skills reference it. Don't repeat your brand voice in every generator. Keep it in a &lt;code&gt;*-context&lt;/code&gt; skill and have the generators read it.&lt;/p&gt;

&lt;p&gt;A generator skill does one job. It writes a thing, or transforms a thing, or validates a thing. Single-purpose, composable, chains cleanly.&lt;/p&gt;

&lt;p&gt;The mistake is making one giant skill that does everything. Anthropic's own open-source skills repo has separate &lt;code&gt;pdf&lt;/code&gt;, &lt;code&gt;docx&lt;/code&gt;, &lt;code&gt;xlsx&lt;/code&gt;, and &lt;code&gt;pptx&lt;/code&gt; skills, not one mega "documents" skill, for a reason. Generators that do too much fail in too many ways and get triggered by too many prompts.&lt;/p&gt;

&lt;p&gt;The other thing nobody tells you: the description &lt;em&gt;is&lt;/em&gt; the trigger. I spent two weeks getting one of my skills to fire when I asked for the right thing. The body was fine. The description was vague. Claude under-triggers skills by default, and Anthropic's own guidance is to be slightly &lt;em&gt;pushy&lt;/em&gt; in descriptions. Specific verbs, specific phrases, specific contexts.&lt;/p&gt;

&lt;p&gt;Custom skills are available on Pro, Max, Team, and Enterprise. You can create them directly in Claude.ai (Settings → Capabilities), via the API, or as folders in Claude Code.&lt;/p&gt;

&lt;h2&gt;
  
  
  2/ Projects (scoped memory)
&lt;/h2&gt;

&lt;p&gt;A Project is a workspace with its own files, instructions, and memory. Memory accumulated in one project doesn't bleed into another. Same Claude account, effectively different "instances" of context.&lt;/p&gt;

&lt;p&gt;Why it matters: chat memory was useful but contaminating. A single global memory pool meant Claude pulled context from a personal conversation into a work answer, or surfaced last week's product strategy when you asked about something unrelated. Project-scoped memory fixes that without forcing you to start cold every session.&lt;/p&gt;

&lt;p&gt;What to use it for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;One project per product or work stream, to keep the contexts clean&lt;/li&gt;
&lt;li&gt;Long-running threads where context compounds (research projects, ongoing client engagements, multi-week investigations)&lt;/li&gt;
&lt;li&gt;Anywhere you want Claude to remember but not leak&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The pattern: every project gets its own files (a PRD, a brand voice doc, a technical spec) and its own memory. The skills you've installed are still available across all projects, but the &lt;em&gt;context&lt;/em&gt; is scoped.&lt;/p&gt;

&lt;p&gt;A consequence worth noticing. If you're not using Projects, your default chat is becoming a leaky bucket. Memory accumulates. Some of it conflicts. After three months it's a soup. Projects are how you stop that.&lt;/p&gt;

&lt;h2&gt;
  
  
  3/ Connectors (the integration layer over MCP)
&lt;/h2&gt;

&lt;p&gt;Connectors are Model Context Protocol-based integrations that let Claude read from and write to external services. Google Drive, Gmail, Notion, GitHub, Slack, Linear, Asana, Jira, Stripe, Figma, Canva, HubSpot, Apple Health. 50+ in the directory as of early 2026, with new ones added weekly.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3sbfiojeioixw7lgbona.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3sbfiojeioixw7lgbona.png" alt="MCP architecture diagram. Source: modelcontextprotocol.io" width="680" height="1206"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Why they matter: pasting screenshots and copy-pasting JSON is the manual work AI was supposed to remove. Connectors remove it. Instead of "here's the email I got," it's "the email from Sarah yesterday." Claude pulls it. Instead of pasting an issue body, it's "the bug filed in expo_boilerplate." Claude pulls it.&lt;/p&gt;

&lt;p&gt;When to use them:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Tools already in your daily workflow. Connectors only earn their place if they're already part of how you work.&lt;/li&gt;
&lt;li&gt;Workflows that span tools (calendar + email + Slack = daily briefing)&lt;/li&gt;
&lt;li&gt;Anywhere you find yourself screenshot-pasting more than twice in one session&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The custom MCP escape hatch (Pro plan and above): if your tool isn't in the directory, you can add any MCP server URL via Settings → Connectors → Add custom connector. Notion's hosted MCP at &lt;code&gt;https://mcp.notion.com/mcp&lt;/code&gt; is the canonical example. Anyone publishing an MCP server can be wired into Claude in 30 seconds.&lt;/p&gt;

&lt;p&gt;The trap is over-connecting. Each connector adds surface area for Claude to get confused. Multiple integrations claiming to handle "messages" or "tasks" leads to wrong-tool-picked failures. The honest take: pick three to five that match your real flow. Connect more only when you hit a specific gap.&lt;/p&gt;

&lt;h2&gt;
  
  
  4/ Cowork (the agentic execution layer)
&lt;/h2&gt;

&lt;p&gt;Cowork is the same agentic architecture as Claude Code, but for non-coding tasks, in the desktop app. (If you haven't installed Claude Code yet, the &lt;a href="https://aimeetcode.substack.com/p/claude-code-for-beginners-what-it" rel="noopener noreferrer"&gt;Claude Code for beginners guide&lt;/a&gt; on the Code Meet AI newsletter walks through the install and your first project.) It reads and writes local files, schedules recurring tasks, and uses connectors first, the browser second, and computer use (driving your screen) only as a last resort. Available on Pro, Max, Team, and Enterprise. Desktop only.&lt;/p&gt;

&lt;p&gt;This is where Claude shifts from assistant to colleague. You give it a goal, walk away, come back to a result. The desktop has to be awake while it runs. That's the catch.&lt;/p&gt;

&lt;p&gt;What Cowork is good at:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Repetitive multi-step work like file organization, daily briefings, weekly reviews&lt;/li&gt;
&lt;li&gt;Tasks that span tools and need orchestration (calendar + email + Slack synthesis)&lt;/li&gt;
&lt;li&gt;Work that's too boring to do reliably but too important to skip&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What it's not for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Time-sensitive tasks. Your desktop has to be open and awake.&lt;/li&gt;
&lt;li&gt;Sensitive data (financial, health, anything regulated). Prompt injection risk is real, and Cowork activity isn't covered by ZDR.&lt;/li&gt;
&lt;li&gt;Work where you want to think alongside Claude. That's chat. Cowork is delegation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The realest test: if you'd skip the task because it's boring, Cowork is the right tool. If you'd want to watch Claude do it step by step, chat is.&lt;/p&gt;

&lt;h2&gt;
  
  
  The multiplier: Dispatch + Computer Use
&lt;/h2&gt;

&lt;p&gt;Two extensions on Cowork worth knowing about, because together they're what makes the rest worth setting up.&lt;/p&gt;

&lt;p&gt;Computer Use lets Claude drive your screen. Clicking, typing, navigating apps that don't have connectors. Slower than a connector. More fragile. But it works for the long tail of tools that haven't published an MCP server. Research preview on Pro and Max.&lt;/p&gt;

&lt;p&gt;Dispatch lets you assign tasks from your mobile app to your desktop. You're on the train; you tell Claude on your phone to summarize three articles you drafted this week. By the time you're at your desk, the answer is in chat.&lt;/p&gt;

&lt;p&gt;Both are research previews as of May 2026. Both work. Use sparingly until they harden, but understand they exist. They're the difference between Claude as a desktop tool and Claude as something you can hand work to from anywhere.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this means for builders
&lt;/h2&gt;

&lt;p&gt;For mobile builders specifically, the implications are sharper than they look from the outside. The web AI dev community has been on this trajectory longer (Cursor, Claude Code in CLI, MCP servers for every database, custom skills for every framework). Mobile dev has stayed a step behind partly because the canonical workflows assume backend or web context.&lt;/p&gt;

&lt;p&gt;Skills, Projects, and Connectors don't care what stack you ship to. The runtime is platform-agnostic. The gain compounds the moment you treat Claude as something to wire, not something to type at.&lt;/p&gt;

&lt;p&gt;The honest version: most "I'm not getting much out of AI" complaints I hear from devs in 2026 trace to one of three things. They're still on the chat surface. They haven't built a single skill. Or they're treating connectors like a novelty. None of those are model problems. They're setup problems.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where to start
&lt;/h2&gt;

&lt;p&gt;Pick one primitive, build something, ship it. Add the next one when the simpler setup hits a wall.&lt;/p&gt;

&lt;p&gt;For most people, the order is:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;One Project per major work stream. Stop polluting the default chat.&lt;/li&gt;
&lt;li&gt;One custom skill for your domain context. Brand voice, codebase conventions, whatever your work depends on.&lt;/li&gt;
&lt;li&gt;Three connectors. The ones already in your daily flow. Not ten.&lt;/li&gt;
&lt;li&gt;One Cowork recurring task. A morning briefing is a good first one.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Stop there for a month. Notice what's still manual. Build the next thing for that.&lt;/p&gt;

&lt;p&gt;The advanced version of Claude isn't a longer prompt. It's the four primitives, wired into how you really work. You're writing code now. It just happens to look like English.&lt;/p&gt;




&lt;h2&gt;
  
  
  If you're shipping mobile-AI
&lt;/h2&gt;

&lt;p&gt;The four primitives apply to every stack, but the wiring for React Native and Expo is its own problem. Web AI dev has a year head start on patterns; mobile is still figuring out what a Claude Code memory bank looks like for a Metro bundler, what skills make sense for an Expo build pipeline, and which connectors actually plug into a mobile workflow.&lt;/p&gt;

&lt;p&gt;That's the gap &lt;strong&gt;AI Mobile Launcher&lt;/strong&gt; fills. It ships the U-AMOS memory system, RN-specific Claude Code rules, and the Skills folder structure pre-configured for Expo and the mobile stack, so you're not figuring out the wiring on a Tuesday night with a build failure on the line.&lt;/p&gt;

&lt;p&gt;The Lite version is free on GitHub. The full system with the rule packs, generators, and U-AMOS 2.0 memory bank is in the Starter tier.&lt;/p&gt;

&lt;p&gt;→ &lt;a href="https://aimobilelauncher.com" rel="noopener noreferrer"&gt;Get AI Mobile Launcher&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Malik Chohra · AI-first mobile engineer · &lt;a href="https://getwireai.com" rel="noopener noreferrer"&gt;WireAI&lt;/a&gt; · &lt;a href="https://aimobilelauncher.com" rel="noopener noreferrer"&gt;AI Mobile Launcher&lt;/a&gt; · &lt;a href="https://aimeetcode.substack.com" rel="noopener noreferrer"&gt;Code Meet AI newsletter&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>reactnative</category>
      <category>ai</category>
      <category>claudecode</category>
      <category>mobile</category>
    </item>
    <item>
      <title>From React to React Native: what web devs get wrong on day one</title>
      <dc:creator>Malik Chohra</dc:creator>
      <pubDate>Thu, 07 May 2026 19:46:01 +0000</pubDate>
      <link>https://dev.to/malik_chohra/from-react-to-react-native-what-web-devs-get-wrong-on-day-one-46i5</link>
      <guid>https://dev.to/malik_chohra/from-react-to-react-native-what-web-devs-get-wrong-on-day-one-46i5</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwrdm53yq1s7uos140fcp.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwrdm53yq1s7uos140fcp.png" alt="From React to React Native: what web devs get wrong on day one" width="800" height="336"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I built three React Native apps before I really understood it.&lt;/p&gt;

&lt;p&gt;The first took me three weeks to ship something that should have taken three days. The second, I shipped fast by ignoring half the platform constraints and paying for it later. The third was the boilerplate I wish I'd had on day one.&lt;/p&gt;

&lt;p&gt;This was by 2019, when React Native was new, and I always thought that jumping from React Native to ReactJS for websites would be smooth. Actually, it was.&lt;/p&gt;

&lt;p&gt;Since then, i saw so many Web developers, they jump into mobile apps, and it is not the same. So much dependency to manage, just to think about the performance, it is a whole new topic, or to manage packages. Here, we are not talking about Native integration or creating native code from scratch. There is so many mistakes when it comes to that, and i will try to simplify life for you.&lt;/p&gt;

&lt;p&gt;If you're a React developer planning to build a mobile app, this is the piece I'd hand you on day zero. What actually transfers from the web? What absolutely doesn't. Where Expo fits in. What to learn first? What to skip. And, because most "React to React Native" guides are written like AI is still 2022, what shipping AI features inside a mobile app actually looks like.&lt;/p&gt;

&lt;h2&gt;
  
  
  "It's just React, right?"
&lt;/h2&gt;

&lt;p&gt;Yes, it's JSX. Yes, it's hooks. Yes, your component model carries over. That's about a third of what makes up "shipping a working app."&lt;/p&gt;

&lt;p&gt;The other things are layout, navigation, storage, build, debugging, and deployment, which are different enough that pretending they aren't is the single biggest reason web devs give up on RN in week two.&lt;/p&gt;

&lt;p&gt;So let me split it cleanly.&lt;/p&gt;

&lt;h2&gt;
  
  
  What transfers from React (the good news)
&lt;/h2&gt;

&lt;p&gt;If you've shipped React on the web, these all carry over more or less untouched:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;JSX and the component model.&lt;/strong&gt; Same mental model. &lt;code&gt;&amp;lt;MyComponent prop={value} /&amp;gt;&lt;/code&gt; is &lt;code&gt;&amp;lt;MyComponent prop={value} /&amp;gt;&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hooks.&lt;/strong&gt; &lt;code&gt;useState&lt;/code&gt;, &lt;code&gt;useEffect&lt;/code&gt;, &lt;code&gt;useMemo&lt;/code&gt;, &lt;code&gt;useCallback&lt;/code&gt;, &lt;code&gt;useReducer&lt;/code&gt;, &lt;code&gt;useContext&lt;/code&gt;, custom hooks all work the same.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TypeScript.&lt;/strong&gt; Same setup, same &lt;code&gt;tsconfig.json&lt;/code&gt; (almost). Expo gives you a working TS template by default.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;State management.&lt;/strong&gt; Zustand, Redux, Jotai. All work in RN. TanStack Query works. (If you're choosing between them, I broke down the trade-offs in &lt;a href="https://medium.com/stackademic/redux-vs-zustand-vs-mobx-in-react-native-the-good-and-the-bad-for-each-one-ec4ce2341e0e" rel="noopener noreferrer"&gt;Redux vs Zustand vs MobX in React Native&lt;/a&gt;.)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Most utility libraries.&lt;/strong&gt; &lt;code&gt;zod&lt;/code&gt;, &lt;code&gt;date-fns&lt;/code&gt;, &lt;code&gt;lodash&lt;/code&gt;, &lt;code&gt;dayjs&lt;/code&gt;, &lt;code&gt;uuid&lt;/code&gt;, all fine. Anything with no DOM dependency.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Patterns.&lt;/strong&gt; Composition, lifting state up, container/presentational, render props if you're into that. All the same.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's the part that lulls you into thinking "this is going to be smooth."&lt;/p&gt;

&lt;h2&gt;
  
  
  What doesn't transfer (the painful part)
&lt;/h2&gt;

&lt;p&gt;Here's where week two starts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No DOM elements.&lt;/strong&gt; &lt;code&gt;&amp;lt;div&amp;gt;&lt;/code&gt; doesn't exist. Neither does &lt;code&gt;&amp;lt;span&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;button&amp;gt;&lt;/code&gt;, or &lt;code&gt;&amp;lt;input&amp;gt;&lt;/code&gt;. You get &lt;code&gt;&amp;lt;View&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;Text&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;Pressable&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;TextInput&amp;gt;&lt;/code&gt;. And every piece of text on the screen has to be inside a &lt;code&gt;&amp;lt;Text&amp;gt;&lt;/code&gt; putting a string, directly in a &lt;code&gt;&amp;lt;View&amp;gt;&lt;/code&gt; crashes the app at runtime. And by the way, if you use lazy loading for screens, and you don't check that screen, the deployment will crash heavily, as the deployment pipeline is so different. Well, you can use Over The Air update (OTA), but…&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No CSS.&lt;/strong&gt; No stylesheets, no media queries, no cascading, no &lt;code&gt;:hover&lt;/code&gt; (there's no hover on mobile). You write style objects in JS, or you use a Tailwind equivalent like NativeWind, which I strongly recommend keeping you on familiar ground.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No React Router.&lt;/strong&gt; Use Expo Router. It's file-based, feels closest to Next.js's app router, and is now the default for new Expo projects.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No browser APIs.&lt;/strong&gt; No &lt;code&gt;localStorage&lt;/code&gt;, no &lt;code&gt;window&lt;/code&gt;, no &lt;code&gt;document&lt;/code&gt;. You'll use &lt;code&gt;AsyncStorage&lt;/code&gt; (or &lt;code&gt;MMKV&lt;/code&gt; for performance), and &lt;code&gt;react-native-reanimated&lt;/code&gt; for anything animated. There's no &lt;code&gt;&amp;lt;a href&amp;gt;&lt;/code&gt;. There's no scroll event the same way. There's no &lt;code&gt;getElementById&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Forms work differently.&lt;/strong&gt; &lt;code&gt;&amp;lt;TextInput&amp;gt;&lt;/code&gt; doesn't auto-handle the keyboard. You manage focus, dismissal, keyboard-avoiding behavior, autocorrect, and autocapitalize. Keyboard handling on mobile is a small engineering problem in itself.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Images load asynchronously.&lt;/strong&gt; You don't &lt;code&gt;&amp;lt;img src=...&amp;gt;&lt;/code&gt; and forget. You think about caching, placeholders, error states, and image sizes. &lt;code&gt;expo-image&lt;/code&gt; handles most of this.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Animations are different.&lt;/strong&gt; CSS transitions don't exist. Reanimated library is the standard, and it runs animations on the UI thread (separate from your JS), which is actually better than the web, but you have to learn worklets. The idea is we have JS and UI threads in React Native. Animation should run on UI threads to keep the app performance. (For high-performance graphics specifically, &lt;a href="https://medium.com/stackademic/rn-skia-a-better-option-for-high-performance-graphics-fad03da5883e" rel="noopener noreferrer"&gt;RN Skia&lt;/a&gt; is worth knowing too.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Build and deploy aren't Vercel.&lt;/strong&gt; No &lt;code&gt;git push&lt;/code&gt; and you're live. You use EAS Build for cloud builds, EAS Submit to push to the App Store and Play Store, and you wait for app review. EAS Update lets you ship JS-only patches over the air without going through review again. That's the closest thing to "deploy on push."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Debugging is different.&lt;/strong&gt; Flipper is deprecated. React Native DevTools is the new standard, and it's actually decent now. But native crashes (the kind that surface as a stack trace from Java or Objective-C) require different muscle memory.&lt;/p&gt;

&lt;p&gt;That's the surface area you didn't know you didn't know.&lt;/p&gt;

&lt;h2&gt;
  
  
  Expo vs bare React Native
&lt;/h2&gt;

&lt;p&gt;This is the first real fork in the road.&lt;/p&gt;

&lt;p&gt;Bare React Native means you have a native iOS project and a native Android project sitting next to your JS code. You can install any native module, customize anything, and you'll probably spend Saturday afternoons fighting CocoaPods, Gradle, and Xcode signing certificates.&lt;/p&gt;

&lt;p&gt;Expo (managed) means you write JS only, install Expo-compatible native modules, and let EAS handle native builds in the cloud. You get OTA updates, a working dev client, and you're shipping to TestFlight in a day instead of a week.&lt;/p&gt;

&lt;p&gt;If you're a web developer starting out: use Expo. Don't believe people who tell you it's "for prototypes." It's not 2020 anymore. Expo's ecosystem covers nearly everything you'll actually need (camera, notifications, biometrics, in-app purchases, deep linking, file system). The workflow management in Expo is the best for RN right now.&lt;/p&gt;

&lt;p&gt;I tried the bare RN route on app two. I lost a weekend to an Xcode signing error that turned out to be a typo in &lt;code&gt;app.json&lt;/code&gt;. I went back to Expo for app three and have not looked back.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to learn first (the priority list)
&lt;/h2&gt;

&lt;p&gt;If you have one week to come up to speed before starting, here's the order:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Expo Router.&lt;/strong&gt; File-based routing, layouts, and dynamic routes. Read the &lt;a href="https://docs.expo.dev/router/introduction/" rel="noopener noreferrer"&gt;Expo Router docs&lt;/a&gt;, they're short.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;NativeWind.&lt;/strong&gt; Tailwind for RN. Lets you keep your CSS muscle memory and skip writing &lt;code&gt;StyleSheet.create({ ... })&lt;/code&gt; for every component.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The core RN primitives.&lt;/strong&gt; &lt;code&gt;View&lt;/code&gt;, &lt;code&gt;Text&lt;/code&gt;, &lt;code&gt;Pressable&lt;/code&gt;, &lt;code&gt;ScrollView&lt;/code&gt;, &lt;code&gt;FlatList&lt;/code&gt;, &lt;code&gt;TextInput&lt;/code&gt;, &lt;code&gt;Image&lt;/code&gt;. Know what each is for and when to use which.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AsyncStorage (or MMKV).&lt;/strong&gt; Your localStorage replacement. MMKV is faster but adds native code; AsyncStorage is fine for most cases.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reanimated basics.&lt;/strong&gt; &lt;code&gt;useSharedValue&lt;/code&gt;, &lt;code&gt;useAnimatedStyle&lt;/code&gt;, &lt;code&gt;withTiming&lt;/code&gt;, &lt;code&gt;withSpring&lt;/code&gt;. You don't need to master worklets on day one, but you do need this for any real interaction.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;EAS Build and EAS Update.&lt;/strong&gt; Your build and deploy story. Ten minutes of reading saves you hours.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That's enough to ship a real app. Everything else, learn when you need it.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to avoid (the trap list)
&lt;/h2&gt;

&lt;p&gt;These are the things that cost me time. Don't repeat them.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Don't try to make React Router work.&lt;/strong&gt; Expo Router exists. Use it. I still use React Navigation for my projects, as I'm more familiar with. Expo Router now is king.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Don't write &lt;code&gt;StyleSheet.create&lt;/code&gt; from scratch&lt;/strong&gt; when NativeWind solves it for you. You'll be slower, your code will read worse, and you'll resist refactoring. You can have a design system library and use it. Faster and easier.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Don't disable Hermes.&lt;/strong&gt; It's the default RN engine now: faster startup, smaller bundle, better debugging. You shouldn't need to touch this.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Don't use &lt;code&gt;setInterval&lt;/code&gt; for animations.&lt;/strong&gt; Use Reanimated. The frame drops will tell you why.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Don't ignore the keyboard.&lt;/strong&gt; Test every screen with the keyboard open. &lt;code&gt;KeyboardAvoidingView&lt;/code&gt; is the minimum; &lt;code&gt;react-native-keyboard-controller&lt;/code&gt; is what I actually use in production now.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Don't ship without offline handling.&lt;/strong&gt; Phones lose signal in elevators, on planes, on the subway. Check &lt;code&gt;NetInfo&lt;/code&gt; and have a fallback. If you want a deeper pattern, I wrote about &lt;a href="https://medium.com/@malikchohra/offline-support-and-caching-in-expo-with-custom-queuing-70af28a18e44" rel="noopener noreferrer"&gt;offline support and caching in Expo with custom queuing&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Don't assume iOS and Android behave the same.&lt;/strong&gt; Safe-area insets, permissions UX, file system paths, system gestures, they diverge in places that matter. Test both.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Don't ship API keys in your app.&lt;/strong&gt; This is the single biggest mistake I see web devs make moving over. Your &lt;code&gt;.env&lt;/code&gt; ships in your bundle. Anyone can decompile it. You need a backend proxy for any third-party API call that requires a secret key. I wrote a longer piece on &lt;a href="https://medium.com/stackademic/secure-your-react-native-app-with-secure-storage-expo-edition-a899dbbe3b8f" rel="noopener noreferrer"&gt;secure storage patterns in Expo&lt;/a&gt; that covers what to do with the secrets you do need on-device.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That last one matters more than ever now, because of the AI part.&lt;/p&gt;

&lt;h2&gt;
  
  
  The AI part nobody talks about
&lt;/h2&gt;

&lt;p&gt;If you're building a mobile app today and there's no AI feature on your roadmap, you're either underestimating where the market is or you have a very specific reason. Most web devs come into RN with an LLM feature already on the spec.&lt;/p&gt;

&lt;p&gt;Here's the honest version of what changes when you ship AI inside a mobile app.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Streaming LLM responses without dropping frames.&lt;/strong&gt; On the web, you stream tokens into a &lt;code&gt;&amp;lt;div&amp;gt;&lt;/code&gt; and let the browser paint. On mobile, your JS thread renders into a &lt;code&gt;&amp;lt;Text&amp;gt;&lt;/code&gt;, and if you re-render too aggressively you drop frames. The pattern is to batch tokens before pushing them into React state, not to call &lt;code&gt;setState&lt;/code&gt; for every chunk that comes off the stream.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;API key management.&lt;/strong&gt; I said it above and I'll say it again, because this is where most first-AI-mobile-app projects ship something insecure. You cannot put your OpenAI / Anthropic / whoever-else API key in your app. It will be extracted within minutes if anyone cares. You need a backend proxy, even a tiny one. A Cloudflare Worker or a Vercel function fronting the AI provider, with rate limiting per device, is the minimum.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Generative UI on mobile.&lt;/strong&gt; This is the gap. On the web, &lt;a href="https://tambo.co/" rel="noopener noreferrer"&gt;Tambo&lt;/a&gt; and Vercel AI SDK UI let you have an LLM render React components on the fly. On mobile, there's nothing equivalent that's stable yet. (Aside: I'm building one, open source. More on that another time.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;On-device inference is possible, barely.&lt;/strong&gt; &lt;code&gt;llama.rn&lt;/code&gt;, Core ML, MLKit can run small models locally for specific use cases (transcription, classification, simple chat). But for anything resembling Claude or GPT-4 quality, you're still calling an API. Plan for that.&lt;/p&gt;

&lt;p&gt;Here's a representative snippet, a streaming chat hook the way I'd write it for a mobile app, with the buffer pattern that keeps frames smooth:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// useStreamingChat.ts: pattern from the AI Mobile Launcher boilerplate&lt;/span&gt;
&lt;span class="c1"&gt;// Note: streaming fetch in RN needs expo/fetch (Expo SDK 52+) or a polyfill.&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;useState&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;useRef&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;useCallback&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;react&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;fetch&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nx"&gt;expoFetch&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;expo/fetch&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;Message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;assistant&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nl"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;useStreamingChat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;apiUrl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setMessages&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;useState&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;Message&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;([]);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;bufferRef&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useRef&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;flushTimerRef&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;useRef&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;NodeJS&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Timeout&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;flush&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useCallback&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;bufferRef&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;bufferRef&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nx"&gt;bufferRef&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nf"&gt;setMessages&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;last&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;last&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;role&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;assistant&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;last&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;last&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;chunk&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;[]);&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;send&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useCallback&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nf"&gt;setMessages&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;input&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;assistant&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;]);&lt;/span&gt;

      &lt;span class="c1"&gt;// Hit your backend proxy. Never the AI API directly from the device.&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;expoFetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;apiUrl&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/chat`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;input&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
      &lt;span class="p"&gt;});&lt;/span&gt;

      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;reader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getReader&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;decoder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;TextDecoder&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

      &lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;done&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;reader&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;done&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="nx"&gt;bufferRef&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;decoder&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;decode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="c1"&gt;// Batch UI updates at ~60fps instead of every token.&lt;/span&gt;
        &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;flushTimerRef&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="nx"&gt;flushTimerRef&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;setTimeout&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nf"&gt;flush&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
            &lt;span class="nx"&gt;flushTimerRef&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
          &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="nf"&gt;flush&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;apiUrl&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;flush&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;send&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;That pattern alone, buffering tokens and flushing at ~60fps instead of on every chunk, fixes most of the dropped-frames issues new RN devs hit when they first try streaming.&lt;/p&gt;
&lt;h2&gt;
  
  
  The shortcut
&lt;/h2&gt;

&lt;p&gt;If you're at day zero with React Native and you want to ship an AI-powered mobile app, here's the boring truth: the first two weeks are setup. Routing, styling system, secure API proxy, streaming UI, auth, EAS pipeline, build configs, app icons, splash screens. You'll do all of this before you write your first feature.&lt;/p&gt;

&lt;p&gt;I built &lt;a href="https://aimobilelauncher.com" rel="noopener noreferrer"&gt;AI Mobile Launcher&lt;/a&gt; because I'd done that two-week setup three times in a row. It's an Expo + React Native boilerplate with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;React Navigation with auth screens scaffolded&lt;/li&gt;
&lt;li&gt;Reanimated patterns ready&lt;/li&gt;
&lt;li&gt;Backend proxy for OpenAI / Anthropic / OpenRouter (deploy to Cloudflare Workers in one command)&lt;/li&gt;
&lt;li&gt;Streaming chat UI with the frame-safe pattern from the snippet above&lt;/li&gt;
&lt;li&gt;EAS Build and EAS Update preconfigured&lt;/li&gt;
&lt;li&gt;App icon, splash screen, and store metadata templated&lt;/li&gt;
&lt;li&gt;Revenue Cat, Authentication, Onboarding skills, Design system with react native restyle, High Performance architecture, that is scalable, included with UAMOS system., read it here 
&lt;div class="crayons-card c-embed text-styles text-styles--secondary"&gt;
    &lt;div class="c-embed__content"&gt;
        &lt;div class="c-embed__cover"&gt;
          &lt;a href="https://codemeetai.substack.com/p/i-spent-6-months-losing-fights-with" class="c-link align-middle" rel="noopener noreferrer"&gt;
            &lt;img alt="" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsubstackcdn.com%2Fimage%2Ffetch%2F%24s_%21Ca-X%21%2Cw_1200%2Ch_675%2Cc_fill%2Cf_jpg%2Cq_auto%3Agood%2Cfl_progressive%3Asteep%2Cg_auto%2Fhttps%253A%252F%252Fsubstack-post-media.s3.amazonaws.com%252Fpublic%252Fimages%252F8bb96400-bd56-4f7a-a511-337017c990ee_1456x816.png" height="450" class="m-0" width="800"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="c-embed__body"&gt;
        &lt;h2 class="fs-xl lh-tight"&gt;
          &lt;a href="https://codemeetai.substack.com/p/i-spent-6-months-losing-fights-with" rel="noopener noreferrer" class="c-link"&gt;
            I spent 6 months losing fights with AI in React Native. Then I built U-AMOS.
          &lt;/a&gt;
        &lt;/h2&gt;
          &lt;p class="truncate-at-3"&gt;
            The memory system that cut hallucinations 93% and token costs 91% across my own projects — and why the broader ecosystem is converging on the same pattern.
          &lt;/p&gt;
        &lt;div class="color-secondary fs-s flex items-center"&gt;
            &lt;img alt="favicon" class="c-embed__favicon m-0 mr-2 radius-0" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fsubstackcdn.com%2Fimage%2Ffetch%2F%24s_%21p5MU%21%2Cf_auto%2Cq_auto%3Agood%2Cfl_progressive%3Asteep%2Fhttps%253A%252F%252Fsubstack-post-media.s3.amazonaws.com%252Fpublic%252Fimages%252F00b0edef-2187-4de8-9be9-039365cff6dc%252Ffavicon.ico" width="64" height="64"&gt;
          codemeetai.substack.com
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
&lt;/div&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;It's the boilerplate I'd hand my past self on day zero. If you're a web dev planning your first AI mobile app, it cuts the setup phase from two weeks to an afternoon.&lt;/p&gt;

&lt;p&gt;Use it, fork it, ignore it. The goal is to not lose two weeks to plumbing.&lt;/p&gt;

&lt;h2&gt;
  
  
  One last thing
&lt;/h2&gt;

&lt;p&gt;If you're a web dev planning to build a mobile app: stop reading the "React Native vs Flutter" arguments. The framework isn't your bottleneck. The surface area you don't know yet, keyboard handling, native builds, store submissions, AI key management, is.&lt;/p&gt;

&lt;p&gt;Pick Expo. Ship something small. Hit a wall. Read the docs for that wall. Repeat.&lt;/p&gt;

&lt;p&gt;That's the whole path.&lt;/p&gt;

</description>
      <category>reactnative</category>
      <category>expo</category>
      <category>ai</category>
      <category>mobile</category>
    </item>
    <item>
      <title>How I stopped Claude Code from hallucinating 42% of my React Code</title>
      <dc:creator>Malik Chohra</dc:creator>
      <pubDate>Wed, 06 May 2026 08:52:13 +0000</pubDate>
      <link>https://dev.to/malik_chohra/how-i-stopped-claude-code-from-hallucinating-42-of-my-react-code-41ki</link>
      <guid>https://dev.to/malik_chohra/how-i-stopped-claude-code-from-hallucinating-42-of-my-react-code-41ki</guid>
      <description>&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;I tracked 6 months of my own AI coding sessions in React Native. In my logs, 42% of AI-generated diffs contained at least one hallucinated import, fake API, or duplicate component.&lt;/li&gt;
&lt;li&gt;Token costs were the second tax. Re-loading project context every session cost roughly $135/month per developer at the model pricing I was using.&lt;/li&gt;
&lt;li&gt;Better prompts didn’t fix either problem. The AI didn’t need smarter instructions : it needed memory and a map.&lt;/li&gt;
&lt;li&gt;I built U-AMOS (Universal AI Memory Operating System): a 3-tier memory bank, a context map, a rule priority system that splits “what to do” from “how to do it,” a 7-point anti-hallucination checklist, and a plan/act workflow that runs before any code is generated.&lt;/li&gt;
&lt;li&gt;After deploying U-AMOS across my own projects over a 3-month tracking period: hallucinations dropped from 42% to 3%. Token costs dropped from $180/month to $18/month. Feature velocity increased roughly 5x. These are my internal numbers: I’ll note where external research reports similar magnitudes.&lt;/li&gt;
&lt;li&gt;The framework is open and documented. U-AMOS 2.0 also ships pre-configured inside AI Mobile Launcher for anyone who doesn’t want to build it from scratch.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  A note on the numbers
&lt;/h2&gt;

&lt;p&gt;Everything in this article that is quantified — the 42%, the $135/month, the 91% reduction — comes from 6 months of my own session logs across my React Native projects. I tracked hallucinations manually, counted tokens via API usage dashboards, and measured debugging time against my own estimates. These are not controlled experiments.&lt;/p&gt;

&lt;p&gt;What I can say is that the direction of the results matches what external research is starting to report. Memory-system papers are showing 40–60% accuracy improvements and 60–90% token reductions when you introduce structured memory into LLM workflows. Mem0’s Claude Code integration reports roughly 90% lower token usage with persistent memory vs full-context prompting. The order of magnitude is consistent. The exact numbers are mine.&lt;/p&gt;

&lt;h2&gt;
  
  
  The moment I stopped pretending it was working
&lt;/h2&gt;

&lt;p&gt;It was a Tuesday in October. I was building a functionality for my app. I asked Claude Code to add a Redux toolkit usage to manage user accounts. It generated something that looked correct. I committed it.&lt;/p&gt;

&lt;p&gt;Twenty minutes later, the build failed.&lt;/p&gt;

&lt;p&gt;The AI had been imported &lt;code&gt;useRouter&lt;/code&gt; from &lt;code&gt;next/router&lt;/code&gt;. In a React Native project. That hook doesn’t exist on mobile. It was a 30-second fix, but it wasn’t the first time. It was the fourth time that week.&lt;/p&gt;

&lt;p&gt;I started keeping a log. Every wrong thing the AI generated, I wrote down. After a month, I had the data from my own sessions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;42% of AI-generated diffs had at least one hallucinated import, function, or component&lt;/li&gt;
&lt;li&gt;25% of the components it created already existed in the codebase under a different name&lt;/li&gt;
&lt;li&gt;I was spending roughly 4 hours a week debugging things the AI had invented&lt;/li&gt;
&lt;li&gt;I was using Cursor much more than Claude that time, so with Cursor, I had analytics dashboard, an d confirm some of my thesis&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The frustrating part was that I knew the AI wasn’t getting worse. I was paying for the best models. The prompts were detailed. The context windows were huge.&lt;/p&gt;

&lt;p&gt;The problem wasn’t the model. The problem was that I was treating it like a senior developer when it was behaving like a junior with no memory of the project, and no map of the codebase.&lt;/p&gt;

&lt;p&gt;I have played before by adding rules, memory bank,.. but there were always issues in grasping the whole context, and i need to remind him much more often.&lt;/p&gt;

&lt;h2&gt;
  
  
  The token tax nobody talks about
&lt;/h2&gt;

&lt;p&gt;While I was tracking hallucinations, I also started tracking token usage. The numbers were uncomfortable.&lt;/p&gt;

&lt;p&gt;Every session, I was loading the same context: project structure, architecture decisions, naming conventions, what components already existed. The AI had no memory between sessions, so I kept reexplaining everything. Worse, when I didn’t re-explain, the AI would explore : running directory listings, opening files at random, building up its own picture of the codebase by trial and error.&lt;/p&gt;

&lt;p&gt;That exploration is where the worst of the token bleeding happens. Asking “where is the authentication logic?” can trigger 25,000 tokens of blind navigation through folders before the AI finds it.&lt;/p&gt;

&lt;p&gt;The math, at the model pricing I was using at the time:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Session 1: Re-load + explore project structure → 50,000 tokens&lt;/li&gt;
&lt;li&gt;Session 2: Re-load + explore project structure → 50,000 tokens&lt;/li&gt;
&lt;li&gt;Session 3: Re-load + explore project structure → 50,000 tokens&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Daily total:&lt;/strong&gt; 150,000 tokens&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Monthly cost:&lt;/strong&gt; ~$135/month per developer&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;(based on ~$30 per million tokens, prompt + completion)&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;That’s the invisible tax. Even when the AI was generating correct code, I was paying to give it the same context every time, plus paying for it to wander around the repo finding things it should already know about.&lt;/p&gt;

&lt;p&gt;I do remember creating one file, that has architecture.md, where i put this type of context that i give each time, and then i created review_best_practices.md, to have the rules for the mistakes that he was repeating.&lt;/p&gt;

&lt;p&gt;Then it comes the Claude Code best practices usage, I tried the obvious approaches first. Longer CLAUDE.md files. More detailed system prompts. Better instructions on what to remember.&lt;/p&gt;

&lt;p&gt;None of it worked sustainably. The AI would hold context for a session or two, then drift. Because the problem wasn’t the prompt. It was the architecture.&lt;/p&gt;

&lt;h2&gt;
  
  
  The reframe that changed everything
&lt;/h2&gt;

&lt;p&gt;The shift came when I stopped thinking of AI as a developer and started thinking of it as a system that needed memory built for it, and a map handed to it. I do remember watching an intreview by Thomas Dohmke, and he asked one of the best practices is to look at it as a colleague, not a tool.&lt;/p&gt;

&lt;p&gt;A junior dev with no memory of your project would also generate hallucinated imports. Would also recreate components that already existed. Would also waste hours wandering through unfamiliar code looking for the right file. The AI wasn’t broken. The relationship was broken. I was asking it to behave like it had context it didn’t have.&lt;/p&gt;

&lt;p&gt;A lot of content I’ve seen treats this as a prompting problem. Write a better system prompt. Use a longer context window. Be more specific in your instructions.&lt;/p&gt;

&lt;p&gt;My experience, and increasingly what I see from teams who’ve shipped real production AI-assisted codebases, is that prompts plateau. Durable context compounds. The teams getting consistent AI output aren’t writing better prompts : they’re building memory systems that load the right context at the right time and update automatically when something changes.&lt;/p&gt;

&lt;p&gt;you can read this article about best prompt engineering approach here:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Essential guide of Prompt Engineering for Software Engineers&lt;/strong&gt;&lt;br&gt;
Malik CHOHRA · 17 November 2025&lt;br&gt;
Read full story&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That’s what I built. I called it U-AMOS.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc46qkgfqop62anikshkw.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fc46qkgfqop62anikshkw.png" alt="UAMOS schema"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What U-AMOS actually is
&lt;/h2&gt;

&lt;p&gt;U-AMOS : Universal AI Memory Operating System, is a framework for managing AI-assisted development. It has five components, each solving a specific failure mode I’d logged.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
┌──────────────────────┐
                  │     Memory Bank      │
                  │ (Cold / Warm / Hot)  │
                  └─────────┬────────────┘
                            ↓
                  ┌──────────────────────┐
                  │     Context Map      │
                  │   (Index / Lookup)   │
                  └─────────┬────────────┘
                            ↓
                  ┌──────────────────────┐
                  │     Plan Mode        │
                  │  (before execution)  │
                  └─────────┬────────────┘
                            ↓
                  ┌──────────────────────┐
                  │ Validation Layer     │
                  │ (7-point checklist)  │
                  └─────────┬────────────┘
                            ↓
                  ┌──────────────────────┐
                  │   Code Generation    │
                  └─────────┬────────────┘
                            ↓
                  ┌──────────────────────┐
                  │  Progress Logging    │
                  │   (.memory updates)  │
                  └─────────┬────────────┘
                            ↓
                  └──────→ FEEDBACK LOOP ──────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  1. The Memory Bank — three tiers, loaded on demand
&lt;/h2&gt;

&lt;p&gt;Not all context is equally important for every task. So I tiered it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cold tier (project identity — loads rarely, ~10% of sessions):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;00-description.md&lt;/code&gt; — what we’re building, in 500 words&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;01-brief.md&lt;/code&gt; — non-negotiable constraints&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;10-product.md&lt;/code&gt; — feature specs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Warm tier (architecture — loads on demand, ~30% of sessions):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;20-system.md&lt;/code&gt; — how the system works&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;30-tech.md&lt;/code&gt; — stack and dependencies&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;60-decisions.md&lt;/code&gt; — why we chose what we chose&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;70-knowledge.md&lt;/code&gt; — lessons learned&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Hot tier (current state — loads every session, 100%):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;40-active.md&lt;/code&gt; — what we’re working on right now (max 500 words)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;50-progress.md&lt;/code&gt; — what shipped recently&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The hot tier is small (~2,000 tokens) and always loads. The warm tier loads when the task touches architecture (~5,000 tokens). The cold tier almost never loads during development — it’s the onboarding layer. A new developer (or a new AI agent starting a session) reads the cold tier once and understands the project without hunting through the entire repo.&lt;/p&gt;

&lt;p&gt;The result: 2,000–10,000 tokens per session instead of 50,000. That assumes you’re maintaining the files actively — see the hygiene section below.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. The Context Map — the exploration killer
&lt;/h2&gt;

&lt;p&gt;This is the piece that does the most work for the lowest cost.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;context_map.md&lt;/code&gt; is a single 500-token lookup file at the root of the project. It indexes everything: every feature, every service, every core UI component, with the entry path next to each one.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Context Map&lt;/span&gt;
&lt;span class="gu"&gt;## Features (14)&lt;/span&gt;
| Feature        | Entry Point                      | Purpose            |
|----------------|----------------------------------|--------------------|
| auth           | src/features/auth/index.ts       | Authentication     |
| onboarding     | src/features/onboarding/index.ts | User onboarding    |
| todos          | src/features/todos/index.ts      | Todo management    |

&lt;span class="gu"&gt;## Services (15)&lt;/span&gt;
| Service        | Path                             | Responsibility     |
|----------------|----------------------------------|--------------------|
| logger         | src/services/logging/logger.ts   | Centralized logs   |
| analytics      | src/services/analytics/...       | Firebase analytics |

&lt;span class="gu"&gt;## UI Components (40+)&lt;/span&gt;
| Category       | Components                       |
|----------------|----------------------------------|
| Buttons        | Button, IconButton, FAB          |
| Forms          | Input, ControlledInput, Switch   |
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When the AI starts a session and needs to know “where does authentication live?”, it reads one 500-token file instead of running directory listings, opening five files to compare them, and burning 25,000 tokens building its own mental model of the repo.&lt;/p&gt;

&lt;p&gt;In my own logs, this single file removed roughly 60% of the per-session token consumption that wasn’t already covered by the memory bank. The math: 500 tokens replaces 25,000. That’s a 50x reduction on the most expensive part of every session : discovery.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. The Rule Priority System — three tiers, with generators separate from rules
&lt;/h2&gt;

&lt;p&gt;The same logic applies to coding rules.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Critical rules (always load, ~4,000 tokens):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Meta-rules and session protocol&lt;/li&gt;
&lt;li&gt;Anti-hallucination checklist&lt;/li&gt;
&lt;li&gt;Common violations (no inline styles, no &lt;code&gt;console.log&lt;/code&gt;, no hardcoded strings, no API keys)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Important rules (task-specific, ~2,000 tokens each):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Design system patterns: loads if working on UI&lt;/li&gt;
&lt;li&gt;State management rules: loads if working on the state&lt;/li&gt;
&lt;li&gt;i18n patterns : loads if adding translations&lt;/li&gt;
&lt;li&gt;Navigation patterns: loads if adding routes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Recommended rules (load if relevant):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Performance optimizations&lt;/li&gt;
&lt;li&gt;Testing patterns&lt;/li&gt;
&lt;li&gt;Security and platform-specific privacy rules&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The other architectural distinction that mattered: I separated generators from rules. They look similar but they solve different problems.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Generators answer what to do.&lt;/strong&gt; Step-by-step implementation guides for recurring tasks: “add a new language,” “add a new screen,” “add a paywall.” They’re workflow documents — copy this template, register here, run this script.&lt;br&gt;
This one i include in my Ai react native boilerplate:&lt;br&gt;
&lt;a href="https://aimobilelauncher.com/" rel="noopener noreferrer"&gt;https://aimobilelauncher.com/&lt;/a&gt;, and i explained them there, you can check the code about different generators.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Rules answer how to do it well.&lt;/strong&gt; Code quality patterns and constraints: this is what good styling looks like; this is what the wrong import path looks like.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;When you mix the two, when your “how to add a language” doc also tries to explain every i18n best practice, the AI gets overwhelmed and follows neither cleanly. Splitting them means the AI reads the generator to know the steps, then reads the matching rule pack to write the code correctly. Two clean reads. No drift.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Concrete examples beat abstract rules
&lt;/h2&gt;

&lt;p&gt;This is a philosophical point but it’s the reason U-AMOS rules actually work.&lt;/p&gt;

&lt;p&gt;Most rule documents read like this: “Use proper styling conventions. Avoid inline styles where possible.”&lt;/p&gt;

&lt;p&gt;Rules in U-AMOS read like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## Styling&lt;/span&gt;

&lt;span class="gu"&gt;### ❌ WRONG — inline styles&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;View&lt;/span&gt; &lt;span class="na"&gt;style=&lt;/span&gt;&lt;span class="s"&gt;{{&lt;/span&gt; &lt;span class="na"&gt;marginTop:&lt;/span&gt; &lt;span class="err"&gt;20,&lt;/span&gt; &lt;span class="na"&gt;padding:&lt;/span&gt; &lt;span class="err"&gt;16&lt;/span&gt; &lt;span class="err"&gt;}}&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;

&lt;span class="gu"&gt;### ✅ CORRECT — Restyle props&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;Box&lt;/span&gt; &lt;span class="na"&gt;marginTop=&lt;/span&gt;&lt;span class="s"&gt;"xl"&lt;/span&gt; &lt;span class="na"&gt;padding=&lt;/span&gt;&lt;span class="s"&gt;"lg"&lt;/span&gt;&lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;

&lt;span class="gu"&gt;### Exception: unsupported properties&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;Box&lt;/span&gt; &lt;span class="na"&gt;marginTop=&lt;/span&gt;&lt;span class="s"&gt;"xl"&lt;/span&gt; &lt;span class="na"&gt;style=&lt;/span&gt;&lt;span class="s"&gt;{{&lt;/span&gt; &lt;span class="na"&gt;opacity:&lt;/span&gt; &lt;span class="err"&gt;0.5&lt;/span&gt; &lt;span class="err"&gt;}}&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
(opacity is not a Restyle prop, inline is acceptable here)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;LLMs don’t generalize abstract principles well. They pattern-match. If you show them what wrong looks like next to what right looks like, they reliably produce the right pattern. If you tell them to “follow good practices,” they produce whatever the training data nudged them toward last time.&lt;/p&gt;

&lt;p&gt;Every rule pack in U-AMOS is built this way. ❌ wrong → ✅ correct → exception (if any). No paragraphs of theory. No abstract guidelines. Just visual diffs. This is the single biggest determinant of whether a rule actually changes the AI’s output or gets ignored.&lt;/p&gt;

&lt;h2&gt;
  
  
  5. The 7-Point Anti-Hallucination Checklist
&lt;/h2&gt;

&lt;p&gt;Before any code is generated, the AI verifies:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Does the file I’m editing exist?&lt;/li&gt;
&lt;li&gt;Did I check the component inventory before creating something new?&lt;/li&gt;
&lt;li&gt;Did I check the service registry?&lt;/li&gt;
&lt;li&gt;Is the import path correct?&lt;/li&gt;
&lt;li&gt;Does the function I’m calling actually exist in that file?&lt;/li&gt;
&lt;li&gt;Am I using the project’s i18n pattern, not hardcoded strings?&lt;/li&gt;
&lt;li&gt;Am I using the project’s logger, not &lt;code&gt;console.log&lt;/code&gt;?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If any answer is no, the AI stops and verifies before continuing.&lt;/p&gt;

&lt;p&gt;The first week I deployed this, my hallucination rate in my own sessions dropped from 42% to under 5%. Not because the model improved. Because I made verification mandatory before generation.&lt;/p&gt;

&lt;p&gt;Each of these rules is manually crafted.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. Plan/Act Mode — no code without a plan
&lt;/h2&gt;

&lt;p&gt;This is the piece I added after the initial U-AMOS deployment, and it might be the highest-leverage addition.&lt;/p&gt;

&lt;p&gt;Before touching more than one file, the AI must:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Read &lt;code&gt;.memory/40-active.md&lt;/code&gt; (current focus)&lt;/li&gt;
&lt;li&gt;Draft an implementation plan in plain markdown&lt;/li&gt;
&lt;li&gt;Wait for my confirmation&lt;/li&gt;
&lt;li&gt;Execute only after approval&lt;/li&gt;
&lt;li&gt;Log what it actually shipped back into &lt;code&gt;.memory/50-progress.md&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This sounds slow. It’s actually faster because you catch architectural mistakes at the plan stage instead of the debugging stage. Tweag’s Agentic Coding Handbook and Lullabot’s memory bank guide both document the same pattern. It’s becoming standard practice in teams using agentic coding seriously.&lt;/p&gt;

&lt;h2&gt;
  
  
  What changed after U-AMOS
&lt;/h2&gt;

&lt;p&gt;I tracked the same metrics for 3 months after deploying U-AMOS across my own projects.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Hallucinations&lt;/strong&gt; (from my logs): 42% → 3% (93% reduction)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tokens per session&lt;/strong&gt; (average): 48,000 → 4,200 (91% reduction)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Token cost&lt;/strong&gt; (at my model tier): ~$180/month → ~$18/month&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Time debugging AI errors:&lt;/strong&gt; 4 hours/week → 20 minutes/week&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Duplicate components created:&lt;/strong&gt; 23 in the 3 months before → 0 in the 3 months after&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Feature velocity:&lt;/strong&gt; roughly 5x faster on features I tracked end-to-end&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I also started tracking which rule packs loaded most often and which hallucination types were still slipping through. That observability layer  is what tells you where the system needs a new rule file vs where the AI needs better examples.&lt;/p&gt;

&lt;h2&gt;
  
  
  Memory hygiene: pruning, plus living rules
&lt;/h2&gt;

&lt;p&gt;The mistake I see in most memory bank setups is treating the files as append-only. They’re not. They need pruning.&lt;/p&gt;

&lt;p&gt;My current hygiene routine:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;40-active.md&lt;/code&gt; updates at the start of every work session (what’s the actual focus today)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;50-progress.md&lt;/code&gt; gets a new entry after every shipped feature : old entries archive monthly&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;70-knowledge.md&lt;/code&gt; gets pruned weekly : if a lesson is now in a rule file, it gets removed from the knowledge doc&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;20-system.md&lt;/code&gt; only updates when architecture actually changes&lt;/li&gt;
&lt;li&gt;If the AI proposes changes to any memory file, it does it as a plan diff I review : it never writes to memory silently&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There’s one more file that prevents documentation rot: &lt;code&gt;updated_rules.md&lt;/code&gt;. It’s a changelog for rule exceptions.&lt;/p&gt;

&lt;p&gt;When the team makes a real exception to a rule : for example, “we never use inline styles, EXCEPT for the opacity prop because Restyle doesn’t support it” : that exception goes in &lt;code&gt;updated_rules.md&lt;/code&gt; with a date and a reason. Not into the main rule file.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Updated Rules (Living Document)&lt;/span&gt;
&lt;span class="gu"&gt;## 2025-12-20 — Inline styles exception&lt;/span&gt;

&lt;span class="gs"&gt;**Original rule**&lt;/span&gt;: NO inline styles ever
&lt;span class="gs"&gt;**Updated rule**&lt;/span&gt;: NO inline styles EXCEPT for single properties not supported by Restyle (opacity)
&lt;span class="gs"&gt;**Why**&lt;/span&gt;: Restyle doesn’t support opacity prop
&lt;span class="gs"&gt;**Example**&lt;/span&gt;: ✅ &lt;span class="nt"&gt;&amp;lt;Box&lt;/span&gt; &lt;span class="na"&gt;marginTop=&lt;/span&gt;&lt;span class="s"&gt;"xl"&lt;/span&gt; &lt;span class="na"&gt;style=&lt;/span&gt;&lt;span class="s"&gt;{{&lt;/span&gt; &lt;span class="na"&gt;opacity:&lt;/span&gt; &lt;span class="err"&gt;0.5&lt;/span&gt; &lt;span class="err"&gt;}}&lt;/span&gt; &lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Why this matters: rules become outdated quickly, and rewriting them every time creates drift. The living rules file lets the AI always check the latest guidance without losing the original logic. Exceptions are explicit and dated. Historical context is preserved. The main rule files stay clean.&lt;/p&gt;

&lt;p&gt;The 2,000–10,000 token figure holds only if you maintain all of this. If you let the files grow unchecked, you’ll hit 50,000 tokens again within two months. The context window isn’t the bottleneck : your maintenance habits are.&lt;/p&gt;

&lt;h2&gt;
  
  
  What still doesn’t work, and what’s on the roadmap
&lt;/h2&gt;

&lt;p&gt;This isn’t a finished system. Four things still fail or are incomplete:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Long sessions.&lt;/strong&gt; Context degrades over multi-hour conversations. I re-attach memory bank files every 30–40 messages. A better solution is probably an MCP server that handles re-injection automatically, but I haven’t built it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Performance edge cases.&lt;/strong&gt; The AI generates working code that sometimes re-renders too aggressively. Architecture rules help, but don’t eliminate this. I m fixing this by creating performance rules for expo apps. i m using the official one from Expo, but it is not enough, and with the project architecture, it needs a lot of fixes and improvement.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cross-project memory.&lt;/strong&gt; U-AMOS handles per-project memory. The next layer — preferences and patterns that follow you across every project you touch — is what tools like Mem0’s MCP integration and Claude Code’s own auto-memory system are starting to solve. If you find yourself re-teaching the same conventions in every new repo, cross-project memory is the fix. I’m watching this space closely.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to set up U-AMOS yourself
&lt;/h2&gt;

&lt;p&gt;I have created a Prompt intialization for the system, i test it on some of my projects, and it was succefful. not so many rules though, but you can customize that part&lt;/p&gt;

&lt;p&gt;You can check it here: &lt;a href="https://gist.github.com/chohra-med/129e6ced83805bc712e92978d4fe7f6d" rel="noopener noreferrer"&gt;link&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Thanks for reading Code Meet AI: Stay relevant in the AI era! Subscribe for free to receive new posts and support my work.&lt;/p&gt;

&lt;h2&gt;
  
  
  Related work worth reading
&lt;/h2&gt;

&lt;p&gt;U-AMOS didn’t emerge from a vacuum. These are the guides I’ve found most aligned with the same pattern:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Tweag’s Agentic Coding Handbook:&lt;/strong&gt; memory bank system and plan/act mode, well documented&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mem0’s Claude Code integration:&lt;/strong&gt; if you want cross-project memory on top of U-AMOS, this is the current best path&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Anthropic’s Claude Code best practices:&lt;/strong&gt; the official guidance on CLAUDE.md structure, memory, and tool use&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The pattern is converging across all of these. Structured memory, tiered loading, mandatory verification before generation, plan-before-execute. U-AMOS is my implementation of that pattern for React Native specifically, with the anti-hallucination rules, the context map, and the mobile-specific constraints built in.&lt;/p&gt;

&lt;h2&gt;
  
  
  Or, if you want it pre-configured
&lt;/h2&gt;

&lt;p&gt;I built AI Mobile Launcher as the productized version of U-AMOS for React Native.&lt;/p&gt;

&lt;p&gt;It ships with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The full 9-file memory bank is pre-structured for a new project&lt;/li&gt;
&lt;li&gt;A pre-built context map of every feature, service, and UI component&lt;/li&gt;
&lt;li&gt;All critical, important, and recommended rule packs — written as visual diffs, not paragraphs&lt;/li&gt;
&lt;li&gt;The split between generators (workflows) and rules (patterns) is already in place&lt;/li&gt;
&lt;li&gt;Pre-built component and service inventories&lt;/li&gt;
&lt;li&gt;Cursor and Claude Code entry points configured with plan/act mode&lt;/li&gt;
&lt;li&gt;Generators for common features (onboarding, paywalls, i18n, design system)&lt;/li&gt;
&lt;li&gt;The 7-point anti-hallucination checklist is embedded in every entry point&lt;/li&gt;
&lt;li&gt;A starter &lt;code&gt;updated_rules.md&lt;/code&gt; ready for your first exception&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The Lite tier is free on GitHub. U-AMOS 2.0 ships fully configured in the Starter tier. If you’re starting a new React Native project and want the memory system running from day one without the setup work, that’s the fastest path. &lt;a href="https://aimobilelauncher.com/" rel="noopener noreferrer"&gt;aimobilelauncher.com&lt;/a&gt; &lt;/p&gt;

&lt;p&gt;If you’re adding U-AMOS to an existing project, the steps above are enough to get started. The framework isn’t magic — it’s the result of 6 months of failed sessions, logged and analyzed, until the AI stopped fighting me and started shipping with me.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I want you to take from this
&lt;/h2&gt;

&lt;p&gt;The content I see most often on AI coding frames is this as a prompting problem. Use a better system prompt. Be more specific. Add more examples to your instructions.&lt;/p&gt;

&lt;p&gt;My experience over 6 months of tracking my own sessions is that prompts hit a ceiling. Once you’ve written a clear, specific prompt, the next 10 iterations give you marginal gains. Memory and structure compound differently . every lesson added to the memory bank improves every future session. Every entry in the context map saves another exploration loop. Every rule written as a visual diff prevents an entire category of hallucination permanently.&lt;/p&gt;

&lt;p&gt;The AI isn’t a developer you prompt. It’s a system you build context for. Build the memory. Hand it the map. Show it what wrong looks like next to what right looks like. Stop paying to re-explain the same architecture every day.&lt;/p&gt;

&lt;p&gt;U-AMOS is how I did it. The principles work without my specific files. The files work better with the principles. Either way: fix the memory and the map first, then build the product.&lt;/p&gt;

&lt;p&gt;I write Code Meet AI weekly — AI in mobile development, real tradeoffs, what’s actually working in production. Next issue: agent-first mobile architecture and why most “AI features” in apps are just bolted-on chatbots pretending to be product. → &lt;a href="https://codemeetai.substack.com/" rel="noopener noreferrer"&gt;https://codemeetai.substack.com/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>reactnative</category>
      <category>productivity</category>
      <category>llm</category>
    </item>
  </channel>
</rss>
