<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Khalit Hartmann</title>
    <description>The latest articles on DEV Community by Khalit Hartmann (@khalit_hartmann_17e573503).</description>
    <link>https://dev.to/khalit_hartmann_17e573503</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F4012301%2F08d1c54e-6d1a-4f46-b404-68b001b98ae9.png</url>
      <title>DEV Community: Khalit Hartmann</title>
      <link>https://dev.to/khalit_hartmann_17e573503</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/khalit_hartmann_17e573503"/>
    <language>en</language>
    <item>
      <title>Mobile App Engagement Strategy: How We Fixed a 0.6% Push Notification Open Rate</title>
      <dc:creator>Khalit Hartmann</dc:creator>
      <pubDate>Thu, 02 Jul 2026 16:19:58 +0000</pubDate>
      <link>https://dev.to/khalit_hartmann_17e573503/mobile-app-engagement-strategy-how-we-fixed-a-06-push-notification-open-rate-2iln</link>
      <guid>https://dev.to/khalit_hartmann_17e573503/mobile-app-engagement-strategy-how-we-fixed-a-06-push-notification-open-rate-2iln</guid>
      <description>&lt;p&gt;&lt;em&gt;Based on my talk at &lt;a href="https://www.meetup.com/flutter-berlin/" rel="noopener noreferrer"&gt;Flutter Berlin&lt;/a&gt; (Flutter OctoberFest). Assumes familiarity with mobile development. No ML or data science background needed.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;📺 &lt;a href="https://www.youtube.com/watch?v=XNFLwyOlZ6w" rel="noopener noreferrer"&gt;Watch the full talk on YouTube&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; Campaign push notifications at MediaMarktSaturn had a 0.6% open rate across 2.7 million sends. The problem wasn't the content. It was timing, relevance, and volume. We found a framework called Just-In-Time Adaptive Interventions (JITAI), originally from medical health apps, and designed a system that decides &lt;em&gt;whether&lt;/em&gt;, &lt;em&gt;when&lt;/em&gt;, and &lt;em&gt;what&lt;/em&gt; to send each user. The system was still in discovery when I gave this talk, but the thinking behind it applies to any mobile app engagement strategy.&lt;/p&gt;




&lt;p&gt;Most mobile app engagement strategies treat push notifications as a broadcast channel: pick a segment, write copy, blast it out, hope for clicks. This post shows what happens when that stops working at scale, and how a research-backed framework gives you a better way to improve app engagement through push notifications. The examples come from a real e-commerce app with millions of users.&lt;/p&gt;

&lt;h2&gt;
  
  
  The numbers that started the conversation
&lt;/h2&gt;

&lt;p&gt;At MediaMarktSaturn, I worked on the consumer apps team, one of the larger Flutter teams in Germany. We maintained apps across 13 countries. The quarter before, we'd also shipped a push notification service.&lt;/p&gt;

&lt;p&gt;Here's what a bad campaign looked like:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Number&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Push notifications sent&lt;/td&gt;
&lt;td&gt;2,700,000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Opened&lt;/td&gt;
&lt;td&gt;~17,000 (0.6%)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Converted (from opens)&lt;/td&gt;
&lt;td&gt;66 (0.36%)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;That's 66 purchases from 2.7 million interruptions. Not 66 thousand. Sixty-six.&lt;/p&gt;

&lt;p&gt;This wasn't even our worst campaign. But it made the question impossible to dodge: are we providing value to our users, or are we just making noise?&lt;/p&gt;

&lt;h2&gt;
  
  
  Why push notifications fail to drive app engagement
&lt;/h2&gt;

&lt;p&gt;We kept coming back to four problems:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Timing.&lt;/strong&gt; Send a notification while someone is in a meeting or on the train and it's just a disturbance. Every user has different windows of attention throughout the day.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Relevance.&lt;/strong&gt; A promotion for kitchen appliances is useful to someone furnishing a new apartment. To everyone else, it's clutter. Batch campaigns don't distinguish between the two.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Volume.&lt;/strong&gt; Send too many and users go numb. Or they revoke notification permissions entirely. Once that happens, the channel is dead.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Competition.&lt;/strong&gt; The average user has around 50 apps installed. Each one wants a slice of a limited attention budget. If your notification doesn't earn its place, it gets swiped away.&lt;/p&gt;

&lt;p&gt;All four come back to the same thing: &lt;strong&gt;receptivity&lt;/strong&gt;, the user's ability and willingness, right now, to receive and act on an interruption.&lt;/p&gt;

&lt;h2&gt;
  
  
  What receptivity actually means
&lt;/h2&gt;

&lt;p&gt;Receptivity isn't a toggle. It shifts throughout the day based on two kinds of signals:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Internal:&lt;/strong&gt; mood, cognitive load, what the user is focused on&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;External:&lt;/strong&gt; location, time of day, whether they're driving, walking, or sitting still&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The idea is simple: if someone isn't in a position to care about your message, sending it does nothing at best and annoys them at worst. Push to a non-receptive user often enough and they'll opt out.&lt;/p&gt;

&lt;p&gt;Our bet: if we time notifications better and make them personal, users will see them as useful instead of annoying. Useful notifications get opened. Opened notifications convert.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is JITAI? A framework from health research
&lt;/h2&gt;

&lt;p&gt;Just-In-Time Adaptive Interventions (JITAI) comes from medical health research, specifically addiction support and weight loss apps. Those systems monitor a user's internal and external state and only step in when two conditions are met:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The user is in a &lt;strong&gt;vulnerable state&lt;/strong&gt; (in a health app: at risk of relapsing; for us: likely to benefit from a product notification)&lt;/li&gt;
&lt;li&gt;The user is in a &lt;strong&gt;receptive state&lt;/strong&gt; (able and willing to process the message right now)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The framework has four parts:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fkhal.it%2Fblog%2Fjitai-framework-pipeline.svg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fkhal.it%2Fblog%2Fjitai-framework-pipeline.svg" alt="The JITAI framework pipeline: decision point, tailoring variables, intervention options, proximal outcomes, and distal outcome — with a decision maker orchestrating the flow" width="900" height="340"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Decision points
&lt;/h3&gt;

&lt;p&gt;The moment the system decides whether to show a notification. For server-sent push, this is when the server would normally fire. For on-device local notifications, you can set intervals and check every X minutes whether conditions are right.&lt;/p&gt;

&lt;p&gt;The important part: the decision isn't just "send." It's "send, delay, or don't send at all."&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Tailoring variables
&lt;/h3&gt;

&lt;p&gt;The data that feeds the decision. Two sources:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Passive&lt;/strong&gt; (no user action needed):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Device type (Android and iOS users often have different notification habits)&lt;/li&gt;
&lt;li&gt;Location (is the user near a physical store?)&lt;/li&gt;
&lt;li&gt;Time patterns (when does this user typically open the app?)&lt;/li&gt;
&lt;li&gt;Sensor data (accelerometer, ambient light, with consent)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Active&lt;/strong&gt; (user tells us directly):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Wishlist items&lt;/li&gt;
&lt;li&gt;Search history&lt;/li&gt;
&lt;li&gt;Category preferences&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The system has to work with whatever the user consents to share. Someone who only grants location access should still get a noticeably better experience than a batch campaign.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Intervention options
&lt;/h3&gt;

&lt;p&gt;What the system can actually do:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Action&lt;/th&gt;
&lt;th&gt;Example&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Information notification&lt;/td&gt;
&lt;td&gt;New arrivals, restocks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Promotional notification&lt;/td&gt;
&lt;td&gt;Sales, discounts&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Location-based notification&lt;/td&gt;
&lt;td&gt;Store-specific offers when nearby&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Rich push&lt;/td&gt;
&lt;td&gt;Image-heavy, interactive notifications&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Reschedule&lt;/td&gt;
&lt;td&gt;Delay the notification for a better time&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;No push&lt;/td&gt;
&lt;td&gt;Suppress the notification entirely&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;That last row matters the most. Sometimes the best notification is none.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. The decision maker
&lt;/h3&gt;

&lt;p&gt;This is the core. It takes the tailoring variables and intervention options and decides &lt;em&gt;which&lt;/em&gt; intervention to offer, to &lt;em&gt;whom&lt;/em&gt;, and &lt;em&gt;when&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Two approaches:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Algorithmic:&lt;/strong&gt; Decision trees and rules. Easier to build, test, and explain. Example: if the user is within 500m of a store AND has a wishlist item on sale at that store, send a location-based push.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Machine learning:&lt;/strong&gt; An on-device reinforcement learning model that learns from each user's interaction patterns. The model improves per-user over time without sharing data across devices. Harder to build and harder to test, but it can pick up patterns that rules miss.&lt;/p&gt;

&lt;p&gt;In practice, you'd start with rules and add ML where the rules stop improving.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fkhal.it%2Fblog%2Fjitai-reinforcement-learning.svg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fkhal.it%2Fblog%2Fjitai-reinforcement-learning.svg" alt="Reinforcement learning decision maker: the on-device RL model receives decision points, chooses actions, and learns from user responses over time" width="640" height="300"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Push notification examples that improve app engagement
&lt;/h2&gt;

&lt;h3&gt;
  
  
  The store walk-by
&lt;/h3&gt;

&lt;p&gt;You're walking through Alexanderplatz. Your Saturn app knows your wishlist, and you've been eyeing a pair of headphones. In the background, the system checks: does the Saturn store at Alexanderplatz have those headphones in stock? Is there a promotion running?&lt;/p&gt;

&lt;p&gt;If yes, send a push. You already wanted the product. The store is 200 meters away. The promotion gives you a reason to walk in now.&lt;/p&gt;

&lt;p&gt;You find out about a deal on something you already want. The store gets a walk-in. One notification, no spam.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Distal outcome:&lt;/strong&gt; purchase in store.&lt;br&gt;
&lt;strong&gt;Proximal outcome:&lt;/strong&gt; user becomes aware of the deal. Even if they don't tap the notification, they've absorbed the information.&lt;/p&gt;

&lt;h3&gt;
  
  
  The abandoned cart
&lt;/h3&gt;

&lt;p&gt;You're at home, browsing the MediaMarkt app. You add three items to your cart. Your partner walks in, you put the phone down, and the cart just sits there.&lt;/p&gt;

&lt;p&gt;The system picks up that the cart has been idle for a few hours. One of the items now has a 20% discount that expires soon. You probably forgot about the cart, and you definitely don't know about the price drop.&lt;/p&gt;

&lt;p&gt;Send a notification about the discount. You get a useful reminder. The retailer recovers a sale that would have quietly disappeared.&lt;/p&gt;

&lt;p&gt;Both examples follow the same logic: only interrupt someone when you're reasonably confident the notification is worth their attention.&lt;/p&gt;

&lt;h2&gt;
  
  
  The hard parts of this engagement strategy
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Privacy and GDPR
&lt;/h3&gt;

&lt;p&gt;We're in Germany. &lt;a href="https://khal.it/blog/ich-bin-kein-dsgvo-experte" rel="noopener noreferrer"&gt;GDPR isn't a formality&lt;/a&gt;, it's the operating environment. The system touches sensitive data: location, purchase history, browsing behavior. The deal with users has to be clear: this data improves your notification experience, full stop. It's never sold or shared.&lt;/p&gt;

&lt;p&gt;If users don't trust the system, they won't grant the permissions it needs. And then there's nothing to work with.&lt;/p&gt;

&lt;h3&gt;
  
  
  Testing tailoring variables
&lt;/h3&gt;

&lt;p&gt;Which variables actually predict receptivity? Device type? Time of day? Distance to a store? Some combination? This needs serious A/B testing and careful measurement. The search space is large, and gut feelings about what matters tend to be wrong.&lt;/p&gt;

&lt;h3&gt;
  
  
  Incomplete data
&lt;/h3&gt;

&lt;p&gt;The system has to work when users only grant some permissions, or none at all. Someone who shares location, wishlist, and browsing history gives the decision maker plenty of signal. Someone who shares nothing gets the batch experience, same as before, no penalties.&lt;/p&gt;

&lt;p&gt;That gap in quality is inherent, and it's fine. But it needs to be measured honestly, not papered over in the dashboard.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where we left off
&lt;/h2&gt;

&lt;p&gt;When I gave this talk at Flutter Berlin, the system was in the discovery phase. We had research, architecture, and a talk, but no production code yet. The academic literature on JITAI is solid, but applying it to e-commerce push at scale, across 13 countries and millions of users, raises problems the papers don't address.&lt;/p&gt;

&lt;p&gt;The thing I keep coming back to: the problem with push notifications isn't push notifications. It's sending the wrong message at the wrong time. And honestly, it's not having the discipline to send nothing when you don't have a good reason.&lt;/p&gt;

&lt;p&gt;Most mobile app engagement strategies still treat push as a broadcast channel. The bar for standing out is low: just stop yelling at people who didn't ask.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F3jchp6xjq5fy1g3xdsjk.webp" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F3jchp6xjq5fy1g3xdsjk.webp" alt="Flutter Berlin OctoberFest group photo at MediaMarktSaturn" width="800" height="600"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Thanks to &lt;a href="https://www.meetup.com/flutter-berlin/" rel="noopener noreferrer"&gt;Flutter Berlin&lt;/a&gt; and MediaMarktSaturn for hosting the event.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;If you're building a mobile app that needs smarter notifications — or rethinking the system you already have — &lt;a href="https://khal.it/leistungen/app-entwicklung" rel="noopener noreferrer"&gt;that's the kind of architecture work I do&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Key takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Push notification open rates are a symptom, not the disease.&lt;/strong&gt; The real problem is sending the wrong message at the wrong time to the wrong user.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The JITAI framework gives structure to personalization.&lt;/strong&gt; Decision points, tailoring variables, intervention options, and a decision maker — four components that turn "send smarter notifications" into an engineering problem.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Start with rules, add ML later.&lt;/strong&gt; Algorithmic decision trees are easier to build, test, and explain. Machine learning picks up patterns rules miss, but only after you understand what "better" looks like.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The best notification is sometimes no notification.&lt;/strong&gt; Suppressing a low-value push preserves the user's trust and keeps the channel alive for when it matters.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Related reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://khal.it/blog/website-in-app-einbetten-webview-komplexitaet" rel="noopener noreferrer"&gt;Embedding a Website in Your App — Why It Is More Complex Than You Think&lt;/a&gt; — the hidden complexity behind WebViews in mobile apps&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://khal.it/blog/webview-app-payment-flows-state-sync-teil-2" rel="noopener noreferrer"&gt;WebView App: Payment Flows, State Sync, and Platform Hacks (Part 2)&lt;/a&gt; — where WebView integration gets really messy&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://khal.it/blog/flutter-webview-tap-gestures-break-nestedscrollview-ios-fix" rel="noopener noreferrer"&gt;Fix: Flutter WebView Tap Gestures Stop Working After Scrolling in NestedScrollView (iOS)&lt;/a&gt; — a platform bug we had to solve in production&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://khal.it/blog/the-phoenix-pattern-in-flutter" rel="noopener noreferrer"&gt;The Phoenix Pattern in Flutter: How to Restart Your App Without Restarting Your App&lt;/a&gt; — a pattern for forced app restarts without killing the process&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Further reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://doi.org/10.1007/s12160-016-9830-8" rel="noopener noreferrer"&gt;Nahum-Shani et al. — "Just-in-Time Adaptive Interventions (JITAIs) in Mobile Health"&lt;/a&gt; — the foundational JITAI paper&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://doi.org/10.1093/her/cyy054" rel="noopener noreferrer"&gt;Hardeman et al. — "Developing and testing a digital intervention for health behavior change"&lt;/a&gt; — applying adaptive interventions in practice&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://firebase.google.com/docs/cloud-messaging" rel="noopener noreferrer"&gt;Firebase Cloud Messaging documentation&lt;/a&gt; — push notification infrastructure most Flutter apps use&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://gdpr.eu/cookies/" rel="noopener noreferrer"&gt;GDPR and push notifications&lt;/a&gt; — EU regulations on user data in notification targeting&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>flutter</category>
      <category>architecture</category>
      <category>mobile</category>
      <category>ux</category>
    </item>
    <item>
      <title>Building Apps with AI: What Actually Works (2026)</title>
      <dc:creator>Khalit Hartmann</dc:creator>
      <pubDate>Thu, 02 Jul 2026 16:19:57 +0000</pubDate>
      <link>https://dev.to/khalit_hartmann_17e573503/building-apps-with-ai-what-actually-works-2026-goh</link>
      <guid>https://dev.to/khalit_hartmann_17e573503/building-apps-with-ai-what-actually-works-2026-goh</guid>
      <description>&lt;p&gt;&lt;em&gt;For CTOs, founders, and developers who want to know how AI is actually changing app development. No buzzword bingo — experiences from real projects.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; "Build an app with AI" can mean two things: using AI tools to develop faster, or adding AI features to your app. I do both daily. The reality: AI coding tools like Claude Code speed up my work by 20–40%, but they don't replace architecture decisions or testing. No-code AI builders like Bolt or FlutterFlow produce decent prototypes, but not production-ready apps. And AI features in apps (chatbots, image recognition, recommendations) are now affordable — if you know which API to use for which problem.&lt;/p&gt;




&lt;h2&gt;
  
  
  What "build an app with AI" actually means
&lt;/h2&gt;

&lt;p&gt;When clients mention "AI" in their initial inquiry, they usually mean one of two things.&lt;/p&gt;

&lt;p&gt;The first: using AI tools to make development itself faster. Claude Code implements entire features from a description, GitHub Copilot suggests inline completions, Cursor navigates the codebase. This affects me as a developer and changes how I work.&lt;/p&gt;

&lt;p&gt;The second: adding AI features to the app itself. A chatbot that answers customer questions. Image recognition that identifies products. Personalized recommendations based on user behavior. This affects the product and its features.&lt;/p&gt;

&lt;p&gt;Both meanings matter, but they're fundamentally different. I'll cover both.&lt;/p&gt;

&lt;h2&gt;
  
  
  AI as a development tool: what works, what doesn't
&lt;/h2&gt;

&lt;p&gt;I've been using AI coding tools in my daily work since 2023. Currently it's exclusively Claude Code — as a coding partner that knows the entire codebase and can implement features end-to-end.&lt;/p&gt;

&lt;p&gt;What genuinely goes faster: boilerplate code, test cases, data models, regular expressions, and navigating unfamiliar codebases. When I need a new Flutter widget that follows an existing pattern in the app, Claude Code builds it in seconds instead of minutes. Unit tests for a method? Claude Code generates them in the context of the existing test suite and gets the approach right most of the time.&lt;/p&gt;

&lt;p&gt;On a recent project — an e-commerce app with a complex shopping cart — I estimate I spent 30% less time on implementation than I would have without AI tools. The gain isn't in single big moments but in hundreds of small time savings spread across the entire project.&lt;/p&gt;

&lt;p&gt;What doesn't work: architecture decisions. "Should we use Riverpod or Bloc for state management?" — AI will give you an answer, but whether it's right for your specific project requires someone who understands the context. (More on framework choices in my &lt;a href="https://khal.it/en/blog/flutter-vs-react-native-2026" rel="noopener noreferrer"&gt;Flutter vs React Native comparison&lt;/a&gt;.) Same goes for security decisions, performance optimization, and anything where context extends beyond a single file.&lt;/p&gt;

&lt;p&gt;One point that rarely gets mentioned: AI-generated code needs the same careful review as human-written code. I've accepted AI suggestions that compiled and passed tests but were logically wrong. That happens when you stop reading the output. AI tools make an experienced developer faster. They don't make an inexperienced developer experienced.&lt;/p&gt;

&lt;h2&gt;
  
  
  Adding AI features to your app: the practical options
&lt;/h2&gt;

&lt;p&gt;The second meaning — putting AI into the product. A lot has changed here in the last two years.&lt;/p&gt;

&lt;h3&gt;
  
  
  Chatbots and text processing
&lt;/h3&gt;

&lt;p&gt;For chatbots and text processing, the large language models are the obvious choice. Claude API, OpenAI API, or Google Gemini as the backend. The integration is technically straightforward: API call to the provider, display the response. The challenge is in prompt design (what instructions does the model get?), cost control (API calls charge per token), and making sure the responses actually make sense for your context.&lt;/p&gt;

&lt;h3&gt;
  
  
  Image recognition and computer vision
&lt;/h3&gt;

&lt;p&gt;For image recognition and computer vision, there are on-device options: TensorFlow Lite, Core ML (iOS), and Google's ML Kit. The advantage: processing happens on the device with no server costs and no latency. A concrete example: for a health app, I built text recognition using Google ML Kit that scans product codes from leaflet inserts. The user points their camera, the app recognizes the code via on-device OCR and unlocks content — no server costs, no latency.&lt;/p&gt;

&lt;h3&gt;
  
  
  Recommendation systems
&lt;/h3&gt;

&lt;p&gt;Recommendation systems — "products you might like" — can now be built with services like AWS Personalize or Google Recommendations AI without training your own ML model. Costs start low and scale with usage. In practice, I haven't needed to train a custom recommendation algorithm on any project so far — the cloud services cover most use cases.&lt;/p&gt;

&lt;h3&gt;
  
  
  Speech recognition
&lt;/h3&gt;

&lt;p&gt;Speech recognition runs on native platform APIs from Apple and Google — reliable and without server costs. Once you need intent detection or context across multiple sentences, you're back to the LLM APIs.&lt;/p&gt;

&lt;h3&gt;
  
  
  What each option costs
&lt;/h3&gt;

&lt;p&gt;What each option costs depends heavily on usage volume. For an MVP with a few hundred users, API costs run 10–50 €/month. With hundreds of thousands of users, it can quickly hit four figures. On-device ML has no ongoing costs — but the initial development is more involved.&lt;/p&gt;

&lt;h2&gt;
  
  
  No-code AI: building apps without programming
&lt;/h2&gt;

&lt;p&gt;Bolt, Lovable, FlutterFlow with AI assistant — I've tested them all. These tools promise to generate apps from text descriptions.&lt;/p&gt;

&lt;p&gt;I've tested several of them. My honest take:&lt;/p&gt;

&lt;p&gt;For prototypes and demos, they work surprisingly well. Bolt generates a functional web app from a prompt — with UI, navigation, and basic logic. FlutterFlow creates Flutter UIs from descriptions that actually look decent. For a pitch deck prototype or an internal demo, that saves days.&lt;/p&gt;

&lt;p&gt;For production-ready apps, they fall short. The codebase is hard to maintain, performance sits below what an experienced developer produces, and the moment you need something beyond standard patterns, you hit walls. Error handling, edge cases, accessibility, performance optimization — these are the things that separate a good app from a mediocre one, and no builder handles them well.&lt;/p&gt;

&lt;p&gt;My recommendation: use no-code AI for validation. Build a prototype, show it to potential users, collect feedback. If validation is positive, invest in a professional implementation. The prototype then provides valuable requirements for the developer. (More on this in my &lt;a href="https://khal.it/en/blog/app-programmieren-lernen" rel="noopener noreferrer"&gt;post for founders building their first app&lt;/a&gt;.)&lt;/p&gt;

&lt;p&gt;Want to know what an MVP with AI features actually costs? &lt;a href="https://khal.it/en/blog/was-kostet-eine-app" rel="noopener noreferrer"&gt;Here's my honest breakdown&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  What AI means for costs and timelines
&lt;/h2&gt;

&lt;p&gt;The honest answer: AI makes app development cheaper and faster, but not to the degree some people promise.&lt;/p&gt;

&lt;p&gt;My experience: AI tools save me 20–40% of development time, depending on the project type. A project that would have taken eight weeks now takes six to seven. With heavy custom UI and little boilerplate, the difference is smaller. With API-heavy backend integrations, it's larger.&lt;/p&gt;

&lt;p&gt;What doesn't change: the concept and design phase. Understanding what to build takes just as long as before. Same for testing and QA. AI accelerates implementation, not the entire product development process.&lt;/p&gt;

&lt;p&gt;For clients, that means concretely: an MVP that previously cost 20,000 € now comes in around 15,000–17,000 €. (More on this in my &lt;a href="https://khal.it/en/blog/was-kostet-eine-app" rel="noopener noreferrer"&gt;post about app costs&lt;/a&gt;.) Not a paradigm shift, but a noticeable improvement. The bigger change is that I can deliver more functionality in the same time — not that the same functionality costs dramatically less.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where AI hits its limits
&lt;/h2&gt;

&lt;p&gt;After two years of daily use, I'm clear on what AI can't do in app development — at least not yet.&lt;/p&gt;

&lt;p&gt;Business context is the biggest gap. AI can write code, but it can't judge whether the feature priority is right or whether the architecture will still scale in six months. Every decision that goes beyond a single function needs human judgment.&lt;/p&gt;

&lt;p&gt;Then there's the missing long-term memory. AI doesn't know the decision from last sprint about why you chose against Redux, or which technical debt you consciously accepted. Context extends beyond the session, and that's missing.&lt;/p&gt;

&lt;p&gt;Hallucinations still happen. Less than a year ago, but they do. Non-existent API methods, outdated syntax, packages that don't exist. If you can't verify the output, you won't notice. That's why AI is a tool for developers, not a replacement.&lt;/p&gt;

&lt;p&gt;And then the quality question. AI produces code that "works." But working code and good code aren't the same thing. Error handling, accessibility, performance under load, edge cases with poor network connectivity — these are the things that matter, and they don't appear in any AI-generated MVP.&lt;/p&gt;

&lt;p&gt;Still: AI is the best tool that's been added to my career as a developer. Not because it replaces me, but because it handles the boring parts and leaves me more time for the interesting ones. If you're considering local AI alternatives to cloud APIs, here's &lt;a href="https://khal.it/blog/lokale-ki-fuer-entwickler" rel="noopener noreferrer"&gt;my experience report&lt;/a&gt;.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Planning an app with AI features, or want to know how AI can speed up your project? &lt;a href="https://calendly.com/development-khal/casual-coffee" rel="noopener noreferrer"&gt;Book a free intro call&lt;/a&gt; — I'll give you a realistic assessment of what's possible and what's worth it. More about my approach on the &lt;a href="https://khal.it/en/services/app-development" rel="noopener noreferrer"&gt;app development page&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>mobile</category>
    </item>
    <item>
      <title>AI-Assisted Flutter Development: Claude Code in Production</title>
      <dc:creator>Khalit Hartmann</dc:creator>
      <pubDate>Thu, 02 Jul 2026 15:04:00 +0000</pubDate>
      <link>https://dev.to/khalit_hartmann_17e573503/ai-assisted-flutter-development-claude-code-in-production-1k1g</link>
      <guid>https://dev.to/khalit_hartmann_17e573503/ai-assisted-flutter-development-claude-code-in-production-1k1g</guid>
      <description>&lt;h1&gt;
  
  
  AI-Assisted Flutter Development: Claude Code in Production
&lt;/h1&gt;

&lt;p&gt;&lt;em&gt;For senior developers, tech leads, and CTOs wondering whether AI coding tools actually work in production -- or just generate impressive demos.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; I used Claude Code across 827 commits in a Flutter e-commerce app for a Swiss retailer. The difference between "vibe coding" and AI-assisted development is not the AI -- it is the infrastructure around it. A &lt;code&gt;CLAUDE.md&lt;/code&gt; file defines project context. Custom skills (slash commands) switch the AI between migration mode, legacy mode, and test mode. Architecture guardrails prevent the AI from "improving" code it should not touch. Result: ~30% faster migration timeline. Not because the AI wrote perfect code, but because it handled the mechanical parts while I made the decisions that required judgment.&lt;/p&gt;




&lt;h2&gt;
  
  
  Vibe Coding vs. AI-Assisted Development
&lt;/h2&gt;

&lt;p&gt;These two terms get used interchangeably in 2026. They should not be.&lt;/p&gt;

&lt;p&gt;Vibe coding is prompting and committing. You describe what you want, the AI generates code, you paste it in. Maybe it works. Maybe it compiles. Maybe it follows the same pattern as the rest of your codebase. You don't know until something breaks, and by then three more features are built on top of it.&lt;/p&gt;

&lt;p&gt;AI-assisted development is structured collaboration. The AI knows your architecture. It knows which patterns to follow in which parts of the codebase. It knows when to use &lt;code&gt;TaskEither&lt;/code&gt; and when to use &lt;code&gt;try/catch&lt;/code&gt;. It knows because you told it -- explicitly, in configuration files that it reads before writing a single line.&lt;/p&gt;

&lt;p&gt;Both use the same underlying models. The difference is entirely in setup. One produces code that looks right in a PR diff. The other produces code that works in production six months later.&lt;/p&gt;

&lt;p&gt;I spent ten months on a project that started as vibe-coded and ended as something far more disciplined.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Project: 827 Commits, One Codebase
&lt;/h2&gt;

&lt;p&gt;The Customer was a renowned Swiss retailer that had just launched their Mobile E-Commerce App 2 Months prior. I joined their Flutter e-commerce app when it had an 85% crash-free rate -- roughly one in seven sessions ended in a crash. API calls inside widgets, three different error handling styles, state management by tutorial roulette.&lt;/p&gt;

&lt;p&gt;Over 22 weeks, I migrated the app to Clean Architecture while shipping features. 827 commits. Crash-free rate past 97%. I have written about the &lt;a href="https://khal.it/blog/flutter-clean-architecture-migration" rel="noopener noreferrer"&gt;migration strategy itself&lt;/a&gt; separately. This post is about the AI tooling that made it possible to move that fast.&lt;/p&gt;

&lt;p&gt;My only AI coding tool was Claude Code -- an agentic coding CLI that reads your entire project and executes multi-step development tasks. Not Copilot, not Cursor. The distinction matters because this workflow depends on Claude Code features: &lt;code&gt;CLAUDE.md&lt;/code&gt; configuration files and custom skills.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Context Problem
&lt;/h2&gt;

&lt;p&gt;Here is what happens when you point an AI at a Flutter codebase without project context.&lt;/p&gt;

&lt;p&gt;You ask it to fetch product data. It writes a service class with &lt;code&gt;try/catch&lt;/code&gt;. You ask it to fetch user data. It writes a repository with &lt;code&gt;TaskEither&lt;/code&gt; from &lt;code&gt;fpdart&lt;/code&gt;. You ask it to handle cart state. It generates a &lt;code&gt;ChangeNotifier&lt;/code&gt;. You already use Riverpod.&lt;/p&gt;

&lt;p&gt;Three requests, three different patterns. Each defensible in isolation. Together, a maintenance nightmare.&lt;/p&gt;

&lt;p&gt;The AI has no way to know your project uses &lt;code&gt;TaskEither&lt;/code&gt; for error handling and &lt;code&gt;AsyncNotifier&lt;/code&gt; for state management. It draws from the sum of all Flutter code it has ever seen, and that sum includes every pattern and anti-pattern in existence.&lt;/p&gt;

&lt;p&gt;The fix is not a smarter model. The fix is project context.&lt;/p&gt;

&lt;h2&gt;
  
  
  CLAUDE.md: Your Project's AI Configuration
&lt;/h2&gt;

&lt;p&gt;Claude Code reads a &lt;code&gt;CLAUDE.md&lt;/code&gt; file from your project root at the start of every session. Architecture decisions, naming conventions, import rules -- everything the AI needs before it touches your code.&lt;/p&gt;

&lt;p&gt;Here is a simplified version from the retailer project:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## Architecture&lt;/span&gt;

This project uses Clean Architecture with feature-first organization.

&lt;span class="gu"&gt;### Decision Matrix&lt;/span&gt;
&lt;span class="p"&gt;
-&lt;/span&gt; Touching existing unmigrated code? -&amp;gt; Follow legacy patterns
&lt;span class="p"&gt;-&lt;/span&gt; Writing a new feature? -&amp;gt; Use Clean Architecture
&lt;span class="p"&gt;-&lt;/span&gt; Migrating an existing feature? -&amp;gt; Follow migration checklist

&lt;span class="gu"&gt;### Import Rules&lt;/span&gt;
&lt;span class="p"&gt;
-&lt;/span&gt; Domain layer: NO imports from data or presentation
&lt;span class="p"&gt;-&lt;/span&gt; Presentation layer: imports domain only
&lt;span class="p"&gt;-&lt;/span&gt; Data layer: implements domain interfaces

&lt;span class="gu"&gt;### Error Handling&lt;/span&gt;
&lt;span class="p"&gt;
-&lt;/span&gt; New code: TaskEither&lt;span class="nt"&gt;&amp;lt;AppFailure&lt;/span&gt;&lt;span class="err"&gt;,&lt;/span&gt; &lt;span class="na"&gt;T&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt; from fpdart
&lt;span class="p"&gt;-&lt;/span&gt; Legacy code: existing try/catch (do NOT refactor)

&lt;span class="gu"&gt;### State Management&lt;/span&gt;
&lt;span class="p"&gt;
-&lt;/span&gt; New features: Riverpod with code generation
&lt;span class="p"&gt;-&lt;/span&gt; AsyncNotifier for async state
&lt;span class="p"&gt;-&lt;/span&gt; Do NOT use ChangeNotifier, StateNotifier, or Bloc
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is not documentation for humans. Humans have context from standups and PR reviews. &lt;code&gt;CLAUDE.md&lt;/code&gt; is documentation for the AI -- explicit, unambiguous, with decision trees instead of guidelines.&lt;/p&gt;

&lt;p&gt;The decision matrix is the most important part. Without it, the AI defaults to "write the best code possible." During a migration, "best code possible" is context-dependent. A bug fix in unmigrated code should follow legacy patterns. The same logic, written during a migration, should follow Clean Architecture. Same feature, two correct approaches, depending entirely on intent. No model training covers that distinction. It has to be configured per project.&lt;/p&gt;

&lt;h2&gt;
  
  
  Custom Skills: Different Tasks, Different AI Behavior
&lt;/h2&gt;

&lt;p&gt;A &lt;code&gt;CLAUDE.md&lt;/code&gt; sets the baseline context. But different development tasks need the AI to behave differently. That is where custom skills come in.&lt;/p&gt;

&lt;p&gt;Custom skills are slash commands in Claude Code. Each one loads a specific set of instructions that override or extend the baseline context. On the retailer project, I used five:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;/migration&lt;/code&gt; enforces a strict sequence:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;When migrating a feature:
&lt;span class="p"&gt;
1.&lt;/span&gt; Create domain layer first (entities, repository interfaces, use cases)
&lt;span class="p"&gt;2.&lt;/span&gt; Create data layer (implementations, data sources, DTOs)
&lt;span class="p"&gt;3.&lt;/span&gt; Create presentation layer (providers, pages, widgets)
&lt;span class="p"&gt;4.&lt;/span&gt; Write tests for each layer
&lt;span class="p"&gt;5.&lt;/span&gt; Remove legacy code only after tests pass
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The AI does not jump ahead to the presentation layer because it is "easier." It starts with the domain, writes the interfaces, and builds outward. Every migrated feature has the same structure.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;/legacy-code&lt;/code&gt; is the opposite. "Follow existing patterns. Do not introduce Clean Architecture imports. Match the style of surrounding code." This was the hardest skill to get right. The AI's instinct is to improve. It sees a &lt;code&gt;try/catch&lt;/code&gt; and wants to refactor it into &lt;code&gt;TaskEither&lt;/code&gt;. That instinct is correct in migration mode and catastrophic in legacy mode. A "quick improvement" to a legacy service class can break five widgets that depend on its exact interface.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;/ui-component&lt;/code&gt; loads design system rules. &lt;code&gt;/riverpod&lt;/code&gt; enforces code generation with &lt;code&gt;@riverpod&lt;/code&gt; annotation and &lt;code&gt;AsyncNotifier&lt;/code&gt;. &lt;code&gt;/test-workflow&lt;/code&gt; sets testing conventions -- unit tests for use cases, widget tests for composed components, mock repositories via domain interfaces.&lt;/p&gt;

&lt;p&gt;Each skill changes the AI's behavior without changing the AI itself. The model is the same. The context is different. (For more on how AI agents and tool systems like these work, see &lt;a href="https://khal.it/blog/ki-agenten-mcp-tools-erklaert" rel="noopener noreferrer"&gt;AI Agents, MCP, and Tools Explained&lt;/a&gt;.)&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture Guardrails: The Biggest Win
&lt;/h2&gt;

&lt;p&gt;If I had to pick one concept from this entire setup that delivered the most value, it is architecture guardrails.&lt;/p&gt;

&lt;p&gt;During a migration, two valid architectural styles coexist for months. Every task requires a decision -- which style applies here? Humans handle this through judgment. AI does not have that judgment. Without explicit guardrails, it sees legacy code and "improves" it. That creates a third style -- half-migrated code that follows neither convention consistently. Worse than legacy code because it is unpredictable.&lt;/p&gt;

&lt;p&gt;The decision matrix in &lt;code&gt;CLAUDE.md&lt;/code&gt; solved this. Not a suggestion -- a rule. "Touching unmigrated code? Follow legacy patterns." No ambiguity.&lt;/p&gt;

&lt;p&gt;Here is what the same task looks like with and without guardrails.&lt;/p&gt;

&lt;p&gt;Without guardrails -- fixing a bug in a legacy service:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight dart"&gt;&lt;code&gt;&lt;span class="c1"&gt;// AI "improves" the legacy code while fixing the bug&lt;/span&gt;
&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ProductService&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;TaskEither&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;AppFailure&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Product&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;getProduct&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;String&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;TaskEither&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;tryCatch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;_httpClient&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'/products/&lt;/span&gt;&lt;span class="si"&gt;$id&lt;/span&gt;&lt;span class="s"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;then&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Product&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;fromJson&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
      &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stack&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;AppFailure&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;unexpected&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toString&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="c1"&gt;// Problem: 12 widgets depend on Future&amp;lt;Product&amp;gt;, not TaskEither&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With guardrails -- same bug, same service:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight dart"&gt;&lt;code&gt;&lt;span class="c1"&gt;// AI fixes the bug using existing patterns&lt;/span&gt;
&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ProductService&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;Future&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;Product&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;getProduct&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;String&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kd"&gt;async&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;_httpClient&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;'/products/&lt;/span&gt;&lt;span class="si"&gt;$id&lt;/span&gt;&lt;span class="s"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;Product&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;fromJson&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="k"&gt;is&lt;/span&gt; &lt;span class="n"&gt;TypeError&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="n"&gt;ProductNotFoundException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="k"&gt;rethrow&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="c1"&gt;// Bug fixed, existing interface preserved&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The second version is not "better code" in the abstract. It is the correct code for this context -- context-dependent correctness.&lt;/p&gt;

&lt;h2&gt;
  
  
  What AI Is Good At (and Where It Falls Short)
&lt;/h2&gt;

&lt;p&gt;After 827 commits using Claude Code, I have a clear picture of where AI pair programming delivers value and where it does not.&lt;/p&gt;

&lt;h3&gt;
  
  
  Where it excels
&lt;/h3&gt;

&lt;p&gt;Boilerplate is the obvious one. Clean Architecture is verbose by design -- entity classes, repository interfaces, use cases, data sources, DTOs, mappers, provider declarations. The structure is formulaic. Once the AI has seen two migrated features, it generates the scaffolding for the third with minimal correction.&lt;/p&gt;

&lt;p&gt;Test generation is close behind. "Write unit tests for this use case" with the project's testing conventions loaded produces usable tests 80% of the time. The remaining 20% need manual adjustment, usually around edge cases the AI cannot infer from the interface alone.&lt;/p&gt;

&lt;p&gt;Pattern consistency is something AI handles better than humans. A developer on their fifteenth provider declaration in a week starts taking shortcuts. The AI does not get tired. Every &lt;code&gt;AsyncNotifier&lt;/code&gt; follows the same structure. That consistency compounds over months.&lt;/p&gt;

&lt;h3&gt;
  
  
  Where it falls short
&lt;/h3&gt;

&lt;p&gt;Architectural decisions remain firmly human territory. "Should we introduce a caching layer here?" depends on traffic patterns, backend SLA, and six other factors the AI cannot observe. It will give you a confident answer. That answer may be wrong.&lt;/p&gt;

&lt;p&gt;Business context is invisible to the AI. It does not know that the German market has different legal requirements for price display than the Swiss market, or that certain API fields return stale data because the backend redesign is not finished. These are the things that cause real bugs.&lt;/p&gt;

&lt;p&gt;Subtle bugs are the dangerous category. The AI writes code that compiles, passes the tests it generated, and looks correct in review. But it might use the wrong comparison operator for a currency calculation or handle a timezone edge case incorrectly. AI-generated code needs the same scrutiny as human-written code. Arguably more, because its confidence makes it easier to rubber-stamp.&lt;/p&gt;

&lt;p&gt;The AI also lacks the judgment to not write code. Sometimes the correct response is "this duplicates existing functionality." The AI will always produce something.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Real Impact: ~30% Faster, Not Magic
&lt;/h2&gt;

&lt;p&gt;The migration took 22 weeks. Without AI-assisted development, my estimate is 30-32 weeks. That is roughly six to eight weeks saved -- not the ten-times productivity claim you see in conference talks.&lt;/p&gt;

&lt;p&gt;The 30% comes from two sources. The mechanical parts of each migration step were faster -- scaffolding, boilerplate, initial test suites. And pattern consistency reduced the review cycle. Fewer "why does this feature handle errors differently?" conversations.&lt;/p&gt;

&lt;p&gt;What AI did not speed up: architectural planning, debugging production issues via Crashlytics traces, and the final QA pass on each step.&lt;/p&gt;

&lt;p&gt;The honest math: 70% of the speed gain came from boilerplate reduction. 30% came from consistency enforcement. Zero percent came from the AI making better architectural choices than I would have. I applied a similar Claude Code workflow in a &lt;a href="https://khal.it/blog/migrating-serverless-to-monolith-with-claude-code" rel="noopener noreferrer"&gt;serverless-to-monolith migration&lt;/a&gt; with comparable results.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;If your team is planning a similar migration or evaluating how AI tooling fits into production workflows, &lt;a href="https://khal.it/leistungen/app-entwicklung" rel="noopener noreferrer"&gt;I have done this before&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Setting This Up for Your Project
&lt;/h2&gt;

&lt;p&gt;This works for any project, not just Flutter -- I use the same CLAUDE.md and custom skills approach on web apps, backend services, and infrastructure projects. For a broader perspective on AI in app development, see &lt;a href="https://khal.it/blog/app-entwickeln-mit-ki" rel="noopener noreferrer"&gt;Building Apps with AI&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Start with &lt;code&gt;CLAUDE.md&lt;/code&gt;. Document your architecture in machine-readable terms. Not "we prefer clean code" -- the AI does not know what you mean by that. Instead: "Domain layer classes live in &lt;code&gt;lib/features/{name}/domain/&lt;/code&gt;. They must not import from &lt;code&gt;data/&lt;/code&gt; or &lt;code&gt;presentation/&lt;/code&gt;. Error handling uses &lt;code&gt;TaskEither&amp;lt;AppFailure, T&amp;gt;&lt;/code&gt;." Concrete paths. Concrete types. Concrete rules.&lt;/p&gt;

&lt;p&gt;Add a decision matrix. If your codebase has multiple styles (most do), codify which style applies when. Write it as a decision tree, not prose.&lt;/p&gt;

&lt;p&gt;Create your first custom skill for whatever task you do most often. Start with one. Add more as you identify repeated patterns in your AI interactions.&lt;/p&gt;

&lt;p&gt;Run the AI on low-risk tasks first. A utility function. A test for an existing module. Review the output carefully -- import paths, naming conventions, whether it followed the &lt;code&gt;CLAUDE.md&lt;/code&gt; rules. Update &lt;code&gt;CLAUDE.md&lt;/code&gt; when the instructions are ambiguous. Expand to feature work once the context is dialed in.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;CLAUDE.md&lt;/code&gt; is a living document. Mine changed over fifty times during the retailer project. Over ten months, it became a remarkably precise description of the project's architecture. A side effect: it is also the best onboarding document the project has.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Means for Teams
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;CLAUDE.md&lt;/code&gt; and custom skills are project-level configuration, not personal preference. When the entire team uses them, a junior developer using &lt;code&gt;/migration&lt;/code&gt; gets the same structural scaffolding as a senior developer. The architecture decisions are encoded in the tooling, not locked inside one person's head.&lt;/p&gt;

&lt;p&gt;This shifts code review from "did you follow the pattern?" to "is the business logic correct?" And it forces the team to articulate rules that previously lived as tribal knowledge. "When you fix a bug in legacy code, follow legacy patterns" is a nuanced rule most teams never write down. The AI forces you to write it down, and the whole team benefits.&lt;/p&gt;

&lt;p&gt;There is a handover benefit too: the &lt;code&gt;CLAUDE.md&lt;/code&gt; and custom skills stay with the codebase. When a freelancer finishes an engagement, the team inherits a precise, machine-readable description of their own architecture. No knowledge walks out the door.&lt;/p&gt;

&lt;h2&gt;
  
  
  Agentic Coding: Where This Is Heading
&lt;/h2&gt;

&lt;p&gt;The current setup -- AI as a careful assistant with explicit guardrails -- is an intermediate step. The next stage is agentic coding: the AI executes multi-step tasks autonomously, creates files, runs tests, fixes failures, and commits the result.&lt;/p&gt;

&lt;p&gt;For me, this is already partially reality in 2026. Claude Code can run a migration end-to-end -- create the file structure, generate the code, write tests, run them, and correct failures. I review the outcome instead of every intermediate step.&lt;/p&gt;

&lt;p&gt;But the same principle holds: autonomy without guardrails is dangerous. The more autonomy the AI gets, the more important the architecture rules in &lt;code&gt;CLAUDE.md&lt;/code&gt; become. Without them, autonomous coding is just vibe coding with extra steps.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;AI-assisted development in production is not about the AI being smart. It is about the developer being deliberate.&lt;/p&gt;

&lt;p&gt;In a production codebase, AI without project context generates code that looks right and behaves wrong. Without guardrails, it "improves" code that should not be touched. The infrastructure matters more than the model -- &lt;code&gt;CLAUDE.md&lt;/code&gt;, custom skills, explicit rules about which patterns apply where. That is what separates vibe coding from production-grade AI development. When this setup becomes your default rather than an experiment, AI is no longer assisting your development -- it is native to it.&lt;/p&gt;

&lt;p&gt;My results on the retailer project: 827 commits, crash-free rate from 85% to past 97%, ~30% faster migration timeline. Not because the AI was magic. Because the AI had context.&lt;/p&gt;

&lt;p&gt;The bar for AI-assisted development in 2026 is not "can the AI write code?" It can. The bar is: "can the AI write code that belongs in your codebase?" That takes work. The work is worth it.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;I help teams set up AI-assisted development workflows for production codebases. If that is relevant for your project, &lt;a href="https://khal.it/leistungen/app-entwicklung" rel="noopener noreferrer"&gt;let's talk&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Frequently Asked Questions
&lt;/h2&gt;

&lt;h3&gt;
  
  
  What is the difference between vibe coding and AI-assisted development?
&lt;/h3&gt;

&lt;p&gt;Vibe coding means prompting an AI and committing the output with minimal review or structural guidance. AI-assisted development means providing the AI with explicit project context -- architecture rules, conventions, decision matrices -- so that its output is consistent with the existing codebase. Same models, different infrastructure.&lt;/p&gt;

&lt;h3&gt;
  
  
  What is CLAUDE.md?
&lt;/h3&gt;

&lt;p&gt;A configuration file that Claude Code reads from your project root at the start of every session. It contains architecture descriptions, naming conventions, import rules, and decision trees that guide the AI's behavior -- project documentation written for the AI.&lt;/p&gt;

&lt;h3&gt;
  
  
  Can AI coding assistants handle architecture migrations?
&lt;/h3&gt;

&lt;p&gt;With guardrails, yes. Unguided, an AI will try to "improve" all code it touches, creating pattern chaos during a migration. Custom skills and decision matrices in &lt;code&gt;CLAUDE.md&lt;/code&gt; constrain the AI to follow the correct patterns for each context.&lt;/p&gt;

&lt;h3&gt;
  
  
  How much faster is AI-assisted Flutter development?
&lt;/h3&gt;

&lt;p&gt;Roughly 20-30% faster for implementation work. Boilerplate-heavy work sees the biggest gains. Architectural planning, debugging, and QA are not meaningfully faster.&lt;/p&gt;

&lt;h3&gt;
  
  
  Does this approach work with tools other than Claude Code?
&lt;/h3&gt;

&lt;p&gt;The concepts -- project context files, task-specific configurations, architecture guardrails -- apply broadly. The specific implementation (&lt;code&gt;CLAUDE.md&lt;/code&gt;, custom skills) is Claude Code. Other tools have their own mechanisms. The principle is the same: give the AI explicit, machine-readable project context.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>flutter</category>
      <category>devtools</category>
      <category>claudecode</category>
    </item>
  </channel>
</rss>
