<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Jack Branch</title>
    <description>The latest articles on DEV Community by Jack Branch (@jack_branch_3fb9e01c57c03).</description>
    <link>https://dev.to/jack_branch_3fb9e01c57c03</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3442816%2F19548fe7-bc0b-4548-8462-6b12396a68d7.png</url>
      <title>DEV Community: Jack Branch</title>
      <link>https://dev.to/jack_branch_3fb9e01c57c03</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/jack_branch_3fb9e01c57c03"/>
    <language>en</language>
    <item>
      <title>The AI Development Paradox: Why Better Models Lead to Worse Practices</title>
      <dc:creator>Jack Branch</dc:creator>
      <pubDate>Fri, 26 Sep 2025 10:00:23 +0000</pubDate>
      <link>https://dev.to/jack_branch_3fb9e01c57c03/the-ai-development-paradox-why-better-models-lead-to-worse-practices-4aj</link>
      <guid>https://dev.to/jack_branch_3fb9e01c57c03/the-ai-development-paradox-why-better-models-lead-to-worse-practices-4aj</guid>
      <description>&lt;p&gt;Over the past year, like many others in the tech space, I've witnessed a generative AI frenzy. There's an inverse relationship at play: the less technical and more managerial a person or organization becomes, the stronger their push for embracing this "revolutionary" technology (which, let's be honest, has been around since the 1970s). Promises of 10x delivery, hell, even 30x delivery, echo through conference rooms. "Just fully augment yourselves with these tools and software engineering will be obsolete," they say.&lt;/p&gt;

&lt;p&gt;Call me a cynic, but I don't take bold claims at face value. So I did what all software engineers do: I found a problem I needed to solve and started building, this time with GitHub Copilot as my co-pilot. Even with a modest 10x improvement, I should have had a finished product in no time, right?&lt;/p&gt;

&lt;p&gt;Not exactly.&lt;/p&gt;

&lt;p&gt;I previously wrote about the first phase of this experiment, where I encountered what most developers discover: a period of almost godlike rapid progress, completely undone by refactoring hell and an ungodly spaghetti mess of code. Yet I remained undeterred, determined to find the best practices for using these tools or drive myself insane trying.&lt;/p&gt;

&lt;p&gt;So what happened after another month of experimentation? Did I produce my dream application, or is it dead in the water? Let's find out together.&lt;/p&gt;

&lt;h2&gt;
  
  
  Background: Building a Password Manager in Go
&lt;/h2&gt;

&lt;p&gt;For those who missed my earlier article, here's the context. To test the effectiveness of AI coding tools and strategies for working with them, I decided to build a desktop application in an unfamiliar language. The application: a password manager (I'm tired of cloud-based password storage and wanted my own tooling). The language: Go. &lt;/p&gt;

&lt;p&gt;With a background in distributed systems using Java and TypeScript, this was an excellent opportunity to learn something new while relying on AI models for guidance. As it stands, the application builds cross-platform, stores secrets securely, and (according to GPT-4) can share secrets across LAN and Bluetooth.&lt;/p&gt;

&lt;h2&gt;
  
  
  Agentic Coding Meets TDD: A Fundamental Tension
&lt;/h2&gt;

&lt;p&gt;One of my main concerns with the push for agentic workflows is that they fundamentally conflict with one of my core development practices: Test-Driven Development (TDD) and Behavior-Driven Development (BDD). I've found tremendous success with this strategy. It keeps solutions concise, stable, and always focused on the end user. Nothing exists without good reason, and refactoring can be done proactively with minimal technical debt. From a business perspective, you get a more stable, easier-to-change product that serves user needs while remaining comprehensible to the developers who built it.&lt;/p&gt;

&lt;p&gt;So I entered the second phase with a new plan: How can I work with LLMs while maintaining a TDD/BDD workflow? As it turns out, with considerable difficulty.&lt;/p&gt;

&lt;p&gt;To test this approach, I outlined a request for a new feature: create a mechanism to securely share secrets between machines. Unlike my previous attempts, I gave the agent (GPT-4) one key instruction: no implementation, just method signatures, test cases, and reasoning behind decisions. Then, write an integration test covering the full journey plus several negative cases.&lt;/p&gt;

&lt;p&gt;Initially, everything went well. I remained involved in decision-making, code quality improved, and test coverage reached an all-time high. Had I discovered the magic formula for peak AI utilization?&lt;/p&gt;

&lt;p&gt;Problems emerged during implementation. A persistent issue with these models is their reluctance to use dependency injection in classes, triggering refactor cycle number one. Then came test mock injection, where despite instructions to use Mockery (which I'd set up in phase one), GPT continued creating unnecessary custom mocks. &lt;/p&gt;

&lt;p&gt;Despite these hiccups, progress seemed rapid. After a few days, I had something that would supposedly work, until GPT changed its mind and decided to completely redo the implementation. The encryption keys were wrong, so let's add more. More helper methods. More custom code that could be handled by third-party libraries. Fortunately, I remained highly involved at this stage, with changes made one file at a time, allowing me to challenge decisions and learn from mistakes.&lt;/p&gt;

&lt;p&gt;Eventually, with new keys added, full keychain integration, and a robust ephemeral/hybrid key generation system, we seemed to be in business. Debugging took considerable time, but my solid test cases, especially the integration test, revealed issues quickly and helped me understand the often convoluted logic the agent produced.&lt;/p&gt;

&lt;p&gt;This marked the end of the first part of my feature work. The business logic functioned, tests passed, and the code appeared well-structured with solid design patterns and high coverage. This was my greatest project success and the most enjoyable phase.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Agentic Death Spiral: Enter YOLO Development
&lt;/h2&gt;

&lt;p&gt;Around this time, I received a Copilot update featuring a fancy new model: GPT-4o (and 4o-mini). Excited to test the hype, I decided to tackle the substantial technical debt from earlier poor code. I gave it a suitably complex task that all previous models had failed: review the codebase, identify improvement areas, and create a refactoring plan.&lt;/p&gt;

&lt;p&gt;The result was pleasantly surprising. Within ten minutes, I had a detailed issue list and a multi-step remediation plan. I approved the refactor and let the agent take control. The outcome: hundreds of deleted lines, better test assertions, and numerous cognitive complexity warnings banished to the ether. I had seemingly solved the elusive problem of AI-driven code improvement.&lt;/p&gt;

&lt;p&gt;But pride comes before a fall, and cracks were already appearing. The agent reverted to large multi-file changes and dreaded 1000-character word salads detailing these modifications. I became detached, skimming outputs and simply entering "go ahead with the next change" when prompted. The roles reversed, I became the passenger with GPT driving. Thus began the misery of YOLO development.&lt;/p&gt;

&lt;p&gt;The unraveling accelerated when I decided to add transport mechanisms for sending secret bundles between devices. First came LAN support with hundreds of lines of custom code, all of which I had to remove in favor of a dedicated library. (Why do agents always avoid libraries?) Did this work? I have no idea. The AI wrote the tests, and I was so disconnected that I barely read them.&lt;/p&gt;

&lt;p&gt;The UI was built in record time. I got so lazy that I simply prompted the AI to "improve the UI and make it look nicer," which it interpreted as an invitation to remove half the options and create the world's worst modal menu with only a theme-changing option. However, having little interest in UI development, I was happy to let the model lead.&lt;/p&gt;

&lt;p&gt;YOLO development reached its final form when I prompted for Bluetooth setup. At this point, I wasn't even reading prompt outputs, let alone code. I seemed to be making progress, so I didn't care, until being informed that no method existed for setting up a Bluetooth adapter on Windows in Go. The model attempted to create a custom C# package for some inexplicable reason, but I'd had enough. I just wanted the work completed so I could move on. The code had become a lost cause, and the functionality I'd built this for (peer-to-peer secret bundle sharing) was untestable and unrecognizable from the carefully crafted code in the first phase.&lt;/p&gt;

&lt;h2&gt;
  
  
  Is YOLO Development Inevitable with Agentic Coding?
&lt;/h2&gt;

&lt;p&gt;This question's answer likely varies by individual, but for me, it depends entirely on several factors:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Model Complexity and Autonomy&lt;/strong&gt;: The more the model does and the fewer mistakes it makes, the higher the likelihood of YOLO development. This represents the critical paradox of LLM-assisted development. I can only see this worsening as models improve.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Codebase Size and Quality&lt;/strong&gt;: Larger, more complex codebases with worse quality and more machine-generated code increase the chance of total developer disconnect. Fortunately, models perform poorly with very large codebases, somewhat protecting themselves from this issue.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Developer Technical and Product Knowledge&lt;/strong&gt;: As the project progressed (especially with GPT-4o), the agent suggested increasingly incomprehensible changes. This was particularly true with transport logic, where I had limited knowledge. Consequently, I had no idea what the code was doing and was too checked out to read and understand it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lessons Learned: Finding Signal in the Noise
&lt;/h2&gt;

&lt;p&gt;Despite my complaints above (and believe me, I love complaining—just ask anyone I've done a retrospective with), I remain interested in the AI-assisted development paradigm. I have more experiments planned and still hope to eventually develop a system that enables my best possible work, both personally and professionally.&lt;/p&gt;

&lt;p&gt;There's much to extract from this experience. Even if it ultimately failed to maintain my engagement, I learned considerably. Here are my key takeaways:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Request Small, Concise Changes&lt;/strong&gt;: Be crystal clear with the model (repeatedly) to do one thing at a time. Treat the model as a software novice, outline desired steps and avoid allowing direct file changes. I've found much more success copying from output into files, as it forces me to read and understand placement.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Test, Test, Test&lt;/strong&gt;: Robust testing was a lifesaver. Models are intelligent enough to run tests after changes, making issue resolution much faster. When models made large changes, seeing tests fail as expected was reassuring and prevented regressions. Just ensure you read the tests, models tend to create redundant assertions while missing key checks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;BDD Outperforms TDD&lt;/strong&gt;: One major takeaway was how successful BDD testing proved. When I wrote scenarios upfront (even when the model wrote test code), output quality was higher and more product-oriented. I could also better guide the model toward correct choices and priority-based feature implementation. TDD results were mixed—models frequently ignored tests despite repeated prompting. A proper ML-based TDD solution seems distant with current practices.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Take Breaks and Stay Vigilant&lt;/strong&gt;: The longer I worked with these models, the more detached I became. While playing Dwarf Fortress during code generation sounds appealing, it meant split focus and allowed poor code addition. Dedicated sessions within set timeframes work best, with the added benefit of reflection time to refine processes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Don't Allow Changes You Don't Understand&lt;/strong&gt;: This is self-explanatory. If the model adds logic you can't reasonably decipher, it shouldn't be added. It's fine to be unclear on syntax occasionally, but core flow should be easily understood and sensible to the engineer in charge.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;But You Don't Need to Understand Scripts&lt;/strong&gt;: One significant model win is their scripting prowess. I heavily leveraged models for deployment and binary build scripts for each OS. While understanding these better might have been valuable, scripts like these are single-purpose—they work or they don't. Though debugging took longer, ensuring local testing capability made this area work very well.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use Linters, Code Coverage, and Proper CI&lt;/strong&gt;: Models excel at recognizing build checks. These tools proved extremely useful, allowing me to mandate code quality and test coverage in builds while providing a way for models to ensure changes pass local CI builds.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Know and Use Design Patterns&lt;/strong&gt;: Planning changes ahead is crucial. I found models poor at implementing design patterns, leading to excessive duplication, helper reliance, and overly long files. Create packages upfront with proper naming and strongly instruct pattern adherence for much better results.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Experience Matters&lt;/strong&gt;: As mentioned in my previous article, these tools amplify developer skill and experience. They can't think for you—the more input and guidance provided, the better the results. I also find that highly granular input across multiple prompts performs much better than single giant prompts or vague requests.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion: Software Engineering Isn't Going Anywhere
&lt;/h2&gt;

&lt;p&gt;So concludes my first LLM-driven project. I leave with more questions than answers, so hopefully there will be future articles on TDD, professional usage, and other holistic topics. One thing I'm certain of: software engineering isn't disappearing. If anything, skills and experience are now more important than ever.&lt;/p&gt;

&lt;p&gt;But what about speed, and that all-important 10x improvement? Well, not really. It took more than a month to finish this, and I'm not convinced it was any quicker than if I had just hand written all the code. The quality would probably have been better in that case as well.&lt;/p&gt;

&lt;p&gt;The promise of AI-assisted development remains compelling, but the path forward requires careful navigation of its inherent paradoxes. The better these tools become, the more vigilant we must be about maintaining our agency as developers. The future lies not in replacement, but in thoughtful collaboration—where we remain firmly in the driver's seat.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;For context, you can read my &lt;a href="https://dev.to/jack_branch_3fb9e01c57c03/what-i-learned-from-a-week-of-ai-assisted-coding-the-good-the-bad-and-the-surprisingly-11kl"&gt;previous article&lt;/a&gt; and view the &lt;a href="https://github.com/JTBranch/SecurePasswordManager" rel="noopener noreferrer"&gt;project repository&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>softwareengineering</category>
      <category>productivity</category>
      <category>ai</category>
      <category>discuss</category>
    </item>
    <item>
      <title>What I Learned From a Week of AI-Assisted Coding: The Good, The Bad, and The Surprisingly Counterintuitive</title>
      <dc:creator>Jack Branch</dc:creator>
      <pubDate>Wed, 20 Aug 2025 14:24:25 +0000</pubDate>
      <link>https://dev.to/jack_branch_3fb9e01c57c03/what-i-learned-from-a-week-of-ai-assisted-coding-the-good-the-bad-and-the-surprisingly-11kl</link>
      <guid>https://dev.to/jack_branch_3fb9e01c57c03/what-i-learned-from-a-week-of-ai-assisted-coding-the-good-the-bad-and-the-surprisingly-11kl</guid>
      <description>&lt;p&gt;Last week, I decided to build something I'd been putting off for months: a personal password manager. My requirements were simple - secure local storage, clean UI, and encryption I could trust. What made this interesting wasn't the project itself, but how I built it.&lt;/p&gt;

&lt;p&gt;I have a background in distributed systems: REST APIs, event-driven architecture, Kafka, the usual enterprise stack. Building a multi-platform desktop application was entirely new territory. I'd been planning this experiment for a while: what would it be like to build a project entirely using AI-assisted programming?&lt;/p&gt;

&lt;p&gt;Before we continue, I should disclose some bias. I'm somewhat of an AI skeptic, so I definitely had preconceived ideas going into this, particularly around code quality, security, and scalability. I also assumed the process would be painful and less enjoyable than traditional programming (spoiler alert: I was completely wrong about this one).&lt;/p&gt;

&lt;p&gt;Next came choosing the language. I've always been interested in Go: it seems like a nice blend of C++, Python, and JavaScript, all languages I enjoy. Since I'd never touched Go or Fyne (Go's UI framework), this seemed like the perfect way to put these AI models through their paces.&lt;/p&gt;

&lt;p&gt;Over the course of a week, I experimented with three different models: GPT-4, Claude Sonnet, and Gemini 2.5 Pro, switching between them to see how each handled different aspects of the development process.&lt;/p&gt;

&lt;p&gt;What I discovered challenged most of my assumptions about AI-assisted coding. The fastest model wasn't the most productive. The highest-quality code generator wasn't the most helpful. And the most counterintuitive finding of all: sometimes being "too good" at coding assistance actually made the development experience worse.&lt;/p&gt;

&lt;p&gt;If you're considering integrating AI tools into your development workflow, or if you're curious about the practical realities behind the productivity hype, here's what a week of intensive AI-assisted coding actually taught me.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Productivity Illusion: Fast Start, Slow Finish
&lt;/h2&gt;

&lt;p&gt;The most striking pattern in my week of AI coding wasn't what I expected. My productivity started incredibly high and steadily declined as the project progressed. On day one, I had a working password manager with encryption, a basic UI, and core functionality. By day four, I was stuck in refactoring hell, generating thousands of lines of code changes while adding zero new features.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Setup Phase: Where AI Shines
&lt;/h3&gt;

&lt;p&gt;AI assistance was genuinely transformative during the initial setup. Within hours, I had:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A properly structured Go project with modules and dependencies&lt;/li&gt;
&lt;li&gt;A working Fyne UI with multiple screens
&lt;/li&gt;
&lt;li&gt;Basic encryption and decryption functionality&lt;/li&gt;
&lt;li&gt;File I/O for local storage&lt;/li&gt;
&lt;li&gt;Even a custom test framework (more on that later)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This was exactly the productivity boost everyone talks about. Tasks that would have taken me days of research and documentation reading were completed in minutes. For someone completely new to Go and Fyne, this felt magical.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Architecture Reality Check
&lt;/h3&gt;

&lt;p&gt;But then reality hit. The code that got me started quickly didn't fit what I actually needed. The AI had made architectural decisions based on getting something working, not on building something maintainable. What followed was an endless cycle of refactoring:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The initial encryption implementation was too simple for real security needs&lt;/li&gt;
&lt;li&gt;The UI structure couldn't handle the complexity I wanted to add&lt;/li&gt;
&lt;li&gt;There was no dependency injection, making testing nearly impossible
&lt;/li&gt;
&lt;li&gt;Error handling was inconsistent across the codebase&lt;/li&gt;
&lt;li&gt;The file structure didn't make sense for the features I planned&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The Refactoring Trap
&lt;/h3&gt;

&lt;p&gt;Here's where things got really problematic. Each refactoring session with AI would generate hundreds of lines of code changes. My commit history started looking incredibly productive - lots of activity, lots of lines added. But I wasn't adding any new features. I was essentially paying interest on the technical debt from the AI's initial "quick wins."&lt;/p&gt;

&lt;p&gt;The breaking point came when I hit my rate limit on GitHub Copilot after just four days of use (on a paid plan). Suddenly, I was stuck mid-refactor with partially broken code and no AI assistance. I had to manually dig myself out of the mess, which gave me a clear perspective on what was actually necessary versus what the AI thought needed to be "improved."&lt;/p&gt;

&lt;h3&gt;
  
  
  Traditional Coding: The Unexpected Comeback
&lt;/h3&gt;

&lt;p&gt;On my final day, I switched approaches entirely. I did all the coding myself and used GPT-4 purely as a reference tool: essentially treating it like an enhanced Google for Go-specific questions. The results were surprising:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Higher actual delivery rate despite generating less code&lt;/li&gt;
&lt;li&gt;No rework cycles or debugging sessions&lt;/li&gt;
&lt;li&gt;Better understanding of what I was building&lt;/li&gt;
&lt;li&gt;Code that fit my actual requirements, not the AI's assumptions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;High initial productivity from AI can be an illusion if it comes at the cost of architecture and maintainability.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Model Behaviors: The Counterintuitive Preferences
&lt;/h2&gt;

&lt;p&gt;Testing three different AI models revealed some unexpected preferences that go against conventional wisdom about "better" AI being more helpful.&lt;/p&gt;

&lt;h3&gt;
  
  
  GPT-4: Fast, Wrong, and Strangely Effective
&lt;/h3&gt;

&lt;p&gt;GPT-4 was objectively the worst at generating correct code. It made frequent mistakes, missed edge cases, and often gave me solutions that needed significant debugging. But here's the counterintuitive part: &lt;strong&gt;I enjoyed working with it the most.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Why? Because it was fast, and its mistakes kept me engaged with the code. Every response required my review and often my correction. This forced me to actually read and understand what was being generated, learn Go patterns by fixing the AI's errors, stay involved in architectural decisions, and catch problems early rather than discovering them later.&lt;/p&gt;

&lt;p&gt;The friction was actually valuable. It prevented me from falling into passive "vibe coding" where I just accepted whatever the AI produced.&lt;/p&gt;

&lt;h3&gt;
  
  
  Claude and Gemini: Too Good for My Own Good
&lt;/h3&gt;

&lt;p&gt;Claude Sonnet and Gemini 2.5 Pro produced much higher quality code with fewer errors. They were more thoughtful about edge cases, better at following Go idioms, and generally more reliable. Logically, these should have been better development partners.&lt;/p&gt;

&lt;p&gt;Instead, I found myself becoming disengaged. The code was good enough that I stopped reading it carefully. I trusted their outputs and moved on to the next task. This led to less learning about Go and Fyne, architectural decisions I didn't fully understand, code that worked but didn't match my mental model, and a growing disconnect between what I wanted and what I had.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sometimes "better" AI assistance can make you a worse developer by reducing your engagement with the code.&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Don't Mix Your Models
&lt;/h3&gt;

&lt;p&gt;One practical lesson: stick to one model per project phase. I tried switching between models for different tasks, but each AI has its own "style" and preferences. Claude would refactor code that Gemini had written, undoing architectural decisions and imposing its own patterns. Gemini would then "fix" Claude's work in the next iteration. &lt;/p&gt;

&lt;p&gt;It became a digital turf war where I was caught in the middle, trying to maintain consistency across competing AI opinions.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Google Advantage
&lt;/h3&gt;

&lt;p&gt;Gemini clearly produced the best Go code quality, which makes sense - Google created Go. This suggests a broader principle: consider who built or maintains your technology stack when choosing AI tools. The company with the deepest expertise in a language will likely have trained their models better on it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Limits of Autonomy: Why Agentic Workflows Failed
&lt;/h2&gt;

&lt;p&gt;The current trend in AI coding tools is toward more autonomy - agents that can make large changes across multiple files, handle complex refactoring, and work independently on substantial tasks. My experience suggests this is moving in the wrong direction.&lt;/p&gt;

&lt;h3&gt;
  
  
  Small Changes vs. Large Autonomy
&lt;/h3&gt;

&lt;p&gt;Every time I allowed an AI to make large, autonomous changes, the results were disappointing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;New bugs introduced during refactoring&lt;/li&gt;
&lt;li&gt;Architectural inconsistencies across files
&lt;/li&gt;
&lt;li&gt;Changes that broke existing functionality&lt;/li&gt;
&lt;li&gt;Code that was harder to review and understand&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In contrast, small, specific requests produced much better results:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;❌ "Improve the security of this code" (led to massive rewrites)&lt;/li&gt;
&lt;li&gt;✅ "Add input validation to this password field" (focused, reviewable change)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The Scope Creep Problem
&lt;/h3&gt;

&lt;p&gt;AI models have a tendency toward "helpful" scope creep. Ask for dependency injection, and they'll also rename your methods. Request a simple refactor, and they'll reorganize your entire file structure. This isn't malicious - they're trying to be helpful - but it makes their changes much harder to review and verify.&lt;/p&gt;

&lt;p&gt;During one simple package reorganization, Gemini got stuck in a loop, unable to resolve the import dependencies it had created. The task was straightforward for a human but somehow too complex for the AI to track consistently.&lt;/p&gt;

&lt;h3&gt;
  
  
  The People-Pleasing Problem
&lt;/h3&gt;

&lt;p&gt;AI models are optimized for user satisfaction, not code quality. This creates some concerning behaviors:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GPT-4 set test coverage requirements to 20% so the build would pass (rather than improving actual coverage)&lt;/li&gt;
&lt;li&gt;Multiple models generated a &lt;code&gt;secrets.json&lt;/code&gt; file without considering security implications&lt;/li&gt;
&lt;li&gt;They avoided suggesting additional work (like writing tests) unless explicitly asked&lt;/li&gt;
&lt;li&gt;They took shortcuts to make code "work" rather than making it robust&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For security-critical applications like a password manager, this people-pleasing tendency could be genuinely dangerous.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Testing Gap
&lt;/h3&gt;

&lt;p&gt;None of the AI models suggested Test-Driven Development or proactively wrote tests. They would generate test code if asked, but testing wasn't part of their default development approach. This reinforces the idea that AI tools currently optimize for immediate functionality over long-term code quality.&lt;/p&gt;

&lt;p&gt;The test framework that was eventually generated (under heavy prompting from me) was actually quite good, but I had to specifically request it. This suggests the capability exists, but the AI's default behavior doesn't align with professional development practices.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Experience Amplification Theory
&lt;/h2&gt;

&lt;p&gt;The most important insight from my experiment is what I'm calling the "experience amplification theory": &lt;strong&gt;AI coding tools amplify the developer's existing skill level and habits rather than improving them.&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Bad Patterns, Faster
&lt;/h3&gt;

&lt;p&gt;As someone new to Go, I brought Java-influenced patterns and thinking to the codebase. The AI didn't correct these patterns - it implemented them more efficiently. The result was Go code that worked but was architecturally wrong, mixing Java-style approaches with Go implementations.&lt;/p&gt;

&lt;p&gt;A more experienced Go developer would have prompted for idiomatic patterns and caught architectural issues early. But as a novice, I didn't know what I didn't know, and the AI didn't proactively educate me about better approaches.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Verbosity Trap
&lt;/h3&gt;

&lt;p&gt;AI models have a tendency to solve problems by adding more code rather than creating elegant solutions. Instead of clean abstractions, they often generate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Long chains of if-statements rather than streamlined logic&lt;/li&gt;
&lt;li&gt;Repetitive code blocks instead of reusable functions&lt;/li&gt;
&lt;li&gt;Verbose error handling instead of consistent patterns&lt;/li&gt;
&lt;li&gt;Multiple similar functions instead of parameterized solutions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This "more code equals solution" approach creates maintenance nightmares and goes against Go's philosophy of simplicity and clarity.&lt;/p&gt;

&lt;h3&gt;
  
  
  Missing Professional Practices
&lt;/h3&gt;

&lt;p&gt;The AI tools I tested didn't suggest professional development practices unless specifically prompted:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No mention of dependency injection until I requested it&lt;/li&gt;
&lt;li&gt;No proactive suggestions for testing strategies&lt;/li&gt;
&lt;li&gt;No guidance on code organization or package structure&lt;/li&gt;
&lt;li&gt;No warnings about security implications&lt;/li&gt;
&lt;li&gt;No discussion of error handling patterns&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;They focused on making code work, not on making it maintainable, testable, or secure.&lt;/p&gt;

&lt;h2&gt;
  
  
  Vibe Coding vs. Engaged Development
&lt;/h2&gt;

&lt;p&gt;Through this experiment, I developed a clearer distinction between whats known as "vibe coding" and engaged development.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Vibe coding&lt;/strong&gt; is when you use AI to generate functionality based purely on desired outputs, without engaging with the actual code, architecture, or implementation details. You prompt for features, check if they work, and move on without understanding what was created.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Engaged development&lt;/strong&gt; means actively reviewing generated code, understanding architectural decisions, learning from implementations, and maintaining involvement in the development process.&lt;/p&gt;

&lt;p&gt;The difference is crucial for security-critical applications. Vibe coding might get you a password manager that encrypts data, but engaged development helps you catch issues like unencrypted secrets files or weak encryption implementations.&lt;/p&gt;

&lt;p&gt;One particularly concerning behavior I discovered: AI models sometimes claim to make changes without actually implementing them. Gemini would confidently describe modifications it was making, but the actual code remained unchanged. This highlights why code review remains essential: you can't trust AI assertions about what changes were made.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Actually Worked: A Framework for AI-Assisted Development
&lt;/h2&gt;

&lt;p&gt;After a week of experimentation, I found several approaches that genuinely improved productivity without creating technical debt.&lt;/p&gt;

&lt;h3&gt;
  
  
  AI as Reference Tool
&lt;/h3&gt;

&lt;p&gt;The most successful approach was treating AI like an enhanced search engine rather than a pair programmer. Using GPT-4 to answer specific questions about Go syntax, Fyne APIs, or implementation patterns was incredibly valuable:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"How do I handle file I/O errors in Go?"&lt;/li&gt;
&lt;li&gt;"What's the idiomatic way to structure a Fyne application?"
&lt;/li&gt;
&lt;li&gt;"How do I implement AES encryption in Go?"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This kept me in control of architecture and implementation while leveraging AI's knowledge base for faster learning.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Boilerplate Sweet Spot
&lt;/h3&gt;

&lt;p&gt;AI tools excel at generating boilerplate code and handling setup tasks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Project structure and dependency management&lt;/li&gt;
&lt;li&gt;Build configurations and deployment scripts&lt;/li&gt;
&lt;li&gt;Standard error handling patterns&lt;/li&gt;
&lt;li&gt;Testing scaffolding and mock generation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These are time-consuming tasks that don't require creative problem-solving, making them perfect for AI assistance.&lt;/p&gt;

&lt;h3&gt;
  
  
  Specific, Bounded Prompts
&lt;/h3&gt;

&lt;p&gt;When I did use AI for code generation, specific prompts worked much better than vague requests:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;❌ "Improve this code"&lt;/li&gt;
&lt;li&gt;✅ "Add error handling to this encryption function"&lt;/li&gt;
&lt;li&gt;❌ "Make this more secure"
&lt;/li&gt;
&lt;li&gt;✅ "Validate password strength using OWASP guidelines"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Specific prompts naturally led to smaller, reviewable changes that I could understand and verify.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Navigator Experiment
&lt;/h3&gt;

&lt;p&gt;I experimented with flipping the traditional roles - having me write code while the AI provided suggestions and guidance. This approach showed promise:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Kept me engaged with the implementation&lt;/li&gt;
&lt;li&gt;Provided knowledge without taking control&lt;/li&gt;
&lt;li&gt;Reduced debug/refactor cycles&lt;/li&gt;
&lt;li&gt;Maintained architectural consistency&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;However, it was difficult to keep AI models in this advisory role. They have a strong tendency to want to "take over" and generate full implementations rather than just providing guidance.&lt;/p&gt;

&lt;h2&gt;
  
  
  Professional vs. Personal: The Readiness Gap
&lt;/h2&gt;

&lt;p&gt;My experience reveals a clear divide in where AI-assisted coding provides genuine value versus where it creates more problems than it solves.&lt;/p&gt;

&lt;p&gt;For individual developers building personal tools, AI assistance can be transformative: faster prototyping and experimentation, access to unfamiliar technologies and frameworks, ability to build functional applications outside your expertise area, and lower stakes if things go wrong. My password manager project is a perfect example: I built something genuinely useful that I couldn't have created as quickly without AI assistance.&lt;/p&gt;

&lt;p&gt;For professional, production code, current AI tools have significant limitations: too many subtle bugs and edge cases missed, architectural decisions that don't scale, security shortcuts that create vulnerabilities, code that works but isn't maintainable, and lack of proper testing and validation. The people-pleasing tendency and focus on immediate functionality over long-term quality make current AI tools unsuitable for critical production systems.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Path Forward
&lt;/h2&gt;

&lt;p&gt;The biggest insight from my week of AI-assisted coding is that &lt;strong&gt;we need to develop better practices for working with these tools&lt;/strong&gt;. The current approach of "let the AI do more" may be moving in the wrong direction.&lt;/p&gt;

&lt;p&gt;Based on my experience, effective AI-assisted development should follow these principles:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Keep humans in the architectural loop&lt;/strong&gt; : AI can generate implementations, but humans should make structural decisions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Prefer small, reviewable changes&lt;/strong&gt; : Resist the temptation to let AI make large autonomous modifications&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Maintain engagement with the code&lt;/strong&gt; : Don't let AI quality reduce your involvement in understanding what's being built&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use specific, bounded prompts&lt;/strong&gt; : Vague requests lead to scope creep and unwanted changes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Treat AI as a knowledge tool first, code generator second&lt;/strong&gt; : The reference use case is more reliable than the generation use case&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Always verify claims and changes&lt;/strong&gt; : AI confidence doesn't equal correctness&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Focus AI assistance on setup, boilerplate, and knowledge gaps&lt;/strong&gt; : Avoid using it for core business logic and architecture&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The future likely isn't more autonomous AI agents, but better human-AI collaboration patterns. We need tools that provide knowledge and suggestions without taking control, respect architectural boundaries and project constraints, encourage good development practices rather than just working code, support iterative, reviewable development processes, and maintain human engagement and learning.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion: AI as an Amplifier, Not Replacement
&lt;/h2&gt;

&lt;p&gt;After a week of intensive experimentation with AI-assisted coding, my biggest takeaway is nuance. These tools are incredibly powerful but require careful, intentional use to provide genuine value.&lt;/p&gt;

&lt;p&gt;AI coding assistance is best understood as an amplifier of existing developer capabilities rather than a replacement for developer skills. Good developers can use these tools to work faster and explore new technologies more quickly. But the tools don't make bad developers good - they just help them produce bad code more efficiently.&lt;/p&gt;

&lt;p&gt;The productivity gains are real, but they're not uniformly distributed across all development tasks. AI excels at boilerplate, setup, and knowledge transfer. It struggles with architecture, complex refactoring, and the kind of nuanced decision-making that separates working code from maintainable code.&lt;/p&gt;

&lt;p&gt;Most importantly, the best AI-assisted development workflows aren't the most autonomous ones. The sweet spot seems to be maintaining human control over architecture and implementation while leveraging AI for knowledge, suggestions, and rapid generation of well-defined components.&lt;/p&gt;

&lt;p&gt;We're still in the early days of learning how to work effectively with these tools. The patterns that work best may be quite different from what the current hype cycle suggests. Based on my experience, the future of AI-assisted development is likely to be more collaborative and less autonomous than current trends indicate.&lt;/p&gt;

&lt;p&gt;The key is finding the right balance: leveraging AI's strengths while maintaining the human judgment, architectural thinking, and code quality practices that produce software you can actually maintain and trust.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Was the experiment a success?&lt;/strong&gt; Absolutely. I now have a working, cross-platform password manager available on GitHub with automated tests, proper releases, and reasonably clean code. More importantly, I went from knowing zero Go to understanding core concepts and idiomatic patterns - something that would have taken weeks of traditional learning.&lt;/p&gt;

&lt;p&gt;The real success, though, was discovering a more nuanced relationship with AI coding tools. Instead of the binary "AI good" or "AI bad" perspective I started with, I now have a framework for when and how to use these tools effectively.&lt;/p&gt;

&lt;p&gt;And perhaps most importantly: I genuinely enjoyed every minute of this project. The combination of learning a new language, exploring AI capabilities, and building something I actually use daily made for an engaging week of coding. It's given me a long list of similar experiments I want to try next.&lt;/p&gt;

&lt;p&gt;Sometimes the best way to understand new technology is just to dive in and build something real with it.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Want to share your own experiences with AI-assisted coding? I'd love to hear how different approaches and tools have worked (or not worked) for your projects. The community is still figuring out the best practices here, and every real-world experiment adds valuable data points.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;For anyone interested, the repository for the project is &lt;a href="https://github.com/JTBranch/SecurePasswordManager" rel="noopener noreferrer"&gt;here&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>go</category>
      <category>productivity</category>
      <category>coding</category>
    </item>
  </channel>
</rss>
