<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Jaskaran Singh</title>
    <description>The latest articles on DEV Community by Jaskaran Singh (@jaskaran_singh).</description>
    <link>https://dev.to/jaskaran_singh</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3891457%2Fcf2bff88-3ae7-4d38-a2ed-d62c86263079.jpg</url>
      <title>DEV Community: Jaskaran Singh</title>
      <link>https://dev.to/jaskaran_singh</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/jaskaran_singh"/>
    <language>en</language>
    <item>
      <title>AI Agents Are Shipping Features Without You. Now What?</title>
      <dc:creator>Jaskaran Singh</dc:creator>
      <pubDate>Wed, 22 Apr 2026 00:12:59 +0000</pubDate>
      <link>https://dev.to/jaskaran_singh/ai-agents-are-shipping-features-without-you-now-what-4eo0</link>
      <guid>https://dev.to/jaskaran_singh/ai-agents-are-shipping-features-without-you-now-what-4eo0</guid>
      <description>&lt;p&gt;&lt;em&gt;Jaskaran Singh — Senior Software Engineer, AI Trainer&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;A few weeks ago I watched an agent open a GitHub issue, write the fix, run the tests, and open a pull request. No human typed a line of code. The PR passed review.&lt;/p&gt;

&lt;p&gt;I didn't find this inspiring. I found it genuinely disorienting. I say that as someone who trains AI models for a living and is currently building an agent of my own.&lt;/p&gt;

&lt;p&gt;If you're a software engineer in 2026 and you haven't had that moment yet, you will. Agentic AI is being called the third seismic shift in software engineering this century, after open source and DevOps. That framing might be overblown. It might not be. Either way, something real is happening and it's worth thinking clearly about instead of panicking or dismissing it.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Numbers Stopped Being Theoretical
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.unsplash.com%2Fphoto-1551288049-bebda4e38f71%3Fw%3D1000%26q%3D80" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.unsplash.com%2Fphoto-1551288049-bebda4e38f71%3Fw%3D1000%26q%3D80" alt="A dashboard of data and analytics representing AI adoption statistics" width="1000" height="667"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;Source: &lt;a href="https://unsplash.com/photos/MacBook-Pro-on-table-beside-white-iMac-and-Magic-Mouse-Im7lZjxeLhg" rel="noopener noreferrer"&gt;Unsplash&lt;/a&gt; — Luke Chesser&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;A &lt;a href="https://newsletter.pragmaticengineer.com/p/ai-tooling-2026" rel="noopener noreferrer"&gt;survey of nearly 1,000 engineers published in early 2026&lt;/a&gt; found that 95% use AI tools at least weekly, 75% use AI for half or more of their engineering work, and 55% regularly use AI agents. That last number is the one that matters. Copilots have been mainstream for two years. Agents are different.&lt;/p&gt;

&lt;p&gt;A copilot suggests. An agent acts. It reads your codebase, decides what to do, does it, checks whether it worked, and tries again if it didn't. The feedback loop is closed without you in it.&lt;/p&gt;

&lt;p&gt;In 2025, coding agents moved from experimental tools to production systems shipping real features to real customers. In 2026, single agents are becoming coordinated teams of agents.&lt;/p&gt;

&lt;p&gt;I've been watching this from an unusual angle. My job involves evaluating AI-generated code for quality: finding the failure modes, writing the rubrics, doing the multi-turn reviews. At the same time I'm building a Python agent that monitors the OINP immigration portal and pushes Telegram alerts whenever a new Masters Graduate stream draw drops. Two different relationships with the same technology, and both have given me a clearer picture than I'd have from either side alone.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Agents Are Actually Good At
&lt;/h2&gt;

&lt;p&gt;Agents handle implementation tasks well when the problem is well-scoped and verifiable. "Add pagination to this endpoint." "Write tests for this module." "Refactor this class to use dependency injection." Tasks with clear success criteria: the code runs, the tests pass, the interface contract is unchanged. The agent can verify its own work.&lt;/p&gt;

&lt;p&gt;Quality still varies. My evaluation work confirms what engineers describe: intuitions for delegation develop over time. People hand off tasks that are easily verifiable or low-stakes. That intuition is real and it matters. Knowing what to delegate is itself a skill now.&lt;/p&gt;

&lt;p&gt;Where agents fall apart is anything requiring judgment about what the right problem even is. An agent given an ambiguous brief will confidently solve the wrong version of it. I've seen this pattern repeatedly, not as an occasional edge case but as a consistent failure mode when the task specification has gaps. The agent doesn't ask for clarification. It infers, fills in, and proceeds. Sometimes the inference is right. When it's wrong, it's wrong in ways that are coherent and hard to catch. That's the part that should make you nervous.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Shift That's Actually Happening to Engineering Teams
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://www.gartner.com/en/newsroom/press-releases/2025-10-20-gartner-identifies-the-top-strategic-technology-trends-for-2026" rel="noopener noreferrer"&gt;Gartner predicts&lt;/a&gt; 80% of organizations will evolve large software engineering teams into smaller, AI-augmented teams by 2030. The trajectory is already visible. Teams that used to need eight engineers to maintain a product are running it with four. Not because the other four got fired, but because agent-assisted output per engineer went up enough that the headcount math changed.&lt;/p&gt;

&lt;p&gt;The pattern emerging in 2026: software development is moving toward human expertise focused on defining problems worth solving while AI handles the tactical implementation work.&lt;/p&gt;

&lt;p&gt;That framing is mostly right but it undersells something. "Defining problems worth solving" sounds clean and strategic. In practice it means writing a spec precise enough that an agent doesn't go off the rails, reviewing agent output at a level that catches subtle correctness issues, and making architecture decisions that hold up when the agent starts filling in implementations you didn't anticipate.&lt;/p&gt;

&lt;p&gt;Those are all hard skills. They're also different from the skills that got most of us into engineering. We learned by writing the implementation ourselves. The feedback loop of "I wrote this, it broke, I understand why" is how you build the mental models that make good judgment possible. Whether that judgment transfers cleanly to directing agents at tasks you've never done yourself is an open question. I don't think anyone knows yet.&lt;/p&gt;




&lt;h2&gt;
  
  
  What This Means If You're Mid-Career
&lt;/h2&gt;

&lt;p&gt;I'm five years in. I've shipped production Android apps, done fintech work, and I'm now working at the AI training layer. The people who seem least threatened by this shift share one thing: they understand systems, not just syntax.&lt;/p&gt;

&lt;p&gt;A developer who knows Kotlin and can write Jetpack Compose components is in a different position than one who understands why coroutine cancellation works the way it does, when a &lt;code&gt;ViewModel&lt;/code&gt; scope is the wrong choice, and what the architectural consequences of a particular state management approach are three features down the road. The first kind of knowledge is increasingly delegatable. The second is what you need to review what the agent produces.&lt;/p&gt;

&lt;p&gt;This is not a comfortable message. It basically says the work that builds deep knowledge is being automated before you've had a chance to accumulate it through repetition. That's a real problem for junior developers and I don't have a clean answer to it. Engineers who actively seek out the "why" behind every pattern they use, even when an agent handed them that pattern, will pull ahead of those who treat agent output as a black box. That's my best guess.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Security Problem Nobody Talks About
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.unsplash.com%2Fphoto-1555066931-4365d14bab8c%3Fw%3D1000%26q%3D80" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fimages.unsplash.com%2Fphoto-1555066931-4365d14bab8c%3Fw%3D1000%26q%3D80" alt="A padlock on a keyboard representing code security" width="1000" height="667"&gt;&lt;/a&gt;&lt;em&gt;Source: &lt;a href="https://unsplash.com/photos/turned-on-gray-laptop-computer-4hbJ-eymZ1o" rel="noopener noreferrer"&gt;Unsplash&lt;/a&gt; — Lewis Kang'ethe Ngugi&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Agentic coding is changing security in two directions. As models get more capable, building security into products gets easier. The same capabilities that help defenders help attackers too.&lt;/p&gt;

&lt;p&gt;There's a third direction worth adding from my evaluation work: agents introduce security risks through confident implementation of insecure patterns. An agent writing a data pipeline reaches for the most direct path to working code. Input sanitization, parameterized queries, credential management, error handling that doesn't leak internals: these require deliberate thought. Agents do them inconsistently.&lt;/p&gt;

&lt;p&gt;The more autonomous the coding pipeline, the more critical it is to have security review that isn't the same agent that wrote the code. I've flagged SQL injection vulnerabilities in agent-generated Python and credential handling issues in agent-generated Kotlin. The code was functionally correct. It would have passed a cursory review. It shouldn't have shipped.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why I'm Still Building Agents
&lt;/h2&gt;

&lt;p&gt;None of this made me stop building the OINP monitoring bot. It made me more deliberate about it.&lt;/p&gt;

&lt;p&gt;The thing I'm building isn't trying to do something clever. It checks a government webpage on a schedule, parses the draw results, compares against the last known state, and fires a Telegram message if something changed. The agent part is the parsing logic: handling inconsistencies in how the page is structured, dealing with cases where the data format shifts slightly. That's a good fit for what these tools are actually good at.&lt;/p&gt;

&lt;p&gt;The immigration system in Canada is opaque in ways that are genuinely stressful for people on it. If a monitoring tool reduces that stress even slightly, it's worth the weekend. The judgment about what's worth building and why is still entirely mine.&lt;/p&gt;

&lt;p&gt;That's probably the honest answer to "now what." The judgment work is still yours. The implementation is increasingly negotiable.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Jaskaran Singh is a Senior Software Engineer working in AI training and evaluation, with production experience in Android development using Kotlin and Flutter. Currently building a Python-based OINP immigration monitoring agent.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;a href="https://linkedin.com/in/jaskaranchana" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; · &lt;a href="https://jaskaranchana.github.io/Portfolio/" rel="noopener noreferrer"&gt;Portfolio&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>career</category>
      <category>security</category>
    </item>
    <item>
      <title>I Grade AI Code for a Living. Here's What Nobody Talks About.</title>
      <dc:creator>Jaskaran Singh</dc:creator>
      <pubDate>Tue, 21 Apr 2026 23:56:05 +0000</pubDate>
      <link>https://dev.to/jaskaran_singh/i-grade-ai-code-for-a-living-heres-what-nobody-talks-about-4do3</link>
      <guid>https://dev.to/jaskaran_singh/i-grade-ai-code-for-a-living-heres-what-nobody-talks-about-4do3</guid>
      <description>&lt;p&gt;&lt;em&gt;Jaskaran Singh — Senior Software Engineer, AI Trainer&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;I've spent the last year doing something most engineers haven't: reading AI-generated code all day and deciding whether it's actually good.&lt;/p&gt;

&lt;p&gt;Not "does it compile." Not "did the tests pass." Good as in, would I be comfortable shipping this to production at 2am on a Friday if something went wrong.&lt;/p&gt;

&lt;p&gt;The answer, more often than people want to admit, is no.&lt;/p&gt;

&lt;p&gt;I use LLMs myself. But after evaluating enough AI-generated code across Python, Java, Kotlin, and C/C++, I know the failure modes aren't random. They follow patterns. And once you see them, you can't unsee them in AI code or your own.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Job Nobody Has a Good Title For
&lt;/h2&gt;

&lt;p&gt;My official role is AI Trainer. What that actually means: I'm a human in the RLHF loop.&lt;/p&gt;

&lt;p&gt;Reinforcement Learning from Human Feedback works by having engineers like me evaluate model outputs against structured rubrics, then rank and rewrite them so the model learns what "better" looks like. I write adversarial prompts to expose failure modes. I do multi-turn code reviews, meaning I follow an entire back-and-forth between a user and a model across five or ten turns, and assess whether the reasoning held up or quietly drifted off the rails somewhere in the middle.&lt;/p&gt;

&lt;p&gt;Less "AI whisperer." More "very opinionated senior reviewer who never runs out of things to flag."&lt;/p&gt;




&lt;h2&gt;
  
  
  The Pattern That Bothers Me Most
&lt;/h2&gt;

&lt;p&gt;There's a category of bug I call "confident and wrong." The code compiles. It's readable. The variable names are sensible. It even has a comment explaining what it does. And it's still wrong. Not obviously wrong, but wrong in the way that only shows up under load, or with a specific input type, or after three other things happen first.&lt;/p&gt;

&lt;p&gt;Here's a real example. Prompt was something like: &lt;em&gt;"Write a function to fetch user details and cache the result."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The model produced:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight kotlin"&gt;&lt;code&gt;&lt;span class="kd"&gt;object&lt;/span&gt; &lt;span class="nc"&gt;UserCache&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;cache&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;HashMap&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;User&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;()&lt;/span&gt;

    &lt;span class="k"&gt;fun&lt;/span&gt; &lt;span class="nf"&gt;getUser&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fetchFn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nc"&gt;User&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nc"&gt;User&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getOrPut&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nf"&gt;fetchFn&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Clean. Concise. Totally broken in a concurrent environment.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;HashMap&lt;/code&gt; isn't thread-safe. Two coroutines calling &lt;code&gt;getOrPut&lt;/code&gt; simultaneously on the same key can corrupt the map. The model didn't add a mutex, didn't suggest &lt;code&gt;ConcurrentHashMap&lt;/code&gt;, didn't even mention the assumption that this runs single-threaded. It just wrote code that works in the demo and fails in production.&lt;/p&gt;

&lt;p&gt;The correct version uses &lt;code&gt;ConcurrentHashMap&lt;/code&gt; or wraps access with a &lt;code&gt;Mutex&lt;/code&gt; if you need atomic get-or-fetch semantics:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight kotlin"&gt;&lt;code&gt;&lt;span class="kd"&gt;object&lt;/span&gt; &lt;span class="nc"&gt;UserCache&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;cache&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ConcurrentHashMap&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;User&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;()&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;mutex&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Mutex&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;suspend&lt;/span&gt; &lt;span class="k"&gt;fun&lt;/span&gt; &lt;span class="nf"&gt;getUser&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fetchFn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;suspend&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nc"&gt;User&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nc"&gt;User&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="o"&gt;?.&lt;/span&gt;&lt;span class="nf"&gt;let&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;it&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;mutex&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;withLock&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="c1"&gt;// double-checked after acquiring lock&lt;/span&gt;
            &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getOrPut&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nf"&gt;fetchFn&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The model's version would pass code review at most places. That's what worries me.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Edge Case Problem is Structural, Not Random
&lt;/h2&gt;

&lt;p&gt;After a few hundred evaluations, I stopped thinking of missed edge cases as oversights. They're structural. LLMs optimize for the problem as stated. If the prompt doesn't mention null inputs, concurrent access, or network timeouts, the model won't think about them either.&lt;/p&gt;

&lt;p&gt;Good engineers treat those as implied. You don't wait to be asked "what if this list is empty." You just handle it.&lt;/p&gt;

&lt;p&gt;Here are the categories where models fail most consistently:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Concurrency.&lt;/strong&gt; Single-threaded assumptions that explode under real-world load. The &lt;code&gt;HashMap&lt;/code&gt; example above is the most common flavor.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Failure state propagation.&lt;/strong&gt; Functions that catch exceptions and return &lt;code&gt;null&lt;/code&gt; or &lt;code&gt;false&lt;/code&gt;, then callers that don't check the return value, and the whole chain silently fails. The model gets each function right in isolation. It gets the composition wrong.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Resource cleanup.&lt;/strong&gt; Network connections, file handles, database cursors left open because the happy path worked and nobody wrote the &lt;code&gt;finally&lt;/code&gt; block or used the right scoping construct.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Behavioral drift across turns.&lt;/strong&gt; In turn 1, the model sets up a class a certain way. By turn 4, after a few "can you refactor this" prompts, it has made changes that contradict the original design without acknowledging it. The code still runs. The architecture is now inconsistent in ways that will cause problems in six months.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Actually Look For in a Code Review
&lt;/h2&gt;

&lt;p&gt;My rubric has eight criteria. The ones that surface the most issues:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Correctness under adversarial input.&lt;/strong&gt; Not "does it work with the example." Does it work when the input is empty, null, malformed, enormous, or concurrent? I'll trace through a model's code in my head with the worst inputs I can think of before scoring it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Explicitness of assumptions.&lt;/strong&gt; Code that works is not the same as code that communicates its constraints. If a function assumes its input is sorted, that needs to be in a comment, a precondition check, or the function name. The model almost never does this unprompted.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Error handling that means something.&lt;/strong&gt; There's a specific anti-pattern I call "error theater":&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight kotlin"&gt;&lt;code&gt;&lt;span class="c1"&gt;// This is not error handling. This is error cosplay.&lt;/span&gt;
&lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;result&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;riskyOperation&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nc"&gt;Log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;e&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"TAG"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"Something went wrong"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;null&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It looks like error handling. It isn't. The caller has no information. The system has no way to recover. The log message gets ignored. Good error handling changes what the caller can do. It doesn't just muffle the crash.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Security surface.&lt;/strong&gt; SQL construction via string interpolation, credentials in code comments, user input passed to shell commands without sanitization. These come up. Not constantly, but often enough that I check every time.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Skill That Transferred Back
&lt;/h2&gt;

&lt;p&gt;I didn't expect this job to change how I write code. It did.&lt;/p&gt;

&lt;p&gt;Spending eight hours a day articulating &lt;em&gt;why&lt;/em&gt; something is wrong, not just flagging it but writing a clear explanation that a model can actually learn from, builds a habit of internal interrogation that's hard to turn off.&lt;/p&gt;

&lt;p&gt;Now, before I submit a PR, I run my own rubric. Is this thread-safe? What happens on retry? Who owns cleanup? Does this function do what its name says, or has it quietly acquired a second responsibility?&lt;/p&gt;

&lt;p&gt;That last one is underrated. Functions that do two things are where bugs live. The AI writes them constantly because function names get generated from the prompt context, and prompts often have two goals. "Fetch and validate" is two functions pretending to be one.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where AI Code Actually Shines
&lt;/h2&gt;

&lt;p&gt;I've been critical, so to be fair.&lt;/p&gt;

&lt;p&gt;AI-generated code is genuinely good at boilerplate. Serialization logic, configuration parsing, test scaffolding, adapters between interfaces that differ only in naming. Tedious work that models handle well. If I ask for a Room database entity with a DAO and a repository, the output is usually solid and saves thirty minutes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight kotlin"&gt;&lt;code&gt;&lt;span class="c1"&gt;// This kind of scaffolding? Models nail it.&lt;/span&gt;
&lt;span class="nd"&gt;@Entity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tableName&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"users"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kd"&gt;data class&lt;/span&gt; &lt;span class="nc"&gt;UserEntity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nd"&gt;@PrimaryKey&lt;/span&gt; &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;email&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="kd"&gt;val&lt;/span&gt; &lt;span class="py"&gt;createdAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;Long&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;currentTimeMillis&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nd"&gt;@Dao&lt;/span&gt;
&lt;span class="kd"&gt;interface&lt;/span&gt; &lt;span class="nc"&gt;UserDao&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nd"&gt;@Query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"SELECT * FROM users WHERE id = :userId"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;suspend&lt;/span&gt; &lt;span class="k"&gt;fun&lt;/span&gt; &lt;span class="nf"&gt;getUserById&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nc"&gt;UserEntity&lt;/span&gt;&lt;span class="p"&gt;?&lt;/span&gt;

    &lt;span class="nd"&gt;@Insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;onConflict&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OnConflictStrategy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;REPLACE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;suspend&lt;/span&gt; &lt;span class="k"&gt;fun&lt;/span&gt; &lt;span class="nf"&gt;insertUser&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;UserEntity&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Models are also good at surfacing options I'd forgotten about. Not because they know my codebase, but because they've seen enough code to suggest a &lt;code&gt;StateFlow&lt;/code&gt; where I was reaching for &lt;code&gt;LiveData&lt;/code&gt;, or use &lt;code&gt;runCatching&lt;/code&gt; in a context where it genuinely fits.&lt;/p&gt;

&lt;p&gt;The mistake is treating it as something that reasons about your system. It doesn't know your system. It knows patterns. Those overlap most of the time and fail in ways that aren't obvious the other times.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why I Wrote This
&lt;/h2&gt;

&lt;p&gt;A few months ago I started noticing that engineers I respect were shipping AI-generated code without reviewing it seriously. Not because they're lazy. Because the code looked fine. That's the problem. It's calibrated to look fine.&lt;/p&gt;

&lt;p&gt;The engineers who work well with AI tooling treat it the way experienced engineers treat a junior developer: capable, useful, not fully trusted without review, and prone to specific failure patterns you learn over time.&lt;/p&gt;

&lt;p&gt;That framing changed how I work with it. I think it'll change how you do too.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Jaskaran Singh is a Senior Software Engineer working in AI training and evaluation. Previously built Android fintech apps at Comviva Technologies and Talentica Software. Currently building a Python-based OINP immigration monitoring bot on the side, because immigration status shouldn't require manually refreshing government websites.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Find me on &lt;a href="https://linkedin.com/in/jaskaranchana" rel="noopener noreferrer"&gt;LinkedIn&lt;/a&gt; or at my &lt;a href="https://jaskaranchana.github.io/Portfolio/" rel="noopener noreferrer"&gt;portfolio&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>career</category>
      <category>android</category>
      <category>kotlin</category>
    </item>
  </channel>
</rss>
