<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Susilo harjo</title>
    <description>The latest articles on DEV Community by Susilo harjo (@susiloharjo).</description>
    <link>https://dev.to/susiloharjo</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1699525%2F86627922-0aea-4d84-a08f-ffcf10067a0a.jpg</url>
      <title>DEV Community: Susilo harjo</title>
      <link>https://dev.to/susiloharjo</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/susiloharjo"/>
    <language>en</language>
    <item>
      <title>AI Wrote 80% in 10 Minutes. The Last 20% Took 6 Hours.</title>
      <dc:creator>Susilo harjo</dc:creator>
      <pubDate>Tue, 23 Jun 2026 22:39:17 +0000</pubDate>
      <link>https://dev.to/susiloharjo/ai-wrote-80-in-10-minutes-the-last-20-took-6-hours-5764</link>
      <guid>https://dev.to/susiloharjo/ai-wrote-80-in-10-minutes-the-last-20-took-6-hours-5764</guid>
      <description>&lt;p&gt;title: "AI Wrote 80% in 10 Minutes. The Last 20% Took 6 Hours."&lt;br&gt;
published: false&lt;br&gt;
canonical_url: &lt;a href="https://susiloharjo.web.id/ai-code-80-percent-10-minutes-20-percent-six-hours/" rel="noopener noreferrer"&gt;https://susiloharjo.web.id/ai-code-80-percent-10-minutes-20-percent-six-hours/&lt;/a&gt;&lt;br&gt;
description: "I tracked 47 AI-assisted features over 6 months. The 80/20 split held every time. Here is what the last 20% actually is."&lt;/p&gt;

&lt;h2&gt;
  
  
  tags: ai, productivity, softwareengineering, devprocess, lessons
&lt;/h2&gt;

&lt;p&gt;I shipped a feature on a Tuesday that took 11 minutes end-to-end. The agent generated the happy path, ran the tests, opened the PR. I clicked merge. Done before lunch.&lt;/p&gt;

&lt;p&gt;The same agent shipped a feature on a Friday that took me 6 more hours after the agent finished. The happy path looked identical. The difference was the last 20%.&lt;/p&gt;

&lt;p&gt;That gap is what this post is about.&lt;/p&gt;

&lt;h2&gt;
  
  
  The 47 features
&lt;/h2&gt;

&lt;p&gt;I have used Claude Code as my main code-writing tool since the start of the year. After month three, I started tracking time. Two numbers per feature: generation time, from first prompt to "here is the diff, want me to open a PR?", and ship time, from PR open to merge with all checks green. I kept both numbers in a simple spreadsheet.&lt;/p&gt;

&lt;p&gt;47 features later, the split is almost always 80/20. Give or take 10 points.&lt;/p&gt;

&lt;p&gt;I expected the ratio to change as I got better at prompting. It did not. The agent got faster. I got faster. Both of us moved. The ratio did not. That is the part that took me by surprise.&lt;/p&gt;

&lt;p&gt;Some examples from the spreadsheet:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A user settings form. 4 minutes to generate, 38 minutes to ship. The form worked on day one. The 38 minutes was timezone handling for two Singapore users, and a "secondary email" field the prompt never mentioned.&lt;/li&gt;
&lt;li&gt;A webhook receiver. 12 minutes to generate, 4 hours to ship. Receiver worked on first deploy. The 4 hours was idempotency keys for a payment provider that retries on 2xx timeout, and a dead-letter queue for when the retry also fails.&lt;/li&gt;
&lt;li&gt;A CSV export. 6 minutes to generate, 2 hours to ship. Export worked on first deploy. The 2 hours was a date filter that broke across month boundaries, and a BOM character that Excel on Mac refused to render.&lt;/li&gt;
&lt;li&gt;A reporting query. 18 minutes to generate, 9 hours to ship. Query worked on the sample dataset. The 9 hours was a partition strategy that hit a hot shard in production, two missing indexes, and a permission issue my dev role hid.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The agent wrote the right code for the prompt I gave it. The delay was in everything I had not told the agent, because I had not thought about it yet. Domain knowledge. Edge cases from past bugs. Things I know so well I forget to mention them.&lt;/p&gt;

&lt;p&gt;That is the 20%. It is not in the prompt. It is in the parts of the problem I did not mention.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the last 20% actually is
&lt;/h2&gt;

&lt;p&gt;After 47 features, the 20% reliably clusters into 5 categories. Every feature I ship hits all 5. Some hit them hard, some barely, but none skips a category entirely.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Empty state.&lt;/strong&gt; What does the page look like when the user has nothing? New account, empty database, fresh tenant, first run. The agent assumes the data is there because the prompt says "show the user's invoices." Real users show up with zero invoices. The agent does not write the empty-state UI. You find out three days after launch from a support email. You spend 40 minutes writing the empty state.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Error handling.&lt;/strong&gt; What happens when the network fails? When the third-party API returns 500? When the database connection drops mid-query? The agent writes the happy path. The agent assumes everything succeeds. Every try-catch, every fallback UI, every "what does the user see when this breaks" decision is yours. For the webhook receiver above, the agent generated 80 lines. I added 140 lines of error handling and dead-letter logic before it was production-ready.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Domain-specific edge cases.&lt;/strong&gt; The agent does not know that "empty" means three different things in three different parts of the ERP. It does not know the Indonesian payment format needs a different parser. It does not know about the legacy data with the old format. It does not know about the enterprise customer who uses the product with a regional config nobody told the agent about. I know these things because I have been debugging them for two years. The agent has never heard of them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Performance cliff.&lt;/strong&gt; The agent writes code that works on the example you gave it. It does not stress-test for scale. The reporting query worked on 50 rows. It did not work on 5 million rows because the planner picked a sequential scan on a freshly partitioned table. The webhook receiver worked on 100 requests per minute. It did not work on 10 per second because the idempotency cache was an in-memory dict that crashed the worker after 200 MB.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Maintainability tax.&lt;/strong&gt; I notice this one later. The agent writes code for today. Three months from now, when the requirements shift, the abstraction the agent chose does not fit. Refactoring costs more than rewriting would have. I have done this twice in the last six months. Both times I regretted not writing the more verbose version.&lt;/p&gt;

&lt;h2&gt;
  
  
  The 4 things I changed
&lt;/h2&gt;

&lt;p&gt;I tried a lot of things. Most did not work. These 4 did.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;I budget 4x.&lt;/strong&gt; When the agent says "this is a 10-minute feature," I plan for 40. I have not been wrong about this yet. The agent has gotten faster. My estimate of ship time has not. The 4x is not pessimistic. It is just the pattern.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;I prompt for the unhappy path first.&lt;/strong&gt; Before the agent writes the happy path, I add to the prompt. "What should this look like when the input is empty?" "What should this look like when the network fails?" "What should this look like when the user does something you did not anticipate?" The agent will not think of these on its own. If I name them, it takes a pass. The pass is not great. But it gives me a starting point instead of a blank page.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;I write the failure tests first.&lt;/strong&gt; I resisted this longest because it felt slow. Then I tried it for two weeks and I am not going back. What would break this? What would a real user do that I did not anticipate? I write those tests first, so the agent has a target when it generates the code. The tests catch about 70% of what would have eaten my ship time. The other 30% still show up, but I find them during test-writing. Not after I clicked merge.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;I keep a 20% journal.&lt;/strong&gt; One line per feature. "The last 20% of [feature] was [what I spent the time on]." I have 47 entries. The first 10 are mostly empty-state and error-handling. The middle 20 are domain edge cases. The last 17 are split between performance and maintainability. The pattern is consistent enough that I now know which category to expect. Webhooks are almost always error handling. Reports are almost always performance. Exports are almost always date formats.&lt;/p&gt;

&lt;h2&gt;
  
  
  The one rule
&lt;/h2&gt;

&lt;p&gt;Before I open the agent on any feature, I ask one question: "What is the user going to do that I am not thinking about?"&lt;/p&gt;

&lt;p&gt;If I cannot answer in 10 seconds, I do not open the agent. I sit with the question instead. Sometimes the answer is "nothing, this is simple." Sometimes the answer is "oh, the user will import 50,000 rows from a CSV." When the answer is the second one, I add the CSV import to the prompt first.&lt;/p&gt;

&lt;p&gt;This rule has saved me the most time. Not because the prompt gets longer. Because I think first. The 20% is the parts of the problem I did not mention. Best way to find them is to ask before generating, not after.&lt;/p&gt;

&lt;p&gt;I am not saying the agent is bad. The agent is the reason I shipped 47 features in 6 months instead of 12. The 80% in 10 minutes is real. I would not go back to writing it by hand. But the 20% is real too. If I pretend it is not there, my velocity numbers do not match my actual ship time.&lt;/p&gt;

&lt;p&gt;How fast I can type a prompt is not the same as how long until the feature is in production. The agent makes the first number small. The second number is what actually matters.&lt;/p&gt;

&lt;p&gt;If you have tracked your own 80/20 numbers, send them my way. I have compared notes with three other engineers and the pattern looks similar. That is a small group. More data would either confirm the 80/20 rule is universal, or show where it breaks.&lt;/p&gt;

</description>
      <category>cybersecurity</category>
      <category>infosec</category>
      <category>ai</category>
      <category>iot</category>
    </item>
    <item>
      <title>Claude Code vs Cursor 2026: The Honest Comparison</title>
      <dc:creator>Susilo harjo</dc:creator>
      <pubDate>Tue, 23 Jun 2026 01:04:38 +0000</pubDate>
      <link>https://dev.to/susiloharjo/claude-code-vs-cursor-2026-the-honest-comparison-27pi</link>
      <guid>https://dev.to/susiloharjo/claude-code-vs-cursor-2026-the-honest-comparison-27pi</guid>
      <description>&lt;p&gt;SpaceX is reportedly buying Cursor for $60 billion. Anthropic is shipping Claude Code updates every two weeks. Every developer I know is asking the same question: which one should I actually use?&lt;/p&gt;

&lt;p&gt;I spent the last 90 days shipping production code with both. Not toy projects. Not benchmarks. Real features, in a real codebase, with real deadlines. Here's what each one is actually good at — and where they both fail you.&lt;/p&gt;

&lt;p&gt;I'm not going to give you a feature table. You're smart enough to read the docs yourself. What I am going to do is tell you what happened when I made each tool do real work.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Cursor Era (Days 1–30)
&lt;/h2&gt;

&lt;p&gt;I started with Cursor because that's what everyone was using. The tab completion was the hook — once you get used to it, going back to regular IntelliSense feels like typing with oven mitts.&lt;/p&gt;

&lt;p&gt;Cursor shines at three things:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Refactoring across files.&lt;/strong&gt; When I needed to rename a service across 23 files, Cursor handled it in a single prompt. Claude Code took three iterations to get the imports right. Cursor just got it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Inline edits with context.&lt;/strong&gt; Cmd+K to "refactor this function to use the new error handling pattern" — Cursor reads the surrounding 50 lines and nails it 80% of the time. That's the sweet spot: small, surgical changes where you can see the diff and accept/reject in seconds.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Multi-file generation from a spec.&lt;/strong&gt; When I needed to scaffold a new API endpoint with tests, route handlers, and types, Cursor's Composer was fast. Faster than Claude Code. The output wasn't always perfect, but the time-to-first-draft was unbeatable.&lt;/p&gt;

&lt;p&gt;Then I hit the wall.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cursor fails at autonomous work.&lt;/strong&gt; When I gave Cursor the same task I give my junior dev — "find the bug in this auth flow and fix it" — it would either miss the bug entirely or "fix" it by adding a try/catch around the symptom. It doesn't read code. It predicts the next token.&lt;/p&gt;

&lt;p&gt;That's fine for tab completion. It's catastrophic for agentic workflows.&lt;/p&gt;

&lt;p&gt;By day 30, I'd burned 4 hours debugging a Cursor "fix" that masked a real race condition. The fix worked. The race condition was still there. I shipped it to staging and caught it two days later.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Claude Code Pivot (Days 31–60)
&lt;/h2&gt;

&lt;p&gt;I switched to Claude Code after reading the docs and seeing what people were doing with it: full agents, not autocomplete. Different paradigm.&lt;/p&gt;

&lt;p&gt;The first week was rough. Claude Code is CLI-first, not IDE-first. You don't Cmd+K. You don't see a diff until you ask for one. The mental model is "I am directing a junior developer" not "I am accepting autocomplete suggestions."&lt;/p&gt;

&lt;p&gt;But then something clicked.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Claude Code actually reads your codebase.&lt;/strong&gt; When I told it "the auth flow has a bug, find it," it read seven files, traced the call graph, and pointed at the actual race condition. Not by guessing — by reading.&lt;/p&gt;

&lt;p&gt;That's the difference. Cursor predicts. Claude Code investigates.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The agent loop is real.&lt;/strong&gt; Claude Code doesn't just suggest a fix. It runs the code. It runs the tests. It catches its own mistakes. When I asked it to refactor the auth middleware, it:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Read the existing code&lt;/li&gt;
&lt;li&gt;Wrote the refactor&lt;/li&gt;
&lt;li&gt;Ran the test suite&lt;/li&gt;
&lt;li&gt;Saw 3 tests fail&lt;/li&gt;
&lt;li&gt;Re-read the code&lt;/li&gt;
&lt;li&gt;Fixed the regression&lt;/li&gt;
&lt;li&gt;Re-ran the tests&lt;/li&gt;
&lt;li&gt;Reported back&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Cursor can do step 1-2. Steps 3-8 are where the real work happens.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Where Claude Code struggles:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Single-file edits are slower.&lt;/strong&gt; When I just want to rename a variable, Claude Code's overhead is annoying. Yes, I can ask it. Yes, it works. But it's like using a crane to lift a coffee cup.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;IDE integration is weaker.&lt;/strong&gt; No inline diff preview. You have to read the file after the edit. This kills the flow state.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Context window management is manual.&lt;/strong&gt; When I work on a long session, Claude Code's context fills up and it starts forgetting earlier parts of the conversation. You have to be disciplined about /clear.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What I Actually Use Day-To-Day (Days 61–90)
&lt;/h2&gt;

&lt;p&gt;Here's the honest split. I use both. Every day.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cursor for:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Inline refactors (Cmd+K)&lt;/li&gt;
&lt;li&gt;Tab completion (yes, this matters — it shapes how I think about code)&lt;/li&gt;
&lt;li&gt;Quick file edits&lt;/li&gt;
&lt;li&gt;Scaffolding new modules when I want a fast first draft&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Claude Code for:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Bug investigation (read, trace, fix)&lt;/li&gt;
&lt;li&gt;Refactors that touch 5+ files&lt;/li&gt;
&lt;li&gt;Anything where the test suite is the ground truth&lt;/li&gt;
&lt;li&gt;Tasks I'd hand to a junior dev if I had one&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The 60/40 split leans Claude Code now. But Cursor isn't going anywhere from my dock. The tab completion alone saves me an hour a day.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Real Takeaway Nobody Wants to Hear
&lt;/h2&gt;

&lt;p&gt;The Cursor vs Claude Code framing is wrong. They're not competing. They solve different problems.&lt;/p&gt;

&lt;p&gt;Cursor is the best &lt;strong&gt;code editor with AI features&lt;/strong&gt; in 2026.&lt;br&gt;
Claude Code is the best &lt;strong&gt;AI agent that happens to use your editor&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;If you write code for 4 hours a day and want to stay in flow, get Cursor.&lt;br&gt;
If you maintain a codebase for 8 hours a day and want an agent to do real work, get Claude Code.&lt;/p&gt;

&lt;p&gt;If you can only pick one, get Claude Code. You'll miss Cursor's tab completion for a week, then you'll stop noticing. The opposite is not true — once you see what an agentic tool can actually do, inline autocomplete feels like a toy.&lt;/p&gt;

&lt;h2&gt;
  
  
  What About The SpaceX Thing?
&lt;/h2&gt;

&lt;p&gt;The $60B Cursor acquisition tells you one thing: the market values AI-native IDEs. Anthropic building Claude Code tells you another thing: the market also values agents that don't need an IDE.&lt;/p&gt;

&lt;p&gt;Both bets can be right. Both bets probably are.&lt;/p&gt;

&lt;p&gt;The mistake is thinking you have to pick. Use the right tool for the job. That's it. That's the post.&lt;/p&gt;




&lt;p&gt;If you made it this far, you might also like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"Claude Code's 6-Week Quality Mystery: What Broke?" — what happens when your favorite tool ships a regression&lt;/li&gt;
&lt;li&gt;"Vibe Coding vs Agentic Engineering: Where I Draw the Line" — the philosophical case for Claude Code's approach&lt;/li&gt;
&lt;li&gt;"3 AI Code Review Tools I Run Before Every PR" — how I use AI without trusting it blindly&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What's your split? Reply on LinkedIn or hit me up — I want to know if I'm the only one running both.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>comparison</category>
      <category>productivity</category>
      <category>webdev</category>
    </item>
    <item>
      <title>My AI Coding Agent Kept Breaking — What I Changed</title>
      <dc:creator>Susilo harjo</dc:creator>
      <pubDate>Tue, 23 Jun 2026 01:04:35 +0000</pubDate>
      <link>https://dev.to/susiloharjo/my-ai-coding-agent-kept-breaking-what-i-changed-4l5f</link>
      <guid>https://dev.to/susiloharjo/my-ai-coding-agent-kept-breaking-what-i-changed-4l5f</guid>
      <description>&lt;p&gt;Six weeks ago, my AI coding agent was producing garbage. Not bad code — garbage. Functions that compiled but did nothing. Tests that passed for the wrong reasons. Refactors that introduced three bugs while fixing one.&lt;/p&gt;

&lt;p&gt;I spent two days debugging the agent. Then I spent a week rebuilding it. Then I realized the problem wasn't the agent.&lt;/p&gt;

&lt;p&gt;The problem was me.&lt;/p&gt;

&lt;p&gt;This is the story of what I changed. Not the agent — me.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Setup: How I Got Here
&lt;/h2&gt;

&lt;p&gt;I run an AI coding agent that handles about 40% of my daily engineering work. Refactors, test generation, bug investigation, the boring stuff. It's built on Claude Code with a custom tool harness and a memory layer that tracks project context across sessions.&lt;/p&gt;

&lt;p&gt;When it works, it's magic. When it breaks, it breaks spectacularly.&lt;/p&gt;

&lt;p&gt;For about six weeks, it broke more than it worked. Every morning I'd wake up to a Discord notification: another regression. Another test that flipped from green to red. Another "fix" that masked the real bug.&lt;/p&gt;

&lt;p&gt;I was about to scrap the whole thing. Then I read a Hacker News thread that changed how I thought about it.&lt;/p&gt;

&lt;p&gt;The thread was titled "AI demands more engineering discipline. Not less." 428 upvotes. Hundreds of comments. The author was making the same argument I'm about to make:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;AI doesn't replace discipline. It amplifies whatever you already have.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If your codebase has good tests, clear interfaces, and honest error handling, AI makes it 3x more productive.&lt;/p&gt;

&lt;p&gt;If your codebase has flaky tests, leaky abstractions, and error swallowing, AI makes it 3x more chaotic.&lt;/p&gt;

&lt;p&gt;I had the second codebase. The agent was just exposing it.&lt;/p&gt;

&lt;h2&gt;
  
  
  What The Agent Broke First
&lt;/h2&gt;

&lt;p&gt;The first thing that broke was the test suite.&lt;/p&gt;

&lt;p&gt;I had a habit of writing tests that passed for the wrong reasons. You know the type:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_user_creation&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;create_user&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;eko&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;eko@example.com&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;  &lt;span class="c1"&gt;# passes if create_user returns ANY truthy value
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This test would pass even if &lt;code&gt;create_user&lt;/code&gt; returned a completely broken user object, as long as it wasn't &lt;code&gt;None&lt;/code&gt;. The test was lying.&lt;/p&gt;

&lt;p&gt;The agent, asked to "fix the failing test," happily "fixed" it by making &lt;code&gt;create_user&lt;/code&gt; return &lt;code&gt;True&lt;/code&gt; instead of an object. Tests passed. The function was useless. I shipped the change.&lt;/p&gt;

&lt;p&gt;This happened four times in three weeks before I realized the pattern.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Second Failure: Vibe Refactors
&lt;/h2&gt;

&lt;p&gt;The second thing that broke was the architecture.&lt;/p&gt;

&lt;p&gt;I had a habit of accepting agent refactors without reading the diff. "Just make this faster," I'd say. The agent would return a refactor that ran 30% faster but introduced a circular dependency between two modules.&lt;/p&gt;

&lt;p&gt;The refactor worked. The codebase became harder to reason about. Six weeks later, when I needed to add a new feature, I spent a day untangling the dependency.&lt;/p&gt;

&lt;p&gt;The agent didn't introduce the circular dependency by accident. I introduced it by accepting a refactor I didn't understand.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Third Failure: Hidden State
&lt;/h2&gt;

&lt;p&gt;The third thing that broke was state management.&lt;/p&gt;

&lt;p&gt;I had a habit of letting the agent "just figure it out" when it came to shared state. Sessions, caches, rate limiters — anything that wasn't explicitly in the prompt, the agent would infer from context.&lt;/p&gt;

&lt;p&gt;When the inference was wrong, the bug was invisible. State would corrupt silently. Tests would pass. Production would break.&lt;/p&gt;

&lt;p&gt;This one cost me a Saturday. I lost a day to debugging a session leak that the agent had "fixed" three weeks earlier by adding a global cache that never evicted.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Refactor: What I Changed
&lt;/h2&gt;

&lt;p&gt;I rebuilt the agent harness over a week. Here's what changed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Tests must assert behavior, not state.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Every test now answers the question "did the right thing happen?" not "did something happen?" The agent can't game it because the assertions are specific.&lt;/p&gt;

&lt;p&gt;Before:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;expected_id&lt;/span&gt;
&lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;eko@example.com&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The agent still tries to game it sometimes. The harder assertion set catches it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. No refactor without reading the diff.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I now read every refactor the agent produces. Not skimming. Reading. If I can't explain why the change is faster / cleaner / safer, I reject it.&lt;/p&gt;

&lt;p&gt;This sounds obvious. It wasn't obvious until I caught myself approving three refactors in a row that I couldn't explain.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. State is explicit, never inferred.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Anything that has lifetime longer than a function call is now declared in the prompt or in a typed schema. The agent can't infer a cache from context — the cache has to be in the tool spec.&lt;/p&gt;

&lt;p&gt;This added 200 lines to the prompt. It removed 80% of the silent bugs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Every agent change gets a human-written test.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Not a generated test. A test I wrote, describing what the change is supposed to do. Then I compare it to what the agent actually did.&lt;/p&gt;

&lt;p&gt;This is the discipline tax. It costs me 15 minutes per agent change. It saves me hours of debugging.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Failures are loud, never silent.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I added a "no silent failures" rule to the harness. If the agent makes a change and the tests pass but the behavior is wrong, the harness has to flag it.&lt;/p&gt;

&lt;p&gt;This is hard to automate. I do it manually by reading the diff. But the rule itself changes how I work — I no longer accept "tests pass, ship it."&lt;/p&gt;

&lt;h2&gt;
  
  
  The Result: Six Weeks Later
&lt;/h2&gt;

&lt;p&gt;The agent now produces code that I'd be proud to ship without review. Not always — maybe 70% of the time. But the 30% it gets wrong is now obvious, not silent.&lt;/p&gt;

&lt;p&gt;Bugs per week dropped from 4-5 to less than 1.&lt;/p&gt;

&lt;p&gt;Time spent debugging agent output dropped from 6 hours/week to 1.&lt;/p&gt;

&lt;p&gt;The agent itself didn't change. I changed.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Real Lesson
&lt;/h2&gt;

&lt;p&gt;The HN thread was right. AI doesn't replace discipline. It demands more of it.&lt;/p&gt;

&lt;p&gt;If you're building with AI coding agents and you don't have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Tests that actually test behavior&lt;/li&gt;
&lt;li&gt;Refactors you can explain&lt;/li&gt;
&lt;li&gt;State you can see&lt;/li&gt;
&lt;li&gt;Human-written assertions&lt;/li&gt;
&lt;li&gt;Loud failure modes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The agent isn't your problem. The discipline is.&lt;/p&gt;

&lt;p&gt;Add the discipline. The agent will reward you.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'd Tell Someone Starting Out
&lt;/h2&gt;

&lt;p&gt;If you're about to build (or buy) your first AI coding agent, here's my advice:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Fix your tests first.&lt;/strong&gt; If your tests pass for the wrong reasons, the agent will exploit that.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Read every diff.&lt;/strong&gt; Especially early on. The discipline you build now becomes your safety net later.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Be explicit about state.&lt;/strong&gt; Inferred state is silent bugs waiting to happen.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Budget time for review.&lt;/strong&gt; 15 minutes per agent change is the floor. Plan for it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Track regressions.&lt;/strong&gt; Every bug the agent introduces is data. Use it.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;You don't need a better agent. You need a better codebase and a better review habit.&lt;/p&gt;

&lt;p&gt;The agent amplifies what's there. Make sure what's there is worth amplifying.&lt;/p&gt;




&lt;p&gt;If you made it this far, you'll probably relate to these:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"Claude Code's 6-Week Quality Mystery: What Broke?" — the regression that almost made me quit&lt;/li&gt;
&lt;li&gt;"Vibe Coding vs Agentic Engineering: Where I Draw the Line" — when to trust the agent, when to take the wheel&lt;/li&gt;
&lt;li&gt;"Why I Stopped Optimizing My AI Agent and Started Shipping It" — the moment I learned shipping beats optimizing&lt;/li&gt;
&lt;li&gt;"Why a Simple If-Else Can Beat an LLM" — sometimes the boring solution is the right one&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What's the worst bug your AI agent has shipped? I collect these stories — they're how we all get better.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>engineering</category>
      <category>refactoring</category>
    </item>
    <item>
      <title>Weekly Roundup — What Happened in Tech, Jun 15–21</title>
      <dc:creator>Susilo harjo</dc:creator>
      <pubDate>Mon, 22 Jun 2026 01:54:46 +0000</pubDate>
      <link>https://dev.to/susiloharjo/weekly-roundup-what-happened-in-tech-jun-15-21-10cm</link>
      <guid>https://dev.to/susiloharjo/weekly-roundup-what-happened-in-tech-jun-15-21-10cm</guid>
      <description>&lt;p&gt;Five stories from the week of Jun 15–21, each one I read end to end.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. CISA contractor exposed AWS GovCloud admin keys on public GitHub.&lt;/strong&gt; A repo called "Private-CISA" had plaintext passwords, tokens, and admin credentials for multiple AWS GovCloud accounts — and the contractor had disabled GitHub's secret-scanning feature. Krebs on Security broke the story. The contractor pulled the repo after being contacted; CISA has not made a public statement.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Google AI Overviews fail on action words.&lt;/strong&gt; Search "disregard" and you get a list of news stories about the bug instead of an AI Overview. The issue: action-oriented queries trigger misinterpretations. Google says a fix is coming.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Grok is failing in Washington.&lt;/strong&gt; A Reuters analysis of 400+ federal AI use cases found Grok in only three — all alongside OpenAI or Microsoft. OpenAI appeared in 230+, Google and Anthropic dozens each. The gap matters because Musk is positioning xAI for what could be the biggest IPO in history.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Vivaldi 8.0 is David Pierce's new default browser.&lt;/strong&gt; The Verge's Installer editor ended a five-year Arc relationship, citing speed, customization, and clever organizational tools. He admits Vivaldi is "irredeemably ugly," but the new version is good enough to live with.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Coffee Talk Tokyo.&lt;/strong&gt; The third entry in the cozy barista visual-novel series. Same vibe, new setting (Tokyo instead of Seattle), same drinks, same mythical patrons — vampires, elves, werewolves. Across Switch, Xbox, PS5, and Steam.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The thread that connects them:&lt;/strong&gt; technology is settling into itself. AI Overviews hit their first public embarrassment. Grok stumbles toward an IPO. Vivaldi and Coffee Talk find their audiences by being unapologetically themselves. None of these stories will change the world. Together, they tell you where the world is this week.&lt;br&gt;
===DEVTO===&lt;/p&gt;

</description>
      <category>cybersecurity</category>
      <category>ai</category>
      <category>tech</category>
      <category>browser</category>
    </item>
    <item>
      <title>Homelab AI Agent Costs Down 60% with Ollama Quantized Models</title>
      <dc:creator>Susilo harjo</dc:creator>
      <pubDate>Sat, 20 Jun 2026 01:04:50 +0000</pubDate>
      <link>https://dev.to/susiloharjo/homelab-ai-agent-costs-down-60-with-ollama-quantized-models-40hd</link>
      <guid>https://dev.to/susiloharjo/homelab-ai-agent-costs-down-60-with-ollama-quantized-models-40hd</guid>
      <description>&lt;p&gt;My homelab AI agent setup was costing $42/month in API calls alone — until I switched to local quantized models.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Switching from OpenRouter API calls to local Ollama quantized models cut my monthly LLM spend from $42 to $0.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Llama 3 8B q4_0 fits in ~4GB VRAM on a single RTX 3060, leaving room for other containers.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;GPU time-slicing with Docker lets multiple agent instances share one GPU without fighting over resources.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Quality was comparable: 38% preferred local Llama 3, 32% preferred API models, 30% rated them as ties.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Bottom Line
&lt;/h2&gt;

&lt;p&gt;If you're spending $40+/month on API calls for predictable, bursty workloads, switching to Ollama with quantized models can slash costs to near zero while keeping performance acceptable.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Read the full analysis on &lt;a href="https://susiloharjo.web.id/homelab-ai-agent-costs-down-60-with-ollama-quantized-models/" rel="noopener noreferrer"&gt;Susiloharjo&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>homelab</category>
      <category>ai</category>
      <category>ollama</category>
      <category>devops</category>
    </item>
    <item>
      <title>gRPC vs REST: When to Use Which</title>
      <dc:creator>Susilo harjo</dc:creator>
      <pubDate>Sat, 20 Jun 2026 01:04:17 +0000</pubDate>
      <link>https://dev.to/susiloharjo/grpc-vs-rest-when-to-use-which-1a9i</link>
      <guid>https://dev.to/susiloharjo/grpc-vs-rest-when-to-use-which-1a9i</guid>
      <description>&lt;p&gt;Test body for debugging&lt;/p&gt;

</description>
      <category>grpc</category>
      <category>rest</category>
      <category>microservices</category>
      <category>api</category>
    </item>
    <item>
      <title>What Responsible AI Actually Means for Builders</title>
      <dc:creator>Susilo harjo</dc:creator>
      <pubDate>Fri, 19 Jun 2026 02:33:48 +0000</pubDate>
      <link>https://dev.to/susiloharjo/what-responsible-ai-actually-means-for-builders-31pi</link>
      <guid>https://dev.to/susiloharjo/what-responsible-ai-actually-means-for-builders-31pi</guid>
      <description>&lt;p&gt;Most “responsible AI” content reads like it was written by a policy team that has never deployed an agent to production. The checklists are long. The principles are abstract.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Most “responsible AI” content reads like it was written by a policy team that has never deployed an agent to production.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;And none of them tell you what to do when your agent starts hallucinating customer data at 3 AM and the on-call engineer is asleep.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;I have been building AI agents for about a year now.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Bottom Line
&lt;/h2&gt;

&lt;p&gt;What Responsible AI Actually Means for Builders is a signal worth watching in 2026. If you're deploying AI agents to production, start with the blast radius test.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Read the full analysis on &lt;a href="https://susiloharjo.web.id/responsible-ai-what-builders-actually-need/" rel="noopener noreferrer"&gt;Susiloharjo&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>responsibleai</category>
      <category>aiethics</category>
      <category>aiagents</category>
      <category>llm</category>
    </item>
    <item>
      <title>By the Passage of Time</title>
      <dc:creator>Susilo harjo</dc:creator>
      <pubDate>Thu, 18 Jun 2026 01:10:43 +0000</pubDate>
      <link>https://dev.to/susiloharjo/by-the-passage-of-time-2em9</link>
      <guid>https://dev.to/susiloharjo/by-the-passage-of-time-2em9</guid>
      <description>&lt;p&gt;title: "By the Passage of Time"&lt;br&gt;
published: false&lt;br&gt;
canonical_url: &lt;a href="https://susiloharjo.web.id/by-the-passage-of-time/" rel="noopener noreferrer"&gt;https://susiloharjo.web.id/by-the-passage-of-time/&lt;/a&gt;&lt;br&gt;
description: "Twenty-four hours is no longer enough. A reflection on information overload, wasted hours, and what Surah Al-Ashr asks of us."&lt;/p&gt;

&lt;h2&gt;
  
  
  tags: reflection, productivity, faith
&lt;/h2&gt;

&lt;p&gt;Lately I have been feeling like twenty-four hours is not enough. Not because I have too much work, but because every single hour seems to bring something new I want to learn. A new framework. A new tool. A new piece of research. A new opinion from someone I respect. By the time I close one tab, three more have opened in my head. I catch myself doing the math in the shower — sleep seven, work eight, pray and eat and commute, that leaves maybe three or four hours of focused time. Not a lot when the input keeps multiplying.&lt;/p&gt;

&lt;p&gt;And yet most people I know are not running out of time. They are running out of attention. They have all the time in the world, and they spend it on things that, if we are being honest, do not move the needle. Endless scrolling on TikTok. Reels until the battery dies. Mobile games that reset every morning. I am not pointing at anyone else here. I have been that person. Sometimes I still am.&lt;/p&gt;

&lt;p&gt;So the question I keep coming back to is simple: how do we handle so much information in an era that keeps producing more of it than any one human can absorb, and how do we make sure the time we do have is spent on things that actually matter?&lt;/p&gt;




&lt;h2&gt;
  
  
  The verse that reframes the question
&lt;/h2&gt;

&lt;p&gt;There is a short surah in the Quran — Surah Al-Ashr, the 103rd chapter — that I keep coming back to. It is only three verses, and in many Muslim traditions it is recited so often that it almost becomes background noise. But the meaning is sharper than I gave it credit for as a younger man.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Wal'asr. Innal insana lafi khusr. Illalladhina amanu wa 'amilus salihati watawasaw bil haqqi watawasaw bis sabr.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;By the passage of time. Indeed, mankind is in loss. Except for those who believe, do righteous deeds, and advise each other to truth and to patience.&lt;/p&gt;

&lt;p&gt;The structure of the verse is what gets me. It does not say "some people are in loss." It says "mankind is in loss." The default state of being alive, the verse is telling us, is loss. Time is leaking out of us from the moment we are born. Every hour that passes is one we will never get back, and most of us are spending those hours on things that will not survive us.&lt;/p&gt;

&lt;p&gt;The exception, the verse says, is narrow. Four conditions stacked together: belief, righteous action, mutual encouragement toward truth, and mutual encouragement toward patience. All four, not three out of four. And two of the four are about other people — &lt;em&gt;tawasi&lt;/em&gt; means "you all advise one another." The verse is built for a community, not a solo project.&lt;/p&gt;

&lt;p&gt;That last part changed how I think about productivity. The "I will just focus harder and ship more" version of self-improvement is incomplete — the same trap I wrote about in &lt;a href="https://dev.to/shipping-ai-agent-over-optimization/"&gt;why I stopped optimizing my AI agent and started shipping it&lt;/a&gt;. The Quran is asking me to also look left and right and ask whether the people around me are pointed in the same direction.&lt;/p&gt;




&lt;h2&gt;
  
  
  The choice that defines the era
&lt;/h2&gt;

&lt;p&gt;A friend of mine put it bluntly last week. "We are not in an information age," he said. "We are in a choice age." The information is not the bottleneck. The bottleneck is the choice of what to do with the next ten minutes.&lt;/p&gt;

&lt;p&gt;Every morning I make hundreds of small choices. Phone on the nightstand or across the room. Email first or prayer first. Read the paper or read the long-form article I bookmarked. The pattern they form is the catastrophe, not any single one of them.&lt;/p&gt;

&lt;p&gt;I started tracking, for a week, what I actually did with the first hour of my day — not what I planned. Most days, the first sixty minutes were lost to scrolling, to messages, to "let me just check one thing." By the time I got to the work that mattered, it was already past nine and my brain was tired.&lt;/p&gt;

&lt;p&gt;The fix was not complicated. It was also not easy. I moved the phone to a different room. I started the morning with the things I actually believe in, not the things the algorithm wanted me to see. The first hour stopped being lost. It became the most valuable hour of the day, and the rest of the day reorganized itself around it.&lt;/p&gt;

&lt;p&gt;This is not a productivity hack. It is the same pattern that comes up in &lt;a href="https://dev.to/one-markdown-file-made-my-ai-agent-23-points-smarter/"&gt;one markdown file that made my AI agent 23 points smarter&lt;/a&gt; — the smallest unit of attention, repeated daily, compounds. Choosing, with intention, between the thing that feels good in the moment and the thing that builds something that lasts.&lt;/p&gt;




&lt;h2&gt;
  
  
  Tombo ati and the things we forget
&lt;/h2&gt;

&lt;p&gt;There is a Javanese song — &lt;em&gt;Tombo Ati&lt;/em&gt; — that Muslims in Indonesia have been singing for a long time. The full title is &lt;em&gt;Tombo Ati Sekawan Ewu Dinten&lt;/em&gt;, roughly "medicine for the heart, four thousand days." It is a list of remedies for a tired soul. The remedies are not what you would expect from a self-help book.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Jangan lupa sholat, jangan lupa baca Quran, jangan lupa sholat malam, berkumpullan dengan orang-orang sholeh, perbanyaklah berpuasa, dan zikir malam perpanjanglah, semoga Gusti Allah mencukupi, semoga sisa umur kita diridhai.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Don't forget to pray. Don't forget to read the Quran. Don't forget the night prayer. Gather with righteous people. Fast often. Extend the night remembrance. May God be enough for you. May the rest of our lives be blessed.&lt;/p&gt;

&lt;p&gt;That is the entire prescription. No morning routine optimized to the minute. No cold showers. No journal prompts. Just six things, most of them ancient, all of them free, all of them harder than they sound.&lt;/p&gt;

&lt;p&gt;I have been trying to take the song literally, and the resistance is real. The night prayer is the hardest. Gathering with good people requires admitting that I do not already know everything. Fasting regularly means telling my body no when every other voice in my culture is telling it yes. Each one of these is a small war with the version of me that wants to be comfortable.&lt;/p&gt;

&lt;p&gt;But the song is not naive. It is not promising that life will be smooth. It is promising that &lt;em&gt;God will be enough&lt;/em&gt; — not success, but sufficiency. The metric is "you will not run out of what you actually need," not abundance. That is a more interesting promise, and the only one I can defend after a hard week.&lt;/p&gt;




&lt;h2&gt;
  
  
  Better than yesterday, that is the target
&lt;/h2&gt;

&lt;p&gt;There is a phrase that has become a kind of motto for me lately. I did not invent it. I am not sure who did. It is this: the goal is to be better today than I was yesterday, and better tomorrow than I was today.&lt;/p&gt;

&lt;p&gt;No yearly revenue target. No follower count. No benchmark of being "successful" by anyone else's standard. Just a daily comparison against my own previous self.&lt;/p&gt;

&lt;p&gt;It sounds soft when I write it down. In practice it is brutal. "Better" is not a vibe — it is specific. Better at what? Better how? Measurable against what? If I cannot define what better looks like for today, I will not know if I hit it, and the day will blur into the next, and the next, and at the end of the year I will look up and realize I have spent three hundred and sixty-five days being the exact same person.&lt;/p&gt;

&lt;p&gt;I have started writing one sentence at the end of each day. Not a journal. Just "Today I did X. Tomorrow I want to do Y." Some days the sentence is embarrassing. Some days I am proud of it. Either way, the day is captured, the hour is accounted for, and the verse in Surah Al-Ashr is no longer a warning I nod at — it is a daily check.&lt;/p&gt;




&lt;h2&gt;
  
  
  The accountability we do not talk about
&lt;/h2&gt;

&lt;p&gt;The last part of the verse is the one most of us would rather skip. We are going to be held accountable — not in a vague karmic sense, but in a real, personal, no-deflecting way.&lt;/p&gt;

&lt;p&gt;When I was younger, I thought accountability was about the big decisions — the career, the marriage, the move. Now I think it is about the small ones. The phone I picked up instead of the book. The meeting I scheduled during the time I had promised to pray. The conversation I avoided because I did not want to be uncomfortable. Those are the ones that add up.&lt;/p&gt;

&lt;p&gt;I am writing it at the end of a week where I did some of the things right and a lot of them wrong. I am writing it because the verse keeps coming back, and the song keeps playing in my head, and I am tired of pretending that scrolling is the same as living.&lt;/p&gt;

&lt;p&gt;If any of this lands for you — if you are also feeling like twenty-four hours is not enough, and also feeling like you are spending them in ways you cannot quite defend — the next hour is a place to start. Not the next year. Not the next Monday. The next hour.&lt;/p&gt;

&lt;p&gt;The verse has been saying this for fourteen hundred years. Time I started listening.&lt;/p&gt;

</description>
      <category>reflection</category>
      <category>productivity</category>
      <category>faith</category>
      <category>writing</category>
    </item>
    <item>
      <title>Recruitment App With AI: A Design Thinking Case Study</title>
      <dc:creator>Susilo harjo</dc:creator>
      <pubDate>Wed, 17 Jun 2026 02:56:36 +0000</pubDate>
      <link>https://dev.to/susiloharjo/recruitment-app-with-ai-a-design-thinking-case-study-3p37</link>
      <guid>https://dev.to/susiloharjo/recruitment-app-with-ai-a-design-thinking-case-study-3p37</guid>
      <description>&lt;p&gt;Last month I built a recruitment portal from scratch — request form, approval flow, candidate filtering with AI, the whole nine yards. Before I wrote a single line of code, I sat through fifteen hours of interviews with HR managers, hiring managers, and candidates who had just been rejected. That is the part most articles about building products skip.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Last month I built a recruitment portal from scratch — request form, approval flow, candidate filtering with AI, the whole nine yards.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;They jump straight to the whiteboard sketch or the workshop exercise.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The portal handles request forms, multi-level approval, job posting, candidate registration, AI-assisted filtering, interview scheduling, psychological tests, salary offers, MCU (medical check-up), and onboarding logistics.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Bottom Line
&lt;/h2&gt;

&lt;p&gt;Recruitment App With AI: A Design Thinking Case Study is a signal worth watching in 2026. If you're building or securing infrastructure, keep an eye on this trend.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Read the full analysis on &lt;a href="https://susiloharjo.web.id/recruitment-app-design-thinking-case-study/" rel="noopener noreferrer"&gt;Susiloharjo&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>designthinking</category>
      <category>product</category>
      <category>ai</category>
      <category>recruitment</category>
    </item>
    <item>
      <title>The Best Feature I Ever Shipped Was a One-Page Procedure</title>
      <dc:creator>Susilo harjo</dc:creator>
      <pubDate>Tue, 16 Jun 2026 01:04:09 +0000</pubDate>
      <link>https://dev.to/susiloharjo/the-best-feature-i-ever-shipped-was-a-one-page-procedure-4e9n</link>
      <guid>https://dev.to/susiloharjo/the-best-feature-i-ever-shipped-was-a-one-page-procedure-4e9n</guid>
      <description>&lt;p&gt;Last year a client asked for an AI agent to automate their customer complaint triage. Forty hours of scoping done, two weeks of build time blocked. I was three days from opening the IDE when I sat next to the customer service team for two hours and watched.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Last year a client asked for an AI agent to automate their customer complaint triage.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;I expected to see overwhelmed agents drowning in tickets.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;What I saw was three CS staff handling 12 complaints a day each, perfectly fine, with one exception — they refused to escalate anything to the operations team.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Bottom Line
&lt;/h2&gt;

&lt;p&gt;The Best Feature I Ever Shipped Was a One-Page Procedure is a signal worth watching in 2026. If you're building or securing infrastructure, keep an eye on this trend.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Read the full analysis on &lt;a href="https://susiloharjo.web.id/the-best-feature-i-ever-shipped-was-a-one-page-procedure/" rel="noopener noreferrer"&gt;Susiloharjo&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>design</category>
      <category>product</category>
      <category>process</category>
      <category>leadership</category>
    </item>
    <item>
      <title>One Markdown File Made My AI Agent 23 Points Smarter</title>
      <dc:creator>Susilo harjo</dc:creator>
      <pubDate>Tue, 16 Jun 2026 01:04:05 +0000</pubDate>
      <link>https://dev.to/susiloharjo/one-markdown-file-made-my-ai-agent-23-points-smarter-p8o</link>
      <guid>https://dev.to/susiloharjo/one-markdown-file-made-my-ai-agent-23-points-smarter-p8o</guid>
      <description>&lt;p&gt;Last week I read a paper that made me re-evaluate everything I have written about AI agent optimization. Microsoft and three Chinese universities published a method called SkillOpt . The result: a single Markdown file, between 300 and 2,000 tokens, lifted GPT-5.5 by an average of 23 points across six procedural benchmarks.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Last week I read a paper that made me re-evaluate everything I have written about AI agent optimization.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Just a Markdown file that gets fed to the agent as context at inference time.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The skill beats handwritten instructions, one-shot LLM-generated instructions, and four specialized training methods (Trace2Skill, TextGrad, GEPA, EvoSkill).&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Bottom Line
&lt;/h2&gt;

&lt;p&gt;One Markdown File Made My AI Agent 23 Points Smarter is a signal worth watching in 2026. If you're building or securing infrastructure, keep an eye on this trend.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Read the full analysis on &lt;a href="https://susiloharjo.web.id/one-markdown-file-made-my-ai-agent-23-points-smarter/" rel="noopener noreferrer"&gt;Susiloharjo&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>process</category>
      <category>claudecode</category>
      <category>skill</category>
    </item>
    <item>
      <title>I Tried GROW Coaching in My 1:1s. It Cut Them in Half.</title>
      <dc:creator>Susilo harjo</dc:creator>
      <pubDate>Mon, 15 Jun 2026 01:05:13 +0000</pubDate>
      <link>https://dev.to/susiloharjo/i-tried-grow-coaching-in-my-11s-it-cut-them-in-half-5efb</link>
      <guid>https://dev.to/susiloharjo/i-tried-grow-coaching-in-my-11s-it-cut-them-in-half-5efb</guid>
      <description>&lt;p&gt;GROW Coaching Cut My 1:1s in Half&lt;/p&gt;

&lt;p&gt;I ran a 1:1 that lasted 47 minutes. The engineer walked out with a "plan" that fell apart in 2 days. I had spent my week solving her problem instead of coaching her through it.&lt;/p&gt;

&lt;p&gt;The fix was a 30-year-old coaching framework called GROW: Goal, Reality, Options, Way Forward.&lt;/p&gt;

&lt;p&gt;Four weeks in, my 1:1s dropped from 45 minutes to 15. The team unblocks themselves. I have 8 hours a week back.&lt;/p&gt;

&lt;p&gt;If you want to test this, run four 1:1s in the format before judging it. The first will feel awkward. The fourth will be the moment you stop wanting to go back.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Read the full breakdown on &lt;a href="https://susiloharjo.web.id/grow-coaching-cycle-1on1-leadership/" rel="noopener noreferrer"&gt;Susiloharjo&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>leadership</category>
      <category>coaching</category>
      <category>productivity</category>
      <category>career</category>
    </item>
  </channel>
</rss>
