<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Sergey Shkuratov</title>
    <description>The latest articles on DEV Community by Sergey Shkuratov (@s_a_shkuratov).</description>
    <link>https://dev.to/s_a_shkuratov</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3959622%2F42a6ac97-4a8f-494b-9d4a-1f49c95f54e6.jpg</url>
      <title>DEV Community: Sergey Shkuratov</title>
      <link>https://dev.to/s_a_shkuratov</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/s_a_shkuratov"/>
    <language>en</language>
    <item>
      <title>Working with AI Means Thinking More, Not Less</title>
      <dc:creator>Sergey Shkuratov</dc:creator>
      <pubDate>Sat, 20 Jun 2026 09:33:09 +0000</pubDate>
      <link>https://dev.to/s_a_shkuratov/working-with-ai-means-thinking-more-not-less-1295</link>
      <guid>https://dev.to/s_a_shkuratov/working-with-ai-means-thinking-more-not-less-1295</guid>
      <description>&lt;p&gt;Yes, this text is long. Yes, it repeats itself in places. I did not clean that up. A text that sounded too smooth while arguing that AI forces you to think more, not less, would be at least slightly dishonest. This is not fast food for quick consumption. And yes, don’t worry: you won’t hear anything especially new here. That is part of the problem too.&lt;/p&gt;

&lt;p&gt;There is a popular and very seductive story about AI in software development. Now that the machine can write code, the human gets to think less. You just point it in the right direction, and the model will quickly and cheaply do a significant part of the work on its own. In that picture, AI is primarily an accelerator for code production, and human thinking gradually shifts from necessity to optional extra.&lt;/p&gt;

&lt;p&gt;I keep feeling more and more strongly that this description is dangerously wrong.&lt;/p&gt;

&lt;p&gt;A more accurate formula for my own experience right now is this: &lt;strong&gt;I’m the tech lead, the AI is the entire team in one body&lt;/strong&gt;. And if you take that metaphor seriously, the conclusion is the exact opposite of the mainstream narrative. Working with AI is not a way to think less. It is a mode in which &lt;strong&gt;you need to think more, not less&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Not because the AI is bad. But because it is too good at one very treacherous thing: it confidently and smoothly fills in what was left unsaid.&lt;/p&gt;

&lt;h2&gt;
  
  
  I’m the tech lead, the AI is the team
&lt;/h2&gt;

&lt;p&gt;At first this metaphor felt like a neat formulation. Now it feels like a literal description of what is going on.&lt;/p&gt;

&lt;p&gt;If you treat AI as a very fast and very capable executor, a lot of things become clearer immediately. It really can wipe out months of routine work. It can spin up prototypes quickly, take over test scaffolding, try out alternatives, make local edits, help break a task into parts, and sometimes even suggest a decent direction.&lt;/p&gt;

&lt;p&gt;On the surface, this really does look like a silver bullet. Especially if the human knows the stack and can read code. The pace becomes so extreme that old assumptions about development speed can be thrown into the garbage bin of history.&lt;/p&gt;

&lt;p&gt;But that is also exactly where the most dangerous substitution begins.&lt;/p&gt;

&lt;p&gt;Once you have an executor this strong, the temptation is to reduce your role to something like this: state the overall goal, wave your hand vaguely in the direction of the task, and then mostly stay out of the way. The system is smart, surely it will figure it out. And this is where the tech lead metaphor becomes genuinely useful: a good tech lead does not stop thinking just because the team is strong. On the contrary, the stronger the team, the more expensive mistakes in framing, boundaries, and verification become.&lt;/p&gt;

&lt;p&gt;A strong tech lead does not lose their work. The work is still there. It’s just not where people think it is. They do not have to personally write every line, but they do have to hold onto:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the larger goal for the near future;&lt;/li&gt;
&lt;li&gt;the boundaries of the change;&lt;/li&gt;
&lt;li&gt;the signs that a task is actually complete;&lt;/li&gt;
&lt;li&gt;an understanding of what must not be broken on the way there;&lt;/li&gt;
&lt;li&gt;and a way to verify that the team did not produce something externally polished but systemically dangerous.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you map this onto working with AI, it turns out that the core human responsibilities have not gone anywhere. If anything, they have become stricter.&lt;/p&gt;

&lt;h2&gt;
  
  
  The main risk is not bad code, but loss of ownership of intent
&lt;/h2&gt;

&lt;p&gt;When people talk about problems with AI in programming, they usually discuss fairly simple things: hallucinations, nonexistent functions, weird syntax, weak tests, generic code, unsafe fragments. All of that happens. But that is not the most unpleasant part.&lt;/p&gt;

&lt;p&gt;The real trouble starts when the code looks fine.&lt;/p&gt;

&lt;p&gt;It is clean. It is tidy. It passes tests. It has sensible variable names. It does exactly what was requested. If you look at it as a local artifact, it may look more than convincing.&lt;/p&gt;

&lt;p&gt;That is exactly why the danger here runs deeper than just bad code.&lt;/p&gt;

&lt;p&gt;The problem is that when working with AI, it becomes very easy to lose &lt;strong&gt;ownership of intent&lt;/strong&gt;. That means losing the actual link between:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;what we are actually trying to achieve;&lt;/li&gt;
&lt;li&gt;why the system is designed this way rather than another;&lt;/li&gt;
&lt;li&gt;what constraints and invariants exist here;&lt;/li&gt;
&lt;li&gt;and how we distinguish a real solution that works in real life from a plausible imitation of a solution.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Once that ownership is lost, a very unpleasant state appears: &lt;strong&gt;“it works, but I don’t know why.”&lt;/strong&gt; And right behind it comes another one: &lt;strong&gt;“and I don’t know what will break it.”&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is especially treacherous because the failure does not happen at the moment of generation. At the moment of generation, everything may look excellent. The problem surfaces later — during the next change, at an edge case, on a repeated call, in a partial failure, when several locally reasonable decisions collide and together create systemic fragility.&lt;/p&gt;

&lt;p&gt;So the main trap here is not that the AI wrote nonsense. The main trap is that the human stopped being the owner of their own system.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why AI forces you to think more
&lt;/h2&gt;

&lt;p&gt;This sounds paradoxical only at first glance. In reality it is fairly simple.&lt;/p&gt;

&lt;p&gt;The stronger the executor, the more dangerous an unclear framing becomes. The faster the work gets done, the faster mistakes in intent materialize. The better the system gets at filling gaps, the more dangerous every unstated assumption becomes.&lt;/p&gt;

&lt;p&gt;If earlier a good developer could ask a follow-up question or at least avoid rushing into implementation, now the model — in some sense carrying the experience of the whole world — fills in the blanks on its own. The further this goes, the better it gets, the more plausible it becomes, but not necessarily in a way that fits this specific context. And it does all of that silently.&lt;/p&gt;

&lt;p&gt;So AI does not lower the demands on thinking. It makes thinking more mandatory and more disciplined.&lt;/p&gt;

&lt;p&gt;Working with AI leaves you with no real choice but to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;write the goal down instead of holding it as a vague feeling;&lt;/li&gt;
&lt;li&gt;separate the larger goal from the local request;&lt;/li&gt;
&lt;li&gt;know in advance what counts as done;&lt;/li&gt;
&lt;li&gt;define a contract for each step: inputs, outputs, errors, edge cases;&lt;/li&gt;
&lt;li&gt;not accept the proposed decomposition automatically;&lt;/li&gt;
&lt;li&gt;not accept code based on external impression alone;&lt;/li&gt;
&lt;li&gt;not stop at the happy path;&lt;/li&gt;
&lt;li&gt;read diffs and run checks;&lt;/li&gt;
&lt;li&gt;and keep in mind what stands above the code: its behavior in the real world.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So with AI, it is not only the speed of code production that changes. The very point where human intelligence gets applied changes. It used to be enough to be the person who writes well by hand. Now you increasingly have to take pride in not losing the foundations of the task when the speed gets too high.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why “it seems to work” is a trap
&lt;/h2&gt;

&lt;p&gt;One of the nastiest effects of working with AI is that it very easily produces solutions that look complete from the outside.&lt;/p&gt;

&lt;p&gt;The feature exists. The behavior exists. The types are in place. There are some tests too. So of course it starts to feel done.&lt;/p&gt;

&lt;p&gt;But that feeling can be false in the most dangerous sense. Because external functionality and engineering reliability are not the same thing.&lt;/p&gt;

&lt;p&gt;You can get code that executes the stated scenario and still:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;violate existing project conventions;&lt;/li&gt;
&lt;li&gt;create unnecessary complexity;&lt;/li&gt;
&lt;li&gt;bypass an existing component instead of reusing it;&lt;/li&gt;
&lt;li&gt;introduce a fragile assumption;&lt;/li&gt;
&lt;li&gt;miss an important failure mode;&lt;/li&gt;
&lt;li&gt;fail to preserve a domain invariant;&lt;/li&gt;
&lt;li&gt;and in doing so buy future maintenance at the cost of today’s speed.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is especially unpleasant because without review and without a full run of the checks, you suddenly end up almost in the role of a client for whom someone quickly assembled a “sort of working” product. From the outside, it’s alive. Inside, you don’t trust it.&lt;/p&gt;

&lt;p&gt;Then three scenarios remain, and all of them are bad. Either you live with anxiety. Or you postpone the analysis until the first incident. Or you start digging immediately, but only after speed has already produced the illusion of completion.&lt;/p&gt;

&lt;p&gt;That is why AI, for me, has turned from permission to think less into a demand to think more.&lt;/p&gt;

&lt;h2&gt;
  
  
  A local prompt does not carry the whole meaning of the project
&lt;/h2&gt;

&lt;p&gt;There is another reason why AI-assisted development forces more thinking. A local request almost never carries all the context that is actually needed for a good solution.&lt;/p&gt;

&lt;p&gt;What usually enters the system is not a full model of the project and not a careful list of invariants. What arrives is a local request: add a field, allow an action, change a state, insert a button, fix a flow, support a new behavior. Everything else has to be reconstructed.&lt;/p&gt;

&lt;p&gt;Before, a human would at least notice that something essential was missing. They would slow down, clarify, remember past decisions, ask colleagues, dig into documentation, go read code. AI, by contrast, is very good at taking a narrow slice of framing and quickly turning it into a locally convincing solution.&lt;/p&gt;

&lt;p&gt;That is the risk. Not that the model cannot do anything, but that it can continue too smoothly in places where the human should have stopped and asked: “Wait, on what basis do we think this is correct at all?”&lt;/p&gt;

&lt;p&gt;That leads to an important point: a ticket, a prompt, or a feature request is usually not a specification. It is only a trigger. Pretending it contains the project is exactly how drift begins.&lt;/p&gt;

&lt;p&gt;Which means that if the human gives the model nothing beyond the local request except maybe hope, then the model has to reconstruct everything else from indirect clues:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;boundaries;&lt;/li&gt;
&lt;li&gt;domain agreements;&lt;/li&gt;
&lt;li&gt;sources of truth;&lt;/li&gt;
&lt;li&gt;prohibitions;&lt;/li&gt;
&lt;li&gt;the rationale behind previous decisions.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And once that happens, AI starts rebuilding all of this from hints. Sometimes successfully. Sometimes not. But almost always with a risk of drift.&lt;/p&gt;

&lt;h2&gt;
  
  
  The tech lead holds not only the goal, but anti-drift discipline
&lt;/h2&gt;

&lt;p&gt;If we go back to the tech lead metaphor, it becomes clear that the role in AI-assisted development is even broader than just assigning tasks.&lt;/p&gt;

&lt;p&gt;The tech lead is needed not only to say “this is what we are doing.” They are also needed so that the project does not quietly start rewriting its own foundations piece by piece.&lt;/p&gt;

&lt;p&gt;AI is very good at helping with local execution. But precisely because of that, someone must hold onto the things that cannot be delegated wholesale:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;which rules count as normal in this project;&lt;/li&gt;
&lt;li&gt;which constraints must not be bypassed silently;&lt;/li&gt;
&lt;li&gt;which decisions have already been made and why;&lt;/li&gt;
&lt;li&gt;where the system’s real invariants live;&lt;/li&gt;
&lt;li&gt;which compromises are acceptable and which are not.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So the human in the tech lead role becomes not just a source of tasks, but a carrier of &lt;strong&gt;anti-drift discipline&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This is the discipline that stops speed from turning the project into drift.&lt;/p&gt;

&lt;p&gt;It requires very boring things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;writing and rereading the goal;&lt;/li&gt;
&lt;li&gt;keeping steps manageable;&lt;/li&gt;
&lt;li&gt;recording important decisions in artifacts rather than leaving them in chat;&lt;/li&gt;
&lt;li&gt;reviewing not only the result but the line of reasoning;&lt;/li&gt;
&lt;li&gt;checking not only new tests but old invariants;&lt;/li&gt;
&lt;li&gt;asking not only “does it work?” but also “what must not happen here at all?” and “could something else quietly happen here too?”&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These boring things turn out to be some of the most expensive engineering work there is.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the tech lead actually does when working with AI
&lt;/h2&gt;

&lt;p&gt;If you try to reduce all of this to a very practical loop, the human in this model is left with at least the following responsibilities.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Hold the larger goal
&lt;/h3&gt;

&lt;p&gt;Not just the local prompt, but a longer line: what exactly are we trying to improve, what counts as success, and what matters most right now.&lt;/p&gt;

&lt;p&gt;Without that, AI easily starts optimizing local form instead of global meaning.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Break the work into isolated parts
&lt;/h3&gt;

&lt;p&gt;Not so large that you lose verifiability, and not so small that you drown in micromanagement.&lt;/p&gt;

&lt;p&gt;Good decomposition here is not bureaucracy. It is a way not to overload either yourself or the model.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Set boundaries
&lt;/h3&gt;

&lt;p&gt;What are we doing in this change, and what are we consciously not doing? Which parts of the system are in scope, and which are not? Where is a temporary solution acceptable, and where is it not?&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Define the signs of done
&lt;/h3&gt;

&lt;p&gt;Not in the sense of “well, it kind of works,” but in the sense of a verifiable contract: which inputs we support, which outputs we expect, which errors are acceptable, which edge cases must be preserved.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Read everything important
&lt;/h3&gt;

&lt;p&gt;You do not have to manually write everything yourself. But you do have to read everything important: diffs, new decisions, key tests, controversial spots.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. Run the existing checks
&lt;/h3&gt;

&lt;p&gt;Do not stop at “the generated code passes its own tests.” All the checks that already exist matter, because those are what catch regressions against the old world.&lt;/p&gt;

&lt;h3&gt;
  
  
  7. Turn decisions into artifacts
&lt;/h3&gt;

&lt;p&gt;If an important decision lived only in your head or in a conversation, that is a bad decision from the perspective of long-term work with AI. Tomorrow’s agent, or tomorrow’s version of you, will start reconstructing it from scratch — and will most likely get it wrong.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters beyond code
&lt;/h2&gt;

&lt;p&gt;This is bigger than code.&lt;/p&gt;

&lt;p&gt;The more AI can generate, the more valuable the ability becomes not to be a typing machine, but to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;hold the task before the code exists;&lt;/li&gt;
&lt;li&gt;understand which context is mandatory;&lt;/li&gt;
&lt;li&gt;see long-tail consequences;&lt;/li&gt;
&lt;li&gt;distinguish what is locally correct from what is systemically dangerous;&lt;/li&gt;
&lt;li&gt;leave behind an artifact that helps the next human or the next agent avoid reinventing the foundations.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That is why the claim that “you need to think more, not less” is not some old craftsman whining about new tools. It is a literal description of the new work.&lt;/p&gt;

&lt;p&gt;AI removes part of the mechanics from the human. But everything tied to intent, boundaries, consequence-checking, and preserving meaning becomes more expensive, not less.&lt;/p&gt;

&lt;h2&gt;
  
  
  What changes in the feeling of the work itself
&lt;/h2&gt;

&lt;p&gt;The biggest shift here is not even procedural. It’s in your head.&lt;/p&gt;

&lt;p&gt;Before AI, you could maintain for a long time the image of a programmer as a person who mostly writes code and sort of thinks around that process. Now it increasingly feels like writing code is no longer the central part of the role. The central part of the role is holding the system of thinking around the code.&lt;/p&gt;

&lt;p&gt;So my work is less and less described as “I write the feature” and more and more as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;I hold the intent;&lt;/li&gt;
&lt;li&gt;I define the boundaries;&lt;/li&gt;
&lt;li&gt;I check whether understanding has been replaced by external plausibility;&lt;/li&gt;
&lt;li&gt;I make sure the project does not drift;&lt;/li&gt;
&lt;li&gt;I turn decisions into forms that will survive both tomorrow’s me and the next agent.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In that sense, the tech lead metaphor is useful not only as a description of process. It is useful because it protects you from lying to yourself. It reminds you that once you have an extremely strong executor, the temptation to relax grows faster than the right to relax.&lt;/p&gt;

&lt;h2&gt;
  
  
  Thinking, you lead. Stop thinking, you get led.
&lt;/h2&gt;

&lt;p&gt;If I reduce all of this to a short version, my current conclusion is this.&lt;/p&gt;

&lt;p&gt;Working with AI is not a mode in which you finally get to stop thinking. It is a mode in which &lt;strong&gt;you need to think more, not less&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;More — because speed increases the price of ambiguity.&lt;/p&gt;

&lt;p&gt;More — because local framing rarely carries all the necessary context.&lt;/p&gt;

&lt;p&gt;More — because plausible code is more dangerous than obviously bad code.&lt;/p&gt;

&lt;p&gt;More — because someone still has to hold the goal, the boundaries, the invariants, and the method of verification.&lt;/p&gt;

&lt;p&gt;That is why the formula &lt;strong&gt;“I’m the tech lead, the AI is the whole team”&lt;/strong&gt; still feels like the most accurate one to me.&lt;/p&gt;

&lt;p&gt;It does not romanticize AI, and it does not let the human off the hook. It returns responsibility to where it belongs: the human who must not only start the work, but also understand what exactly is being started, why, and by what evidence the result will be shown to deserve trust rather than merely looking functional.&lt;/p&gt;

&lt;p&gt;The cruel irony is that the AI almost certainly already knows all these subtleties better than we do. If you ask it properly, it will tell you about project management, and code review, and contracts, and regressions, and all those old, good ways of not breaking a system through sheer stupidity. It will even suggest the right precautions.&lt;/p&gt;

&lt;p&gt;But we still have to ask.&lt;/p&gt;

&lt;p&gt;We still have to frame the question.&lt;/p&gt;

&lt;p&gt;We still have to notice that this is a place where a question is needed at all.&lt;/p&gt;

&lt;p&gt;And we still have to do the thinking.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>softwareengineering</category>
      <category>discuss</category>
      <category>productivity</category>
    </item>
    <item>
      <title>2026. Week 24: mobile as a test for backend honesty</title>
      <dc:creator>Sergey Shkuratov</dc:creator>
      <pubDate>Fri, 19 Jun 2026 08:57:50 +0000</pubDate>
      <link>https://dev.to/s_a_shkuratov/2026-week-24-mobile-as-a-test-for-backend-honesty-21nm</link>
      <guid>https://dev.to/s_a_shkuratov/2026-week-24-mobile-as-a-test-for-backend-honesty-21nm</guid>
      <description>&lt;p&gt;This week I wanted to “just start mobile development” for my checklist service: build a straightforward mobile client and cover the basic scenarios. The first questions looked practical: do I need a full editor on the phone, is the current API enough, and which stack and working setup should I choose?&lt;/p&gt;

&lt;p&gt;But very quickly it became clear that this was not really a story about one more client. Mobile became a useful spotlight: it showed the places where I had not fully described the system as a contract, and where it still depended on browser defaults.&lt;/p&gt;

&lt;h3&gt;
  
  
  1) Product framing: a phone is not a second “full editor”
&lt;/h3&gt;

&lt;p&gt;The request for a “full mobile editor” sounds logical until you break it down into real situations. In practice, it mixes two different user scenarios:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;edits on the go: quickly fix text, mark a step, replace one item, continue a checklist in the field;&lt;/li&gt;
&lt;li&gt;full editing: structural changes, bulk edits, careful work with formatting and visibility rules.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Once I separated these, the frame for the first version became much more realistic: the mobile client should first of all run checklists and support light edits, not fully copy the web editor.&lt;/p&gt;

&lt;p&gt;This matters not because of “laziness”, but because of the cost of the tail. If you aim for full feature parity, you almost immediately pull in topics like offline sync, change conflicts, versioning, and complex UX for large edits. On a phone, these things are more expensive both to build and to use.&lt;/p&gt;

&lt;h3&gt;
  
  
  2) The API is “mostly enough”, but the bottleneck was not in the business methods
&lt;/h3&gt;

&lt;p&gt;Then came the good news: a quick audit showed that for the first online version of the mobile client, the business contract was already mostly there. Auth, templates, checklists, instances, invitations — all of this exists. There is OpenAPI and there are types, so I do not have to guess from loose documentation.&lt;/p&gt;

&lt;p&gt;At this point it would be easy to stop and say: “we can start building the app.”&lt;/p&gt;

&lt;p&gt;But the real knot was not there.&lt;/p&gt;

&lt;p&gt;The problem was that the current auth flow and session lifecycle in my project are tightly tied to the browser: cookies and expectations about how a client behaves inside a browser session.&lt;/p&gt;

&lt;p&gt;For the web this is natural and convenient. For mobile it is not “impossible”, but it is a boundary that quickly turns into architecture debt: a mobile client should not have to adapt itself to the browser habits of the system.&lt;/p&gt;

&lt;p&gt;So the paradox of the week sounds like this: the business API already looks mature, but the system as a whole is still not fully ready to honestly serve the same contract in different client environments.&lt;/p&gt;

&lt;h3&gt;
  
  
  3) The main shift: the question is not “how do I adapt this for mobile?”, but “how do I describe session behavior independently from transport?”
&lt;/h3&gt;

&lt;p&gt;After several iterations, it became clear that mobile is not a demanding client that needs exceptions. It simply forces me to name what should already be defined:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;what a session means in the system (in domain terms, not browser terms),&lt;/li&gt;
&lt;li&gt;how the client gets and refreshes the right to make requests,&lt;/li&gt;
&lt;li&gt;which errors are expected and what the client should do in response,&lt;/li&gt;
&lt;li&gt;how all of this should be tested.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As a result, the frame became stronger: not “two kinds of auth”, but one shared session model and two ways to deliver that model depending on the environment:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;cookies — natural for web;&lt;/li&gt;
&lt;li&gt;bearer token — natural for a native mobile client.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is no longer just cosmetics and not only “preparation for the future”. It is a way to bring the contract to a state where it does not depend on implicit assumptions about transport.&lt;/p&gt;

&lt;p&gt;And this is where something unpleasant but useful came up: before that, I had not separated the layers clearly enough. I had “it works on the web” in my head, and that was enough. But as soon as a second class of client appears, it becomes obvious that some decisions are not really decisions at all, but fog.&lt;/p&gt;

&lt;p&gt;To clear that fog, I had to lift and define several system-level artifacts independently of the auth transport:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a description of login and session refresh flows (what the client does, what the server does);&lt;/li&gt;
&lt;li&gt;behavior rules for common failures (network error, expired session, invalid token, repeated request);&lt;/li&gt;
&lt;li&gt;a test matrix: which combinations of environments, clients, and scenarios must be tested for this to count as “real readiness”, not just a feeling of readiness.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  4) An episode about the editor and the boundary of “where an agent is useful”
&lt;/h3&gt;

&lt;p&gt;Against that background, there was also a revealing local episode inside the web checklist editor. For a long time I could not catch a bug: the menu behaved badly in a situation where there was not enough space below it to open normally. The problem was clearly somewhere in low-level JS/DOM/layout mechanics — a layer I know less well than the product and architecture side.&lt;/p&gt;

&lt;p&gt;At some point I stopped pretending I should do “manual debugging until victory” and asked an agent to bring up Playwright and run the loop on its own: reproduce, measure, and test hypotheses.&lt;/p&gt;

&lt;p&gt;This turned out to be useful not only for the specific bug, but also as a process calibration. If the task is mostly instrumental work — reproduction, changing conditions, collecting facts — it makes sense to give the machine the role of executor in the research loop. The human still holds the frame: formulates what exactly we are looking for and prevents the process from drifting into random fixes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Weekly result
&lt;/h3&gt;

&lt;p&gt;The week started as an attempt to get into mobile development, and ended with a more important result: I saw the foundations of the system more clearly.&lt;/p&gt;

&lt;p&gt;First, I narrowed the product frame of the mobile client to something viable: launch + light edits, without pretending it should do everything the web version does. Then it became clear that business API readiness is only half of the story. Real readiness starts where auth and session behavior are described as a contract that lives honestly both in the browser and in a native app.&lt;/p&gt;

&lt;p&gt;And maybe the main conclusion is this: new client environments are valuable not because they require a different UI or stack. They are valuable because they bring implicit assumptions into the light — especially around security, session behavior, and invariants, where “it works on the web” is not enough.&lt;/p&gt;

</description>
      <category>weeklyretro</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Not Every Lint Warning Is Cosmetic</title>
      <dc:creator>Sergey Shkuratov</dc:creator>
      <pubDate>Sun, 14 Jun 2026 08:32:54 +0000</pubDate>
      <link>https://dev.to/s_a_shkuratov/not-every-lint-warning-is-cosmetic-11f8</link>
      <guid>https://dev.to/s_a_shkuratov/not-every-lint-warning-is-cosmetic-11f8</guid>
      <description>&lt;p&gt;How old tools improve the work of new (non)humans.&lt;/p&gt;

&lt;p&gt;I noticed this pattern while working through a series of backend cleanup tasks in &lt;code&gt;pylint&lt;/code&gt;, &lt;code&gt;flake8&lt;/code&gt;, and &lt;code&gt;mypy&lt;/code&gt;: some warnings that I wanted to dismiss as housekeeping kept turning out to be the shortest path to hidden contracts in the code.&lt;/p&gt;

&lt;p&gt;As a rule, they looked like small cleanup tasks: naming style, function signatures, module size. But once I started fixing them, it became clear that the problem was not cosmetic. The check was simply the first thing pointing at a place where an important contract in the code was still resting on an unspoken assumption.&lt;/p&gt;

&lt;p&gt;For Python backend development, this is especially noticeable in places where the code already looks plausible and locally reasonable — including cases where the draft was assembled quickly with the help of an LLM. In those places, a warning is sometimes useful not as a demand to “make it cleaner”, but as an early signal: this is where it is worth checking boundaries, invariants, or the shape of the contract.&lt;/p&gt;

&lt;p&gt;Below are four short cases where cleanup turned out to be not quite cleanup.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Enum cleanup that exposed a database contract
&lt;/h2&gt;

&lt;p&gt;At first this looked almost cosmetic: I was simply normalizing naming style.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Operator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Enum&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;eq&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;lt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;then:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Operator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Enum&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;EQ&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;LT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But that was not the end of it. It turned out that the real contract lives not only in the Python enum, but also in which values SQLAlchemy reads from and writes to the database. So the real fix ended up looking like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;operator&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;MappedColumn&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Operator&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;mapped_column&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nc"&gt;SQLEnum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Operator&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;operator_enum&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;values_callable&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;_enum_lower_names&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So the warning was not really about enum style. It was about persistence semantics. From the outside it looked like cleanup, but in practice it forced me to make the contract between the Python enum and stored values explicit.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. An extra parameter that turned out to be an ownership check
&lt;/h2&gt;

&lt;p&gt;The next signal looked even more mundane: an extra parameter and a messy signature.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;delete_condition&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;AsyncSession&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;item_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;condition_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ChecklistItem&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;item_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When I dug into it, it turned out the warning was not really about the shape of the function. It pointed at an under-specified domain contract. After the fix, the function explicitly required the &lt;code&gt;item_id&lt;/code&gt; and &lt;code&gt;checklist_id&lt;/code&gt; pair:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;delete_condition&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;AsyncSession&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;checklist_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;item_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;condition_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;get_draft_item&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;item_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;item_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;checklist_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;checklist_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What mattered here was not the signal about the signature itself, but the fact that it led to an ownership check. After the fix, what became explicit was not the “neatness” of the function, but the dependency between &lt;code&gt;item_id&lt;/code&gt; and &lt;code&gt;checklist_id&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. A module warning that turned out to be about boundaries
&lt;/h2&gt;

&lt;p&gt;The third case was about structure. A warning about module size and shape would have been easy to dismiss as a linting nitpick. But the monolithic &lt;code&gt;schemas.py&lt;/code&gt; had in fact stopped being maintainable.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/schemas.py
&lt;/span&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;HTTPErrorDetail&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="bp"&gt;...&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AuditEvent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="bp"&gt;...&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ChecklistItemCreate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="bp"&gt;...&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;OrganisationCreate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="bp"&gt;...&lt;/span&gt;
&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TemplateListItem&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="bp"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;...and so on for hundreds of lines.&lt;/p&gt;

&lt;p&gt;Instead of adding a suppress, this ended up as a proper package:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# app/schemas/__init__.py
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;.audit&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AuditEvent&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;.checklists&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Checklist&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ChecklistItem&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ItemCondition&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;.enums&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Operator&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;PropName&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;.orgs&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Organisation&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Invite&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Grant&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;.templates&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;TemplateListItem&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;TemplatePublic&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So the problem was not style as such. It was that the codebase was already asking for clearer boundaries. In this case, the file-size warning was not noise, but an early sign that the module needed to be split along responsibilities.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Docstrings that turned out to hold a local contract
&lt;/h2&gt;

&lt;p&gt;Another case initially looked purely documentation-related. &lt;code&gt;pylint&lt;/code&gt; and &lt;code&gt;flake8&lt;/code&gt; required proper docstrings for functions, methods, and classes, and from the outside this is easy to read as hygiene for the sake of hygiene.&lt;/p&gt;

&lt;p&gt;But in several places the fix worked differently. For example, in &lt;code&gt;checklist_router.py&lt;/code&gt; the docstring stopped being a generic phrase about a “router for Checklist entity” and turned into a short description of the real lifecycle contract: that published versions are immutable, that draft creation is explicit, that editor mutations operate only on the draft layer, and which denial/error semantics are considered normal here.&lt;/p&gt;

&lt;p&gt;A similar thing happened with middleware. There the docstring started fixing not a paraphrase of the method, but important boundary conditions: when &lt;code&gt;Session-Log-ID&lt;/code&gt; is required, why the middleware returns &lt;code&gt;400&lt;/code&gt;, where exactly the request-scoped DB session lives, and why this is ASGI middleware rather than &lt;code&gt;BaseHTTPMiddleware&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The framing from &lt;a href="https://dev.to/mickyarun/i-replaced-scrum-jira-and-our-wiki-with-12-ai-agents-on-a-mac-mini-o7o"&gt;Arun Rajkumar’s post about agent-driven workflow&lt;/a&gt; is also useful here, because in that framing code and related artifacts act as a source of working context. Read that way, a meaningful docstring is not decorative documentation, but another explicit form of a local contract, useful not only to a human, but also to an AI agent: it no longer has to reconstruct those constraints from the implementation every single time.&lt;/p&gt;

&lt;p&gt;Of course, not every docstring has that effect. If a checker only pushes on form — imperative mood, blank lines, section headers — that is more a matter of formatting discipline. That is still useful at least as a factor of consistency, but it is a different kind of value. But where a docstring fixes preconditions, boundaries, error semantics, or lifecycle assumptions, a documentation warning stops being pure cosmetics.&lt;/p&gt;

&lt;h2&gt;
  
  
  What follows from this
&lt;/h2&gt;

&lt;p&gt;Across all four cases, what matters is not the warning itself as a ritual, but the move from implicit to explicit. In one place I had to make persistence semantics explicit, in another an ownership boundary, in the third module boundaries, and in the fourth a local API or middleware contract.&lt;/p&gt;

&lt;p&gt;That is probably the most useful way to read such signals. Not as an order to make the code neater, but as a reason to check whether an important contract is still resting on a silent “this is obvious anyway.” This does not mean every warning hides a deep problem. But if it touches storage format, ownership, input shape, or a module boundary, I would no longer treat it as pure cosmetics.&lt;/p&gt;

</description>
      <category>python</category>
      <category>backend</category>
      <category>linting</category>
      <category>ai</category>
    </item>
    <item>
      <title>2026. Week 23: a UI task that stopped being small</title>
      <dc:creator>Sergey Shkuratov</dc:creator>
      <pubDate>Wed, 10 Jun 2026 08:41:33 +0000</pubDate>
      <link>https://dev.to/s_a_shkuratov/2026-week-23-a-ui-task-that-stopped-being-small-2n22</link>
      <guid>https://dev.to/s_a_shkuratov/2026-week-23-a-ui-task-that-stopped-being-small-2n22</guid>
      <description>&lt;h2&gt;
  
  
  I Thought This Would Be a Local UI Task
&lt;/h2&gt;

&lt;p&gt;This week, I thought I was solving a fairly narrow task: how to show group settings more neatly in the new checklist editor. The question looked local enough: how to read a group’s state right in the item row, and how to provide a convenient entry point for editing.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foqla5kcput5tbiuma191.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Foqla5kcput5tbiuma191.jpeg" alt="New editor" width="799" height="307"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;The &lt;code&gt;Morning chaos&lt;/code&gt; row is active now, and it is easy to see that the group of indicators on the right takes a significant part of the row’s space.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;At first, this looked like a normal UX improvement. I needed to find a form of presentation that would not force the user to open the detail panel too often, while also not overloading the item row.&lt;/p&gt;

&lt;h2&gt;
  
  
  The First Version Turned Out Too Noisy
&lt;/h2&gt;

&lt;p&gt;My first move was toward a richer inline representation. I wanted to show a set of signals directly in the group row, so that its state could be read quickly.&lt;/p&gt;

&lt;p&gt;It became clear quite fast that this was the wrong direction. This UI did not make reading easier; it added noise. Instead of a more “talkative” interface, I had to move in the opposite direction: remove extra elements and compress the state representation.&lt;/p&gt;

&lt;p&gt;In the end, the solution narrowed down to one summary pill in the group row and one shared popover for editing. That became the main UX lesson of this part of the week: in a dense editor UI, extra signals stop helping very easily.&lt;/p&gt;

&lt;h2&gt;
  
  
  Then a Deeper Problem Surfaced
&lt;/h2&gt;

&lt;p&gt;The story did not end there. While I was working on the interface, it became clear that the problem was not only about how it looked, but also about how it worked at all.&lt;/p&gt;

&lt;p&gt;During the work, an ambiguity in the semantics of visibility surfaced unexpectedly. Initially, I made it so that if a field was invisible, conditions like “if another item has such-and-such value” could make it visible. But the new interface showed this approach poorly, to the point that even I, the author, could not quickly understand what was going on when looking at the editor.&lt;/p&gt;

&lt;p&gt;After several experiments, I realized that the problem was not in the editor UX but in the semantics. In the end, I had to invert it: &lt;code&gt;effective_visible = is_visible &amp;amp;&amp;amp; conditions_pass&lt;/code&gt;. In other words, an item that is invisible by default cannot be made visible by any conditions. But if a visible item also has conditions, then the item can become invisible.&lt;/p&gt;

&lt;p&gt;A typical breaking change, and good that it happened early.&lt;/p&gt;

&lt;h2&gt;
  
  
  After That, the Most Honest Part of the Work Began
&lt;/h2&gt;

&lt;p&gt;Once the rule became clearer, the work was not finished. After all the edits, a more unpleasant but also more honest phase began: I had to verify that the system really behaved the way it was now described.&lt;/p&gt;

&lt;p&gt;This is where the E2E part began. It looked much less elegant than the idea of the solution itself. There were failures like &lt;code&gt;Expected: "hidden" Received: "visible"&lt;/code&gt;, there were timeouts around the drawer and popover, and there were situations where everything already looked fine locally, but the tests answered: no, the behavior is still not fully assembled.&lt;/p&gt;

&lt;p&gt;In the end, E2E became the real finish line of this task. Not the moment when the interface already looked convincing, but the moment when the target scenarios started to converge in a verifiable form.&lt;/p&gt;

&lt;h2&gt;
  
  
  In Parallel, Another Shift Was Taking Shape
&lt;/h2&gt;

&lt;p&gt;There was another line of work this week as well — backend cleanup. I do not want to expand on it here: it will get a separate text.&lt;/p&gt;

&lt;p&gt;Against that background, and also against the background of publishing &lt;a href="https://dev.to/s_a_shkuratov/llm-assisted-deploy-you-save-typing-not-thinking-5h91"&gt;LLM-Assisted Deploy: You Save Typing, Not Thinking&lt;/a&gt;, another thought started to come together more strongly for me. A meaningful part of the work should not disappear into local edits, commits, and one-off sessions. It should be brought to the state of a public artifact: a text, a case, a note — something from which one can later read not only the result, but also the line of thought, the constraints, and the way of verification.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Take Away From This Week
&lt;/h2&gt;

&lt;p&gt;In short, this was a week in which a small UI task refused to stay small.&lt;/p&gt;

&lt;p&gt;First, it ran into the need to simplify my own solution. Then it brought out behavior that had not been fully thought through or clearly stated. Then it demanded that this behavior be proved and shown separately through tests.&lt;/p&gt;

&lt;p&gt;That is probably why the week feels coherent. Its center was not that I simply made one more product improvement, but a more general movement: from a local change to an explicit contract, from an explicit contract to verification, and then further to proving and showing the work so that it would not dissolve inside the code.&lt;/p&gt;

</description>
      <category>uidesign</category>
      <category>weeklyretro</category>
    </item>
    <item>
      <title>LLM-Assisted Deploy: You Save Typing, Not Thinking</title>
      <dc:creator>Sergey Shkuratov</dc:creator>
      <pubDate>Sat, 06 Jun 2026 13:24:08 +0000</pubDate>
      <link>https://dev.to/s_a_shkuratov/llm-assisted-deploy-you-save-typing-not-thinking-5h91</link>
      <guid>https://dev.to/s_a_shkuratov/llm-assisted-deploy-you-save-typing-not-thinking-5h91</guid>
      <description>&lt;h3&gt;
  
  
  TL;DR
&lt;/h3&gt;

&lt;p&gt;An LLM helped me put together a deploy in about an hour, but only because I did not hand the deploy itself over to it.&lt;/p&gt;

&lt;h3&gt;
  
  
  What happened
&lt;/h3&gt;

&lt;p&gt;I know how to write deploy scripts. Not in theory — I’ve done it many times, and usually I just sit down and write one.&lt;/p&gt;

&lt;p&gt;This time the problem was not &lt;em&gt;how&lt;/em&gt; to write it. The problem was time. And I had absolutely no desire to break anything along the way.&lt;/p&gt;

&lt;p&gt;So I did not play the game of “the LLM will neatly put this together for me.” In semi-automated deployment, that is an unusually bad idea. Instead of a beautiful result, you get a beautifully broken site lying on the floor.&lt;/p&gt;

&lt;p&gt;I did something else.&lt;/p&gt;

&lt;p&gt;I defined the script structure myself, based on my previous experience. I spelled out what exactly the deploy had to do, where the boundaries were, what control points existed, what counted as success, and what was a reason not to proceed. Then I fed that text to the LLM and had it write the bash script.&lt;/p&gt;

&lt;p&gt;In other words, the model helped where the cost of error was relatively cheap: typing code.&lt;/p&gt;

&lt;p&gt;Everything that actually had a price stayed manual: decisions, review, checking the logic, and running it in a safe environment.&lt;/p&gt;

&lt;p&gt;In the end I got two scripts: &lt;code&gt;deploy-preprod.sh&lt;/code&gt; and &lt;code&gt;deploy-production.sh&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;That separation mattered to me. Production should not “just go” directly. First the preprod gate, then production. And I also kept a standard trick of mine for production deploy confirmation — a kind of textual captcha: the script prints a random code to the console, and nothing moves until you type it in by hand. It is not protection from a malicious hacker. It is protection from an overly easy, overly mechanical “fine, let’s just ship it already.”&lt;/p&gt;

&lt;p&gt;It took four iterations. Honestly, that is not a sign that the approach was bad. Quite the opposite.&lt;/p&gt;

&lt;p&gt;Across those passes, exactly the usual crap surfaced — the kind that makes deployment dangerous: typos, wrong variables, bad log parsing. Nothing conceptually interesting. The code itself looked fairly brisk. The irony was that the code had been written by the LLM — perhaps it was in a hurry too and did not want to get distracted from taking over the world.&lt;/p&gt;

&lt;p&gt;My verification was not in the “well, looks reasonable enough” genre either. A more reliable approach after the containers start is to automatically scan the &lt;code&gt;docker-compose&lt;/code&gt; logs for obvious signs of errors. Then do a manual smoke check — log in and walk through the key flows. I did think about e2e, but for this task I decided it would be overkill. What I needed at that moment was not a perfect automation contour, but a reproducible deploy with explicit control points and predictable failure behavior.&lt;/p&gt;

&lt;p&gt;If you ground this in the actual scripts, the line between “accelerate” and “hand over control” becomes pretty clear:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;preprod reproduces an exact copy of the production site before the deploy itself, which is feasible because the site is small;&lt;/li&gt;
&lt;li&gt;production refuses to run with &lt;code&gt;latest&lt;/code&gt;;&lt;/li&gt;
&lt;li&gt;production refuses to do a no-op deploy on the same tags;&lt;/li&gt;
&lt;li&gt;before production, the preprod gate runs and fails if preprod is unhappy about anything;&lt;/li&gt;
&lt;li&gt;there is an explicit failure if Postgres does not come up;&lt;/li&gt;
&lt;li&gt;there is a log check for &lt;code&gt;error&lt;/code&gt;, &lt;code&gt;exception&lt;/code&gt;, &lt;code&gt;traceback&lt;/code&gt;, &lt;code&gt;panic&lt;/code&gt;, and &lt;code&gt;fatal&lt;/code&gt;;&lt;/li&gt;
&lt;li&gt;there is manual confirmation before production.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So the LLM did not “do the deploy.” It helped me assemble the code around a structure and a set of constraints that I had already defined.&lt;/p&gt;

&lt;p&gt;All in all, it took about an hour.&lt;/p&gt;

&lt;h3&gt;
  
  
  Takeaways
&lt;/h3&gt;

&lt;p&gt;The LLM saved me time, but not where many people dream it will. Not on responsibility. Not on the engineering decision. Not on verification. It compressed the draft phase — the keyboard pounding. Everything critical was still manual engineering work.&lt;/p&gt;

&lt;p&gt;For tasks like this, that is what a sane &lt;code&gt;LLM-assisted&lt;/code&gt; mode looks like: do not delegate risk to the model; use it to strip away the mechanical part so more attention remains for architecture and for the control points where mistakes are actually expensive.&lt;/p&gt;

&lt;p&gt;A minute of thought increases uptime by an hour.&lt;/p&gt;

</description>
      <category>devops</category>
      <category>ai</category>
    </item>
    <item>
      <title>Documentation is code: LLMs don’t actually read it — and honestly, neither do we</title>
      <dc:creator>Sergey Shkuratov</dc:creator>
      <pubDate>Tue, 02 Jun 2026 09:46:17 +0000</pubDate>
      <link>https://dev.to/s_a_shkuratov/documentation-is-code-llms-dont-actually-read-it-and-honestly-neither-do-we-1fl6</link>
      <guid>https://dev.to/s_a_shkuratov/documentation-is-code-llms-dont-actually-read-it-and-honestly-neither-do-we-1fl6</guid>
      <description>&lt;p&gt;I learned this the hard way: when an LLM says “it matches the docs”, it can still be wrong for a boring reason—it didn’t &lt;em&gt;read&lt;/em&gt; the part that matters.&lt;/p&gt;

&lt;p&gt;I’m building a small SaaS (checklists as a service). No users yet. Plenty of documentation already. And at some point my docs stopped being an asset and started turning into a liability.&lt;/p&gt;

&lt;p&gt;This is the story of how I rebuilt my documentation so that an LLM could actually &lt;strong&gt;read it end-to-end&lt;/strong&gt;—and how that restructure helped me.&lt;/p&gt;

&lt;h2&gt;
  
  
  The moment I got scared: “silent misses”
&lt;/h2&gt;

&lt;p&gt;The docset grew. I kept asking the LLM to verify tasks against it.&lt;/p&gt;

&lt;p&gt;And then I noticed a pattern that felt worse than hallucinations.&lt;/p&gt;

&lt;p&gt;Not “the model invented stuff”, but “the model confidently said &lt;em&gt;it matches&lt;/em&gt;”—while quietly missing exceptions, prohibitions, and thresholds. Keyword scanning instead of reading.&lt;/p&gt;

&lt;p&gt;I called it &lt;strong&gt;silent drift&lt;/strong&gt;: code slowly moves away from conventions, while the invariants remain only in my head.&lt;/p&gt;

&lt;p&gt;In a project with roles, audit, and CI/CD security gates, that kind of drift isn’t “just messy docs”. It’s how you lose the ability to implement and review changes consistently.&lt;/p&gt;

&lt;h2&gt;
  
  
  I couldn’t do it manually (and I couldn’t delegate it fully)
&lt;/h2&gt;

&lt;p&gt;I knew I had to redo the documentation. But I also knew I couldn’t realistically do it all by hand.&lt;/p&gt;

&lt;p&gt;At the same time, I couldn’t just tell an LLM: “Rewrite everything according to approach X.” Not enough context, too easy to lose control.&lt;/p&gt;

&lt;p&gt;So I went with a third option: build a reliable process out of unreliable components—&lt;strong&gt;me + an LLM&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: I separated my docs into domains (and forced the model to actually read)
&lt;/h2&gt;

&lt;p&gt;First, I extracted &lt;strong&gt;domain areas&lt;/strong&gt; from the old documentation—the vocabulary I was using to describe the project and its parts. I tried to keep domains mutually independent (so the overall framework stays holdable in my head).&lt;/p&gt;

&lt;p&gt;Then I ran the same loop for each domain:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;I asked the LLM to &lt;strong&gt;read all old docs carefully&lt;/strong&gt; and extract requirements for that domain.&lt;/li&gt;
&lt;li&gt;I moved those requirements into a dedicated file and gave each one a &lt;strong&gt;project-unique ID&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;I asked the LLM to reread everything and check &lt;strong&gt;internal consistency&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;I fixed contradictions (sometimes by cross-checking the code).&lt;/li&gt;
&lt;li&gt;I repeated the consistency check (this caught small but nasty issues).&lt;/li&gt;
&lt;li&gt;I reviewed diffs manually to catch what was missing or implicitly assumed.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This took ~4 days (about 4 hours/day). Exhausting, but still much faster than doing it without an LLM.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: I hit a wall—because I mixed “requirements” with “verification”
&lt;/h2&gt;

&lt;p&gt;After the requirements pass, I wanted to extract scenarios (the thing that connects domains and requirements).&lt;/p&gt;

&lt;p&gt;And suddenly the model started to stumble and hallucinate again.&lt;/p&gt;

&lt;p&gt;The fix turned out to be painfully simple: my requirements were still “too thick” because they contained &lt;strong&gt;verification sections&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Verification text was useful, but it didn’t belong inside requirements files. It confused the extraction step.&lt;/p&gt;

&lt;p&gt;So I separated verification into its own files per domain. After that, scenario extraction became stable again.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3 (the main artifact): I built per-subsystem digests
&lt;/h2&gt;

&lt;p&gt;Even with cleaner docs, there was still one big problem:&lt;/p&gt;

&lt;p&gt;An LLM is much more likely to &lt;em&gt;actually read&lt;/em&gt; &lt;strong&gt;one document&lt;/strong&gt; than to wander through folders and do keyword search across many files.&lt;/p&gt;

&lt;p&gt;So I built a small, boring artifact:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a &lt;strong&gt;registry file&lt;/strong&gt; listing subsystems and the docs that belong to each&lt;/li&gt;
&lt;li&gt;a tiny &lt;strong&gt;builder script&lt;/strong&gt; that concatenates those files into a single digest per subsystem&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Now, for each subsystem (authentication, access control, audit, security gates, plus a few project-specific ones), I have &lt;strong&gt;one consolidated document&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;I also keep a short “selection rules” note for myself: which digests to feed into the agent for a given task (e.g., access control vs audit logic). The LLM can check conformance well, but I don’t expect it to reliably infer what to check via chains of implicit assumptions.&lt;/p&gt;

&lt;h2&gt;
  
  
  The payoff: the restructure wasn’t cosmetic
&lt;/h2&gt;

&lt;p&gt;After I rebuilt the registry and digests, I asked the LLM to check the whole codebase for conformance to each consolidated document.&lt;/p&gt;

&lt;p&gt;It found about &lt;strong&gt;15 bugs&lt;/strong&gt;. Some only manifested under specific conditions.&lt;/p&gt;

&lt;p&gt;At first, I was upset: how could this exist with so many tests?&lt;/p&gt;

&lt;p&gt;Then I realized: this was the clearest proof that the new documentation structure was doing real work.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I’m taking from this
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;A big docset is not automatically verifiable.&lt;/li&gt;
&lt;li&gt;If you want LLM-assisted development to be stable, you need docs the model can &lt;strong&gt;read&lt;/strong&gt;, not just search.&lt;/li&gt;
&lt;li&gt;A tiny artifact (subsystem registry + digest builder) can become a point of leverage for your whole workflow.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you’ve dealt with docs/code drift (especially with LLMs in the loop), I’d love to hear what helped—and what failed.&lt;/p&gt;

</description>
      <category>llm</category>
      <category>documentation</category>
    </item>
  </channel>
</rss>
