<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: pickuma</title>
    <description>The latest articles on DEV Community by pickuma (@pickuma).</description>
    <link>https://dev.to/pickuma</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3926669%2Fb3923c39-364a-4953-b8f7-aa962d6419e0.jpg</url>
      <title>DEV Community: pickuma</title>
      <link>https://dev.to/pickuma</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/pickuma"/>
    <language>en</language>
    <item>
      <title>Git Plumbing in Practice: How CI, Review Tools, and AI Agents Build on Git's Primitives</title>
      <dc:creator>pickuma</dc:creator>
      <pubDate>Fri, 12 Jun 2026 02:10:05 +0000</pubDate>
      <link>https://dev.to/pickuma/git-plumbing-in-practice-how-ci-review-tools-and-ai-agents-build-on-gits-primitives-2858</link>
      <guid>https://dev.to/pickuma/git-plumbing-in-practice-how-ci-review-tools-and-ai-agents-build-on-gits-primitives-2858</guid>
      <description>&lt;p&gt;Run &lt;code&gt;git log&lt;/code&gt; and you're using porcelain — the human-facing layer Git's own manual labels exactly that. Run &lt;code&gt;git cat-file -p HEAD&lt;/code&gt; and you've dropped into plumbing: the low-level toolkit porcelain itself is built from. The split is not trivia. It's the reason a whole generation of developer tools — CI runners, stacked-diff CLIs, code review systems, and now AI coding agents — builds on Git rather than reinventing version control. They all program against the same small set of primitives, and once you can read those primitives, most "magic" tooling behavior becomes legible.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Object Model Is Smaller Than You Think
&lt;/h2&gt;

&lt;p&gt;Git's entire data model is four object types living in &lt;code&gt;.git/objects&lt;/code&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Blob&lt;/strong&gt; — file contents and nothing else. No filename, no permissions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tree&lt;/strong&gt; — one directory listing: mode, type, object ID, and name per entry.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Commit&lt;/strong&gt; — a pointer to exactly one tree, zero or more parent commits, an author, a committer, and a message.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Annotated tag&lt;/strong&gt; — a named (optionally signed) pointer to another object.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every object is content-addressed: its ID is the SHA-1 hash of its type, size, and bytes (SHA-256 has been an opt-in repository format since Git 2.29). Identical content always hashes to the same ID, which is why a file left unchanged across 500 commits is stored as one blob referenced 500 times, not 500 copies.&lt;/p&gt;

&lt;p&gt;On top of objects sit &lt;strong&gt;refs&lt;/strong&gt;, and a ref is almost embarrassingly simple: &lt;code&gt;cat .git/refs/heads/main&lt;/code&gt; prints 40 hex characters. A branch is a text file containing a commit ID. &lt;code&gt;HEAD&lt;/code&gt; is typically a one-line file pointing at one of those refs.&lt;/p&gt;

&lt;p&gt;You can verify all of this in under a minute:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git cat-file &lt;span class="nt"&gt;-p&lt;/span&gt; HEAD          &lt;span class="c"&gt;# the raw commit object&lt;/span&gt;
git cat-file &lt;span class="nt"&gt;-p&lt;/span&gt; HEAD^&lt;span class="o"&gt;{&lt;/span&gt;tree&lt;span class="o"&gt;}&lt;/span&gt;   &lt;span class="c"&gt;# the tree it points to&lt;/span&gt;
git ls-tree HEAD src/         &lt;span class="c"&gt;# one directory's entries&lt;/span&gt;
git rev-parse HEAD            &lt;span class="c"&gt;# resolve any ref to an ID&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;These are plumbing commands, and they come with a contract porcelain doesn't offer: their output formats stay stable and script-friendly across Git versions, while &lt;code&gt;git log&lt;/code&gt;'s formatting is allowed to drift. That stability guarantee is what third-party tools build against.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Reading &lt;code&gt;.git&lt;/code&gt; is safe; writing into it by hand is not. Refs get compacted into &lt;code&gt;.git/packed-refs&lt;/code&gt;, and repositories on Git 2.45+ can use the reftable backend instead of loose files — so a hash you &lt;code&gt;echo&lt;/code&gt; into &lt;code&gt;refs/heads/&lt;/code&gt; may be shadowed, skip locking, and leave no reflog entry. Go through &lt;code&gt;git update-ref&lt;/code&gt;, &lt;code&gt;git symbolic-ref&lt;/code&gt;, and &lt;code&gt;git hash-object -w&lt;/code&gt;; they handle packing, locks, and reflogs for you.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What Real Tools Do With These Primitives
&lt;/h2&gt;

&lt;p&gt;Once you hold the object model in your head, existing tools stop looking like magic and start looking like four primitives composed differently.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;CI systems tune object transfer, not checkouts.&lt;/strong&gt; GitHub Actions' &lt;code&gt;checkout&lt;/code&gt; action defaults to &lt;code&gt;fetch-depth: 1&lt;/code&gt; — a shallow clone that fetches only the objects reachable from a single commit instead of full history. On a long-lived repository that's the difference between transferring one tree's worth of blobs and every blob ever written. Partial clone (&lt;code&gt;--filter=blob:none&lt;/code&gt;) goes further, deferring blob downloads until checkout actually needs them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stacked-diff tools are ref editors.&lt;/strong&gt; Graphite, ghstack, and git-branchless implement "restack" by writing new commit objects — same trees, new parents — and pointing branch refs at them. There's no second storage engine for your code; the stack is a set of refs plus a dependency order the tool tracks.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Code review systems use Git as their database.&lt;/strong&gt; Gerrit stores every patchset under a &lt;code&gt;refs/changes/...&lt;/code&gt; namespace, and since its NoteDb migration it keeps review comments and votes inside the repository itself using git notes — a mechanism that attaches metadata to a commit without changing the commit's hash. Replicating the review database is a &lt;code&gt;git fetch&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Alternative frontends keep the Git backend.&lt;/strong&gt; Jujutsu (jj) replaces the index and branching UX entirely but reads and writes standard Git object storage, so you can run jj locally while collaborators see an ordinary Git repo. Libraries like libgit2 (C), gitoxide (Rust), go-git, and isomorphic-git (JavaScript, runs in the browser) reimplement object and ref access without shelling out to a git binary — which is how browser-based editors clone and diff without a server-side checkout.&lt;/p&gt;

&lt;h2&gt;
  
  
  AI Coding Agents Treat Git as a Snapshot Engine
&lt;/h2&gt;

&lt;p&gt;The newest tenants on the plumbing are coding agents, and they lean on two properties: cheap isolation and cheap snapshots.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;git worktree add&lt;/code&gt; gives one repository multiple working directories that share a single object database, each checked out to its own branch. That's the standard isolation move for agents — Claude Code, for one, can run subagents in disposable worktrees so parallel edits can't clobber each other, then remove a worktree that produced no changes. Spinning one up costs a directory and some ref bookkeeping, not a second clone.&lt;/p&gt;

&lt;p&gt;Snapshots fall out of content addressing. &lt;code&gt;git hash-object -w&lt;/code&gt; writes any file into the object store; &lt;code&gt;git write-tree&lt;/code&gt;, pointed at an alternate index via the &lt;code&gt;GIT_INDEX_FILE&lt;/code&gt; environment variable, captures an entire working state as a tree ID — without touching your branches, your index, or your history. An agent that wants a checkpoint between every edit doesn't need to invent a journaling format. The repository already is one: append-only, deduplicated, addressable by hash.&lt;/p&gt;

&lt;p&gt;The practical payoff when you evaluate agentic tools: ask how they isolate work and how they snapshot it. A tool that answers "worktrees and trees" inherits Git's guarantees — &lt;code&gt;git diff&lt;/code&gt; works, &lt;code&gt;git fsck&lt;/code&gt; works, your existing recovery muscle memory works. A tool that answers with a proprietary sidecar format makes you learn a second recovery model for the day something goes wrong.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where to Start Building
&lt;/h2&gt;

&lt;p&gt;You don't need libgit2 bindings to get value from the plumbing. Three small projects, in ascending order of effort:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;A repo inspector.&lt;/strong&gt; Pipe &lt;code&gt;git for-each-ref&lt;/code&gt; and &lt;code&gt;git cat-file --batch&lt;/code&gt; into a script that answers a question your team actually has — say, which branches still contain a leaked config blob. &lt;code&gt;cat-file --batch&lt;/code&gt; is built for this: object IDs in on stdin, parsed objects out on stdout, one process for thousands of lookups.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A deploy hook.&lt;/strong&gt; A bare repository plus &lt;code&gt;GIT_WORK_TREE=/srv/app git checkout -f&lt;/code&gt; inside a &lt;code&gt;post-receive&lt;/code&gt; hook is a complete push-to-deploy pipeline in roughly five lines of shell. Heroku-style deploys worked this way, and it still holds up for a single server.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A snapshot tool.&lt;/strong&gt; Combine an alternate &lt;code&gt;GIT_INDEX_FILE&lt;/code&gt;, &lt;code&gt;git add -A&lt;/code&gt;, and &lt;code&gt;git write-tree&lt;/code&gt; to checkpoint a directory on a timer into refs under &lt;code&gt;refs/snapshots/&lt;/code&gt;. You get deduplicated, diffable backups with zero new dependencies.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;For the canonical deep dive, Chapter 10 of Pro Git ("Git Internals") is free online and walks the same ground with full examples; the plumbing section of &lt;code&gt;man git&lt;/code&gt; lists every low-level command with its stability contract.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://pickuma.com/for-dev/git-plumbing-how-dev-tools-build-on-gits-primitives/?utm_source=devto&amp;amp;utm_medium=crosspost&amp;amp;utm_campaign=blog" rel="noopener noreferrer"&gt;pickuma.com&lt;/a&gt;. Subscribe to &lt;a href="https://pickuma.com/rss.xml" rel="noopener noreferrer"&gt;the RSS&lt;/a&gt; or follow &lt;a href="https://bsky.app/profile/pickuma.bsky.social" rel="noopener noreferrer"&gt;@pickuma.bsky.social&lt;/a&gt; for new reviews.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>δ-mem Explained: What Online Memory Means for LLM Agent Cost and Recall</title>
      <dc:creator>pickuma</dc:creator>
      <pubDate>Fri, 12 Jun 2026 02:08:49 +0000</pubDate>
      <link>https://dev.to/pickuma/d-mem-explained-what-online-memory-means-for-llm-agent-cost-and-recall-3ji4</link>
      <guid>https://dev.to/pickuma/d-mem-explained-what-online-memory-means-for-llm-agent-cost-and-recall-3ji4</guid>
      <description>&lt;p&gt;A new arXiv preprint (2605.12357) introduces δ-mem — "delta-mem" — an online memory mechanism that promises persistent, low-overhead context for LLM agents across long-running sessions. If you build agents, RAG pipelines, or chat apps, that one sentence touches the three things you actually get paged about: latency, recall quality, and per-request token spend.&lt;/p&gt;

&lt;p&gt;We mapped the preprint's framing against the memory approaches developers run in production today. Here's the problem δ-mem is positioned to solve, what the paper does and doesn't commit to, and how to decide whether a memory layer like this belongs in your stack — without re-architecting anything on the strength of an abstract.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why memory is the expensive part of every agent
&lt;/h2&gt;

&lt;p&gt;The default memory strategy in most agent codebases is no strategy: append every message to the conversation array and resend the whole thing on each API call. That works until sessions get long. If your agent accumulates 2,000 tokens per turn — tool results included — the history alone passes 100,000 tokens by turn 50, and you pay to reprocess all of it on every subsequent call. Cumulative input cost grows roughly quadratically with session length, and time-to-first-token degrades along with it.&lt;/p&gt;

&lt;p&gt;The standard mitigations each trade something away:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Sliding windows and summarization&lt;/strong&gt; cap cost but are lossy. The constraint your user stated in turn 3 ("never touch the prod database") gets compressed into a summary, then compressed again, then it's gone — usually right before the agent needs it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vector-store RAG over chat history&lt;/strong&gt; keeps everything but relocates the problem. You now run an embedding pass and a retrieval call per turn, you make chunking decisions that silently shape recall, and multi-hop questions ("what did we decide after we ruled out option B?") sit exactly where similarity search is weakest.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Offline memory pipelines&lt;/strong&gt; — batch jobs that distill transcripts into a user profile after the session ends — can't help the session that's currently running.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That last gap is what "online" means in the paper's title: memory that updates during the session and is immediately usable, rather than reconstructed from logs afterwards. For a coding agent on hour two of a refactor, or a support bot on message 40 of an escalation, that distinction is the whole game.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the δ-mem preprint claims — and what to verify before you care
&lt;/h2&gt;

&lt;p&gt;The preprint's framing commits δ-mem to three properties:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Persistence across sessions.&lt;/strong&gt; Memory survives the end of a conversation, so the agent doesn't restart from zero context tomorrow.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Low overhead.&lt;/strong&gt; Maintaining memory shouldn't itself dominate your latency budget or token bill — the failure mode of several earlier memory systems, where the bookkeeping calls cost more than the history they replaced.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Online operation.&lt;/strong&gt; Updates happen incrementally as the session runs, not in a post-hoc batch job.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The δ in the name points at the mechanism: incremental updates — deltas — applied to a persistent memory state, instead of repeatedly reprocessing or re-summarizing the full history. That's the read the paper's framing suggests; the implementation specifics are what you should go to the PDF for.&lt;/p&gt;

&lt;p&gt;And when you do read it, four questions separate a useful result from a benchmark-shaped one:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Which baselines?&lt;/strong&gt; Beating full-context replay or a naive sliding window is table stakes. A tuned RAG-over-history setup or a hierarchical memory system in the MemGPT lineage is the comparison that matters for anyone with an existing pipeline.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Overhead measured in what?&lt;/strong&gt; Tokens, wall-clock latency, or both — and at which session lengths. "Low overhead" at 20 turns says little about turn 500.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;How is recall evaluated?&lt;/strong&gt; Multi-session QA over long horizons stresses memory differently than single-needle retrieval. Check whether the evaluation includes questions whose answers require composing facts from separate, distant turns.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Is there code?&lt;/strong&gt; A repo you can run against your own transcripts is worth more than any table in the paper.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;This is an unreviewed preprint, and as of this writing we have not seen independent replication of its results. Treat reported numbers as claims, not facts. The right-sized response is a time-boxed spike against your own workload — not a migration of your production memory layer.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Where a memory layer like this fits — and what to do this week
&lt;/h2&gt;

&lt;p&gt;Here's how the δ-mem class of system compares to what you're likely running now:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;Token cost as sessions grow&lt;/th&gt;
&lt;th&gt;Cross-session memory&lt;/th&gt;
&lt;th&gt;Extra infrastructure&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Full history replay&lt;/td&gt;
&lt;td&gt;Grows every turn&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Window + summarization&lt;/td&gt;
&lt;td&gt;Flat after the cap, lossy&lt;/td&gt;
&lt;td&gt;None, unless persisted separately&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RAG over transcripts&lt;/td&gt;
&lt;td&gt;Roughly flat, plus retrieval tokens&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Embeddings + vector store&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Online memory (δ-mem class)&lt;/td&gt;
&lt;td&gt;Claimed near-flat&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;td&gt;Memory store + update path&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Whether the "claimed" row earns a place in your stack depends on your numbers, not the paper's. Two concrete moves:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Measure how much of your input is replayed history.&lt;/strong&gt; Log input token counts per call and tag the share that is prior-turn content versus new instructions and retrieved documents. If history is a minor slice of your spend, a memory layer solves a problem you don't have. If it dominates — common for tool-heavy agents, where every tool result gets dragged through every subsequent call — the ceiling on savings is large enough to justify the spike.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Put memory behind a seam.&lt;/strong&gt; If your conversation state is assembled inline wherever you call the model, you can't experiment with anything. A narrow interface — &lt;code&gt;getContext(sessionId)&lt;/code&gt; on the way in, &lt;code&gt;recordEvent(sessionId, event)&lt;/code&gt; on the way out — turns every future memory backend, δ-mem included, into a swappable implementation instead of a rewrite. Build the seam now; evaluate candidates as their code lands.&lt;/p&gt;

&lt;p&gt;If you're scaffolding that seam this week, an agent-capable editor shortens the loop considerably — generating the interface, a replay-based test harness, and a token-accounting script is exactly the kind of well-specified grunt work it's good at.&lt;/p&gt;

&lt;p&gt;The honest summary: δ-mem names a real and expensive problem, and "online, persistent, low-overhead" is the correct wishlist. Whether this particular mechanism delivers is unknowable from where we sit — but the instrumentation and the seam are worth building regardless of which memory system eventually wins.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://pickuma.com/for-dev/delta-mem-online-memory-llm-agents-explained/?utm_source=devto&amp;amp;utm_medium=crosspost&amp;amp;utm_campaign=blog" rel="noopener noreferrer"&gt;pickuma.com&lt;/a&gt;. Subscribe to &lt;a href="https://pickuma.com/rss.xml" rel="noopener noreferrer"&gt;the RSS&lt;/a&gt; or follow &lt;a href="https://bsky.app/profile/pickuma.bsky.social" rel="noopener noreferrer"&gt;@pickuma.bsky.social&lt;/a&gt; for new reviews.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Agent-Native Infrastructure: What Actually Breaks When AI Agents Use Your Stack</title>
      <dc:creator>pickuma</dc:creator>
      <pubDate>Fri, 12 Jun 2026 02:07:33 +0000</pubDate>
      <link>https://dev.to/pickuma/agent-native-infrastructure-what-actually-breaks-when-ai-agents-use-your-stack-1ho4</link>
      <guid>https://dev.to/pickuma/agent-native-infrastructure-what-actually-breaks-when-ai-agents-use-your-stack-1ho4</guid>
      <description>&lt;p&gt;The claim circulating in AI infrastructure circles is blunt: the stack you run today—identity, auth, storage, APIs—was designed around a human at a keyboard, and autonomous agents violate that assumption at every layer. The strong version of the argument says agents demand a full rewrite of core software primitives. We think the diagnosis is mostly correct and the prescription is premature. Here is where your existing stack genuinely breaks when an agent starts using it, where it merely bends, and what is worth building first.&lt;/p&gt;

&lt;h2&gt;
  
  
  Your Identity Layer Assumes a Human Is Present
&lt;/h2&gt;

&lt;p&gt;OAuth 2.0 was finalized in 2012 as RFC 6749, and its central flow assumes a browser redirect and a person reading a consent screen. An agent has neither. So teams shipping agent features today fall back on the two primitives that don't require a human: API keys and service accounts. Both are static, long-lived, and scoped at provisioning time—which is exactly wrong for an agent that exists for ninety seconds, acts on behalf of one specific user, and may spawn sub-agents with narrower jobs.&lt;/p&gt;

&lt;p&gt;Three concrete problems follow. Attribution: when an agent updates a CRM record, your audit log shows the service account, not the user who delegated the task or the reasoning step that triggered the write. Revocation: killing one misbehaving agent means rotating a key shared by every agent in the fleet. Delegation: there is no standard way to express \"this agent may read calendar events for user A, for this task, for the next ten minutes.\"&lt;/p&gt;

&lt;p&gt;The pieces exist in partial form. OAuth token exchange (RFC 8693) models on-behalf-of flows. SPIFFE gives workloads cryptographic identities. But nobody has assembled them into a default that a two-person team gets out of the box, and that gap—not model quality—is a large part of what makes agent deployments feel risky.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The most common failure mode in agent deployments is not hallucination—it is an over-permissioned credential. An agent holding an admin API key turns every prompt-injection attempt into a potential admin action. Scope agent credentials the way you scope production SSH access: per task, time-boxed, and logged.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Storage and APIs Expect Polite, Predictable Clients
&lt;/h2&gt;

&lt;p&gt;Your database schema encodes decisions made at design time: these tables, these access patterns, these indexes. Agents add a workload that schema-first design never anticipated—memory. An agent needs to recall what happened in previous sessions, retrieve facts by semantic similarity rather than primary key, and weigh recency against relevance. The current answer is to bolt a vector store next to Postgres and sync embeddings through an ETL job. That works until a source row changes and its embedding doesn't, and now your agent confidently cites stale data with no provenance trail to catch it.&lt;/p&gt;

&lt;p&gt;APIs have the inverse problem. REST contracts assume a developer read the documentation once, wrote correct client code, and shipped it. Agents generate calls at runtime. They retry ambiguously failed requests, fill parameters from inferred context, and parallelize in ways your rate limiter reads as abuse. Stripe normalized idempotency keys for payment APIs years ago; almost nothing else in a typical SaaS API surface offers them, machine-readable error semantics, or a dry-run mode that lets a caller preview side effects before committing them.&lt;/p&gt;

&lt;p&gt;Model Context Protocol, which Anthropic released in November 2024, addresses one slice of this: tool discovery and description, so a model can learn what an API does without scraping docs. It deliberately does not solve authorization, spend budgets, or execution safety. Treating MCP adoption as \"agent-ready\" is the new version of treating an OpenAPI spec as a security model.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to Build First (It Is Not a Rewrite)
&lt;/h2&gt;

&lt;p&gt;The full-rewrite framing makes a good essay and a bad roadmap. Most of the breakage above can be contained at a boundary layer without touching your core services, and that is where we would start.&lt;/p&gt;

&lt;p&gt;First, an agent gateway. Every agent call enters through one proxy that mints a short-lived credential scoped to the current task, attaches a task ID to every downstream request, enforces a per-task spend and call budget, and writes the full trace to your audit log. You can assemble this from an off-the-shelf API gateway in days, not quarters, and it converts the attribution and revocation problems from architectural to operational.&lt;/p&gt;

&lt;p&gt;Second, provenance-first memory. Before reaching for a dedicated vector database, add pgvector to the Postgres you already run and store every embedded chunk with its source row ID and a timestamp. The query performance ceiling is real but distant for most products; the debugging value of knowing where a memory came from is immediate.&lt;/p&gt;

&lt;p&gt;Third, tier your write actions. Reads are free. Reversible writes—drafting an email, staging a change—get idempotency keys. Irreversible writes—sending, deleting, paying—require either a human approval step or a compensating-transaction plan. Most agent products today skip this triage entirely and either block everything or allow everything.&lt;/p&gt;

&lt;p&gt;A rewrite becomes worth discussing when agents stop being a feature and become the primary client of your system—when most inbound requests carry a task ID instead of a session cookie. Some companies will reach that point. Yours probably has not yet.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://pickuma.com/for-dev/agent-native-infrastructure-what-breaks/?utm_source=devto&amp;amp;utm_medium=crosspost&amp;amp;utm_campaign=blog" rel="noopener noreferrer"&gt;pickuma.com&lt;/a&gt;. Subscribe to &lt;a href="https://pickuma.com/rss.xml" rel="noopener noreferrer"&gt;the RSS&lt;/a&gt; or follow &lt;a href="https://bsky.app/profile/pickuma.bsky.social" rel="noopener noreferrer"&gt;@pickuma.bsky.social&lt;/a&gt; for new reviews.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>devops</category>
      <category>cloud</category>
      <category>astro</category>
    </item>
    <item>
      <title>AI Subscription Sprawl: Why Your Company Is Paying for the Same Capability Three Times</title>
      <dc:creator>pickuma</dc:creator>
      <pubDate>Fri, 12 Jun 2026 02:06:17 +0000</pubDate>
      <link>https://dev.to/pickuma/ai-subscription-sprawl-why-your-company-is-paying-for-the-same-capability-three-times-43kf</link>
      <guid>https://dev.to/pickuma/ai-subscription-sprawl-why-your-company-is-paying-for-the-same-capability-three-times-43kf</guid>
      <description>&lt;p&gt;Check your company's expense reports for the last quarter and you'll likely find the same pattern we keep hearing about from engineering leaders: a GitHub Copilot contract signed at the org level, a cluster of Cursor subscriptions expensed by individual team leads, a ChatGPT Team workspace someone in product spun up, and a Claude subscription a staff engineer pays for personally and quietly expenses. Four bills. Three of them doing substantially the same job.&lt;/p&gt;

&lt;p&gt;This didn't happen because anyone was careless. It happened because AI coding tools were adopted bottom-up, one $20 expense report at a time, while procurement processes were built for top-down SaaS purchases. Each individual subscription sat below the approval threshold that would have triggered a vendor review. Now the aggregate is a line item finance is starting to notice — and when the audit comes, engineering gets asked to justify the stack.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Duplicate AI Seats Pile Up
&lt;/h2&gt;

&lt;p&gt;The sprawl follows a predictable sequence. First, the company signs an official tool — usually GitHub Copilot, because it rides along on an existing GitHub Enterprise relationship and requires no new vendor onboarding. Then individual developers hit Copilot's limitations for their workflow and start expensing Cursor, because a $20/month receipt doesn't need a procurement ticket. Meanwhile, non-engineering teams adopt ChatGPT Team or Claude for writing and analysis, and engineers join those workspaces too because they want a general-purpose chat model alongside their IDE tooling.&lt;/p&gt;

&lt;p&gt;The result is capability overlap, not capability coverage:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;IDE autocomplete and agentic editing&lt;/strong&gt;: Copilot and Cursor both do this. Paying for both per seat means paying twice for the category.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Chat-based reasoning and code review&lt;/strong&gt;: ChatGPT Enterprise and Claude Team overlap heavily for most day-to-day developer queries.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Embedded AI in existing SaaS&lt;/strong&gt;: Notion AI, Slack AI, Atlassian Intelligence, and similar add-ons each charge their own per-seat premium for model access you're already buying elsewhere.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No single layer is wasteful on its own. The waste is in the stack.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Per-Seat Math Nobody Has Run
&lt;/h2&gt;

&lt;p&gt;Here's what published list pricing looks like for the common stack as of mid-2026 (enterprise tiers are negotiated, so treat these as floors, not finals):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Plan&lt;/th&gt;
&lt;th&gt;List price&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;GitHub Copilot&lt;/td&gt;
&lt;td&gt;Business&lt;/td&gt;
&lt;td&gt;$19/user/month&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GitHub Copilot&lt;/td&gt;
&lt;td&gt;Enterprise&lt;/td&gt;
&lt;td&gt;$39/user/month&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cursor&lt;/td&gt;
&lt;td&gt;Teams&lt;/td&gt;
&lt;td&gt;$40/user/month&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ChatGPT&lt;/td&gt;
&lt;td&gt;Team&lt;/td&gt;
&lt;td&gt;$25–30/user/month&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ChatGPT&lt;/td&gt;
&lt;td&gt;Enterprise&lt;/td&gt;
&lt;td&gt;Custom (annual commitment, seat minimums)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Claude&lt;/td&gt;
&lt;td&gt;Team&lt;/td&gt;
&lt;td&gt;$25–30/user/month&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Run the numbers for a 200-engineer organization carrying Copilot Business ($19), Cursor Teams ($40), ChatGPT Team ($30 monthly billing), and Claude Team ($30 monthly billing) simultaneously: that's $119 per engineer per month, or roughly $285,000 a year. If half of those seats are duplicative — and in the overlap categories above, they usually are — you're looking at six figures of annual spend that a procurement review could reclaim without removing any capability developers actually use.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Seat price is only the visible half of the bill. Cursor and several agentic tools layer usage-based pricing on top of the subscription, so heavy agent users can generate overages that dwarf their seat cost. And most of these contracts auto-renew annually — if you discover the duplication two weeks after renewal, you're carrying it for another year.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  A Procurement Playbook You Can Run This Quarter
&lt;/h2&gt;

&lt;p&gt;You don't need a FinOps team to fix this. You need an afternoon of data pulling and one uncomfortable meeting. Here's the sequence we'd run:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Inventory what's actually deployed.&lt;/strong&gt; Pull three sources: your SSO/identity provider logs (which AI domains are people authenticating to?), expense report line items matching AI vendors, and your corporate card statements. Expense data catches the shadow subscriptions SSO misses. Expect to find tools nobody on the leadership team knew about.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Map seats to utilization, not headcount.&lt;/strong&gt; Copilot's admin dashboard, Cursor's team analytics, and ChatGPT Enterprise's workspace reports all show last-active dates and usage volume. A seat that hasn't generated a completion in 60 days is a refund waiting to happen. In our experience reviewing these dashboards, inactive-seat rates of 20–30% are common in tools that were rolled out org-wide rather than opt-in.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Pick a primary per category and make the overlap explicit.&lt;/strong&gt; You probably need one IDE-layer tool and one chat-layer tool, not two of each. The decision criteria that matter: which tool your developers actually choose when they have both (utilization data answers this), data processing terms, and whether the vendor supports your compliance requirements (SOC 2 report, zero-data-retention options, regional hosting).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Negotiate with the utilization data in hand.&lt;/strong&gt; Vendors price enterprise AI deals expecting consolidation pressure. Walking into a renewal with "we have 200 licensed seats and 120 monthly actives, and we're evaluating consolidating to your competitor" changes the conversation. Annual billing typically takes 15–20% off monthly list prices on its own.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Build a tool registry and an intake path.&lt;/strong&gt; The reason sprawl happened is that requesting a tool officially was slower than expensing it. Fix the incentive: a lightweight registry of approved AI tools, who owns each contract, what data classes are allowed in each, and a request form that gets answered in days, not quarters. This is a database, not a bureaucracy.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;6. Put renewals on a calendar with a 90-day review trigger.&lt;/strong&gt; Every AI contract gets a review date one quarter before auto-renewal. That's when you re-pull utilization and decide: renew, renegotiate, or consolidate.&lt;/p&gt;

&lt;h2&gt;
  
  
  Shadow AI Is the Bill You Can't See on Any Invoice
&lt;/h2&gt;

&lt;p&gt;The duplicate seats are the measurable problem. The unmeasurable one is worse: developers using personal-tier AI accounts for work because the official tool is missing, slow to provision, or worse than what they can get for $20 of their own money.&lt;/p&gt;

&lt;p&gt;Consumer-tier AI accounts typically lack the data processing agreements, audit logs, and training opt-out guarantees that enterprise tiers carry. Source code pasted into a personal chat account may be handled under consumer terms your security team has never reviewed. You can't fix this by blocking domains — developers will route around blocks, and you'll lose the visibility you had.&lt;/p&gt;

&lt;p&gt;The pragmatic fix is an amnesty: announce that anyone using an unapproved AI tool can register it in the intake process with no penalty, and commit to either provisioning an approved equivalent or fast-tracking an evaluation. The goal is to make the sanctioned path faster than the shadow path. Governance that's slower than the workaround isn't governance — it's theater.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://pickuma.com/for-dev/ai-subscription-sprawl-enterprise-procurement-playbook/?utm_source=devto&amp;amp;utm_medium=crosspost&amp;amp;utm_campaign=blog" rel="noopener noreferrer"&gt;pickuma.com&lt;/a&gt;. Subscribe to &lt;a href="https://pickuma.com/rss.xml" rel="noopener noreferrer"&gt;the RSS&lt;/a&gt; or follow &lt;a href="https://bsky.app/profile/pickuma.bsky.social" rel="noopener noreferrer"&gt;@pickuma.bsky.social&lt;/a&gt; for new reviews.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>productivity</category>
      <category>saas</category>
      <category>webdev</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>How to Build a Skills Library for Your AI Engineering Team</title>
      <dc:creator>pickuma</dc:creator>
      <pubDate>Fri, 12 Jun 2026 02:05:01 +0000</pubDate>
      <link>https://dev.to/pickuma/how-to-build-a-skills-library-for-your-ai-engineering-team-jbj</link>
      <guid>https://dev.to/pickuma/how-to-build-a-skills-library-for-your-ai-engineering-team-jbj</guid>
      <description>&lt;p&gt;Walk around any engineering team using Claude Code or Cursor and you'll find the same pattern: one engineer has a prompt that reliably generates migration-safe database changes, another has a debugging workflow that cuts incident triage from an hour to fifteen minutes, and neither knows the other's trick exists. The knowledge lives in personal dotfiles, Slack threads, and muscle memory.&lt;/p&gt;

&lt;p&gt;A skills library fixes this. It's a version-controlled collection of reusable instructions — prompts, workflows, checklists, and helper scripts — that your AI coding assistant loads on demand. Claude Code formalized this with Agent Skills (a &lt;code&gt;SKILL.md&lt;/code&gt; file with YAML frontmatter inside &lt;code&gt;.claude/skills/&lt;/code&gt;), and Cursor has an equivalent in project rules (&lt;code&gt;.cursor/rules/*.mdc&lt;/code&gt;). The format matters less than the discipline: treat the way your team talks to AI tools as shared infrastructure, not personal preference.&lt;/p&gt;

&lt;p&gt;We've been running a skills library across two of our own repositories for several months, and this article covers what we'd tell a team starting from zero: how to decide what becomes a skill, how to write skills that survive contact with real codebases, and how to version and distribute them so they don't rot.&lt;/p&gt;

&lt;h2&gt;
  
  
  What belongs in a skills library (and what doesn't)
&lt;/h2&gt;

&lt;p&gt;The failure mode for most teams isn't writing too few skills — it's writing the wrong ones. A skill earns its place when it encodes a decision your team has already made and doesn't want to re-litigate in every session.&lt;/p&gt;

&lt;p&gt;Good candidates:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Workflows with ordered steps.&lt;/strong&gt; "How we do database migrations" is a skill: check for existing migrations, generate with the project's CLI, write the down migration, run against a local copy before committing. An AI assistant following five explicit steps beats one improvising from training data every time.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Project-specific conventions the model can't infer.&lt;/strong&gt; Your error-handling wrapper, your feature-flag client, the internal package that replaces a popular open-source one. Without a skill, the assistant reaches for the public library it saw most during training.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Checklists that gate risky actions.&lt;/strong&gt; Pre-deploy verification, secrets scanning before commit, accessibility passes on new UI. Skills are good at converting "things senior engineers remember" into "things every session does."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Repeated prompt patterns.&lt;/strong&gt; If three engineers have independently written near-identical prompts for generating API client code, that's a skill announcing itself.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What doesn't belong: anything the model already does well unprompted (don't write a "how to write a for loop" skill), one-off instructions for a single task, and content that changes weekly. A skill that's stale is worse than no skill — the assistant will follow outdated instructions with full confidence.&lt;/p&gt;

&lt;p&gt;There's also a structural distinction worth getting right early. Always-loaded context files (&lt;code&gt;CLAUDE.md&lt;/code&gt;, &lt;code&gt;AGENTS.md&lt;/code&gt;, Cursor's &lt;code&gt;alwaysApply&lt;/code&gt; rules) consume context window in every single session, so they should stay short — project layout, build commands, hard constraints. Skills load on demand when their description matches the task, so they can afford to be detailed. Teams that dump everything into one giant context file pay a per-session tax for instructions that apply to 5% of sessions.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Never put credentials, internal hostnames, or customer data in skill files. Skills get committed to git, synced across machines, and pasted into bug reports. Treat every skill file as if it will eventually be public — because once it's in a shared repo with a long history, it effectively is.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Writing skills that actually get used
&lt;/h2&gt;

&lt;p&gt;A skill has two jobs: get selected at the right moment, and produce the right behavior once loaded. Teams consistently underinvest in the first job.&lt;/p&gt;

&lt;p&gt;In Claude Code, the frontmatter &lt;code&gt;description&lt;/code&gt; field is what the model reads when deciding whether a skill applies. A description like "database stuff" will fire rarely and randomly. A description like "Use when creating or modifying database migrations, including schema changes, index additions, and data backfills" fires precisely. Write descriptions the way you'd write a function's docstring for a colleague who only reads docstrings: state the trigger conditions, name the keywords a user would actually type.&lt;/p&gt;

&lt;p&gt;For the body of the skill, three rules have held up for us:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Write steps, not essays.&lt;/strong&gt; Numbered procedures with explicit verification points ("run the test suite; if it fails, stop and report") outperform paragraphs of philosophy. The assistant executes instructions; it doesn't absorb vibes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Keep one skill per concern.&lt;/strong&gt; A 400-line mega-skill covering migrations, deployments, and code review will partially apply to everything and fully apply to nothing. Split it. Smaller skills also produce smaller diffs in review, which matters once your library has multiple contributors.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Include the negative space.&lt;/strong&gt; The most valuable lines in our own skills are prohibitions: "do NOT hardcode affiliate URLs," "do NOT skip the post-publish step." Models are eager to be helpful; telling them where helpfulness becomes harm is exactly the tribal knowledge a library should capture.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Progressive disclosure is the pattern that keeps this scalable: the &lt;code&gt;SKILL.md&lt;/code&gt; stays under a page, and detailed references — API schemas, long examples, helper scripts — live in adjacent files the assistant reads only when needed. Claude Code skills support bundling scripts directly in the skill directory, which turns "instructions for doing X" into "a tool that does X," a meaningful reliability upgrade for anything involving exact output formats.&lt;/p&gt;

&lt;h2&gt;
  
  
  Versioning and distribution: treat skills like code
&lt;/h2&gt;

&lt;p&gt;The distribution question is where most skills libraries quietly die. An engineer writes five great skills, shares them in Slack, four people copy them, and six weeks later there are four divergent copies and nobody knows which is current.&lt;/p&gt;

&lt;p&gt;The fix is boring and effective: &lt;strong&gt;skills live in git, and changes go through pull requests.&lt;/strong&gt; From there you have three distribution models, in increasing order of ceremony:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;In-repo skills&lt;/strong&gt; (&lt;code&gt;.claude/skills/&lt;/code&gt; committed to the project repository). Zero distribution cost — everyone who clones the repo has them. Right answer for project-specific skills, which is most of them.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A shared skills repository&lt;/strong&gt; that engineers clone into their personal skills directory (or wire in via symlink or git submodule). Right answer for cross-project skills: your code review checklist, your incident response workflow, your documentation style.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Plugins or internal packages&lt;/strong&gt; with explicit versions. Claude Code supports plugin marketplaces for exactly this; it's worth the ceremony once you're distributing skills across multiple teams and need to roll back a bad skill update the way you'd roll back a bad library release.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Whichever model you pick, apply the same hygiene you'd apply to a shared library: a CODEOWNERS entry so skill changes get reviewed by someone who uses them, a changelog (or at minimum descriptive commit messages), and periodic pruning. We do a quarterly pass and delete any skill nobody can remember firing — every stale skill is a chance for the assistant to confidently do the wrong thing.&lt;/p&gt;

&lt;p&gt;One more practice that pays off: keep a lightweight index document — a table of skill names, one-line purposes, and owners. New engineers read the index on day one and immediately know what the team has already solved. A Notion page or a &lt;code&gt;README.md&lt;/code&gt; at the repo root both work; the point is that discoverability is a feature you build, not one you get for free.&lt;/p&gt;

&lt;p&gt;Rolling this out doesn't require a mandate. Start with three skills that solve real, recurring pain — a migration workflow, a review checklist, a release procedure — put them in the repo, and mention them when they would have helped. Adoption follows usefulness. The teams we've seen succeed treat the library the way they treat their CI config: owned by everyone, reviewed like code, and improved every time it fails someone.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://pickuma.com/for-dev/build-ai-skills-library-engineering-team/?utm_source=devto&amp;amp;utm_medium=crosspost&amp;amp;utm_campaign=blog" rel="noopener noreferrer"&gt;pickuma.com&lt;/a&gt;. Subscribe to &lt;a href="https://pickuma.com/rss.xml" rel="noopener noreferrer"&gt;the RSS&lt;/a&gt; or follow &lt;a href="https://bsky.app/profile/pickuma.bsky.social" rel="noopener noreferrer"&gt;@pickuma.bsky.social&lt;/a&gt; for new reviews.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>tutorial</category>
      <category>productivity</category>
    </item>
    <item>
      <title>The Best Document Scanners for a Paperless Home Office in 2026</title>
      <dc:creator>pickuma</dc:creator>
      <pubDate>Wed, 10 Jun 2026 02:11:48 +0000</pubDate>
      <link>https://dev.to/pickuma/the-best-document-scanners-for-a-paperless-home-office-in-2026-35f</link>
      <guid>https://dev.to/pickuma/the-best-document-scanners-for-a-paperless-home-office-in-2026-35f</guid>
      <description>&lt;p&gt;Going paperless stalls for most people at the same point: the friction of actually digitizing the pile. A phone scanning app works for the occasional receipt, but for a real stack — contracts, tax records, manuals — a dedicated document scanner with an automatic feeder turns an afternoon of tedium into a few minutes of feeding pages. For a developer who likes searchable, organized files over physical clutter, it's a genuinely useful tool. This guide covers what matters and which to buy in 2026.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;These picks are compiled from independent reviews and buyer consensus — not paid placements, and not a claim that we have personally long-term tested every model. Confirm current software support for your OS at the link before buying.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What actually matters in a document scanner
&lt;/h2&gt;

&lt;p&gt;For documents, the specs that matter are different from a photo scanner. The big ones:&lt;/p&gt;

&lt;p&gt;An &lt;strong&gt;automatic document feeder (ADF)&lt;/strong&gt; is the whole point — it pulls in a stack of pages so you're not scanning one sheet at a time. &lt;strong&gt;Duplex&lt;/strong&gt; (two-sided) scanning in a single pass roughly halves the time for double-sided documents. &lt;strong&gt;Speed&lt;/strong&gt;, measured in pages per minute, determines how painful a big batch is. And &lt;strong&gt;reliable feeding&lt;/strong&gt; — not jamming or pulling two pages at once — matters more than any spec sheet number, which is why feeder quality is where the better brands earn their price.&lt;/p&gt;

&lt;p&gt;Then there's &lt;strong&gt;software&lt;/strong&gt;: good OCR (optical character recognition) turns scanned images into searchable, selectable text, which is what makes a digital archive actually useful. One-button workflows that scan straight to a searchable PDF in a chosen folder remove the friction that otherwise kills paperless habits. Resolution barely matters for text — 300 dpi is plenty — so don't pay for high dpi you won't use.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;A scanner that's fast on paper but misfeeds or pulls double pages will frustrate you into giving up. Feeding reliability is hard to read from specs and is exactly where established document-scanner brands justify their cost. Weight this over a slightly higher pages-per-minute number.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Best for most people
&lt;/h2&gt;

&lt;p&gt;The ScanSnap iX1600 is the perennial recommendation for good reason. It scans both sides quickly, the feeder is dependable, and the software makes one-touch scanning to a searchable PDF genuinely effortless — which is the part that keeps a paperless habit alive. It's not cheap, but it's the scanner people stop shopping after, and the experience justifies the premium for anyone digitizing regularly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Best value
&lt;/h2&gt;

&lt;p&gt;If the ScanSnap's price is hard to justify, the Brother ADS series delivers most of the experience for less. You get duplex scanning, a touchscreen for on-device workflows, and capable software, in a compact body. The feeder and software polish aren't quite at Fujitsu's level, but for typical home-office volumes it's a strong value that gets you to a searchable archive without overspending.&lt;/p&gt;

&lt;h2&gt;
  
  
  Best for flexible software
&lt;/h2&gt;

&lt;p&gt;The Epson WorkForce line is the choice when you want more say over how scans are processed and where they go. Its software offers flexible control over formats, destinations, and settings, with reliable duplex scanning underneath. It sits between the value and premium picks, and it suits people who want to tune their scanning workflow rather than rely solely on one-button presets.&lt;/p&gt;

&lt;p&gt;A document scanner is the tool that finally makes "go paperless" stick, by removing the friction that defeats most attempts. Get the Fujitsu ScanSnap iX1600 if you'll use it regularly, the Brother ADS for value, and weight feeder reliability and OCR over resolution — for paper, a smooth workflow beats a big spec sheet.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://pickuma.com/for-home/best-document-scanners-paperless-home-office-2026/?utm_source=devto&amp;amp;utm_medium=crosspost&amp;amp;utm_campaign=blog" rel="noopener noreferrer"&gt;pickuma.com&lt;/a&gt;. Subscribe to &lt;a href="https://pickuma.com/rss.xml" rel="noopener noreferrer"&gt;the RSS&lt;/a&gt; or follow &lt;a href="https://bsky.app/profile/pickuma.bsky.social" rel="noopener noreferrer"&gt;@pickuma.bsky.social&lt;/a&gt; for new reviews.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>The Best Laptop Stands for a Desk Setup in 2026</title>
      <dc:creator>pickuma</dc:creator>
      <pubDate>Wed, 10 Jun 2026 02:10:32 +0000</pubDate>
      <link>https://dev.to/pickuma/the-best-laptop-stands-for-a-desk-setup-in-2026-5oj</link>
      <guid>https://dev.to/pickuma/the-best-laptop-stands-for-a-desk-setup-in-2026-5oj</guid>
      <description>&lt;p&gt;Working all day on a laptop sitting flat on a desk forces you into the posture that physical therapists see constantly: head tilted down, neck craned forward, shoulders rounded over a screen that's far too low. A laptop stand fixes the screen height, bringing the display up toward eye level so your neck stays neutral. There's one catch that makes or breaks the benefit, and it's the thing most people miss when they buy one. This guide covers it and the picks worth buying in 2026.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;These picks are compiled from independent reviews and buyer consensus — not paid placements, and not a claim that we have personally long-term tested every model. Confirm current compatibility with your laptop size at the link before buying.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The rule that makes a stand actually work
&lt;/h2&gt;

&lt;p&gt;Here's the thing people get wrong: raising your laptop to eye level also raises its built-in keyboard and trackpad to a height where typing wrecks your shoulders and wrists. So a laptop stand on its own doesn't fix your ergonomics — it trades a neck problem for a wrist-and-shoulder problem.&lt;/p&gt;

&lt;p&gt;The stand only helps &lt;strong&gt;when paired with an external keyboard and mouse&lt;/strong&gt;. The screen goes up to eye level for your neck; the external keyboard sits at the right height on the desk for your hands. That combination is the actual ergonomic fix. If you're not willing to add a separate keyboard and mouse, a stand will make typing worse, not better — so budget for the whole setup, not just the riser.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Lifting the laptop puts its keyboard at an uncomfortable height for your arms and shoulders. The stand fixes your neck only if your hands move to a separate keyboard at desk level. Plan to buy both together — the stand alone solves one posture problem by creating another.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Best fixed stand for a desk
&lt;/h2&gt;

&lt;p&gt;The Rain Design mStand is the classic desk recommendation. It's a solid single-piece aluminum riser — no wobble, no folding mechanisms to fail — that raises your screen to a comfortable height and looks at home next to a monitor. It's fixed-height, which is fine for a permanent desk where you've dialed in your seating. Paired with an external keyboard, it's a clean, durable answer to laptop neck strain.&lt;/p&gt;

&lt;h2&gt;
  
  
  Best adjustable and packable
&lt;/h2&gt;

&lt;p&gt;If you work from multiple places or want to set the exact screen height, a packable adjustable stand like the Roost is the better fit. It folds down small, weighs little, and adjusts to bring the screen precisely to your eye level wherever you are. The Nexstand is a similar, more affordable take on the same idea. Either turns any table into an ergonomic setup — again, with an external keyboard in your bag.&lt;/p&gt;

&lt;h2&gt;
  
  
  Best budget adjustable
&lt;/h2&gt;

&lt;p&gt;The Nexstand offers the packable, height-adjustable formula at a lower price than the Roost. It folds flat for a bag, sets to several heights, and holds a laptop securely. The build isn't quite as refined, but functionally it does the same job — raising your screen anywhere — and for many people the savings make it the sensible adjustable pick.&lt;/p&gt;

&lt;p&gt;A laptop stand is a simple fix for the neck-craning posture that long laptop days create — but only as half of a setup. Get the Rain Design mStand for a permanent desk or a Roost/Nexstand for flexibility, and pair either with an external keyboard and mouse, because raising the screen without moving your hands just relocates the problem.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://pickuma.com/for-home/best-laptop-stands-desk-setup-2026/?utm_source=devto&amp;amp;utm_medium=crosspost&amp;amp;utm_campaign=blog" rel="noopener noreferrer"&gt;pickuma.com&lt;/a&gt;. Subscribe to &lt;a href="https://pickuma.com/rss.xml" rel="noopener noreferrer"&gt;the RSS&lt;/a&gt; or follow &lt;a href="https://bsky.app/profile/pickuma.bsky.social" rel="noopener noreferrer"&gt;@pickuma.bsky.social&lt;/a&gt; for new reviews.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Dragonfly vs Redis in 2026: Is It Really a Faster Drop-In Cache?</title>
      <dc:creator>pickuma</dc:creator>
      <pubDate>Wed, 10 Jun 2026 02:09:16 +0000</pubDate>
      <link>https://dev.to/pickuma/dragonfly-vs-redis-in-2026-is-it-really-a-faster-drop-in-cache-290f</link>
      <guid>https://dev.to/pickuma/dragonfly-vs-redis-in-2026-is-it-really-a-faster-drop-in-cache-290f</guid>
      <description>&lt;p&gt;Dragonfly markets itself with one line that gets engineers to pay attention: a drop-in Redis replacement that scales vertically on a single machine. You point your existing client at it, keep your &lt;code&gt;GET&lt;/code&gt;/&lt;code&gt;SET&lt;/code&gt;/&lt;code&gt;HSET&lt;/code&gt; code untouched, and get more throughput per box because it uses every core instead of one.&lt;/p&gt;

&lt;p&gt;That pitch is mostly accurate, and that's exactly why it deserves a careful read rather than a benchmark screenshot. "Drop-in" and "faster" are doing a lot of work in that sentence, and whether either holds depends on what your cache actually does. Here's how the two compare in 2026, and how to decide if a migration is worth your weekend.&lt;/p&gt;

&lt;h2&gt;
  
  
  What "drop-in" actually means
&lt;/h2&gt;

&lt;p&gt;Dragonfly implements the RESP wire protocol, so the same &lt;code&gt;redis-cli&lt;/code&gt;, the same &lt;code&gt;ioredis&lt;/code&gt; or &lt;code&gt;redis-py&lt;/code&gt; client, and the same connection string work without code changes. It also speaks the Memcached text protocol. For the common path — strings, hashes, lists, sorted sets, expirations, pub/sub — you genuinely can swap the endpoint and move on.&lt;/p&gt;

&lt;p&gt;The gap shows up at the edges. Redis has a deep surface area: Lua scripting semantics, the module ecosystem (RediSearch, RedisJSON, RedisBloom, RedisTimeSeries), cluster-mode hash-slot behavior, and a long tail of less-common commands. Dragonfly covers a large fraction of the core command set and has added scripting and some search capabilities, but it is not a byte-for-byte clone. If your application leans on a specific Redis module or depends on exact &lt;code&gt;WAIT&lt;/code&gt;/replication or cluster semantics, "drop-in" stops being literal.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Before you migrate anything, grep your codebase and your dependencies for the actual commands and modules you use. A cache that calls &lt;code&gt;SET&lt;/code&gt;, &lt;code&gt;GET&lt;/code&gt;, &lt;code&gt;INCR&lt;/code&gt;, and &lt;code&gt;EXPIRE&lt;/code&gt; will move cleanly. One that depends on RediSearch indexes or a custom Lua script with edge-case behavior needs a real compatibility test against your workload — not the vendor's feature matrix.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Where the architectures diverge
&lt;/h2&gt;

&lt;p&gt;The reason Dragonfly can claim more throughput on one node comes down to threading. Redis executes commands on a single thread. That design is deliberate — it makes operations atomic without locks and keeps reasoning simple — and Redis added I/O threads in 6.0 to parallelize socket reads and writes. But command execution itself stays on one core. To use a 32-core machine fully with Redis, the standard answer is to run multiple instances or a cluster and shard across them.&lt;/p&gt;

&lt;p&gt;Dragonfly takes the opposite stance. It uses a shared-nothing, multi-threaded architecture: the keyspace is partitioned across threads, each thread owns its slice, and there's no shared lock on the hot path. On a large multi-core box, that lets a single Dragonfly process saturate cores that a single Redis process leaves idle. It also ships a different hash table implementation (Dash) tuned for memory efficiency and incremental resizing, which reduces the latency spikes that classic open-addressing tables can cause during rehashing.&lt;/p&gt;

&lt;p&gt;Snapshotting is the other meaningful difference. Redis &lt;code&gt;BGSAVE&lt;/code&gt; forks the process, and on a write-heavy instance copy-on-write can balloon memory during the snapshot. Dragonfly performs point-in-time snapshots without a full fork, which avoids that memory spike and makes persistence on large datasets less of a tightrope walk.&lt;/p&gt;

&lt;p&gt;A word on the throughput numbers you'll see. Dragonfly's team publishes benchmarks showing large multiples over a single Redis instance on high-core machines. Those are real measurements, but they compare one Dragonfly process against one Redis process — and one Redis process was never meant to use 32 cores. The honest comparison is Dragonfly versus a sharded Redis or Valkey cluster on the same hardware, and against that baseline the gap narrows. The win Dragonfly offers is less "Redis is slow" and more "you can collapse a cluster into a single node and cut operational complexity."&lt;/p&gt;

&lt;h2&gt;
  
  
  The license angle, because it changed
&lt;/h2&gt;

&lt;p&gt;Licensing is part of this decision now in a way it wasn't a few years ago. Redis moved off the BSD license in March 2024 to a dual RSALv2/SSPLv1 model, which prompted the Valkey fork under the Linux Foundation, backed by AWS, Google, and Oracle and kept BSD-licensed. Redis 8 in 2025 re-added an AGPLv3 option. Dragonfly ships under the Business Source License 1.1 — source-available, with a use restriction against offering it as a competing managed service, converting to Apache 2.0 after a set period.&lt;/p&gt;

&lt;p&gt;None of these is the plain BSD that Redis users assumed for a decade. If your legal or procurement team cares about OSI-approved open source specifically, Valkey is the cleanest path, and that's a separate conversation from raw performance. Pick your axis before you pick your datastore.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;If your only goal is to fully use a large instance and you want a BSD license, benchmark Valkey before Dragonfly. It's the closest thing to "old Redis," it's adding its own multi-threading work, and it sidesteps both the BSL restriction and the module-compatibility question.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  When to switch — and when not to
&lt;/h2&gt;

&lt;p&gt;Dragonfly is a strong fit when you're running a large single cache instance, you're CPU-bound rather than memory-bound, and your command usage stays inside the core set. Collapsing a six-node Redis cluster into one Dragonfly box is a genuine operational simplification: fewer moving parts, no client-side sharding logic, simpler failover.&lt;/p&gt;

&lt;p&gt;It's the wrong move when you depend on Redis modules, when you have battle-tested Lua scripts with subtle behavior, or when your cache is small enough that a single Redis thread already keeps up — which describes a large share of real workloads. A cache serving a few thousand operations per second on a 2-core container gains nothing from 32-way threading. Don't migrate for a benchmark you'll never hit.&lt;/p&gt;

&lt;p&gt;The right way to decide is empirical: stand Dragonfly up beside your current cache, replay production-shaped traffic, and measure p99 latency and throughput on your hardware with your command mix. The vendor's numbers tell you what's possible; only your replay tells you what you'll get.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://pickuma.com/for-dev/dragonfly-vs-redis-2026/?utm_source=devto&amp;amp;utm_medium=crosspost&amp;amp;utm_campaign=blog" rel="noopener noreferrer"&gt;pickuma.com&lt;/a&gt;. Subscribe to &lt;a href="https://pickuma.com/rss.xml" rel="noopener noreferrer"&gt;the RSS&lt;/a&gt; or follow &lt;a href="https://bsky.app/profile/pickuma.bsky.social" rel="noopener noreferrer"&gt;@pickuma.bsky.social&lt;/a&gt; for new reviews.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>devops</category>
      <category>cloud</category>
      <category>astro</category>
    </item>
    <item>
      <title>Time-Series Cross-Validation: Why Standard K-Fold Ruins Trading Models</title>
      <dc:creator>pickuma</dc:creator>
      <pubDate>Wed, 10 Jun 2026 02:08:00 +0000</pubDate>
      <link>https://dev.to/pickuma/time-series-cross-validation-why-standard-k-fold-ruins-trading-models-29l7</link>
      <guid>https://dev.to/pickuma/time-series-cross-validation-why-standard-k-fold-ruins-trading-models-29l7</guid>
      <description>&lt;p&gt;If you've trained a machine-learning model on market data and gotten suspiciously good cross-validation scores, there's a good chance your validation was lying to you. The default cross-validation everyone reaches for — k-fold — does something catastrophic on time-series data: it shuffles the rows. On a trading model, shuffling means training on tomorrow to predict yesterday, and the result is a backtest that looks brilliant and fails the moment real time only moves forward. None of this is investment advice.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why k-fold breaks on time series
&lt;/h2&gt;

&lt;p&gt;Standard k-fold cross-validation splits your data into random folds, trains on most of them, and tests on the held-out one — rotating until every row has been a test row. This is the right tool when samples are independent, like classifying unrelated images.&lt;/p&gt;

&lt;p&gt;Market data is not independent across time, and the order is the whole point. When k-fold shuffles, it scatters future observations into the training set and past observations into the test set. Your model gets to "learn" from data that, in reality, hadn't happened yet when the prediction needed to be made. That's look-ahead bias, baked directly into your validation procedure — and it inflates your scores because predicting the past using the future is easy and useless.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;In time series, when something happened matters as much as what happened. Any validation that ignores chronological order is implicitly assuming you could have known the future — which is exactly the assumption that makes a trading backtest worthless. Preserving time order isn't a nicety; it's the core requirement.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Walk-forward validation: testing the way you'd trade
&lt;/h2&gt;

&lt;p&gt;The fix is to validate the way you'd actually deploy: train on the past, test on the future, and never the reverse. This is walk-forward validation.&lt;/p&gt;

&lt;p&gt;You train on an initial window, test on the period immediately after it, then move the window forward and repeat. In an &lt;strong&gt;expanding-window&lt;/strong&gt; version, the training set grows to include everything up to each test period — mimicking an investor who uses all history to date. In a &lt;strong&gt;rolling-window&lt;/strong&gt; version, the training set is a fixed-length window that slides forward, which adapts to changing regimes by forgetting the distant past. Either way, every test period is strictly later than the data the model trained on, so there's no leak. The scores you get are an honest estimate of how the model would have performed forward in time.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Walk-forward validation retrains the model many times over successive windows, so it costs more compute than a single shuffled k-fold. That cost is the price of an honest answer. A fast validation that lies is worse than a slow one that tells the truth — resist the temptation to shuffle just because it's quicker.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Purging and embargoing for overlapping labels
&lt;/h2&gt;

&lt;p&gt;There's a subtler leak that walk-forward alone doesn't fully close. If your labels are built from windows of time — say, "the return over the next five days" — then a training sample near the boundary of your test set overlaps in time with test samples, quietly sharing information across the split.&lt;/p&gt;

&lt;p&gt;The fix, popularized in the quant-ML literature, is &lt;strong&gt;purging&lt;/strong&gt; and &lt;strong&gt;embargoing&lt;/strong&gt;: remove training samples whose label windows overlap the test set (purging), and add a small gap after the test set before training resumes (embargoing), so adjacent-in-time leakage doesn't sneak through. If your features or labels look forward over any horizon, you need this; if each sample is genuinely point-in-time, plain walk-forward is enough.&lt;/p&gt;

&lt;p&gt;The unifying principle behind all of it is one sentence: information from the future must never touch the training set. K-fold violates it by shuffling; walk-forward respects it by construction; purging and embargoing patch the edge cases. Get this right and your validation scores become trustworthy. Get it wrong and you'll keep deploying models that were never as good as your notebook claimed.&lt;/p&gt;

&lt;p&gt;Validation is where most trading models are secretly broken, and shuffled k-fold is the usual culprit. Switch to walk-forward, add purging and embargoing when your labels overlap, and hold to the one rule that makes results trustworthy: the future never gets to teach the past.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://pickuma.com/for-investor/time-series-cross-validation-trading-models/?utm_source=devto&amp;amp;utm_medium=crosspost&amp;amp;utm_campaign=blog" rel="noopener noreferrer"&gt;pickuma.com&lt;/a&gt;. Subscribe to &lt;a href="https://pickuma.com/rss.xml" rel="noopener noreferrer"&gt;the RSS&lt;/a&gt; or follow &lt;a href="https://bsky.app/profile/pickuma.bsky.social" rel="noopener noreferrer"&gt;@pickuma.bsky.social&lt;/a&gt; for new reviews.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>investing</category>
      <category>finance</category>
      <category>beginners</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Portfolio Optimization with PyPortfolioOpt: Mean-Variance in Practice</title>
      <dc:creator>pickuma</dc:creator>
      <pubDate>Wed, 10 Jun 2026 02:06:44 +0000</pubDate>
      <link>https://dev.to/pickuma/portfolio-optimization-with-pyportfolioopt-mean-variance-in-practice-4d2j</link>
      <guid>https://dev.to/pickuma/portfolio-optimization-with-pyportfolioopt-mean-variance-in-practice-4d2j</guid>
      <description>&lt;p&gt;PyPortfolioOpt is the library that makes modern portfolio theory feel approachable: feed it price history, and a handful of lines returns the "optimal" portfolio weights on the efficient frontier. It's a great on-ramp to Markowitz mean-variance optimization for developers. But there's a famous gap between the elegance of the math and the fragility of the result, and using the library well means understanding why the naive answer is usually wrong — and which of its features exist specifically to rescue it. None of this is investment advice.&lt;/p&gt;

&lt;h2&gt;
  
  
  What mean-variance optimization does
&lt;/h2&gt;

&lt;p&gt;Markowitz's idea, which won a Nobel Prize, is that you shouldn't pick assets in isolation — you should pick the &lt;em&gt;combination&lt;/em&gt; that gives the most expected return for a given level of risk, accounting for how assets move together. The output is the efficient frontier: the set of portfolios where you can't get more return without taking more risk.&lt;/p&gt;

&lt;p&gt;PyPortfolioOpt implements this directly. You give it historical returns; it estimates expected returns and a covariance matrix, then solves for the weights that maximize a chosen objective — maximum Sharpe ratio, minimum volatility, or a target return. In code it's almost trivial: compute expected returns, compute the covariance, hand both to an optimizer, and read off the weights. That accessibility is exactly why it's so widely used to learn the concepts.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Mean-variance optimization isn't just about which assets return the most — it's about how they co-move. Two assets that hedge each other can both earn a place in the portfolio that neither would alone. That interaction, captured by the covariance matrix, is the whole point of optimizing a portfolio rather than ranking assets.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Why the naive output is fragile
&lt;/h2&gt;

&lt;p&gt;Here's the catch that every practitioner learns the hard way: naive mean-variance optimization is an "error-maximizer." It takes your estimates of expected return and treats them as truth, then aggressively tilts the portfolio toward whatever asset your noisy estimate happened to rate highest. Small errors in the inputs produce wildly different, often absurd outputs — 90% in one asset, large short positions, weights that swing violently when you add a month of data.&lt;/p&gt;

&lt;p&gt;The root problem is that expected returns are extraordinarily hard to estimate from historical data; the past average is a terrible predictor of the future. The optimizer doesn't know your inputs are guesses — it optimizes them as if they were facts, and amplifies their errors. A portfolio that looks "optimal" on paper is often just a bet on your estimation noise.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The unconstrained max-Sharpe portfolio almost always looks spectacular on the data used to build it and disappoints afterward, because it has fit the noise in your historical returns. Be suspicious of any optimizer output with extreme or highly concentrated weights — that's the signature of fitting estimation error, not finding edge.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The techniques that make it usable
&lt;/h2&gt;

&lt;p&gt;PyPortfolioOpt's real value is that it ships the tools to tame this fragility — and using them is the difference between a toy and something defensible.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Shrink the covariance estimate.&lt;/strong&gt; Instead of the raw sample covariance, use a shrinkage estimator (Ledoit-Wolf is the standard, and the library includes it), which pulls noisy estimates toward a more stable structure and produces far better-behaved portfolios.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Don't trust raw expected returns.&lt;/strong&gt; Rather than feeding in historical mean returns, many practitioners use the minimum-volatility objective (which ignores expected-return estimates entirely) or impose views more carefully. Optimizing purely for low risk sidesteps the hardest-to-estimate input.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Constrain and regularize.&lt;/strong&gt; Add weight bounds (no single asset above some cap, no shorting if you don't want it) and L2 regularization, both of which the library supports, to keep the optimizer from producing the extreme, concentrated allocations that signal overfitting.&lt;/p&gt;

&lt;p&gt;Used this way — shrinkage on the covariance, humility about expected returns, sensible constraints — PyPortfolioOpt produces portfolios that are diversified and reasonably stable. Used as a black box that you trust to hand you the "optimal" answer, it produces confident-looking nonsense. The library is excellent; the discipline is on you.&lt;/p&gt;

&lt;p&gt;PyPortfolioOpt lowers the barrier to portfolio optimization, which is both its gift and its hazard. The math is sound and the code is clean — but the difference between a fragile toy and a usable tool is entirely in whether you apply shrinkage, constraints, and skepticism about your own return estimates.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://pickuma.com/for-investor/portfolio-optimization-pyportfolioopt-mean-variance/?utm_source=devto&amp;amp;utm_medium=crosspost&amp;amp;utm_campaign=blog" rel="noopener noreferrer"&gt;pickuma.com&lt;/a&gt;. Subscribe to &lt;a href="https://pickuma.com/rss.xml" rel="noopener noreferrer"&gt;the RSS&lt;/a&gt; or follow &lt;a href="https://bsky.app/profile/pickuma.bsky.social" rel="noopener noreferrer"&gt;@pickuma.bsky.social&lt;/a&gt; for new reviews.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>investing</category>
      <category>finance</category>
      <category>beginners</category>
      <category>productivity</category>
    </item>
    <item>
      <title>How to Handle a Take-Home Coding Assignment Without Burning a Weekend</title>
      <dc:creator>pickuma</dc:creator>
      <pubDate>Wed, 10 Jun 2026 02:05:28 +0000</pubDate>
      <link>https://dev.to/pickuma/how-to-handle-a-take-home-coding-assignment-without-burning-a-weekend-89k</link>
      <guid>https://dev.to/pickuma/how-to-handle-a-take-home-coding-assignment-without-burning-a-weekend-89k</guid>
      <description>&lt;p&gt;A take-home is the one interview stage where nobody is watching you type, and that is exactly why candidates lose it. You get a prompt, an open-ended deadline, and a blank repo. The trap is treating it like a personal project with infinite scope instead of a graded exercise with a rubric you cannot see. We have read enough of these briefs to notice the pattern: the people who pass are not the ones who write the most code. They are the ones who read the prompt correctly and stop on time.&lt;/p&gt;

&lt;p&gt;This is a playbook for doing that. It assumes you have a real job, limited evenings, and no desire to spend 14 hours proving you can build a feature you will never ship.&lt;/p&gt;

&lt;h2&gt;
  
  
  Read the brief like a spec, not a suggestion
&lt;/h2&gt;

&lt;p&gt;Before you open an editor, separate the prompt into three buckets: hard requirements, soft signals, and noise. A hard requirement is anything phrased as "must," "the API should return," or a concrete input/output example. A soft signal is "we value clean code" or "bonus points for tests." Noise is the framing story about the fictional company.&lt;/p&gt;

&lt;p&gt;Most briefs include an explicit or implied time box. If it says "this should take about 3 hours," that number is the real constraint, not the feature list. Reviewers calibrate their expectations to that number. Shipping a sprawling solution that clearly took ten hours signals that you either cannot estimate or cannot stop, and both are things they are screening for. If no time box is given, set your own and write it down: most companies expect somewhere between 2 and 5 hours of focused work for an early-stage take-home.&lt;/p&gt;

&lt;p&gt;The single most useful move at this stage is to email back one or two scoping questions. "Should the API handle pagination, or is returning the full list fine for this exercise?" does two things. It de-risks your build, and it shows the reviewer you think about scope before writing code, which is most of what the job actually is.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Write your interpretation of the requirements as a short checklist in the README before you write any code. If you misread the brief, a reviewer can see your reasoning and grade the thinking, not just the broken output. A wrong answer with visible reasoning beats a wrong answer with none.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Build the boring 80 percent first
&lt;/h2&gt;

&lt;p&gt;Start with the thinnest version that satisfies every hard requirement, end to end, and commit it. A working solution that handles the happy path is worth more than a half-finished clever architecture. Reviewers run your code first; if it does not start, the rest of your decisions never get seen.&lt;/p&gt;

&lt;p&gt;Resist the two failure modes that eat take-homes alive. The first is gold-plating: adding a caching layer, a plugin system, or a config abstraction for a single use case. This is the exact overengineering a senior reviewer is trained to flag. If you would not add it to a real PR for this size of task, do not add it here. The second is premature breadth: starting five features and finishing none. Depth on the core requirement beats a tour of everything you know.&lt;/p&gt;

&lt;p&gt;Once the core works, spend your remaining budget on the things that signal craft cheaply: a few meaningful tests on the logic that matters, clear error messages on bad input, and named functions over clever one-liners. Commit in small, labeled steps. Your git history is part of the submission whether you intend it or not, and "initial commit" containing 600 lines reads very differently from a sequence of focused commits.&lt;/p&gt;

&lt;p&gt;AI assistants are fair game on almost every take-home now, and reviewers assume you used one. The differentiator is judgment, not abstinence. A tool like Cursor is genuinely faster for scaffolding boilerplate, generating test cases, and explaining an unfamiliar library, which frees your limited time for the decisions that actually get graded. What gets candidates rejected is shipping AI output they cannot explain: if you cannot defend a line in the follow-up call, delete it or rewrite it until you can.&lt;/p&gt;

&lt;h2&gt;
  
  
  Write the README that does your arguing for you
&lt;/h2&gt;

&lt;p&gt;The README is where take-homes are won, and most candidates treat it as an afterthought. The reviewer reads it before and after running your code, and it is your only chance to explain decisions they would otherwise have to guess at.&lt;/p&gt;

&lt;p&gt;Keep it short and answer four questions. How do I run this (exact commands, assume nothing). What did you build versus skip, and why. What tradeoffs did you make under the time box. What would you do with another day. That last section is disproportionately powerful: it lets you name the missing test coverage, the error case you stubbed, or the scaling concern you saw but chose not to solve. Naming a limitation reads as senior. Hiding it and hoping reads as junior, and reviewers notice the difference immediately.&lt;/p&gt;

&lt;p&gt;Be honest about time. If you went over, say so and say where. "I spent extra time on the parser because the input format was ambiguous" is useful signal. Pretending a six-hour build took two helps no one and tends to show up in the code anyway.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Do not submit a private repo link without checking access, and do not paste a take-home prompt into a public forum or a public AI chat history. Some companies reuse prompts and watch for leaks, and a leaked brief can disqualify you. Keep the work in a private repo or a zip, exactly as the instructions request.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Before you submit, do one cold read. Clone your own repo into a fresh directory, follow your own README line by line, and confirm it runs from scratch. The number of submissions that fail on a missing dependency or an uncommitted file is high enough that simply checking puts you ahead of a real fraction of the pool.&lt;/p&gt;

&lt;h2&gt;
  
  
  A realistic time split
&lt;/h2&gt;

&lt;p&gt;For a 4-hour budget, a workable allocation is roughly 30 minutes reading and scoping, 2 hours on the core happy path, 45 minutes on tests and error handling, and 30 minutes on the README and the cold-read check. Scale the proportions, not the structure, for shorter or longer briefs. The reading and the README never shrink to zero, because they are the parts that make the rest legible.&lt;/p&gt;

&lt;p&gt;The meta-skill a take-home tests is not raw coding speed. It is whether you can take an ambiguous request, scope it to a deadline, ship the part that matters, and communicate what you chose to leave out. That is the job. Treat the assignment as a small, honest demonstration of exactly that, and you will spend fewer evenings on it while passing more of them.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://pickuma.com/for-junior/how-to-handle-a-take-home-coding-assignment/?utm_source=devto&amp;amp;utm_medium=crosspost&amp;amp;utm_campaign=blog" rel="noopener noreferrer"&gt;pickuma.com&lt;/a&gt;. Subscribe to &lt;a href="https://pickuma.com/rss.xml" rel="noopener noreferrer"&gt;the RSS&lt;/a&gt; or follow &lt;a href="https://bsky.app/profile/pickuma.bsky.social" rel="noopener noreferrer"&gt;@pickuma.bsky.social&lt;/a&gt; for new reviews.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Whimsical AI Review: Editable Diagrams and Flowcharts From a Prompt</title>
      <dc:creator>pickuma</dc:creator>
      <pubDate>Wed, 10 Jun 2026 02:04:12 +0000</pubDate>
      <link>https://dev.to/pickuma/whimsical-ai-review-editable-diagrams-and-flowcharts-from-a-prompt-463o</link>
      <guid>https://dev.to/pickuma/whimsical-ai-review-editable-diagrams-and-flowcharts-from-a-prompt-463o</guid>
      <description>&lt;p&gt;You describe a process in a sentence, and a few seconds later you have a flowchart you can actually edit — not a screenshot, but real nodes you can drag, relabel, and reconnect. That is the pitch behind Whimsical AI, and it is a different promise from the image generators that hand you a flat PNG you cannot touch.&lt;/p&gt;

&lt;p&gt;We ran Whimsical AI through the kinds of diagrams developers draw most: onboarding flows, request lifecycles, retry logic, rough architecture sketches, and planning mind maps. Here is where it earns its keep and where you will still reach for the cursor.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Whimsical AI actually generates
&lt;/h2&gt;

&lt;p&gt;Whimsical is a visual workspace — flowcharts, mind maps, wireframes, sticky notes, and docs on one canvas. The AI layer sits on top of that. You type a prompt like "flowchart for password reset including email verification and rate limiting," and it drops a structured diagram onto the canvas: labeled boxes, decision diamonds, and directional arrows, all placed as native Whimsical objects.&lt;/p&gt;

&lt;p&gt;The detail that matters is that the output is editable. When the model misnames a step or routes an arrow the wrong way, you fix it in place the same way you would fix anything you drew by hand. There is no re-prompting just to move one box. That single property is what separates a prompt-to-diagram tool from a text-to-image tool wearing a flowchart costume.&lt;/p&gt;

&lt;p&gt;Three generation modes are worth knowing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Flowcharts&lt;/strong&gt; from a text description of a process or branching logic.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mind maps&lt;/strong&gt; that expand a single topic into a tree of branches — useful for breaking down a feature or scoping a project before you commit to structure.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;In-doc drafting&lt;/strong&gt;, where the AI helps write inside Whimsical's document blocks rather than on the canvas.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The mind-map-to-structure path is the one we kept coming back to. Starting from a messy brain dump, asking for a mind map, then reshaping the branches is faster than staring at a blank canvas trying to name your first node.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where prompt-to-diagram breaks down
&lt;/h2&gt;

&lt;p&gt;The model produces a confident first draft, and confidence is exactly the failure mode to watch for. It does not know your system. When you ask for an "authentication flow," it returns &lt;em&gt;a&lt;/em&gt; plausible authentication flow — a generic one — not yours. If your real flow has a quirk (a legacy token path, a feature flag, an out-of-band step), the AI will not invent it, and it will not flag the omission. You have to know what is missing.&lt;/p&gt;

&lt;p&gt;Quality also degrades with density. Linear and moderately branched flows come out clean. Dense diagrams — many nodes, many crossing edges, the kind of architecture map with twelve services and a message bus — arrive tangled, with overlapping boxes and arrows that need manual untangling before the picture reads. At that point you are editing more than you are generating.&lt;/p&gt;

&lt;p&gt;Naming is generic unless you spell it out. "User submits form" instead of "User submits checkout," "Check condition" instead of "Validate coupon code." The fix is prompt specificity: name the steps you care about in the prompt, and the diagram comes back closer to usable.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Treat an AI-generated diagram as a draft, not a source of truth. If the flowchart will guide a real implementation or onboard a new engineer, walk every node and edge against the actual code or runbook before you share it. A wrong-but-tidy diagram is more dangerous than no diagram, because people trust pictures.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Practically, the workflow that held up was: prompt for the skeleton, then spend two or three minutes correcting names, adding the steps the model could not know about, and straightening the layout. That is still far faster than drawing from zero — but it is editing, not magic, and budgeting for the edit pass is the difference between a tool that helps and one that disappoints.&lt;/p&gt;

&lt;h2&gt;
  
  
  Who it fits, and the alternatives
&lt;/h2&gt;

&lt;p&gt;Whimsical AI is a strong fit if you already think in diagrams and want to skip the cold-start cost of the first ten boxes. It is good for documentation diagrams, planning sessions, and explaining a flow to teammates who would rather see a picture than read a paragraph. The editable-output model means you are never fighting the tool to make a small correction.&lt;/p&gt;

&lt;p&gt;It is a weaker fit if you need precise architecture diagrams with strict notation, or if your diagrams live in version control as code. For those, a text-based approach like Mermaid or PlantUML keeps the diagram diffable and reviewable in a pull request, which a canvas tool cannot match.&lt;/p&gt;

&lt;p&gt;If your diagrams mostly live inside written documents — specs, runbooks, project pages — and you would rather not run a second tool, a connected docs workspace can cover the lighter cases. Notion supports simple diagrams and has AI for drafting the surrounding text, so a flow that is "three boxes and a paragraph" may not need a dedicated canvas at all.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Write prompts the way you would brief a junior engineer: name the specific steps, the decision points, and the order. "Flowchart for checkout: cart, address, payment auth (retry on failure, max 3), confirmation email" produces a far more usable first draft than "checkout flowchart." The model fills gaps with generic guesses, so the more you specify, the less you clean up.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The honest summary: Whimsical AI removes the blank-canvas tax and gives you editable output, which is the right design for a diagramming assistant. It does not remove the need to know your own system or to review what it produces. Used as a fast first-draft engine with a deliberate edit pass, it saves real time. Used as an oracle, it will hand you a clean diagram of the wrong thing.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://pickuma.com/for-pm/whimsical-ai-review-diagrams-flowcharts-from-a-prompt/?utm_source=devto&amp;amp;utm_medium=crosspost&amp;amp;utm_campaign=blog" rel="noopener noreferrer"&gt;pickuma.com&lt;/a&gt;. Subscribe to &lt;a href="https://pickuma.com/rss.xml" rel="noopener noreferrer"&gt;the RSS&lt;/a&gt; or follow &lt;a href="https://bsky.app/profile/pickuma.bsky.social" rel="noopener noreferrer"&gt;@pickuma.bsky.social&lt;/a&gt; for new reviews.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>tutorial</category>
    </item>
  </channel>
</rss>
