<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Evan-dong</title>
    <description>The latest articles on DEV Community by Evan-dong (@evan-dong).</description>
    <link>https://dev.to/evan-dong</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3805708%2F6a9f71a4-d7de-4c0a-8ff7-ba23c9b2486a.png</url>
      <title>DEV Community: Evan-dong</title>
      <link>https://dev.to/evan-dong</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/evan-dong"/>
    <language>en</language>
    <item>
      <title># I Went Through 60 Curated Claude Fable 5 Cases So You Don't Have To — Here's What's Actually Useful</title>
      <dc:creator>Evan-dong</dc:creator>
      <pubDate>Thu, 11 Jun 2026 10:11:59 +0000</pubDate>
      <link>https://dev.to/evan-dong/-i-went-through-60-curated-claude-fable-5-cases-so-you-dont-have-to-heres-whats-actually-1gbd</link>
      <guid>https://dev.to/evan-dong/-i-went-through-60-curated-claude-fable-5-cases-so-you-dont-have-to-heres-whats-actually-1gbd</guid>
      <description>&lt;p&gt;Every model launch buries you in the same noise: Twitter threads saying "this just works," demo clips with no numbers, benchmark pages that disagree with each other. When you actually want to know &lt;em&gt;will this work for my task, and what will it cost&lt;/em&gt;, the signal is scattered across a dozen places.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffxu6nqilg1stv979o3mi.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ffxu6nqilg1stv979o3mi.jpeg" alt=" " width="799" height="421"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;So I went through &lt;code&gt;awesome-claude-fable-5&lt;/code&gt;, a curated list of 60 real-world Claude Fable 5 cases, each logged with its source, evidence type, and date. The repo's stated principle is the reason I bothered: &lt;strong&gt;concrete evidence over hype&lt;/strong&gt; — reproducible prompts, demos, cost data, and caveats, not vibes.&lt;/p&gt;

&lt;p&gt;Here's how it's organized and the cases worth your time.&lt;/p&gt;

&lt;h2&gt;
  
  
  How each case is logged
&lt;/h2&gt;

&lt;p&gt;Every entry answers four questions, which is what makes the list scannable:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Input&lt;/strong&gt; — prompt, screenshot, codebase, logs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Process&lt;/strong&gt; — which tool/model, how it was run&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Output&lt;/strong&gt; — result plus time, tokens, cost&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Evidence type&lt;/strong&gt; — Demo / Tutorial / Evaluation / Integration&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;60 cases across 8 categories: coding, agents &amp;amp; long-running automation, games &amp;amp; interactive demos, visual/design/video/3D, documents &amp;amp; research, tutorials &amp;amp; prompt resources, platform/API integration, and evaluations/comparisons/limits.&lt;/p&gt;

&lt;h2&gt;
  
  
  Coding: screenshot → working clone
&lt;/h2&gt;

&lt;p&gt;One case pastes a single GitHub UI screenshot and asks for a working clone. Reported: ~10 minutes of coding, ~$4.07. The category also covers a single-file macOS-style web OS and large-PR reviews. Good for gauging how far screenshot-to-frontend actually goes today.&lt;/p&gt;

&lt;h2&gt;
  
  
  Agents: real incident diagnosis, not toy demos
&lt;/h2&gt;

&lt;p&gt;The case that stuck with me: a self-hosted infra outage diagnosed by reading pod logs, querying Cloud SQL error logs, and comparing image digests — landing on "Kubernetes ran a 3-month-old cached image against a freshly migrated DB," then opening a fix PR in minutes, no web search.&lt;/p&gt;

&lt;p&gt;Another rebuilt a stalled site (GPT-5.5 and Claude 4.8 had plateaued at 85–90% even with a Figma reference) by going to the original Webflow source, pulling assets, and reproducing it nearly in one pass.&lt;/p&gt;

&lt;h2&gt;
  
  
  The pattern worth stealing: "Relay"
&lt;/h2&gt;

&lt;p&gt;This recurs across the repo and it's the most practical takeaway: &lt;strong&gt;plan, architect, and review with the expensive model; route high-volume implementation to cheaper ones&lt;/strong&gt; (4.8, GPT-5.5, Sonnet 4.6). One case phrases it as "think with Fable 5, build with a cheaper model, review with Fable 5." If you care about cost, this one idea pays for the read.&lt;/p&gt;

&lt;h2&gt;
  
  
  The limits are documented too (this is the good part)
&lt;/h2&gt;

&lt;p&gt;The repo doesn't hide the failures. One comparison ran the same Physarum simulation on both stacks: GPT-5.5 (Codex) finished in 17 min for ~$6; Fable 5 (Cursor) took 40+ min and &lt;strong&gt;$360.55&lt;/strong&gt;. A tens-of-times cost gap on the &lt;em&gt;same task&lt;/em&gt;. There's also a safety-routing mechanism — cyber/bio-chem/distillation queries get routed to Opus 4.8 instead of refused, triggering in &amp;lt;5% of sessions on average (measured 2–9% depending on the benchmark).&lt;/p&gt;

&lt;h2&gt;
  
  
  Reproducible prompts you can copy
&lt;/h2&gt;

&lt;p&gt;The tutorials section ships full prompts, not summaries: a DEVICE HARDENING security-pass prompt (model self-check → inventory → 11-category walkthrough → report), a &lt;code&gt;/goal&lt;/code&gt; site design-audit prompt with P0–P3 tagging and output paths, and a 4-phase Repo Audit prompt. Worth reading just for prompt-engineering structure.&lt;/p&gt;

&lt;h2&gt;
  
  
  Trying it yourself
&lt;/h2&gt;

&lt;p&gt;Each case is meant to be reproducible, and the repo includes a &lt;code&gt;curl&lt;/code&gt; example against an EvoLink-compatible Messages API:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;--request&lt;/span&gt; POST &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--url&lt;/span&gt; https://direct.evolink.ai/v1/messages &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--header&lt;/span&gt; &lt;span class="s1"&gt;'Authorization: Bearer &amp;lt;token&amp;gt;'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--header&lt;/span&gt; &lt;span class="s1"&gt;'Content-Type: application/json'&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--data&lt;/span&gt; &lt;span class="s1"&gt;'{ "model": "claude-fable-5", "max_tokens": 1024, "messages": [{ "role": "user", "content": "Hello, world" }] }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Reported official benchmarks listed with sources: SWE-Bench Pro 80.3%, Terminal-Bench 2.1 88.0%, OSWorld-Verified 85.0%.&lt;/p&gt;

&lt;h2&gt;
  
  
  One honest caveat
&lt;/h2&gt;

&lt;p&gt;Almost every number here is &lt;strong&gt;creator-reported&lt;/strong&gt; — pulled from X posts and public demos, curated &lt;em&gt;with&lt;/em&gt; its source rather than independently re-verified. The Physarum gap shows how much tool/setup/timing swings results. Treat the figures as a reference range, not gospel.&lt;/p&gt;

&lt;p&gt;You can browse &lt;a href="https://github.com/EvoLinkAI/awesome-claude-fable-5" rel="noopener noreferrer"&gt;the full 60-case list on GitHub&lt;/a&gt; (Korean README included).&lt;/p&gt;

&lt;p&gt;And if you want to reproduce a case and test the model directly, you can &lt;a href="https://evolink.ai/claude-fable-5?utm_source=community&amp;amp;utm_medium=devto&amp;amp;utm_campaign=awesome_claude_fable_5" rel="noopener noreferrer"&gt;try claude-fable-5 on EvoLink&lt;/a&gt;.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>10 Claude Code Settings I Wish I'd Configured on Day One</title>
      <dc:creator>Evan-dong</dc:creator>
      <pubDate>Wed, 10 Jun 2026 09:44:31 +0000</pubDate>
      <link>https://dev.to/evan-dong/-10-claude-code-settings-i-wish-id-configured-on-day-one-3f57</link>
      <guid>https://dev.to/evan-dong/-10-claude-code-settings-i-wish-id-configured-on-day-one-3f57</guid>
      <description>&lt;p&gt;I used Claude Code daily for six months before I admitted the obvious: the defaults were costing me 10–15 hours a month. Not because the tool is bad — because I never configured it. The agent would wander off, forget decisions mid-session, or confidently rewrite code I hadn't asked it to touch, and I'd clean up after it.&lt;/p&gt;

&lt;p&gt;This isn't a "what is Claude Code" tutorial. It's the settings I landed on if you already use it and want more out of it. Each section ends with copy-paste config.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Cap CLAUDE.md at 8,000 characters
&lt;/h2&gt;

&lt;p&gt;My biggest early mistake was dumping everything into &lt;code&gt;CLAUDE.md&lt;/code&gt;. By March it was 40,000 characters. I assumed more context meant better output. Wrong — and the reason is structural: transformers are bad at retaining info in the middle of long documents. Past a certain size the agent stops &lt;em&gt;seeing&lt;/em&gt; middle-of-file rules. I asked it to reproduce a rule from line 300 of its own &lt;code&gt;CLAUDE.md&lt;/code&gt;; it couldn't.&lt;/p&gt;

&lt;p&gt;Hard cap the main file at 8,000 characters. Everything else goes into files loaded on demand:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;.claude/
├── CLAUDE.md              # always-on context, ≤ 8,000 chars
├── skills/
│   ├── nestjs.md          # NestJS session rules
│   ├── migration.md       # TS migration rules
│   └── testing.md         # test strategy
└── specs/
    ├── domain-model.md
    ├── api-contracts.md
    └── infra.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;CLAUDE.md&lt;/code&gt; holds only what every session needs: stack, hard prohibitions, links to specs and skills. After the refactor: 6,000 chars, more accurate answers, ~35% lower session-init token cost.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. settings.json — the permissions that change your rhythm
&lt;/h2&gt;

&lt;p&gt;Most devs never open &lt;code&gt;settings.json&lt;/code&gt;. Add this and read/test commands stop interrupting you:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"permissions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"allow"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"Bash(git log:*)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"Bash(git diff:*)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"Bash(git status:*)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"Bash(npm test:*)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"Bash(npm run lint:*)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"Bash(npm run typecheck:*)"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"deny"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"Bash(git push:*)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"Bash(git reset --hard:*)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"Bash(rm -rf:*)"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Interruptions per session went from 15–20 to 2–3. &lt;code&gt;deny&lt;/code&gt; outranks &lt;code&gt;allow&lt;/code&gt;, so dangerous commands stay blocked. Commit &lt;code&gt;.claude/settings.json&lt;/code&gt; and the whole team gets the same guardrails.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. acceptEdits — enable it selectively, not globally
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;acceptEdits&lt;/code&gt; removes per-edit confirmations. It helps for a specific class of work:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Enable for:&lt;/strong&gt; type migrations (JS→TS, fixing &lt;code&gt;any&lt;/code&gt;), mechanical fixes (renames, import swaps, formatting), test generation against existing logic, docs/comments.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Don't enable for:&lt;/strong&gt; new features, architectural refactors, auth/payment/data-layer changes, or any task where you can't state the "done" criteria in 10 seconds.&lt;/p&gt;

&lt;p&gt;Real example: I enabled it for "clean up unused imports." The agent tidied the imports, then decided to rewrite a few services while it was in there. Valid, worse than our patterns, 30 unplanned minutes of review. My rule: &lt;code&gt;acceptEdits&lt;/code&gt; only when automated tests can catch the mistake.&lt;/p&gt;

&lt;h2&gt;
  
  
  4. Hooks as guardrails — the 3 that earned their keep
&lt;/h2&gt;

&lt;p&gt;Hooks are the most underrated feature. Shell scripts that run before/after tool calls, receiving JSON context on stdin.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stop hook — log every session:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="c"&gt;# ~/.claude/hooks/stop.sh&lt;/span&gt;
&lt;span class="nv"&gt;TASK&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;cat&lt;/span&gt; /tmp/claude_current_task 2&amp;gt;/dev/null &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"no task recorded"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt; &lt;span class="s1"&gt;'+%Y-%m-%d %H:%M'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt; | &lt;/span&gt;&lt;span class="nv"&gt;$TASK&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; ~/.claude/session_history.log
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A month of this gave me an honest report: ~2h15m/day with the agent, 4–5 sessions. I thought I used it "occasionally." The data disagreed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;PreToolUse hook — block dangerous commands.&lt;/strong&gt; Return &lt;code&gt;exit 2&lt;/code&gt; to block and pass stderr back to the agent:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
&lt;span class="c"&gt;# ~/.claude/hooks/pre_tool.sh&lt;/span&gt;
&lt;span class="nv"&gt;INPUT&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;cat&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;COMMAND&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$INPUT&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | jq &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s1"&gt;'.tool_input.command // empty'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="nv"&gt;DANGEROUS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"(rm -rf|DROP TABLE|truncate|DELETE FROM.*WHERE 1=1|git reset --hard|kubectl delete)"&lt;/span&gt;
&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$COMMAND&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-qiE&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$DANGEROUS&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"BLOCKED: potentially dangerous command: &lt;/span&gt;&lt;span class="nv"&gt;$COMMAND&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&amp;amp;2
  &lt;span class="nb"&gt;exit &lt;/span&gt;2
&lt;span class="k"&gt;fi
&lt;/span&gt;&lt;span class="nb"&gt;exit &lt;/span&gt;0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Fired 7 times in 3 months. 5 were reasonable (I ran them manually after review); 2 I was glad it caught.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Notification hook — ping when done:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/bin/bash&lt;/span&gt;
osascript &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="s1"&gt;'display notification "Claude Code finished the task" with title "Claude Code"'&lt;/span&gt;
&lt;span class="c"&gt;# Linux: notify-send "Claude Code" "Task done"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Wiring them up:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"Stop"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"matcher"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"~/.claude/hooks/stop.sh"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"PreToolUse"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"matcher"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Bash"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"~/.claude/hooks/pre_tool.sh"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  5. Multi-model council for architecture decisions
&lt;/h2&gt;

&lt;p&gt;For hard-to-roll-back decisions (module layout, data schema, service contracts), I run the same task across models in parallel:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Opus:   "Design [X]. Priority: reliability and explicit contracts"
Sonnet: "Design [X]. Priority: development speed"
Haiku:  "Design [X]. Minimal complexity, minimal abstraction"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then a fourth Opus session synthesizes the best of the three against our context. ~20–25 minutes. On my last big refactor it replaced a week of team debate and landed cleaner than a 2–3 person discussion would have.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. "Effort levels" — match depth to the task
&lt;/h2&gt;

&lt;p&gt;I formalized this as a custom skill with four levels. (&lt;code&gt;/effort&lt;/code&gt; is &lt;em&gt;my&lt;/em&gt; skill, not built in; the built-in way to deepen reasoning is keywords like &lt;code&gt;think&lt;/code&gt; / &lt;code&gt;ultrathink&lt;/code&gt;.)&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/effort low     fast draft, no plan, just code        → small utilities, throwaway
/effort medium  short plan, then implement            → standard features, fixes
/effort high    detailed plan, then implement         → new components, public API changes
/effort max     analyze → plan → approval → implement → report; stop for OK at each step
                                                       → core changes, auth, schema migrations
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Zero full rollbacks on &lt;code&gt;/effort max&lt;/code&gt; in six months. Bonus: if &lt;code&gt;/effort max&lt;/code&gt; feels like overkill for a task, the task is easier than you thought. The reverse holds too.&lt;/p&gt;

&lt;h2&gt;
  
  
  7. Spot context rot early and restart
&lt;/h2&gt;

&lt;p&gt;Long sessions degrade. Three tells:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The agent re-proposes something you already rejected this session&lt;/li&gt;
&lt;li&gt;It suggests changing code it just wrote&lt;/li&gt;
&lt;li&gt;It asks you to "clarify the task" you gave 10 messages ago&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Don't push through. Restart with a short handoff:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gh"&gt;# Context to continue&lt;/span&gt;
&lt;span class="gs"&gt;**Task:**&lt;/span&gt; [one sentence]
&lt;span class="gs"&gt;**Done:**&lt;/span&gt; [file/function]: [what], ...
&lt;span class="gs"&gt;**Ruled out (discussed):**&lt;/span&gt; [option]: [why], ...
&lt;span class="gs"&gt;**Next step:**&lt;/span&gt; [one concrete item]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;3–5 minutes to fill out, 5–10x faster than dragging a rotted session. My personal trigger: 3 messages with no forward progress → restart, no debate.&lt;/p&gt;

&lt;h2&gt;
  
  
  8. The two-correction rule
&lt;/h2&gt;

&lt;p&gt;If you correct the agent twice on the same point in one session, stop. That's a signal the task framing or context is wrong, not that the agent is stubborn. Rebuild the task three ways:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Split smaller.&lt;/strong&gt; Two failures on X may mean X is too big. Solve X1 then X2.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Give a concrete example&lt;/strong&gt; of the desired output instead of "do it right."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Write the skeleton&lt;/strong&gt; (structure, signatures) and let the agent fill it in. "Here's the service interface, implement it" beats "write a service for X."&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Estimated 15–20 hours saved over six months.&lt;/p&gt;

&lt;h2&gt;
  
  
  9. Split settings by environment
&lt;/h2&gt;

&lt;p&gt;Settings merge hierarchically: team &lt;code&gt;.claude/settings.json&lt;/code&gt;, personal &lt;code&gt;.claude/settings.local.json&lt;/code&gt; (gitignored), user-global &lt;code&gt;~/.claude/settings.json&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Team (strict, committed):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"permissions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"deny"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Bash(kubectl delete:*)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Bash(terraform destroy:*)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Bash(rm -rf:*)"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Personal (local, gitignored):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"permissions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"allow"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Bash(docker compose up --build:*)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Bash(npm run db:reset:*)"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You move fast locally without losing the shared guardrails — &lt;code&gt;deny&lt;/code&gt; outranks &lt;code&gt;allow&lt;/code&gt;, so a local &lt;code&gt;allow&lt;/code&gt; can't override a team &lt;code&gt;deny&lt;/code&gt;. You can also switch permission modes at launch (&lt;code&gt;--permission-mode plan&lt;/code&gt;) for investigation-only sessions.&lt;/p&gt;

&lt;h2&gt;
  
  
  10. Skills: lazy-load project-specific context
&lt;/h2&gt;

&lt;p&gt;Skills are markdown files in &lt;code&gt;.claude/skills/&lt;/code&gt;, loaded on demand. The point: don't keep project-specific context in the agent's memory full-time.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;.claude/skills/
├── nestjs-conventions.md
├── clean-arch.md
├── typescript-strict.md
├── git-workflow.md
├── testing-strategy.md
└── code-review-checklist.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each maps to a task type — writing a NestJS module → &lt;code&gt;nestjs-conventions&lt;/code&gt;, refactoring → &lt;code&gt;clean-arch&lt;/code&gt;, reviewing → &lt;code&gt;code-review-checklist&lt;/code&gt;. After moving to skills: main &lt;code&gt;CLAUDE.md&lt;/code&gt; from 40K to 6K chars, 30–40% lower token use in standard sessions, and the agent stopped mixing rules from different domains.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where this leads
&lt;/h2&gt;

&lt;p&gt;Once you're running multiple models in parallel (#5), the next question isn't prompt quality — it's routing and cost: who uses which model, and what does each workflow cost? That part lives outside Claude Code. Routing the CLI through a gateway gives you per-workflow, per-model cost visibility across providers; the &lt;a href="https://docs.evolink.ai/en/integration-guide/claude-code-cli?utm_source=community&amp;amp;utm_medium=devto&amp;amp;utm_campaign=claude_code_configs" rel="noopener noreferrer"&gt;EvoLink Claude Code CLI guide&lt;/a&gt; documents that setup if your team's at that stage.&lt;/p&gt;

&lt;h2&gt;
  
  
  Takeaway
&lt;/h2&gt;

&lt;p&gt;Claude Code isn't plug-and-play — it's a platform you configure. Defaults run at "fine"; the right settings move it to "genuinely fast." By my estimate these paid back 50–60 hours in the first six months. If you've got settings that didn't make my list — especially hooks — drop them in the comments.&lt;/p&gt;




&lt;p&gt;&lt;a href="https://docs.anthropic.com/en/docs/claude-code" rel="noopener noreferrer"&gt;Claude Code docs&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;tags: claude-code, ai, developer-tools, productivity&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
    </item>
    <item>
      <title>A Seedance 2.0 Prompt Template for More Controllable AI Video</title>
      <dc:creator>Evan-dong</dc:creator>
      <pubDate>Fri, 05 Jun 2026 09:03:52 +0000</pubDate>
      <link>https://dev.to/evan-dong/a-seedance-20-prompt-template-for-more-controllable-ai-video-15c4</link>
      <guid>https://dev.to/evan-dong/a-seedance-20-prompt-template-for-more-controllable-ai-video-15c4</guid>
      <description>&lt;p&gt;I have been collecting working Seedance 2.0 prompts, and one thing keeps showing up:&lt;/p&gt;

&lt;p&gt;Good video prompts are not just descriptive. They are structured.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fftg98bv4h3nuo5fhct49.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fftg98bv4h3nuo5fhct49.png" alt=" " width="800" height="623"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;We published the collection here:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/EvoLinkAI/awesome-seedance-2.0-prompts" rel="noopener noreferrer"&gt;https://github.com/EvoLinkAI/awesome-seedance-2.0-prompts&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;It has 160+ curated Seedance 2.0 prompt cases across product ads, POV shots, cinematic scenes, camera motion, first-frame control, fantasy, action, and reference-driven workflows.&lt;/p&gt;

&lt;p&gt;This is the template I would use when writing new prompts.&lt;/p&gt;

&lt;h2&gt;
  
  
  The template
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Intent:
What should this clip communicate?

Scene:
Where does it happen? What should the environment feel like?

Subject:
Who or what is the main focus? What must stay consistent?

Camera:
Where does the camera start? How does it move? Are cuts allowed?

Timeline:
0-2s: ...
2-5s: ...
5-8s: ...

Style:
Lighting, color, lens feel, genre, texture.

Constraints:
What should not happen?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The most useful blocks are usually &lt;code&gt;Camera&lt;/code&gt;, &lt;code&gt;Timeline&lt;/code&gt;, and &lt;code&gt;Constraints&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this helps
&lt;/h2&gt;

&lt;p&gt;A prompt like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;A man running through a city at night, cinematic, dramatic, realistic, fast camera.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;can work, but it leaves too much open.&lt;/p&gt;

&lt;p&gt;The model has to guess:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;whether the camera follows from behind or from the side;&lt;/li&gt;
&lt;li&gt;whether the shot cuts;&lt;/li&gt;
&lt;li&gt;whether the man should stay visually consistent;&lt;/li&gt;
&lt;li&gt;how fast the camera should move;&lt;/li&gt;
&lt;li&gt;what should happen at the end.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For video, those guesses matter.&lt;/p&gt;

&lt;h2&gt;
  
  
  Example: action shot
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Intent:
Create an 8-second continuous chase shot.

Scene:
Crowded old market street at dusk, warm lamps, narrow alleys, moving pedestrians.

Subject:
One young protagonist sprinting through the crowd. Keep the same outfit and silhouette throughout.

Camera:
Handheld tracking shot from behind, then side tracking, then a forward push as the character looks back.

Timeline:
0-2s: camera follows closely behind.
2-5s: side tracking as the character dodges pedestrians.
5-8s: push forward as the character glances back.

Style:
Cinematic thriller energy, readable motion, natural crowd ambience.

Constraints:
No cuts, no face morphing, no outfit change, no chaotic motion blur.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The trick is to stop writing one big paragraph and start writing a small shot plan.&lt;/p&gt;

&lt;h2&gt;
  
  
  Example: product ad
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Intent:
Create a premium product hero shot.

Scene:
Dark reflective studio surface, subtle haze, controlled highlights.

Subject:
One luxury watch. Preserve the watch face, metal edges, proportions, and logo placement.

Camera:
Start with an extreme close-up on the dial, then slowly orbit right.

Timeline:
0-2s: close-up on dial and polished metal.
2-5s: slow orbit reveals the silhouette.
5-8s: centered hero frame.

Style:
Premium commercial lighting, crisp reflections, cinematic contrast.

Constraints:
No logo distortion, no extra text, no shape change, no abrupt cuts.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For products, "what not to change" is often more important than "make it beautiful."&lt;/p&gt;

&lt;h2&gt;
  
  
  How I use the repo
&lt;/h2&gt;

&lt;p&gt;I would use the GitHub repo as a reference library:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Pick a category close to your target output.&lt;/li&gt;
&lt;li&gt;Study the structure, not just the words.&lt;/li&gt;
&lt;li&gt;Start with short clips.&lt;/li&gt;
&lt;li&gt;Keep camera movement simple.&lt;/li&gt;
&lt;li&gt;Save prompt formats that produce stable results.&lt;/li&gt;
&lt;li&gt;Turn the best ones into templates.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The full open-source collection is here:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/EvoLinkAI/awesome-seedance-2.0-prompts" rel="noopener noreferrer"&gt;https://github.com/EvoLinkAI/awesome-seedance-2.0-prompts&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The visual prompt page is here:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://evolink.ai/seedance-2-0-prompts?utm_source=community&amp;amp;utm_medium=devto&amp;amp;utm_campaign=seedance2_prompts_02" rel="noopener noreferrer"&gt;Seedance 2.0 prompt examples&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you have a prompt format that works well for Seedance 2.0, especially for product videos or character consistency, PRs are welcome.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Claude Code Dynamic Workflows: What I Learned From a 111-Agent Research Run</title>
      <dc:creator>Evan-dong</dc:creator>
      <pubDate>Thu, 04 Jun 2026 11:08:30 +0000</pubDate>
      <link>https://dev.to/evan-dong/claude-code-dynamic-workflows-what-i-learned-from-a-111-agent-research-run-12hg</link>
      <guid>https://dev.to/evan-dong/claude-code-dynamic-workflows-what-i-learned-from-a-111-agent-research-run-12hg</guid>
      <description>&lt;p&gt;I tried Claude Code's new dynamic workflows with Opus 4.8 on a real research task.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8o7i5m85lz9uo6zrb29q.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8o7i5m85lz9uo6zrb29q.png" alt=" " width="800" height="541"&gt;&lt;/a&gt;&lt;br&gt;
The short version: the agent count was interesting, but the failure handling was more useful.&lt;/p&gt;
&lt;h2&gt;
  
  
  What I ran
&lt;/h2&gt;

&lt;p&gt;I used the built-in workflow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/deep-research &amp;lt;research question&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The question was broad enough to need multiple angles: understand an AI project related to medical imaging, compare possible use cases, look at competitors, and identify product weak spots.&lt;/p&gt;

&lt;p&gt;After launching the workflow, I inspected it with:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/workflows
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That view shows the phase structure, agent counts, token totals, and runtime.&lt;/p&gt;

&lt;h2&gt;
  
  
  The phase breakdown
&lt;/h2&gt;

&lt;p&gt;My run split into five phases:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Phase&lt;/th&gt;
&lt;th&gt;Agents&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Scope&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Define the research frame&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Search&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Search from several angles&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fetch&lt;/td&gt;
&lt;td&gt;28&lt;/td&gt;
&lt;td&gt;Retrieve and read sources&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Verify&lt;/td&gt;
&lt;td&gt;75&lt;/td&gt;
&lt;td&gt;Check key claims&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Synthesize&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Produce the final report&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Total: 111 agents.&lt;/p&gt;

&lt;p&gt;Important caveat: this does not mean 111 agents ran at the same time. The point is that the workflow had a script-like structure that coordinated many subagents across phases.&lt;/p&gt;

&lt;h2&gt;
  
  
  The useful failure
&lt;/h2&gt;

&lt;p&gt;The Verify phase extracted 25 claims and tried to check each claim with three independent agents.&lt;/p&gt;

&lt;p&gt;Then the run partially broke.&lt;/p&gt;

&lt;p&gt;In my case, 17 of the 25 claims ended up with a &lt;code&gt;killed&lt;/code&gt; status. The issue looked like a StructuredOutput failure.&lt;/p&gt;

&lt;p&gt;That sounds bad, but the workflow's interpretation was the useful part.&lt;/p&gt;

&lt;p&gt;It separated:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;confirmed;&lt;/li&gt;
&lt;li&gt;contradicted;&lt;/li&gt;
&lt;li&gt;not verified.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Only 2 of the 17 killed claims were actually contradicted. The other 15 were incomplete checks, not false claims.&lt;/p&gt;

&lt;p&gt;That distinction matters.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;killed&lt;/code&gt; should not automatically mean "wrong." It means the check did not finish. If your agent turns that into a confident conclusion, the report becomes dangerous.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why workflows feel different from long chats
&lt;/h2&gt;

&lt;p&gt;In a normal long chat, intermediate state gets compressed into the final answer. That is convenient, but it can hide weak spots.&lt;/p&gt;

&lt;p&gt;With a workflow, the plan moves into a script. The runtime can hold phase state, branch logic, repeated checks, and failed agents without forcing everything into the chat context.&lt;/p&gt;

&lt;p&gt;That makes it better for tasks where process quality matters:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;codebase audits;&lt;/li&gt;
&lt;li&gt;large migrations;&lt;/li&gt;
&lt;li&gt;cross-checked research;&lt;/li&gt;
&lt;li&gt;repeated claim verification;&lt;/li&gt;
&lt;li&gt;multi-angle planning.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It is not a good fit for every task. For small edits or quick debugging, it is overkill.&lt;/p&gt;

&lt;h2&gt;
  
  
  My checklist
&lt;/h2&gt;

&lt;p&gt;Before running a dynamic workflow, I would do this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Narrow the question.&lt;/li&gt;
&lt;li&gt;Run a small version first.&lt;/li&gt;
&lt;li&gt;Open &lt;code&gt;/workflows&lt;/code&gt;, not just the final report.&lt;/li&gt;
&lt;li&gt;Inspect failed, killed, contradicted, and not-verified states separately.&lt;/li&gt;
&lt;li&gt;Do not treat &lt;code&gt;killed&lt;/code&gt; as a refutation.&lt;/li&gt;
&lt;li&gt;Save the workflow only after it proves useful.&lt;/li&gt;
&lt;li&gt;Watch token usage.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Takeaway
&lt;/h2&gt;

&lt;p&gt;Dynamic workflows are not just "more subagents."&lt;/p&gt;

&lt;p&gt;They are a way to turn a long task into an inspectable process.&lt;/p&gt;

&lt;p&gt;My run partially failed, but that failure was visible. For automated research and coding-agent work, visible failure is much better than a clean answer that hides uncertainty.&lt;/p&gt;

&lt;p&gt;If you are testing Opus 4.8 for Claude Code, coding agents, or long-running workflows, I keep the model notes and API routing details here: &lt;a href="https://evolink.ai/claude-opus-4-8?utm_source=devto&amp;amp;utm_medium=community&amp;amp;utm_campaign=opus48_dynamic_workflows&amp;amp;utm_content=dynamic-workflows-devto" rel="noopener noreferrer"&gt;Claude Opus 4.8 API for Coding Agents&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>ai</category>
    </item>
    <item>
      <title>I Generated a Full Brand Mockup Set from a Logo in 10 Minutes — Here's the Workflow</title>
      <dc:creator>Evan-dong</dc:creator>
      <pubDate>Sat, 23 May 2026 12:02:18 +0000</pubDate>
      <link>https://dev.to/evan-dong/i-generated-a-full-brand-mockup-set-from-a-logo-in-10-minutes-heres-the-workflow-3ald</link>
      <guid>https://dev.to/evan-dong/i-generated-a-full-brand-mockup-set-from-a-logo-in-10-minutes-heres-the-workflow-3ald</guid>
      <description>&lt;h1&gt;
  
  
  I Generated a Full Brand Mockup Set from a Logo in 10 Minutes — Here's the Workflow
&lt;/h1&gt;

&lt;p&gt;Every time I finish a logo, the same problem comes back: how do I show what this brand actually looks like in the real world? Clients don't react to a mark on a white background. They want to see signage, cards, packaging, booths, and product surfaces.&lt;/p&gt;

&lt;p&gt;That part usually takes half a day of mockup sourcing and layout work. I wanted to see if Image 2 could compress that.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who this is for:&lt;/strong&gt; designers, brand consultants, or developers building brand-adjacent tools who want a faster mockup workflow.&lt;/p&gt;

&lt;h2&gt;
  
  
  The framework
&lt;/h2&gt;

&lt;p&gt;I split the output into six categories before generating anything:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Logo material and finish studies&lt;/li&gt;
&lt;li&gt;Single branded item mockups&lt;/li&gt;
&lt;li&gt;Combined brand material sets&lt;/li&gt;
&lt;li&gt;Spatial and environmental scenes&lt;/li&gt;
&lt;li&gt;Human or usage scenarios&lt;/li&gt;
&lt;li&gt;Social media visuals&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The planning prompt
&lt;/h2&gt;

&lt;p&gt;Once you have the categories, use this prompt to generate a plan:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;I already have a logo. The industry is [industry]. The brand personality is [keywords]. Based on the uploaded logo, help me plan a complete logo mockup generation set. Include: 1) logo material variations, 2) single brand applications, 3) combined brand materials, 4) spatial scenes, 5) human or usage scenarios, and 6) social media visuals. For each category, provide the recommended visual direction, aspect ratio, and prompts that can be used directly in Image 2. Keep the logo recognizable, choose materials appropriate to the industry and brand personality, and make the outputs suitable for a brand proposal.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Test 1: Evolink (tech/API brand)
&lt;/h2&gt;

&lt;p&gt;Brand inputs: professional, stable, developer-friendly, enterprise-ready, connected.&lt;/p&gt;

&lt;p&gt;Image 2 generated logo finishes, interface materials, developer badges, documentation covers, and booth-style spatial visuals.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.gooo.ai%2Fweb-images%2F77cf3e009e7494cf8460a94e86013a41085460e5adb7efe5a4be8cf7a2144189" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.gooo.ai%2Fweb-images%2F77cf3e009e7494cf8460a94e86013a41085460e5adb7efe5a4be8cf7a2144189" alt="Evolink logo mockup direction" width="720" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.gooo.ai%2Fweb-images%2Fcb93a6bb8caba4a62a34828089b4d0874980bdc727caca6f87e9cb295bd1324a" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.gooo.ai%2Fweb-images%2Fcb93a6bb8caba4a62a34828089b4d0874980bdc727caca6f87e9cb295bd1324a" alt="Evolink branded material mockup" width="760" height="1140"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.gooo.ai%2Fweb-images%2F08a06d4457f482e2fbaa21ade3750d0f8b289f3ba4907e1c81c0e4c422a8d7f2" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.gooo.ai%2Fweb-images%2F08a06d4457f482e2fbaa21ade3750d0f8b289f3ba4907e1c81c0e4c422a8d7f2" alt="Evolink high-resolution brand scene" width="1280" height="1706"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.gooo.ai%2Fweb-images%2Fd63947ecbf13922cfc89257b57249825488beb4e55d57c72796dc73d8f0e89ba" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.gooo.ai%2Fweb-images%2Fd63947ecbf13922cfc89257b57249825488beb4e55d57c72796dc73d8f0e89ba" alt="Evolink product and brand environment" width="720" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.gooo.ai%2Fweb-images%2F46579df9da91014d13292beb16e7fb615ed8322fb86b2af021fb5a950baef18c" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.gooo.ai%2Fweb-images%2F46579df9da91014d13292beb16e7fb615ed8322fb86b2af021fb5a950baef18c" alt="Evolink booth concept" width="720" height="405"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.gooo.ai%2Fweb-images%2Fa489ac910f637f33c64878c79a2cad104538b846ae48256c535649238f2a7bac" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.gooo.ai%2Fweb-images%2Fa489ac910f637f33c64878c79a2cad104538b846ae48256c535649238f2a7bac" alt="Evolink brand application scene" width="720" height="405"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.gooo.ai%2Fweb-images%2Ff0bef8c0cdec0c8eda9b295d1788e93d7ee61496daf2e064be3014207c1ddf0e" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.gooo.ai%2Fweb-images%2Ff0bef8c0cdec0c8eda9b295d1788e93d7ee61496daf2e064be3014207c1ddf0e" alt="Evolink card or badge application" width="720" height="900"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;After one round, Evolink stopped looking like just a geometric mark. It started to feel like a real company with booths, dashboards, documentation, and team assets.&lt;/p&gt;

&lt;h2&gt;
  
  
  Test 2: MoriJoy (nature park brand)
&lt;/h2&gt;

&lt;p&gt;To see if this holds outside tech, I tried MoriJoy — a parent-child nature park. Logo: smiling treehouse + sapling. Palette: wood tones, cream, light green. Personality: soft, warm, healing, natural, playful, family-oriented.&lt;/p&gt;

&lt;p&gt;The mockup directions shifted completely: entrance signage, membership cards, staff name tags, children's drinkware, sticker sets, tote bags, family activity spaces.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.gooo.ai%2Fweb-images%2F54d95fd065a77840bdb62a600cb0f876f11c795eebb0caee098586e200fbecf8" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.gooo.ai%2Fweb-images%2F54d95fd065a77840bdb62a600cb0f876f11c795eebb0caee098586e200fbecf8" alt="MoriJoy entrance and environment" width="720" height="405"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.gooo.ai%2Fweb-images%2F6e784147e8b1c2d7777a60ec060f5722a5052a161639b0d89a287b694d38aa43" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.gooo.ai%2Fweb-images%2F6e784147e8b1c2d7777a60ec060f5722a5052a161639b0d89a287b694d38aa43" alt="MoriJoy branded item mockup" width="720" height="900"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.gooo.ai%2Fweb-images%2F6b8a7e353df0985dac86f61230afb0bdfa0a2a706825b42d177031ecf6f429fb" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fcdn.gooo.ai%2Fweb-images%2F6b8a7e353df0985dac86f61230afb0bdfa0a2a706825b42d177031ecf6f429fb" alt="MoriJoy spatial brand scene" width="720" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Same workflow, completely different visual language. The system adapts the brand world around the logo.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I learned
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;The six-part framework prevents random outputs and keeps the set coherent.&lt;/li&gt;
&lt;li&gt;Defining brand personality keywords before generating makes a big difference.&lt;/li&gt;
&lt;li&gt;Not every output is perfect — you still need design judgment to pick the best directions.&lt;/li&gt;
&lt;li&gt;This works best for early-stage proposals and direction-setting, not final production.&lt;/li&gt;
&lt;li&gt;The workflow compresses what normally takes half a day into about 10 minutes.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;If you're building brand proposals or need to show a logo in context quickly, this workflow is worth testing.&lt;/p&gt;

&lt;p&gt;Upload your logo and generate a full brand mockup set with Image 2: &lt;a href="https://evolink.ai/gpt-image-2?utm_source=devto&amp;amp;utm_medium=community&amp;amp;utm_campaign=image2_brand_mockup&amp;amp;utm_content=image2-brand-mockup-devto" rel="noopener noreferrer"&gt;https://evolink.ai/gpt-image-2&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>nanobanana</category>
    </item>
    <item>
      <title>Gemini 3.5 Flash Just Shipped — Here's When to Use It (and When Not To)</title>
      <dc:creator>Evan-dong</dc:creator>
      <pubDate>Wed, 20 May 2026 10:33:55 +0000</pubDate>
      <link>https://dev.to/evan-dong/gemini-35-flash-just-shipped-heres-when-to-use-it-and-when-not-to-1aee</link>
      <guid>https://dev.to/evan-dong/gemini-35-flash-just-shipped-heres-when-to-use-it-and-when-not-to-1aee</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmuyutechnology.mintlify.app%2Fmintlify-assets%2F_next%2Fimage%3Furl%3D%252F_mintlify%252Fapi%252Fog%253Fdivision%253DGoogle%252BNative%252BAPI%252BFormat%2526title%253DGemini%252B3.5%252BFlash%252B-%252BNative%252BAPI%252B-%252BQuick%252BStart%2526description%253D-%252BUse%252BGoogle%252BNative%252BAPI%252Bformat%252Bto%252Bcall%252Bgemini-3.5-flash%252Bmodel%25250A-%252BSynchronous%252Bprocessing%252Bmode%25252C%252Breal-time%252Bresponse%25250A-%252BMinimal%252Bparameters%252Bfor%252Bquick%252Bstart%25250A-%252B%2525F0%25259F%252592%2525A1%252BNeed%252Bm%2526primaryColor%253D%2525231E90FF%2526lightColor%253D%2525231E90FF%2526darkColor%253D%2525231565C0%2526backgroundLight%253D%252523ffffff%2526backgroundDark%253D%252523090c10%26w%3D1200%26q%3D100" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fmuyutechnology.mintlify.app%2Fmintlify-assets%2F_next%2Fimage%3Furl%3D%252F_mintlify%252Fapi%252Fog%253Fdivision%253DGoogle%252BNative%252BAPI%252BFormat%2526title%253DGemini%252B3.5%252BFlash%252B-%252BNative%252BAPI%252B-%252BQuick%252BStart%2526description%253D-%252BUse%252BGoogle%252BNative%252BAPI%252Bformat%252Bto%252Bcall%252Bgemini-3.5-flash%252Bmodel%25250A-%252BSynchronous%252Bprocessing%252Bmode%25252C%252Breal-time%252Bresponse%25250A-%252BMinimal%252Bparameters%252Bfor%252Bquick%252Bstart%25250A-%252B%2525F0%25259F%252592%2525A1%252BNeed%252Bm%2526primaryColor%253D%2525231E90FF%2526lightColor%253D%2525231E90FF%2526darkColor%253D%2525231565C0%2526backgroundLight%253D%252523ffffff%2526backgroundDark%253D%252523090c10%26w%3D1200%26q%3D100" alt="Gemini 3.5 Flash Native API quick start" width="1200" height="630"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Google launched Gemini 3.5 Flash at I/O 2026 today. The "budget" model now beats Gemini 3.1 Pro on agent and coding benchmarks. Here is what you actually need to know to decide whether to switch.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quick specs
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Model ID:&lt;/strong&gt; &lt;code&gt;gemini-3.5-flash&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Context:&lt;/strong&gt; 1M input / 65K output&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Input:&lt;/strong&gt; text, image, audio, video, PDF&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pricing:&lt;/strong&gt; $1.50/M input, $9.00/M output, $0.15/M cached input&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Knowledge cutoff:&lt;/strong&gt; January 2026&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dynamic Thinking:&lt;/strong&gt; on by default (medium), low mode available&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Where Flash wins
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Benchmark&lt;/th&gt;
&lt;th&gt;Flash&lt;/th&gt;
&lt;th&gt;3.1 Pro&lt;/th&gt;
&lt;th&gt;Gap&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Terminal-Bench 2.1 (coding)&lt;/td&gt;
&lt;td&gt;76.2%&lt;/td&gt;
&lt;td&gt;70.3%&lt;/td&gt;
&lt;td&gt;+5.9&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MCP Atlas (tool calling)&lt;/td&gt;
&lt;td&gt;83.6%&lt;/td&gt;
&lt;td&gt;78.2%&lt;/td&gt;
&lt;td&gt;+5.4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Finance Agent v2&lt;/td&gt;
&lt;td&gt;57.9%&lt;/td&gt;
&lt;td&gt;43.0%&lt;/td&gt;
&lt;td&gt;+14.9&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GDPval-AA (decision-making)&lt;/td&gt;
&lt;td&gt;1,656 Elo&lt;/td&gt;
&lt;td&gt;1,314 Elo&lt;/td&gt;
&lt;td&gt;+342&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CharXiv Reasoning&lt;/td&gt;
&lt;td&gt;84.2%&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Where Flash does NOT win
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Humanity's Last Exam:&lt;/strong&gt; 3.1 Pro leads (44.4% vs 40.2%)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ARC-AGI-2:&lt;/strong&gt; 3.1 Pro leads (77.1% vs 72.1%)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SWE-Bench Pro:&lt;/strong&gt; Claude Opus 4.7 still leads&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Computer Use:&lt;/strong&gt; GPT-5.5 is the only model with production screen control. Flash does not support this.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Tool capabilities
&lt;/h2&gt;

&lt;p&gt;What ships:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Function Calling&lt;/li&gt;
&lt;li&gt;Structured Output&lt;/li&gt;
&lt;li&gt;Search-as-a-tool&lt;/li&gt;
&lt;li&gt;Code Execution&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What is missing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Computer Use&lt;/strong&gt; — no screen control, no clicking, no form filling. If your agent operates a browser or desktop app, you still need GPT-5.5 for that part.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  When to pick which model
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Agent needs to call multiple tools in sequence?
  → Gemini 3.5 Flash

Agent needs to refactor code across a large repo?
  → Claude Opus 4.7

Agent needs to control a browser or desktop?
  → GPT-5.5

Need all three depending on the task?
  → Route through a unified gateway
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  API call example
&lt;/h2&gt;

&lt;p&gt;Through EvoLink (single endpoint for Gemini, Claude, and GPT):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://direct.evolink.ai/v1beta/models/gemini-3.5-flash:generateContent &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer YOUR_EVOLINK_KEY"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "contents": [{
      "role": "user",
      "parts": [{"text": "List the top 3 files changed in the last commit and explain what each change does"}]
    }]
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you were using any previous Gemini model through EvoLink, swap the model ID. No other change needed.&lt;/p&gt;

&lt;p&gt;The benefit: same API key and endpoint for Flash, Opus 4.7, and GPT-5.5. Switch model IDs depending on the task. One bill, automatic failover.&lt;/p&gt;

&lt;p&gt;Full docs: &lt;a href="https://docs.evolink.ai/en/api-manual/language-series/gemini-3.5-flash/native-api/native-api-quickstart?utm_source=devto&amp;amp;utm_medium=community&amp;amp;utm_campaign=gemini35flash&amp;amp;utm_content=gemini-35-flash-agent-model-devto" rel="noopener noreferrer"&gt;EvoLink Gemini 3.5 Flash API Guide&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Things to keep in mind
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;These are Google's benchmarks.&lt;/strong&gt; Independent community testing will confirm or adjust. Take the exact numbers with normal benchmark skepticism.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agent cost is not unit cost.&lt;/strong&gt; A 20-step agent loop means 20 API calls. Fast and cheap per token is not the same as cheap per workflow run.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dynamic Thinking adds reasoning tokens.&lt;/strong&gt; The model thinks before answering. This increases output quality but also output token count. Watch your bills.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;3.5 Pro is expected next month.&lt;/strong&gt; If you need peak general reasoning from Google, it might be worth waiting.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://blog.google/technology/google-deepmind/gemini-3-5-flash/" rel="noopener noreferrer"&gt;Google I/O 2026 Gemini 3.5 Flash&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://llm-stats.com/blog/research/gemini-3.5-flash-launch" rel="noopener noreferrer"&gt;Benchmarks and pricing analysis&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.evolink.ai/en/api-manual/language-series/gemini-3.5-flash/native-api/native-api-quickstart" rel="noopener noreferrer"&gt;EvoLink API docs&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Pick models by task, not by tier. Flash for tool orchestration. Opus for code rewrites. GPT for screen control. That is the state of play.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;tags: gemini, ai-agents, llm, api, google-io&lt;/p&gt;

</description>
      <category>ai</category>
      <category>claude</category>
      <category>programming</category>
      <category>api</category>
    </item>
    <item>
      <title>Codex Now Works from Your Phone — Plus Hooks and CI/CD Tokens</title>
      <dc:creator>Evan-dong</dc:creator>
      <pubDate>Fri, 15 May 2026 07:57:10 +0000</pubDate>
      <link>https://dev.to/evan-dong/codex-now-works-from-your-phone-plus-hooks-and-cicd-tokens-4mj3</link>
      <guid>https://dev.to/evan-dong/codex-now-works-from-your-phone-plus-hooks-and-cicd-tokens-4mj3</guid>
      <description>&lt;h1&gt;
  
  
  Codex Now Works from Your Phone — Plus Hooks and CI/CD Tokens
&lt;/h1&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7i9v79rp9y36mscg2glx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7i9v79rp9y36mscg2glx.png" alt="Codex mobile in the ChatGPT app" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I run Codex for long refactoring tasks. The annoying pattern: start a job, step away, come back to find Codex has been waiting 40 minutes for me to approve a permission check.&lt;/p&gt;

&lt;p&gt;The May 2026 updates fix that, along with a few other gaps I cared about.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Three Things That Matter
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Mobile access (preview, May 14)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Codex is now available in preview on the ChatGPT mobile app for iOS and Android. Your phone connects to the Codex session running on your Mac — you review diffs, approve commands, switch models, and follow terminal output from wherever you are. The code stays on the host machine.&lt;/p&gt;

&lt;p&gt;Setup: latest ChatGPT app + Codex for Mac → scan QR code to pair.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Hooks (GA, May 14)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;You can now inject custom scripts at 6 points in the Codex loop. The one I found most useful: &lt;code&gt;PreToolUse&lt;/code&gt; to scan Bash commands for credential patterns before they execute.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="c"&gt;# ~/.codex/config.toml&lt;/span&gt;
&lt;span class="nn"&gt;[[hooks.PreToolUse]]&lt;/span&gt;
&lt;span class="py"&gt;matcher&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"^Bash$"&lt;/span&gt;

&lt;span class="nn"&gt;[[hooks.PreToolUse.hooks]]&lt;/span&gt;
&lt;span class="py"&gt;type&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"command"&lt;/span&gt;
&lt;span class="py"&gt;command&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"python3 ~/.codex/hooks/scan_credentials.py"&lt;/span&gt;
&lt;span class="py"&gt;timeout&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The hook receives JSON on stdin with the command about to run, and can return &lt;code&gt;{"decision": "block", "reason": "..."}&lt;/code&gt; to stop it.&lt;/p&gt;

&lt;p&gt;Other hook events: &lt;code&gt;SessionStart&lt;/code&gt;, &lt;code&gt;PostToolUse&lt;/code&gt;, &lt;code&gt;PermissionRequest&lt;/code&gt;, &lt;code&gt;UserPromptSubmit&lt;/code&gt;, &lt;code&gt;Stop&lt;/code&gt;. Enterprise teams can push managed hooks via &lt;code&gt;requirements.toml&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Caveat: hooks do not intercept all shell calls yet per the docs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Access tokens (Business/Enterprise, May 5)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you run Codex in CI/CD, you no longer need to fake an interactive login. Access tokens carry workspace identity:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;CODEX_ACCESS_TOKEN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&amp;lt;token&amp;gt;"&lt;/span&gt;
codex &lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="nt"&gt;--json&lt;/span&gt; &lt;span class="s2"&gt;"run tests and report coverage"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Business and Enterprise only. Tokens expire (configurable) and show up in governance logs under the creating user.&lt;/p&gt;

&lt;h2&gt;
  
  
  Also Shipped
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Remote SSH (GA)&lt;/strong&gt;: connect Codex to managed dev environments through a relay — no public SSH port needed&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HIPAA compliance&lt;/strong&gt;: for eligible Enterprise workspaces in local environments&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Multi-Agent Problem Remains
&lt;/h2&gt;

&lt;p&gt;These features make Codex better. But if you also use Claude Code and Gemini CLI, you still have three API keys, three dashboards, no fallback between them.&lt;/p&gt;

&lt;p&gt;All three CLIs support custom endpoints:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="c"&gt;# Codex: ~/.codex/config.toml&lt;/span&gt;
&lt;span class="nn"&gt;[api]&lt;/span&gt;
&lt;span class="py"&gt;base_url&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"https://api.evolink.ai/v1"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Claude Code&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;ANTHROPIC_BASE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"https://api.evolink.ai"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Gemini CLI: ~/.gemini/.env&lt;/span&gt;
&lt;span class="nv"&gt;GOOGLE_GEMINI_BASE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;https://api.evolink.ai/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One gateway → unified cost, automatic fallback on 429/5xx, model switching via routing rules.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://evolink.ai/blog/one-endpoint-coding-clis?utm_source=devto&amp;amp;utm_medium=community&amp;amp;utm_campaign=codex_mobile_hooks&amp;amp;utm_content=codex-mobile-devto" rel="noopener noreferrer"&gt;Setup guide for routing all three CLIs →&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  References
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://developers.openai.com/codex/changelog" rel="noopener noreferrer"&gt;Codex Changelog&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://developers.openai.com/codex/hooks" rel="noopener noreferrer"&gt;Hooks Docs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://developers.openai.com/codex/enterprise/access-tokens" rel="noopener noreferrer"&gt;Access Tokens&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>claude</category>
      <category>programming</category>
      <category>api</category>
    </item>
    <item>
      <title>I Tried TencentDB Agent Memory — Here's What the Token Reduction Looks Like</title>
      <dc:creator>Evan-dong</dc:creator>
      <pubDate>Fri, 15 May 2026 07:47:26 +0000</pubDate>
      <link>https://dev.to/evan-dong/i-tried-tencentdb-agent-memory-heres-what-the-token-reduction-looks-like-2pm4</link>
      <guid>https://dev.to/evan-dong/i-tried-tencentdb-agent-memory-heres-what-the-token-reduction-looks-like-2pm4</guid>
      <description>&lt;h1&gt;
  
  
  I Tried TencentDB Agent Memory — Here's What the Token Reduction Looks Like
&lt;/h1&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7650bi3yd6scsa0uhm6d.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7650bi3yd6scsa0uhm6d.jpg" alt="TencentDB Agent Memory four-tier architecture" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The context window problem in long-running agents is familiar: by turn 20, you are paying for tool logs the agent does not need anymore. Truncation loses detail. Summarization compresses but also forgets.&lt;/p&gt;

&lt;p&gt;Tencent Cloud open-sourced TencentDB Agent Memory (MIT license, May 2026), and it takes a different approach: offload the verbose stuff to local files, keep a Mermaid task graph in context, let the agent drill back in when it needs specifics.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Architecture
&lt;/h2&gt;

&lt;p&gt;Four memory layers, each traceable back to raw data:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;L0 Conversation&lt;/strong&gt;: raw dialogue + tool logs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;L1 Atom&lt;/strong&gt;: structured facts extracted every N conversations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;L2 Scenario&lt;/strong&gt;: aggregated solution patterns&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;L3 Persona&lt;/strong&gt;: user behavior profiles built over time&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The short-term trick: verbose tool output gets offloaded to &lt;code&gt;refs/*.md&lt;/code&gt; files. In context, only a lightweight Mermaid graph remains. When the agent needs a specific output, it retrieves by &lt;code&gt;node_id&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Numbers
&lt;/h2&gt;

&lt;p&gt;According to the project's benchmarks (long-horizon sessions, not isolated turns):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Task&lt;/th&gt;
&lt;th&gt;Success Change&lt;/th&gt;
&lt;th&gt;Token Change&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;WideSearch&lt;/td&gt;
&lt;td&gt;33% → 50%&lt;/td&gt;
&lt;td&gt;−61.38%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;SWE-bench&lt;/td&gt;
&lt;td&gt;58.4% → 64.2%&lt;/td&gt;
&lt;td&gt;−33.09%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PersonaMem&lt;/td&gt;
&lt;td&gt;48% → 76% accuracy&lt;/td&gt;
&lt;td&gt;N/A&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Biggest gains on WideSearch — makes sense, that is where context accumulates fastest. SWE-bench improvement is real but modest (+9.93%).&lt;/p&gt;

&lt;p&gt;Important caveat: these are self-reported by the project team, not independently verified.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quick Setup (OpenClaw)
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openclaw plugins &lt;span class="nb"&gt;install&lt;/span&gt; @tencentdb-agent-memory/memory-tencentdb
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;//&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;~/.openclaw/openclaw.json&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"memory-tencentdb"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"enabled"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"offload"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"enabled"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;openclaw gateway restart
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is it. SQLite + sqlite-vec by default, no external DB needed. The &lt;code&gt;offload.enabled: true&lt;/code&gt; is what activates the Mermaid compression — without it you only get long-term memory.&lt;/p&gt;

&lt;h2&gt;
  
  
  Two Layers of Cost Optimization
&lt;/h2&gt;

&lt;p&gt;Memory cuts tokens per call. But you are still paying the provider's per-token rate, and if the provider has an outage, the agent stalls.&lt;/p&gt;

&lt;p&gt;If you route agent LLM calls through a gateway, you get a second optimization layer: model routing (pick the cheapest capable model per task), automatic fallback on 429/5xx, and a unified cost dashboard.&lt;/p&gt;

&lt;p&gt;For Hermes, this means setting &lt;code&gt;MODEL_BASE_URL&lt;/code&gt; to a gateway endpoint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;--name&lt;/span&gt; hermes-memory &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;MODEL_BASE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;https://api.evolink.ai/v1 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;MODEL_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your-key &lt;span class="se"&gt;\&lt;/span&gt;
  hermes-memory
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Fewer tokens × lower cost per token = compounding savings.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://evolink.ai/blog/unified-api-multi-model-ai-apps?utm_source=devto&amp;amp;utm_medium=community&amp;amp;utm_campaign=agent_memory_guide&amp;amp;utm_content=agent-memory-devto" rel="noopener noreferrer"&gt;More on unified API routing for multi-model apps →&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Limitations
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Only OpenClaw and Hermes are supported today&lt;/li&gt;
&lt;li&gt;Offloading is off by default&lt;/li&gt;
&lt;li&gt;SQLite is single-agent; concurrent access needs Tencent Cloud Vector DB backend&lt;/li&gt;
&lt;li&gt;Benchmarks are project-reported&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/Tencent/TencentDB-Agent-Memory" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/Tencent/TencentDB-Agent-Memory/blob/main/README_CN.md" rel="noopener noreferrer"&gt;中文 README&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>claude</category>
      <category>api</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Codex vs Claude Code vs Cursor: What Changed in May 2026 and What Can Be Routed</title>
      <dc:creator>Evan-dong</dc:creator>
      <pubDate>Thu, 14 May 2026 12:50:13 +0000</pubDate>
      <link>https://dev.to/evan-dong/codex-vs-claude-code-vs-cursor-what-changed-in-may-2026-and-what-can-be-routed-2997</link>
      <guid>https://dev.to/evan-dong/codex-vs-claude-code-vs-cursor-what-changed-in-may-2026-and-what-can-be-routed-2997</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd4yl3eu36rwu7etp6p6m.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fd4yl3eu36rwu7etp6p6m.png" alt="Cursor cloud agent development environments" width="800" height="420"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A common setup now is Codex CLI for scaffolding, Claude Code for refactoring, and Cursor for IDE or cloud-agent workflows. This month all three shipped infra updates — on different dates, solving different problems.&lt;/p&gt;

&lt;p&gt;Here is what changed, what can be wired through one endpoint, and what has to remain separate.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Updates
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Codex CLI — Windows Sandbox (May 13)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Codex now has real OS-level sandboxing on Windows. Dedicated user accounts (&lt;code&gt;CodexSandboxOffline&lt;/code&gt;/&lt;code&gt;CodexSandboxOnline&lt;/code&gt;), per-account firewall rules, helper binaries for privilege boundaries. Linux already had seccomp/bubblewrap; Windows finally caught up.&lt;/p&gt;

&lt;p&gt;Previously, Windows sandbox attempts used synthetic SIDs and proxy-based network blocking — programs could bypass them by implementing their own networking stack.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Claude Code — Doubled Five-Hour Rate Limits (May 6)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Anthropic doubled Claude Code's five-hour rate limits for Pro, Max, Team, and seat-based Enterprise plans. Also removed the peak hours limit reduction for Pro and Max.&lt;/p&gt;

&lt;p&gt;This was announced May 6, a week before the other updates. If you were splitting refactoring tasks to stay under the five-hour cap, you have roughly 2x headroom.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cursor — Cloud Agent Environments (May 13)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Cloud environments with multi-repo support, Dockerfile config with build secrets, layer caching (70% faster on cache hits), version history, rollback, scoped egress/secrets, audit logging.&lt;/p&gt;

&lt;p&gt;Dockerfile auto-config is in private beta, rolling out to Enterprise teams.&lt;/p&gt;

&lt;h2&gt;
  
  
  Quick Comparison
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Codex&lt;/th&gt;
&lt;th&gt;Claude Code&lt;/th&gt;
&lt;th&gt;Cursor&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Updated&lt;/td&gt;
&lt;td&gt;May 13&lt;/td&gt;
&lt;td&gt;May 6&lt;/td&gt;
&lt;td&gt;May 13&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;What&lt;/td&gt;
&lt;td&gt;Windows sandbox&lt;/td&gt;
&lt;td&gt;2x five-hour limits&lt;/td&gt;
&lt;td&gt;Cloud dev environments&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Runs&lt;/td&gt;
&lt;td&gt;Local sandbox&lt;/td&gt;
&lt;td&gt;Local terminal&lt;/td&gt;
&lt;td&gt;Cloud + IDE&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multi-repo&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Isolation&lt;/td&gt;
&lt;td&gt;OS-level&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Cloud containers&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Best for&lt;/td&gt;
&lt;td&gt;Scaffolding, Windows&lt;/td&gt;
&lt;td&gt;Refactoring, terminal&lt;/td&gt;
&lt;td&gt;End-to-end delivery&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  The Problem When You Use More Than One
&lt;/h2&gt;

&lt;p&gt;Multiple provider keys, billing dashboards, and rate-limit surfaces. When Claude Code hits its five-hour cap, no automatic fallback happens by default. When you want to compare spend across routable CLIs, you are exporting data from separate consoles.&lt;/p&gt;

&lt;h2&gt;
  
  
  One Gateway Setup
&lt;/h2&gt;

&lt;p&gt;Point the routable CLIs at one endpoint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight toml"&gt;&lt;code&gt;&lt;span class="c"&gt;# Codex: ~/.codex/config.toml&lt;/span&gt;
&lt;span class="nn"&gt;[api]&lt;/span&gt;
&lt;span class="py"&gt;base_url&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;"https://api.evolink.ai/v1"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Claude Code&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;ANTHROPIC_BASE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"https://api.evolink.ai"&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;ANTHROPIC_AUTH_TOKEN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"your-evolink-key"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Gemini CLI: ~/.gemini/.env&lt;/span&gt;
&lt;span class="nv"&gt;GEMINI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your-evolink-key
&lt;span class="nv"&gt;GOOGLE_GEMINI_BASE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;https://api.evolink.ai/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What you get: unified cost dashboard, automatic fallback on 429/5xx, model switching via routing rules.&lt;/p&gt;

&lt;p&gt;Note: Cursor cloud environments use their own backend — not routable through third-party gateways yet. Local Cursor IDE completions can be routed.&lt;/p&gt;

&lt;p&gt;Full walkthrough: &lt;a href="https://evolink.ai/blog/one-endpoint-coding-clis?utm_source=devto&amp;amp;utm_medium=community&amp;amp;utm_campaign=coding_agents_infra&amp;amp;utm_content=coding-agents-devto" rel="noopener noreferrer"&gt;One Gateway for 3 Coding CLIs&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Watch Out For
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Config precedence:&lt;/strong&gt; env vars override config files. If you set &lt;code&gt;ANTHROPIC_BASE_URL&lt;/code&gt; globally and have a project &lt;code&gt;.claude/settings.json&lt;/code&gt;, behavior depends on tool version.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auth errors:&lt;/strong&gt; 401 from gateway usually means old provider key is still being sent. Restart terminal after env var changes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Upstream limits still apply:&lt;/strong&gt; gateway adds fallback, but Anthropic five-hour cap and OpenAI daily quotas are still upstream.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://openai.com/index/building-codex-windows-sandbox" rel="noopener noreferrer"&gt;OpenAI: Codex Windows Sandbox&lt;/a&gt; (May 13)&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.cursor.com/blog/cloud-agent-development-environments" rel="noopener noreferrer"&gt;Cursor: Cloud Agent Environments&lt;/a&gt; (May 13)&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.anthropic.com/news/higher-limits-spacex" rel="noopener noreferrer"&gt;Anthropic: Higher Limits&lt;/a&gt; (May 6)&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>claude</category>
      <category>programming</category>
      <category>api</category>
    </item>
    <item>
      <title>I Tried the Claude Code Skills Repo That Got 77K Stars — Here Is What Works and What Does Not</title>
      <dc:creator>Evan-dong</dc:creator>
      <pubDate>Wed, 13 May 2026 12:13:27 +0000</pubDate>
      <link>https://dev.to/evan-dong/i-tried-the-claude-code-skills-repo-that-got-77k-stars-here-is-what-works-and-what-does-not-57a4</link>
      <guid>https://dev.to/evan-dong/i-tried-the-claude-code-skills-repo-that-got-77k-stars-here-is-what-works-and-what-does-not-57a4</guid>
      <description>&lt;p&gt;I kept running into the same problem with coding agents: I would describe a task, the agent would build something, and it was not what I meant. Not broken — just off.&lt;/p&gt;

&lt;p&gt;The fix turned out to be surprisingly low-tech. Matt Pocock published a repo of "skills" — small instruction files that go in your &lt;code&gt;.claude&lt;/code&gt; directory and change how the agent approaches work. The repo exploded: 77,000+ stars, 6,700+ forks, #1 on GitHub Trending.&lt;/p&gt;

&lt;p&gt;I installed it. Here is what I found.&lt;/p&gt;

&lt;h2&gt;
  
  
  Setup (About 60 Seconds)
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx skills@latest add mattpocock/skills
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pick the skills you want. Select &lt;code&gt;/setup-matt-pocock-skills&lt;/code&gt; — it is the bootstrap.&lt;/p&gt;

&lt;p&gt;Inside your agent:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/setup-matt-pocock-skills
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It asks your issue tracker (GitHub / Linear / local files), triage labels, and docs folder. Done.&lt;/p&gt;

&lt;p&gt;Works with Claude Code, Codex, Cursor, or anything that reads &lt;code&gt;.claude/&lt;/code&gt; directories.&lt;/p&gt;

&lt;h2&gt;
  
  
  The 4 Skills I Actually Use
&lt;/h2&gt;

&lt;p&gt;There are 28 skills. I use 4 regularly.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;/grill-with-docs&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;This is the best one. Before you start coding, the agent asks you detailed questions about what you are building. Edge cases, constraints, why you are doing it this way.&lt;/p&gt;

&lt;p&gt;The output is a &lt;code&gt;CONTEXT.md&lt;/code&gt; file — a shared vocabulary that the agent reads in every future session. One of my projects had a 15-word phrase that got replaced with "materialization cascade." Every session after that was shorter and more accurate.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;/tdd&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Forces test-driven development: tests first, implementation second, verification third. Works great on isolated functions. Gets annoying on complex UI where the tests are hard to specify upfront.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;/diagnose&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Structured debugging instead of the agent guessing. Most useful when the error message does not point to the real problem.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;/caveman&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Extreme concision. The agent says as little as possible and just executes. Perfect for experienced devs who already know what they want.&lt;/p&gt;

&lt;h2&gt;
  
  
  Everything Else (Quick Map)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Alignment:&lt;/strong&gt; &lt;code&gt;/grill-me&lt;/code&gt; (non-code version), &lt;code&gt;/to-prd&lt;/code&gt;, &lt;code&gt;/to-issues&lt;/code&gt;&lt;br&gt;
&lt;strong&gt;Code quality:&lt;/strong&gt; &lt;code&gt;/prototype&lt;/code&gt;, &lt;code&gt;/triage&lt;/code&gt;, &lt;code&gt;/zoom-out&lt;/code&gt;&lt;br&gt;
&lt;strong&gt;Workflow:&lt;/strong&gt; &lt;code&gt;/handoff&lt;/code&gt;, &lt;code&gt;/write-a-skill&lt;/code&gt;, &lt;code&gt;/review&lt;/code&gt; (WIP), &lt;code&gt;/setup-pre-commit&lt;/code&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Honest Limits
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Skills are instructions, not plugins.&lt;/strong&gt; They do not make the model smarter — they make the conversation more structured.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;CONTEXT.md&lt;/code&gt; drifts.&lt;/strong&gt; You need to update it as the project evolves, or re-run the grill.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agent-agnostic = lowest common denominator.&lt;/strong&gt; Skills work everywhere but cannot use agent-specific features like Claude Code's &lt;code&gt;/goal&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;28 skills is too many to learn at once.&lt;/strong&gt; Start with the four above.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  When It Gets Bigger
&lt;/h2&gt;

&lt;p&gt;Once a team runs multiple agents on the same codebase, the next problem is not prompt quality — it is routing. Who uses which model, how much does each workflow cost, and where do the API keys live.&lt;/p&gt;

&lt;p&gt;Skills handle the agent behavior side. For the model routing side, connecting Claude Code CLI to a gateway makes the access path concrete. The &lt;a href="https://docs.evolink.ai/en/integration-guide/claude-code-cli?utm_source=devto&amp;amp;utm_medium=community&amp;amp;utm_campaign=claude_skills&amp;amp;utm_content=claude-skills-devto" rel="noopener noreferrer"&gt;EvoLink Claude Code CLI guide&lt;/a&gt; documents that setup if your team is at that stage.&lt;/p&gt;

&lt;h2&gt;
  
  
  Who Should Use This
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;✅ Daily coding agent user who wants structured quality control&lt;/li&gt;
&lt;li&gt;✅ Team with domain-specific terminology that confuses agents&lt;/li&gt;
&lt;li&gt;✅ Multi-developer project using coding agents on the same repo&lt;/li&gt;
&lt;li&gt;❌ One-off scripts or throwaway code&lt;/li&gt;
&lt;li&gt;❌ Projects too small for shared language docs&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;a href="https://github.com/mattpocock/skills" rel="noopener noreferrer"&gt;mattpocock/skills on GitHub&lt;/a&gt; | &lt;a href="https://skills.sh/mattpocock/skills" rel="noopener noreferrer"&gt;skills.sh installer&lt;/a&gt; | &lt;a href="https://docs.anthropic.com/en/docs/claude-code" rel="noopener noreferrer"&gt;Claude Code docs&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;tags: claude-code, ai, developer-tools, coding-agent&lt;/p&gt;

</description>
      <category>ai</category>
      <category>claude</category>
      <category>programming</category>
      <category>api</category>
    </item>
    <item>
      <title>Stop Asking Claude Code for Markdown Specs. Ask for HTML Artifacts.</title>
      <dc:creator>Evan-dong</dc:creator>
      <pubDate>Sat, 09 May 2026 09:30:17 +0000</pubDate>
      <link>https://dev.to/evan-dong/stop-asking-claude-code-for-markdown-specs-ask-for-html-artifacts-16ke</link>
      <guid>https://dev.to/evan-dong/stop-asking-claude-code-for-markdown-specs-ask-for-html-artifacts-16ke</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fpbs.twimg.com%2Fmedia%2FHHz_ftzaIAAwkQs%3Fformat%3Djpg%26name%3Dmedium" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fpbs.twimg.com%2Fmedia%2FHHz_ftzaIAAwkQs%3Fformat%3Djpg%26name%3Dmedium" alt="Using Claude Code: The Unreasonable Effectiveness of HTML cover" width="1200" height="480"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Claude Code is very good at writing Markdown. That does not mean Markdown should be the default output for every task.&lt;/p&gt;

&lt;p&gt;Thariq from the Claude Code team recently described a workflow where he increasingly asks Claude Code for HTML instead of Markdown. The reason is practical: long Markdown specs are easy to generate but hard to read. HTML can turn the same information into a navigable, visual, and sometimes interactive artifact.&lt;/p&gt;

&lt;h2&gt;
  
  
  When HTML Beats Markdown
&lt;/h2&gt;

&lt;p&gt;Use HTML when the output is meant to be consumed by people, not maintained line by line in Git.&lt;/p&gt;

&lt;p&gt;Good fits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;PR walkthroughs&lt;/li&gt;
&lt;li&gt;design option comparisons&lt;/li&gt;
&lt;li&gt;architecture explainers&lt;/li&gt;
&lt;li&gt;onboarding docs&lt;/li&gt;
&lt;li&gt;debugging reports&lt;/li&gt;
&lt;li&gt;one-off planning tools&lt;/li&gt;
&lt;li&gt;draggable prioritization boards&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Keep Markdown for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;READMEs&lt;/li&gt;
&lt;li&gt;changelogs&lt;/li&gt;
&lt;li&gt;durable docs&lt;/li&gt;
&lt;li&gt;API references&lt;/li&gt;
&lt;li&gt;anything that needs clean Git diff review&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Example: PR Review Artifact
&lt;/h2&gt;

&lt;p&gt;Instead of:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Summarize this PR in Markdown.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Try:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Create a single self-contained HTML PR walkthrough.
Render the important diff areas with inline annotations.
Color-code findings by severity.
Add a manual verification checklist at the bottom.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That gives reviewers something closer to a focused review interface than a wall of bullets.&lt;/p&gt;

&lt;h2&gt;
  
  
  Example: Implementation Options
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Generate 5 implementation approaches as one HTML file.
Use a comparison grid.
For each approach show:
- complexity
- migration risk
- test impact
- recommended use case
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is much easier to scan than five Markdown sections stacked vertically.&lt;/p&gt;

&lt;h2&gt;
  
  
  Trade-Offs
&lt;/h2&gt;

&lt;p&gt;HTML is not always better.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Concern&lt;/th&gt;
&lt;th&gt;Markdown&lt;/th&gt;
&lt;th&gt;HTML&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Git diffs&lt;/td&gt;
&lt;td&gt;Great&lt;/td&gt;
&lt;td&gt;Noisy&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Long-term docs&lt;/td&gt;
&lt;td&gt;Great&lt;/td&gt;
&lt;td&gt;Mixed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Visual hierarchy&lt;/td&gt;
&lt;td&gt;Limited&lt;/td&gt;
&lt;td&gt;Strong&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Interactivity&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;Possible&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sharing in browser&lt;/td&gt;
&lt;td&gt;Requires renderer&lt;/td&gt;
&lt;td&gt;Native&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The rule I use: Markdown is the source. HTML is the reading surface.&lt;/p&gt;

&lt;h2&gt;
  
  
  Practical Prompt
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Create a self-contained HTML explainer for this feature.
Audience: an engineer who has not seen this code before.
Include a visual summary, annotated code snippets, risks, and a next-step checklist.
Do not add external dependencies.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The real insight is not "HTML is better than Markdown." It is that AI-generated output does not have to be plain text. If the model can generate a useful interface, ask for the interface.&lt;/p&gt;




&lt;p&gt;For teams building Claude Code workflows across multiple models, &lt;a href="https://evolink.ai?utm_source=devto&amp;amp;utm_medium=community&amp;amp;utm_campaign=claude_html_output&amp;amp;utm_content=claude-code-html-over-markdown" rel="noopener noreferrer"&gt;EvoLink&lt;/a&gt; provides unified API access to Claude and other frontier models.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>api</category>
      <category>image</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>OpenAI's New Realtime Voice Models Can Think, Translate, and Transcribe — Here's What Developers Need to Know</title>
      <dc:creator>Evan-dong</dc:creator>
      <pubDate>Fri, 08 May 2026 13:36:06 +0000</pubDate>
      <link>https://dev.to/evan-dong/openais-new-realtime-voice-models-can-think-translate-and-transcribe-heres-what-developers-5hab</link>
      <guid>https://dev.to/evan-dong/openais-new-realtime-voice-models-can-think-translate-and-transcribe-heres-what-developers-5hab</guid>
      <description>&lt;p&gt;OpenAI just shipped three realtime voice models through their API. One reasons at GPT-5 level during live calls. One translates 70+ languages in real time. One does streaming transcription. All available today.&lt;/p&gt;

&lt;p&gt;Let me break down what matters for developers.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Three Models
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;GPT-Realtime-2&lt;/strong&gt; handles voice conversations with GPT-5-level reasoning. The key difference from previous voice models: it can call tools mid-conversation without going silent. It narrates what it's doing while executing — OpenAI calls this "preamble."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GPT-Realtime-Translate&lt;/strong&gt; does real-time voice translation. 70+ input languages, 13 output languages. End-to-end audio processing (no intermediate text step), which preserves tone and emotion.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GPT-Realtime-Whisper&lt;/strong&gt; is streaming speech-to-text. Words appear as the speaker talks. Built for live captions and meeting transcription.&lt;/p&gt;

&lt;h2&gt;
  
  
  Integration Options
&lt;/h2&gt;

&lt;p&gt;All three use the Realtime API with three connection methods:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;WebRTC&lt;/strong&gt; — browser-based, lowest latency&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;WebSocket&lt;/strong&gt; — server-side, more control&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SIP&lt;/strong&gt; — telephony integration&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  GPT-Realtime-2: Voice Agents That Actually Work
&lt;/h2&gt;

&lt;p&gt;If you've built voice agents before, you know the pain: tool calls create dead air. The user asks something that requires a database lookup, and the agent goes silent for 2-3 seconds. Feels broken.&lt;/p&gt;

&lt;p&gt;GPT-Realtime-2 solves this with preamble — it talks through its actions while executing them. "Let me check your calendar... I see you have a meeting with Alex Kim in 12 minutes." The tool call happens in parallel with the speech.&lt;/p&gt;

&lt;p&gt;Other developer-relevant specs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;128K context window (up from 32K)&lt;/li&gt;
&lt;li&gt;Handles interruptions without losing context&lt;/li&gt;
&lt;li&gt;Better instruction following for system prompts&lt;/li&gt;
&lt;li&gt;Text tokens: $4/$16 per million (input/output)&lt;/li&gt;
&lt;li&gt;Audio tokens: $32/$64 per million&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  GPT-Realtime-Translate: The $0.034/min Disruption
&lt;/h2&gt;

&lt;p&gt;The translation model is priced at $0.034 per minute. For context, a human simultaneous interpreter costs $25-44 per minute.&lt;/p&gt;

&lt;p&gt;Technical details:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Processes raw audio end-to-end (not cascaded speech-to-text-to-speech)&lt;/li&gt;
&lt;li&gt;Preserves speaker emotion and tone&lt;/li&gt;
&lt;li&gt;Works best with brief pauses between thoughts (labeled "turn-based" in docs)&lt;/li&gt;
&lt;li&gt;Occasional hallucinations still occur&lt;/li&gt;
&lt;li&gt;Supports language switching mid-stream&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The end-to-end approach is what makes the quality difference. Traditional pipelines lose vocal characteristics at every stage. This model skips text entirely.&lt;/p&gt;

&lt;h2&gt;
  
  
  GPT-Realtime-Whisper: Streaming Transcription
&lt;/h2&gt;

&lt;p&gt;If you need real-time captions or meeting transcription, this is the model. Low-latency streaming output as the speaker talks.&lt;/p&gt;

&lt;h2&gt;
  
  
  What You Can Build
&lt;/h2&gt;

&lt;p&gt;The three models together cover the full voice infrastructure stack:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Customer support agents&lt;/strong&gt; that can reason, look up accounts, and process requests — all by voice&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Real-time translation layers&lt;/strong&gt; for international meetings at 1/1000th the cost of human interpreters&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Live captioning systems&lt;/strong&gt; for streaming, conferences, or accessibility&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multilingual voice assistants&lt;/strong&gt; that handle code-switching naturally&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Telephony bots&lt;/strong&gt; via SIP integration that feel like talking to a person&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://openai.com/index/advancing-voice-intelligence-with-new-models-in-the-api/" rel="noopener noreferrer"&gt;OpenAI Blog&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://developers.openai.com/api/docs/guides/realtime" rel="noopener noreferrer"&gt;Realtime API Docs&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://developers.openai.com/api/docs/models/gpt-realtime" rel="noopener noreferrer"&gt;Model Reference&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://developers.openai.com/cookbook/examples/voice_solutions/one_way_translation_using_realtime_api" rel="noopener noreferrer"&gt;Translation Cookbook&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>api</category>
      <category>image</category>
      <category>tutorial</category>
    </item>
  </channel>
</rss>
