DEV Community: Toruk Makto

I turned 14 business books into Claude Code skills that auto-trigger based on your question

Toruk Makto — Wed, 29 Apr 2026 02:08:28 +0000

why this exists

been using claude for almost all my business planning - pricing, customer interviews, marketing strategy, sales calls. the problem is claude knows these books from training data but only surface level. ask it about The Mom Test and it'll say "ask open-ended questions." ask it to actually score your customer conversation and it makes up random criteria every time.

wanted something structured. actual decision trees, scoring rubrics, templates that work the same way every time. started with The Mom Test after someone recommended it to me. turned it into a skill. then couldn't stop. 14 books later here we are.

what's actually inside each skill

every skill follows the same structure:

a decision tree at the top that tells you whether this is even the right framework for your problem. half the time founders think they have a messaging problem when it's actually distribution or pricing. the skill catches that before you waste time.

scored checklists you can use in real situations. the mom test skill scores your customer conversations on 10 specific criteria. spin selling has a call planning worksheet. $100M offers has an offer scoring rubric.

honest limitations. every skill tells you what the book got wrong, what's outdated, and when to stop using it. the lean startup skill flags that innovation accounting barely works outside software. crossing the chasm warns you the bowling alley model is mostly theoretical.

conflict resolution between books. storybrand says position yourself as the guide. obviously awesome is more product-centric. the skills map exactly where two frameworks disagree and how to resolve it depending on your situation.

who this actually helps

you're about to do customer interviews → mom test skill gives you exact questions to ask and a scoring rubric to evaluate answers

you're pricing a new product → monetizing innovation walks you through willingness-to-pay research before you build

you're writing your landing page → storybrand gives you a fill-in brandscript template so you stop talking about yourself and start talking about the customer's problem

your marketing isn't converting → the skill figures out whether it's messaging (storybrand), positioning (obviously awesome), channels (traction), or your offer itself ($100M offers)

you're preparing for a B2B sales call → spin selling gives you a call planner with situation, problem, implication, and need-payoff questions mapped out

how to use

clone the repo and symlink into claude code - skills auto-trigger based on your question. or just paste any SKILL.md into chatgpt/gemini/cursor as context. works the same way.

https://github.com/getagentseal/founder-playbook

free and open source. genuinely curious what books you'd want added next.

Reading your AI coding logs: cache hits, retry loops, and other signals

Toruk Makto — Thu, 16 Apr 2026 09:58:49 +0000

Last week I checked my AI coding spend and it was higher than my AWS bill. I'm paying for Claude Code, Codex, Cursor, the occasional Opus burst, and I had no visibility into where any of it went. Just a number going up.

Turns out every AI coding tool already writes session data to disk. Claude Code drops JSONL into ~/.claude/projects/. Codex writes to ~/.codex/sessions/YYYY/MM/DD/. Cursor uses a SQLite database. OpenCode uses SQLite. Pi uses JSONL. All of it is sitting there waiting to be read.

I started reading mine and the patterns are obvious once you look.

What the data shows

This is one week of my actual AI coding usage:

A few things jumped out immediately.

Cache hit rate matters more than I thought. Claude prices cache reads at 1/10th the cost of fresh input. Opus came in at 98.8% cache hits, which sounds great until I noticed Sonnet 4.6 was at 77.1%. That gap is real money. If your system prompt or the first few files in context are unstable, you're paying full price for the same tokens every turn.

Tool counts tell you the agent's mood. 2,126 Bash calls, 990 Reads, 742 Edits in a week. The Read:Edit ratio is roughly 1.3, which is fine. If Read had been 4x higher, I'd know the agent was spelunking instead of executing.

One-shot rate is brutal honesty. Coding shows 88% one shot. The other 12% needed retries (Edit → Bash → Edit). That's where time and tokens leak silently.

Model mix reveals overspending. Opus 4.6 cost $1219 this week. Sonnet 4.6 cost $38. Some of those Opus turns were small Q&A that Sonnet would have handled fine. I haven't run the experiment of routing them yet, but the gap suggests there's real money on the table.

Patterns worth watching for

Signal	What it usually means
Cache hit < 80%	System prompt or context unstable, caching not configured
Lots of Read calls per session	Agent re-reading files, missing context
Low one-shot rate (Coding < 30%)	Retry loops, agent struggling with edits
Opus dominating cost on small turns	Overpowered for the task
`dispatch_agent` heavy	Sub-agent fan-out, expected or excessive
No MCP usage	Either you don't use MCP, or your config is broken
Bash dominated by `git status`, `ls`	Agent exploring instead of executing
Conversation category dominant	Agent talking instead of doing

These aren't verdicts, just starting points. A 60% cache hit on a one-off experiment is fine. A persistent 60% across weeks is a config issue.

How I'm reading this data

There's a tool called codeburn that reads all the session formats and renders this dashboard in your terminal. It supports Claude Code, Codex, Cursor, OpenCode, and Pi. No proxy, no API keys just reads the local files.

npx codeburn report --period week

Repo: https://github.com/AgentSeal/codeburn

Open source, MIT.

Why this matters

We obsess over model choice and pricing tier. We argue about Opus vs Sonnet vs GLM. The discussion online is almost entirely about which model to use, never about what your agent is actually doing once it's running.

The session files have the answer. Every retry, every redundant Read, every cache miss, every misrouted model,it's all there. Looking at it once a week takes ten minuts and tells you more about your spend than any pricing comparison.

Try reading your own sessions for a week. Even if you don't use any tool, just cat a few JSONL files and look at the usage blocks. You'll spot at least one pattern you didn't expect.

Where do your AI coding tokens actually go?

Toruk Makto — Thu, 16 Apr 2026 09:49:05 +0000