What 12 Anthropic Academy Quizzes Taught Me About My Own Blind Spots

#claudecode #ai #anthropic #learning

Final part. This series started because I passed all 12 Anthropic Academy certifications — then looked at my wrong answers.

Over the past two weeks I've written about each misconception individually (read the full series). Here's the full picture.

The List

Prompt caching has a 1,024-token minimum — and it fails silently. I was adding cache_control to short prompts for months, paying full price.
Cache breakpoints go after the last tool definition — not the system prompt. I was leaving tool schemas uncached on every call.
Extended Thinking returns two blocks — a thinking block and a text block, with cryptographic signatures to prevent tampering. My streaming parser worked by accident.
Re-ranking is a separate LLM step — not just sorting by similarity score. I'd been skipping it entirely and sending noisy results to Claude.
Anthropic doesn't provide an embedding model — the recommended provider is Voyage AI, requiring a separate account and API key.
MCP Inspector exists — and it's the first tool you should use when debugging MCP servers, not Claude Desktop.
ListToolsRequest is how clients discover tools — your tool descriptions are the instructions Claude reads to decide when to use each tool.
Workflows ≠ Agents — predefined steps = workflow. Dynamic LLM-driven control flow = agent. I'd been calling everything an agent.
Temperature defaults to 1.0 — maximum creativity by default. I'd been running classification at full temperature without realizing it.

The Pattern

These all share the same shape: things that don't cause errors but silently cost you money or degrade your results. The uncached prompts worked. The noisy RAG results produced answers. The temperature-1.0 classifications were usually right.

"It works" is not the same as "it works correctly." The gap between the two is where the money goes.

What I'd Recommend

If you've been using Claude's API for more than a month, take the quizzes. Not the courses (though those are good too) — the quizzes specifically. They're designed to catch the things you think you know but don't.

Three courses stood out:

Building with Claude's API — the final assessment covers caching, streaming, Extended Thinking, and tool use in ways that expose gaps
RAG with Claude — the re-ranking and contextual chunking sections changed how I build retrieval
Introduction to MCP + Advanced Topics — if you're building MCP servers, the protocol internals matter

Every certification is verifiable. Every quiz tests understanding over memorization. And apparently, every one of them can humble someone who thought they knew what they were doing.

Start here: anthropic.skilljar.com

Which of these surprised you the most? Or did the quizzes catch you on something I didn't mention?

More on Claude Code safety: cc-safe-setup on GitHub