DEV Community

jesus manrique
jesus manrique

Posted on • Originally published at guayoyo.tech

Macchiato Days 3, 4 & 5: Red Brutalism, a .sh Domain, and the Night Claude’s Metrics Nearly Made Me Smash My Monitor

Macchiato Days 3-5 — Header

Series: Macchiato's Log — Day 1 · Day 2 · Days 3-5


I promised a daily log. I failed. Three days off the radar. Not out of laziness. Because days 3, 4, and 5 were a meat grinder: a redesign, metrics that Claude hides like a gremlin, features my friends demanded at 11 PM, bugs that stung, and a domain I bought between fixes like grabbing coffee at 3 AM.

This isn't a tutorial. It's a confession.


Day 3 — Red on Black

Day 2 ended with a sidebar that sort of worked. Live metrics, a terminal grid, keyboard shortcuts. Functionally solid. Visually... a mess. The app still looked like a hackathon project: generic dark background, buttons ripped from Electron docs, inconsistent padding, icons screaming "a backend dev built this."

Day 3 I woke up with one fixed idea: Macchiato has to look the way it feels to use it. And using it feels like being up at 2 AM with three monitors, lights off, watching AI agents burn tokens while you sip cold coffee. That's not a pretty app. That's a control room.

What it was never going to be

Before I tell you what it is, let me tell you what I explicitly banned:

  • No glassmorphism. I'm not a crypto dashboard from 2023.
  • No identical card grids. If every panel has no personality, it's useless.
  • No side-stripe accent borders. That's a template, not design.
  • No emoji as icons. I'm not a vegan delivery app.
  • No gradient text. Text is for reading, not admiring.
  • No giant hero metric numbers. That's for selling, not working.
  • No hardcoded uppercase on body text. I'm not yelling data at you.

With that clear, I started building.

The palette

I pulled it from what I see at 2 AM every night: deep black for the background (#0A0A0B), layers of industrial grays (#0F0F10, #18181A, #222224) for panels and surfaces, zinc text (#E4E4E7) that reads without fatigue. Sharp, thin borders at #27272A — enough to separate panels, not to decorate.

And the accent: red. #E8443A.

Not Christmas red. Not error red. Dried blood red. Measured. Industrial. The kind of red you use to say "this is alive" without it looking like a fire alarm.

The reason is simple: amber is Guayoyo. Macchiato is something else. More aggressive. An espresso, not a latte. Red gives it character without being friendly. This tool doesn't want you to like it. It wants you to work fast.

Red is used for one thing only: live data — tokens burning, costs rising, active sessions. If you see red in Macchiato, something is happening. The rest of the interface is monochrome. That constraint makes every pixel of color mean something.

Engineer-grade typography

JetBrains Mono for everything that's data. Metrics, file paths, session names, commands. If it's a number or a path, it's monospaced. For labels, buttons, and navigation: the operating system's system-ui, the one that annoys you the least. No custom Google fonts. No Typekit. The app loads fast or it doesn't load at all.

Surgical sizes: 11px for dense metadata, 12px for table values, 13px for UI body, 15px for panel headings, 18px for view titles. Nothing larger than 22px in the entire app. If you need a huge number to feel important, open Excel.

Maximum letter-spacing of 0.02em on values — enough to read, not enough to look like a restaurant menu.

The result

I sent a screenshot to a designer friend at 10 PM. His exact words: "it looks like a Swiss bank terminal that's also angry."

Exactly.

Macchiato no longer looks like a weekend project. It looks like a tool that charges money to use it. Except it doesn't. But that's a topic for later.


Day 4 — The Night Claude's Metrics Nearly Broke Me

You need to understand something about Claude: Anthropic does not want you to know how much you're spending.

This isn't a conspiracy theory. It's product design. Claude Code does report metrics. But it reports them in inconsistent formats across API versions, across models, across request types. It buries them in text streams. It fragments them in incomplete JSON. It changes them without warning between minor versions.

On Day 2 we managed to read metrics. They were approximate. Good enough to give you a rough idea. But three different people told me the same thing: "this is cool, but how much am I spending exactly?" Not approximately. Exactly.

On Day 4 I set out to make Macchiato accurate to the cent. Naive.

Attempt 1: parse the stream

Every Claude request returns a stream of events. I thought it was just a matter of capturing the final event and reading usage. Beginner's mistake. Claude sends input tokens in the first chunk, output tokens in the last, and cache tokens — which is where the real savings are, sometimes 90% of the cost — it sends... in the middle. Or not at all. Or duplicated. Depends on the model, the API version, and apparently the phase of the moon.

First attempt: garbage. Metrics danced between requests. The same prompt cost $0.14 or $0.38 depending on how Claude had decided to report that day.

Attempt 2: HTTP headers

Anthropic's headers include cost fields on REST requests. But Claude Code uses streaming, and in streaming the headers arrive before the body. Without the full body, you don't know how many output tokens were generated. The math doesn't close until the stream ends, and by then the headers are history.

Second attempt: also garbage.

Attempt 3: state machine

Fine. If the stream is inconsistent, I'll track it myself. I built a state machine that receives every fragment of the stream, classifies it by type (input, output, cache_read, cache_write), and accumulates. When the stream closes, it totals.

It worked. For Claude Opus. Claude Sonnet reported differently. Haiku reported differently. Claude 3.5 reported completely differently — different field names, different units, sometimes tokens instead of credits.

By 11 PM I had the state machine supporting 4 models with 3 different metric formats each. The code looked like a lie detector, not an API parser.

Attempt 4: normalization table

I surrendered. I couldn't maintain a state machine with ifs per model and version. It was fragile, unreadable, and was going to break with the next model Anthropic released.

So I built a normalization table: every provider, every model, every API version, every token type has an explicit mapping to a unified schema inside Macchiato. Claude Opus 4.6 reports cache_creation_input_tokens → Macchiato stores it as cache.write. Claude 3.5 Sonnet reports cache_read_input_tokens in a different field with a different name → same table, different row, same unified schema.

The table has 47 rows. 47 different ways Anthropic tells you how much something cost you.

At 2:17 AM I ran ten test requests. Every single one matched to the cent. I sat there watching Macchiato's sidebar update with exact numbers for five minutes. Not debugging. Pleasure.

I sent the build to two friends. "Now I actually know what each prompt costs," one said. "Before I prayed, now I know," said the other.

That was the point. But I couldn't sleep easy. Because while I was fighting with metrics, they were testing the app. And they found things.


Day 5 — Broadcast, Bugs, Images, and a Domain

I slept four hours. I woke up to a WhatsApp message: "bro the app can't find my node."

The nvm bug

The tester had Node installed via nvm at a non-standard path: ~/.nvm/versions/node/v22.11.0/bin/node. Macchiato looked for node with which, which returned garbage on his system because nvm doesn't export to the global PATH until you run nvm use.

Fix: detection cascade. First which node, then command -v node, then find ~/.nvm -name node -type f, then scan common paths. If nothing works, the UI tells you "Node not found" instead of dying silently like a wounded animal.

The zsh bug

Another tester used zsh with oh-my-zsh and a custom three-line prompt with colors, icons, and probably his zodiac sign. Macchiato's PTY output parser choked on the ANSI escape sequences. The terminal showed corrupted glyphs and flickering.

I had to rewrite the ANSI parser from scratch. Not with regex. With a complete escape sequence table: CSI, OSC, DCS, all of them. Every sequence mapped to its action: move cursor, change color, clear line, hide cursor, show cursor. The new parser ignores cosmetic sequences and only processes the ones that affect real content.

Three hours of reading terminal documentation from 1987. But now Macchiato runs on bash, zsh, fish, and whatever people use who customize their prompt with weather emojis.

The grid bug

A third tester — the one with the three-monitor setup — reported that the terminal grid desynced when resizing the window. Move the left panel border 20 pixels and the right terminal compressed like an accordion. The internal PTY sessions weren't receiving the new column size.

Fix: a resize event system that propagates dimensions from each grid cell to its child PTY session in real time. Every resize triggers a virtual SIGWINCH on the correct session. The grid stays put, the terminals don't deform, the tester's three monitors stopped suffering.

Every bug was a stab wound. Every fix was a medal. This is how you build software people actually use: with feedback from people who aren't afraid to tell you it's broken.

Command broadcast

Between one fix and the next, a tester threw me an idea that should have been obvious: "I have three projects open in the grid. Why do I have to type git pull three times?"

Broadcast mode: a toggle in the toolbar. Flip it on, and every keystroke you type is sent simultaneously to every open terminal in the grid. Like tmux synchronize-panes, but cross-project. git pull across three different repos, one command. npm install in frontend and backend at the same time. docker compose up on two stacks.

I built it in 40 minutes. The first test was git status across three different projects. The tester sent me a ten-second WhatsApp audio that was nothing but laughter. That audio is worth more than any adoption metric.

Image paste

Claude Code and OpenCode accept images if you pass them by path. But the traditional workflow is ridiculous: save a screenshot, remember the path, type the path in the terminal, pray it has no spaces. Three steps and a prayer.

In Macchiato you now copy an image to the clipboard, hit Ctrl+V in any terminal's prompt, and the image is saved to a temp directory, auto-referenced in the command, and cleaned up when the session closes.

New flow: you see a bug, take a screenshot (Shift+Cmd+4 on Mac, Win+Shift+S on Windows), back to Macchiato, Ctrl+V, type "what's going on here," Enter. Claude sees the image and the context. No path. No extra command.

This isn't a feature. It's friction removal.

Metrics Part 2 — The Revenge

The Day 4 parser was accurate. For individual requests. But the testers — blessed testers — found the edge cases I didn't catch at 2 AM:

Truncated streams. When Anthropic rate-limits a request mid-stream, there's no final event with metrics. Macchiato showed "0 tokens" for that request. A lie. The request burned tokens, Anthropic just didn't tell you how many. Fix: detect truncated streams and display the metric as "incomplete" with a visual indicator. I don't show a fake number.

Bodyless errors. HTTP 429 rate limits, 500 server errors, timeouts. None include metrics. Macchiato showed the cost as zero because there was no data. Fix: map HTTP codes to metric status. If the API said 429, the UI shows "rate limited" in red, not a lying zero.

Long sessions. A tester left Macchiato open all afternoon and ran 63 requests. By request 40, the sidebar started lagging. Every new request triggered a full recalculation of session totals — adding 40 requests every time a new one arrives is O(n²). Fix: incremental aggregation cache. Each new request adds its metrics to an accumulator. The sidebar reads the accumulator, it doesn't recalculate. Render time after the fix: under 3ms with 100+ requests in the session.

I tested with a synthetic session of 100 consecutive requests, alternating models and deliberately triggering errors. The sidebar didn't flinch. Metrics exact. For real this time.

macchiato.sh

At 9 PM on Day 5, between a fix and a test, I opened Namecheap. Searched for macchiato.sh. Nobody had registered it. A .sh domain, clean, short, memorable. Fourteen dollars a year.

In two hours I had a minimal landing page up: app name, what it does, project status, and an email field for people who want to try the beta as soon as it drops. No templates. No animations. Bare HTML with the same brutalist palette as the app. Visual coherence from end to end.

Macchiato has a home now: macchiato.sh.


What's Coming: Containers, Clusters, and Real Scale

Three days. Surgical brutalism. Metrics tamed. Broadcast. Images. Street bugs crushed. Its own domain.

But this isn't over. What comes next is more ambitious:

  • Docker inside Macchiato. Run containers, inspect them, view logs, troubleshoot without leaving the app where your code agents already live. No jumping between terminal and Docker Desktop like a kangaroo.
  • Kubernetes. Switch cluster contexts with a shortcut. Pod logs in side panels. Port-forward with a button. What every dev working with K8s needs daily and no terminal tool gives you integrated with your AI agents.
  • Public beta. At macchiato.sh, soon. The first to sign up will be the first to break it. And I need them to break it. Because every bug found now is one that a lonely, frustrated user won't find three months from now.

A Coffee to Keep This Alive

I'll be direct: Macchiato was born to be free. I'm not putting up a paywall. I'm not hiding features behind an "enterprise" plan. I'm not charging you for using your own money on AI APIs.

But macchiato.sh costs. My time costs. And what's coming — Docker, K8s, testing on Windows, Mac, and Linux, signed builds, distribution infrastructure — costs more.

If Macchiato helps you, if it saves you hours, if it makes life easier in that terminal you stare at at 2 AM just like I do, consider throwing a few dollars. Soon I'll enable donations at macchiato.sh. It's not a subscription. It's not a hidden SaaS. It's a coffee. Literally. So the project stays free for everyone.

In the meantime, if you want to be there when the beta drops: macchiato.sh. Leave your email. The first ones in will get bugs. But they'll be our bugs.


Guayoyo Tech

Macchiato is an independent project, but it was born inside Guayoyo Tech. If your company needs to integrate code agents into your development stack — with real metrics, cost control, without depending on a single AI provider — we can help.

Let's talk — Free 30-minute initial consultation.

Top comments (0)