DEV Community: Kamil Buksakowski

How to Set Up Google Workspace Email for Multiple Domains Using Cloudflare

Kamil Buksakowski — Sat, 13 Jun 2026 19:35:43 +0000

I was setting up yet another project and ran into a deceptively simple question: how do I handle email so it looks professional, lets me write as a brand, stays manageable, and does not require paying for a separate Google Workspace user for every address?

Sounds simple, but I ended up learning it the hard way because I had never properly set this up before. One detail is especially easy to miss: if you previously enabled Cloudflare Email Routing on the domain, it can conflict with Google Workspace MX records. Email will not work correctly until you disable it. I cover that in the relevant step.

This tutorial shows the exact setup: one Google Workspace account, multiple domains, multiple brand addresses, and one inbox to manage everything. I'll also show the DNS records, Gmail settings, and the small places where this setup can easily go wrong.

The idea: one inbox, many brands

Before you click anything, it's worth separating the roles of your addresses. This is the whole philosophy of the setup:

private Gmail (johndoe@gmail.com) → owner account, recovery, old logins. You don't use it on the website or in contact with clients.
main business inbox (john@johndoe.dev) → command center, Google Workspace.
public project email (contact@taskly.pl) → the address you give to clients.

The key: you don't create a separate Gmail for every project. Everything lands in one Workspace inbox, and you reply "as a brand". You'll wire up the next product with the same pattern (contact@newproduct.pl → the same inbox).

Step 1 — Google Workspace on the main domain

Go to workspace.google.com and start registration. When choosing a plan, you probably don't need the pricier Standard plan at the beginning — Business Starter is plenty for an MVP. You get Gmail on your own domain, Drive, Calendar, Meet, and an admin panel. The 2 TB and advanced Gemini from the higher plan are overkill for an MVP.

While creating the account, pick the "I already have a domain" option, enter johndoe.dev, and create the user john. Result: john@johndoe.dev.

On the contact details screen, your private Gmail will only serve as contact/recovery — it won't be the target business mail.

Step 2 — verifying the domain in Cloudflare

Google needs to confirm the domain is yours. It will show a TXT record to add. Choose manual verification (Switch to manual verification) — it's clearer and does not require connecting Google directly to your Cloudflare account.

In Cloudflare (for the johndoe.dev domain) → DNS → Records → Add record:

Type: TXT
Name: @
Content: google-site-verification=...   // paste the exact value from Google
TTL: Auto
Comment: Google Workspace verification

Save, go back to Google, and click Verify. If it doesn't pass right away — wait 1–3 minutes. DNS sometimes needs a moment.

Important: Cloudflare Email Routing vs Google MX records

If you previously had Cloudflare Email Routing enabled on the domain (e.g. to forward mail to another address), pay attention here — otherwise you can skip this step.

Email Routing keeps its own MX records:

route1.mx.cloudflare.net
route2.mx.cloudflare.net
route3.mx.cloudflare.net

These records block adding Google's MX and create a confusing setup — you end up with two mail systems at once. As a result, mail doesn't work the way it should, and the Google wizard can keep showing an error for a long time even though the DNS "looks" correct.

Important: disabling just the routing rules is not enough. You have to disable the entire Email Routing service. Check it like this:

Cloudflare → Email → Email Routing → Settings

The status has to be Disabled, not just "rules disabled". If you see Status: Enabled or DNS records: Locked — Google still sees a conflict. Click Disable until you reach:

Status: Disabled
DNS records: Not configured (or Unlocked)

Google's MX records only have a chance to work properly after Email Routing is actually disabled and the old routeX.mx.cloudflare.net records are removed.

Step 3 — Google MX record

In Cloudflare for johndoe.dev → DNS → Records → Add record:

Type: MX
Name: @
Mail server: smtp.google.com
Priority: 1
TTL: Auto
Comment: Google Workspace MX

For new Google Workspace setups, Google usually shows a single MX record: smtp.google.com with priority 1 — no longer the old sets of five ASPMX records. Use exactly what the Google wizard shows you, and don't guess.

A small note from experience: if Google still complains, some panels normalize the MX value and may expect smtp.google.com. with a trailing dot. Cloudflare usually handles that for you — don't force the dot.

Step 4 — SPF

To improve deliverability and help receiving servers trust your mail, add an SPF record. In Cloudflare:

Type: TXT
Name: @
Content: v=spf1 include:_spf.google.com ~all
TTL: Auto
Comment: Google Workspace SPF

This doesn't block the MX itself, but it's worth adding right away. We'll handle DKIM and DMARC at the end, after Gmail is active.

After saving, go back to Google and click Retry. Also do a quick test: send a mail from your private Gmail to john@johndoe.dev and reply the other way. If both directions work — the mail is fine, even if the Google wizard is still complaining.

Step 5 — adding support@ as an email alias

You want support@johndoe.dev, but not as a separate, paid user (every extra user is another paid Workspace seat). You do it as an alias to the existing account.

In Google Admin:

admin.google.com → Directory → Users → [your user]
→ User information → Alternate email addresses / Email aliases

In the alias field, enter only support (Google appends the domain itself). Final result:

support@johndoe.dev → lands in → john@johndoe.dev

Watch out: don't click "Add user" or "Set up mail routing" — that creates a new paid account / is for migration, not for an alias.

After adding, test it: send from your private Gmail to support@johndoe.dev — it should land in john@johndoe.dev.

Step 6 — adding the project domain as a Secondary domain

Now you attach the project domain taskly.pl to the same Workspace. There's an important choice here:

Google Admin → Account → Domains → Manage domains → Add a domain

Google will ask about the domain type. For this setup, I prefer Secondary domain, not Domain alias. A domain alias can also work in simpler cases (it maps aliased addresses onto existing users), but a secondary domain gives you more explicit control over project-specific addresses and lets you create normal, separate ones:

contact@taskly.pl
billing@taskly.pl
support@taskly.pl

Next, Google will ask for DNS verification — this time you add the TXT record in Cloudflare in the taskly.pl domain, not in johndoe.dev:

Type: TXT
Name: @
Content: google-site-verification=...
TTL: Auto

And after verification — the MX record (same as before, smtp.google.com, priority 1).

Finally, add the project alias (just like support@ in step 5):

contact@taskly.pl → lands in → john@johndoe.dev

Step 7 — "Send as" in Gmail (the client sees the brand, you have one inbox)

The alias alone makes mail arrive in your inbox. But when you reply, by default it goes out from john@johndoe.dev — and we want the client to see the brand. So we wire up sending.

In Gmail (on the Workspace account):

Settings → See all settings → Accounts → Send mail as → Add another email address

Enter:

Name: Taskly
Email address: contact@taskly.pl

Leave the "Treat as an alias" checkbox checked — these addresses belong to you / your projects and should behave as identities of the same inbox. Gmail will send a verification email to contact@taskly.pl — which will land in your inbox anyway. You confirm the code and you're done.

Now the most important setting, so replies go out as the brand and not as your main address:

Settings → Accounts → When replying to a message:
→ check: "Reply from the same address the message was sent to"

Result: someone writes to contact@taskly.pl → your reply goes out as Taskly. The client doesn't see johndoe.dev. You handle everything from a single inbox.

Step 8 — DKIM and DMARC (deliverability)

The last technical step: DKIM signatures and a DMARC policy. Without them, your mail is more likely to land in spam.

DKIM — generating in Google Admin

Google Admin → Apps → Google Workspace → Gmail → Authenticate email (DKIM)

Pick the domain (e.g. johndoe.dev), click Generate new record, leave the key length at 2048 bit, prefix google. Google will show a TXT record:

Name: google._domainkey
Value: v=DKIM1; k=rsa; p=...   // very long key, copy it ALL, all the way to the end

You paste this record in Cloudflare for the given domain:

Type: TXT
Name: google._domainkey
Content: v=DKIM1; k=rsa; p=MIIBIJANBgkq...   // the entire value from Google
TTL: Auto
Comment: Google Workspace DKIM

After saving, go back to Google Admin and click Start authentication. The "up to 48 hours" propagation message is normal — it usually clears faster. Once DKIM kicks in, the button changes to Stop authentication — that means it's working, and you don't click it. Repeat the whole process for taskly.pl.

DMARC — in Cloudflare

For each domain, add in Cloudflare:

Type: TXT
Name: _dmarc
Content: v=DMARC1; p=none; rua=mailto:john@johndoe.dev
TTL: Auto
Comment: Google Workspace DMARC

p=none is good to start with — it only monitors, doesn't aggressively block anything. After a few weeks, once everything works, you can tighten it to quarantine or reject. For a serious setup, you may later want to send DMARC reports to a dedicated mailbox or a DMARC monitoring tool instead of your main inbox — the aggregate XML reports can pile up fast.

Final setup

After these steps you have a solid, production-friendly mail foundation for a small business or MVP:

On your landing page you put contact@taskly.pl and handle clients from Gmail, without exposing your private address. Next product? Same pattern: a new domain as a "secondary domain" → a contact@… alias → "send as". One inbox, many brands.

If you ever had Cloudflare Email Routing enabled, it's worth checking first whether it's set to Disabled — that's the most common reason mail on a fresh Workspace doesn't start working.

Quick checklist if email still does not work

Cloudflare Email Routing is fully disabled, not only the routing rules.
There are no leftover Cloudflare MX records like route1.mx.cloudflare.net.
There is only one MX setup for the domain.
Google MX points to smtp.google.com with priority 1.
SPF exists as a TXT record on @.
DKIM was generated for the correct domain.
DMARC exists on _dmarc.
The Gmail alias was added both in Google Admin and in Gmail "Send mail as".
Gmail is set to reply from the same address the message was sent to.

Happy shipping! 🚀

MySQL Too many connections: How I Debugged It and Scaled the Connection Pool

Kamil Buksakowski — Sun, 15 Mar 2026 13:02:43 +0000

Practical guide to the MySQL Too many connections error — from a local test to RDS Proxy.

Lately, I’ve spent quite a lot of time debugging and understanding how database connections actually work. Then the topic of scaling those connections also came up.

I wrote this article to share the conclusions and observations I arrived at. It also includes a simple local test that helps explain how DB connections work and what the Too many connections error really means.

This is not an article written from the perspective of textbook theory. It’s more of a practical take after spending time analyzing the problem, running tests, and observing how the application behaves.

If you spot a mistake here or think something could be explained better — feel free to let me know.

What MySQL Too many connections really means

The message itself is pretty simple.

In practice, this error means that the total demand for database connections is greater than the number of connections available on the DB side — in other words, greater than max_connections.

In SQL, you set the max_connections parameter according to your database resources and your infrastructure setup. Depending on what your application and environment look like, that number will vary.

Let’s take a simple example:

MySQL
2 GB RAM
one database instance
max_connections = 200

Now let’s assume the following setup:

the database is running on a server,
we have a web application hitting that database,
a developer connects to the same database locally from their machine.

At this point alone, we already have at least two sources generating DB connections:

the web application,
the developer’s local machine.

Now add the fact that the developer also connects through a client like DBeaver to inspect the data. That’s another connection, or another use of the existing pool.

Now add functions like await Promise.all(), where we fire off multiple database queries in parallel.

And this is exactly where we start moving toward the risk of eventually seeing: Too many connections.

A simple max_connections = 10 example

To illustrate it better, let’s take an example where max_connections equals 10.

Let’s assume the web application uses between 2 and 8 connections at peak.

If it reaches 8, that leaves 2 free.

Now:

one developer connects to the database through DBeaver,
the total becomes 9,
another developer does the same,
the total becomes 10.

And at that point, we are already at the limit.

Any additional client trying to connect may get Too many connections, because it would be trying to become the eleventh connection to the database.

This is exactly the kind of message you will also see on the DB client side when you try to connect to a database that no longer has any free connections.

This is obviously a simplified model, but it explains the mechanics of the problem well.

The same can be applied to production.

If the total demand for connections exceeds max_connections, sooner or later you will get an error.

Where connection spikes and overages come from

Most often, they come from a few things:

poorly written code can generate too much parallel traffic to the DB,
connections are not being released properly,
there is no sensible pool limit configured on the application side,
the application scales, but the database configuration does not keep up with the number of instances,
additional consumers of the same database appear: DBeaver, local tests, jobs, cron, migrations, integrations.

One thing is worth clarifying here.

When I talk about Promise.all(), I do not mean that Promise.all() magically creates new connections by itself. I mean situations where you run many database queries in parallel inside Promise.all(), which increases simultaneous demand for DB resources.

That is an important difference.

Why setting an application-side connection pool limit matters

In my opinion, this should be standard.

If your DB has max_connections = 10, and in your backend code you set a limit so the application can use at most 5 connections, then a single backend instance should not be able to consume the entire pool by itself.

That does not mean the global problem disappears. Other connection sources still exist. But it does mean the backend stops behaving aggressively toward the database.

And this leads to an interesting side effect.

If the application cannot use more than 5 connections and more traffic comes in, some requests will simply wait longer. So instead of an error, you get increased response time.

Of course, this is not a guarantee that the error will never appear, because the same database may also be used by other processes, other application instances, or additional clients.

By setting a limit on the application side, you reduce the risk of Too many connections, but during peak hours, once the available connection pool is exhausted, some things will simply run slower.

My local MySQL Too many connections test

To understand this better, I ran a simple test.

The setup looked like this:

a local database running in Docker,
max_connections = 15,
an endpoint that executed more than 15 parallel DB queries.

To start with, I set a low local max_connections value so I could trigger the problem easily and observe it in controlled conditions.

Variant 1 - without an application-side limit

I called the endpoint.

Result: the Too many connections error appeared.

Here you can see the effect on the NestJS application side — once the connection limit was exceeded, the backend started returning Too many connections.

Variant 2 — with an application-side limit

I added a parameter so the backend could not use more than 10 connections.

Result:

the error disappeared, but the request took longer to complete.

And this is one of the most important conclusions from the whole article for me:

A code-side limit is a simple protective mechanism that helps reduce the risk of exhausting DB connections.

If you want to verify this yourself, I encourage you to test it on the mysql-too-many-connections-repro repository. (https://github.com/KamilBuksa/mysql-too-many-connections-repro)

What is the right number of connections?

In my opinion, that is the wrong question.

A better question is:

what number of connections is appropriate for my infrastructure?

And this is where things start getting interesting.

Because the answer depends on things like:

whether your backend scales vertically or horizontally,
how many application instances you have,
how many different clients use the same database,
how intensively you use parallel queries,
what the traffic looks like.

If you scale the backend vertically, the topic is simpler. You have one instance, more resources, and you can tune the pool and max_connections.

If you scale the backend horizontally, it gets harder.

Because then you need not only to increase max_connections, but also to calculate how much one instance of the application should be allowed to consume at most.

Example:

max_connections = 15
3 backend instances
you configure 5 connections per instance

At first glance, it looks fine.

But once you add:

DBeaver,
a developer’s local connection,
another process,
a job runner,

you quickly realize that 3 x 5 is no longer such an ideal setup.

And that is exactly why this topic is context-dependent. There is no single answer for everyone.

Scaling the backend is not the same as scaling MySQL

These are two separate things.

You can scale the backend by adding more instances, while the database itself remains the same. And that is exactly when connection management stops being “a local problem of one application.”

If the backend scales horizontally and the application starts gaining traffic, sooner or later you end up asking:

how do I manage connections so that several instances do not kill one database?

And that is where proxy comes in.

How RDS Proxy helps with MySQL connection management

I’ll show this with AWS RDS Proxy as an example.

My simplified mental model looked like this:

without a proxy, if the total number of connections generated by the application and all instances together exceeds max_connections, errors will start appearing,

with a proxy, part of that traffic will be handled by a layer managing the connection pool, and instead of an immediate error, some requests will simply wait longer.

In other words:

without a proxy, you will see Too many connections sooner,
with a proxy, the system has a better chance of spreading the problem over time.

That was exactly what interested me the most.

Put simply: the application no longer connects directly to the database, but to an intermediate layer that manages the DB connection pool. Because of that, a large number of clients on the application side does not have to mean exactly the same number of physical connections to the database itself.

It is also worth clarifying that a proxy does not magically make the database able to handle everything. If the real problem is heavy queries or long-running transactions, the proxy will not fix that. Its job is mainly to manage connections better.

My working hypothesis about RDS Proxy and Too many connections

My hypothesis was this:

if, without RDS Proxy, I trigger a scenario where the sum of connections used by the application exceeds max_connections on the database side, I will get a Too many connections error.

If I connect the application through RDS Proxy, then in that same scenario the request will not fail immediately — it will wait for a free connection and complete later.

That was exactly the mechanism I wanted to confirm with a test.

Test 1: MySQL Too many connections without RDS Proxy

On an EC2 server, I deployed an endpoint that, when called, exceeded max_connections in the DB.

I also disabled the application-side limit so that the application would not block me from exceeding the limit and I could trigger the problem directly.

After calling the endpoint:

Too many connections errors appeared.

Test 2: MySQL connections behavior with RDS Proxy

The setup was:

the same endpoint exceeding the number of connections,
no application-side limit,
in .env, a connection through RDS Proxy instead of directly to RDS.

After calling the endpoint:

the error did not appear,
the endpoint took longer to execute,
but in the end the request completed.

And that was exactly the effect I expected.

Is setting a connection pool limit enough?

Not always.

In my opinion, a code-side limit is mainly enough when:

you have one backend instance,
you scale mostly vertically,
your traffic is predictable,
and the whole setup is fairly simple.

In that case, a proxy may be unnecessary.

But if you have several backend instances, it becomes a different story.

Because the code-side limit works locally for one instance, while RDS Proxy works at a higher level — as a connection-management layer for a larger number of clients.

In simplified terms:

the code-side limit controls one instance,
the proxy helps control the problem more broadly.

From the user’s perspective, it is obviously better if a request waits a little longer instead of failing immediately. In many cases that is exactly what will happen, although under heavier overload you can still get timeouts or other errors.

And that is exactly the practical value of the proxy for me.

What I concluded from the MySQL connection pool tests

Based on these tests, I confirmed one practical observation for myself:

when the total number of DB connections starts exceeding what the database can safely handle directly, RDS Proxy can turn part of those immediate failures into waiting for a resource.

And yes — someone could say:

but you did not actually test this with several additional backend instances running at once

That is true. I did not run a full 1:1 autoscaling simulation with several instances.

But the behavior itself was clear to me:

If, without the proxy, exceeding the limit causes errors, and with the proxy, in a similar scenario, the request simply starts waiting longer, then you can see what class of problem this solution addresses.

That was enough of a signal for me to better understand the purpose of the proxy.

What is the best connection pool size?

There is no single answer.

Context is king.

It depends on the system, traffic, number of instances, the intensity of parallel queries, and how many other processes use the same database.

You cannot do this properly once and for all in isolation from the context.

When RDS Proxy actually makes sense

In my opinion, not blindly.

At the very beginning of a project, RDS Proxy will very often be overkill and an unnecessary extra cost.

The hard truth is that most projects never even reach the point where scaling DB connections becomes a real problem.

So first:

observe the traffic,
observe how the application evolves,
watch whether you are actually reaching the stage where instances are being added and connections start competing.

In many cases, instead of implementing a proxy, companies simply:

increase DB RAM,
increase DB resources,
increase max_connections.

And honestly? That is okay too.

It is faster, simpler, and often enough.

Only later, when costs start rising or the setup becomes harder to manage, does real interest in a proxy begin.

From what I have seen, companies are more likely to first increase DB resources and max_connections than to jump straight into RDS Proxy. And honestly, that does not surprise me at all.

RDS Proxy is more advanced. It gives you more possibilities, but it also requires more work, more understanding, and changes how the application connects to the database — which by itself can create new complexity in an existing setup.

So I would not treat it as the default first move.

But it is definitely worth understanding how it works.

Final thoughts on MySQL Too many connections

Too many connections is rarely the problem of one function or one endpoint.

It is a sum problem — all the clients, processes, and instances that use the same database at the same time.

An application-side limit is the first and simplest protective mechanism. It is worth setting. But with many backend instances, that alone often stops being enough — and that is exactly where a proxy starts doing useful work.

That does not mean you need to implement it right away. Often, it is enough to first understand your setup and observe how the system behaves under load.

If this article helps someone understand the topic faster than it took me the first time around, that is great.

Thanks for your time.

Letting Claude Code Test Your Backend - Verifying Business Logic via API and SQL

Kamil Buksakowski — Sun, 08 Mar 2026 12:07:16 +0000

What if an AI agent could test your backend by calling API endpoints and verifying results directly in SQL?

I tried exactly that using Claude Code. The agent called API endpoints, inspected a Docker database via SQL, and validated a decision tree of business logic scenarios automatically.

When a single operation triggers cascading changes across multiple entities, the number of scenarios grows quickly. In practice, this often ends up with manually clicking endpoints in Postman and checking the database state after every operation.

Instead of running tests manually, I wrote down the decision tree of scenarios in a markdown file and handed it to Claude Code. Then I gave it access to a local Docker database and an API service that seeds test data.

Claude executed operations through the API, verified the system state using SQL queries, analyzed the results, and reported PASS / FAIL for each scenario.

In practice, it behaves like an agent running integration tests — but without writing test code.

In this article, I will show how to let Claude Code call your API, inspect a local database, and automatically validate complex decision trees.

This workflow is still experimental, but it demonstrates an interesting direction for backend testing automation — especially in systems with complex business logic and many edge-case combinations.

Tools used in this article:

Claude Code
Docker Desktop 29.1.5 (Mac)
DBeaver — database client

Demo repository: github.com/KamilBuksa/claude-code-local-db-testing

Running a Local Database with Docker

We start by setting up the environment. The repository includes a ready-to-use docker-compose.yml with MariaDB — one command is enough:

docker compose up -d

Once the containers are running, we can connect to the database. In Docker Desktop it should look like this:

Viewing the Database in DBeaver

To see what's happening in the database in real time, I connected DBeaver.

Open DBeaver → New Database Connection → choose MariaDB:

Click Test Connection — it should display "Connected (59ms)". Then click Finish.

After connecting, the full database structure becomes visible (to generate the structure, run npm run start:dev):

Everything is ready.

Domain: HR with Cascading Statuses

The application models a company operating across multiple offices. Employees belong to departments, and departments operate inside buildings. Each entity has its own lifecycle.

Building

ACTIVE — has at least one active department with employees
VACANT — all departments are empty or disbanded
CLOSED — manually closed, does not change automatically

Department

ACTIVE — has at least one employee
EMPTY — the last employee left the department (the department still exists)
DISBANDED — dissolved by an admin or automatically when the last employee becomes deactivated

Employee — ACTIVE or DEACTIVATED. Each employee can belong to multiple departments with roles MANAGER or MEMBER.

The key cascading rule: when a department changes status, the system checks whether the building should change to VACANT. Conversely, when an active department appears, the building returns to ACTIVE.

I defined five edge-case scenarios for three operations:

#	Operation	Situation	Expected result
1	Remove employee from department	The only active department in the building. The only employee voluntarily leaves	Department: `EMPTY` · Building: `VACANT`
2	Disband department	The only department in the building	Department: `DISBANDED` · Building: `VACANT`
3	Disband department	Other departments in the building remain active	Department: `DISBANDED` · Building: `ACTIVE`
4	Deactivate employee	The only active department in the building. The only employee is deactivated	Department: `DISBANDED` · Building: `VACANT` · Employee: `DEACTIVATED`
5	Deactivate employee	The employee is not the only person in any department	Departments and buildings: `ACTIVE` · Employee: `DEACTIVATED`

What happens step by step:

The only employee voluntarily leaves the department → the department becomes EMPTY. The building no longer has any active departments, so it becomes VACANT.
When an admin disbands a department, it becomes DISBANDED. Because it was the only department in the building, the building becomes VACANT.
Disbanding one department does not affect the building if other departments are still active — the building remains ACTIVE.
Deactivating the employee automatically disbands the department (DISBANDED) instead of merely emptying it (EMPTY) — this is the key difference compared to case 1.
Deactivating an employee who is not the only member of any department does not trigger any cascade — departments and buildings remain unchanged.

Prompt for Claude Code

The decision tree — the full table of scenarios with setups and expected results — is stored in docs/test-cases.md in the repository. Claude has access to it via CLAUDE.md. I used a single short prompt:

Seed the database, then test each scenario from @docs/test-cases.md.
For every case: set up the state, call the endpoint, verify via SQL, report PASS / FAIL.

Claude read the file with the test cases, planned the execution order, and started working.

Claude Code in Action

Claude started by verifying that the API was running:

First, a curl request to /buildings to confirm the service responds. Then it loaded the test data:

The /mock/seed endpoint creates a complete dataset: buildings, departments, employees, and their memberships. Claude saved the returned IDs and proceeded with the tests.

To verify the database state, Claude used docker exec to run SQL queries directly:

Claude did not just execute queries — it analyzed the results and understood the context, explaining why the result was correct before moving to the next case.

First Test: Case 4

Claude began with Case 4 — deactivating the only employee in the department:

State reset, minimal setup: one building, one IT department, and one employee (John) as the only manager. After deactivating John, the expected result is: department → DISBANDED, building → VACANT.

Case 4: PASS. The cascading logic works correctly.

Parallel Tests

After verifying Case 4, I asked Claude to run the remaining scenarios in parallel. It prepared the following cases:

Each test received a separate agent, and all started simultaneously:

Result:

All five scenarios: PASS. The cascading business logic works correctly for every edge case.

Demo Repository

The full demo is publicly available: github.com/KamilBuksa/claude-code-local-db-testing

Simply follow the setup instructions in the README.md, run claude in the repository directory, and type:

Let's play around and test the decision tree.

Claude will read the context from CLAUDE.md, recall the decision tree, and guide the entire testing process:

In the screenshot you can see that Claude immediately starts working — it reads the context from CLAUDE.md and begins by seeding the test data.

Important Notes

A few things to keep in mind before trying this yourself:

Never give Claude Code access to a production database — it's unsafe. In this article we work with an isolated local environment running in Docker.
The repository uses schema synchronization instead of migrations for quick setup — in production environments you should use migrations.
All endpoints are public for testing purposes — this is not recommended for real applications.
Tested on Docker Desktop 29.1.5 on Mac — if the docker command does not work, you likely need a newer version.

Summary

What happened here in short? Claude executed operations through curl calls to the API, verified SQL results via docker exec, understood the context of the results, and reported PASS/FAIL.

This workflow is still experimental. I had a lot of fun exploring Claude's behavior during this process. In real projects it may be useful to create a database dump from the testing environment and restore it locally to start with realistic data. Preparing the decision tree in a markdown file beforehand is also important.

Personally, I never trust AI 100%, so I would still manually verify critical cases through the UI after connecting the frontend. However, this kind of testing can detect issues earlier and save time.

This approach works particularly well when testing complex decision trees, where the number of scenarios grows quickly and manual testing becomes impractical.

Thanks for reading. Happy coding! 🚀

Parallel work with Claude Code in iTerm2 - a workflow inspired by Boris Cherny

Kamil Buksakowski — Mon, 16 Feb 2026 07:40:47 +0000

A while ago, Boris Cherny's setup for working with Claude Code went viral. I noticed how he works in parallel across several terminals and how much it simplifies context management.

In this article, I'm showing a workflow inspired by Boris's setup, extended with Git worktrees for real work without conflicts. I'm still testing this workflow in practice, but it already works well enough to be a valuable starting point for others. That's why this tutorial exists.

Installation

brew install --cask iterm2

Open iTerm2 via Spotlight (⌘ + Space, type "iterm").

After launching, you'll see:

Creating tabs

Create several tabs with the shortcut:

⌘T

Repeat 5 times (or as many as you need).

Switching between tabs:

⌘1 – first tab
⌘2 – second tab
⌘3 – third tab

By default, all tabs have the name -zsh, which quickly becomes hard to read.

Boris's setup looks much cleaner – each tab has a simple, numbered name.

Numbering tabs

Double-click the name of the first tab (where it says -zsh).
A Set Tab Title window will appear.

Type the tab number (e.g., 1 for the first, 2 for the second) and click OK.

On the screen:

red circle 1 – double-click the tab name
red circle 2 – type the number and confirm

Repeat the operation for the remaining tabs (2, 3, 4, 5, …).

Final result:

Simple flow for small tasks

Simplest approach: one tab = one branch.

Before running Claude Code:

git checkout -b feature/auth

then:

claude

This approach works well for small, independent tasks in different parts of the project.

Keep in mind that when running the same project in several terminals, Claude Code works on the same file state. If tasks start touching similar code areas, conflicts can appear. That's why we use this approach consciously and only for simple cases.

Here's how it looks for me:

Extended flow for larger tasks

When working on larger features or several tasks in parallel, just switching branches in one directory quickly becomes inconvenient.

Problem

Claude Code operates on the filesystem state of a given folder, regardless of the current branch. Even if we have different branches open in several terminals, we're still working on the same working directory, which increases the risk of conflicts.

Solution

Git worktrees – multiple working directories, no conflicts.

Setup Git Worktrees:

1. Creating worktrees (e.g., at the start of the day)

cd ~/projects/my-super-project

git worktree add ../worktrees/task1 -b feature/carrier-popup
git worktree add ../worktrees/task2 -b feature/email-validation
git worktree add ../worktrees/task3 -b feature/carrier-filters

2. Running Claude Code in each worktree

# Tab 1
cd ~/projects/worktrees/task1
claude "Task 1 description"

# Tab 2
cd ~/projects/worktrees/task2
claude "Task 2 description"

# Tab 3
cd ~/projects/worktrees/task3
claude "Task 3 description"

Each terminal now works on a separate folder and a separate branch.

3. Finishing feature and cleanup

After finishing work and code review:

# In each worktree
git add .
git commit -m "feat: change description"
git push origin feature/branch-name

Removing worktrees:

# In main repo
cd ~/projects/my-super-project
git worktree remove ../worktrees/task1
git worktree remove ../worktrees/task2
git worktree remove ../worktrees/task3

Notes

Worktrees outside repo (../worktrees/) → do not require .gitignore
Each worktree = separate branch = separate changes
Each worktree has its own node_modules, but shares .git – operations on the repo are shared

Notifications in iTerm2

After Claude Code finishes a command, you can get a notification (sometimes with a slight delay).

Open iTerm2 settings:

⌘ ,

Then:

Profiles (top bar)
Terminal (right panel)
Check:
- ✓ Flash visual bell
- ✓ Show bell icon in tabs
- ✓ Notification Center Alerts

Additionally, in System Settings (macOS):
Settings → Notifications → iTerm2 → Allow

Result:

BONUS – keyboard shortcut conflict

If you use Command + Right Arrow in Claude Code to jump to the end of a line, iTerm2 might switch tabs.

To disable this:

1) Open settings:

⌘ ,

2) Go to Keys
3) Find: Previous Tab and Next Tab
4) Set both to "-" (minus = disabled)

Now Command + Right Arrow works correctly in Claude Code.

Summary

Numbered tabs, conscious use of branches, and Git worktrees let you work sensibly in parallel with Claude Code without fighting with context and code conflicts.

This is a workflow that works especially well when working on multiple tasks at the same time.

Happy coding! 🚀

Spec Driven Development with AI - How I Built My Portfolio Site

Kamil Buksakowski — Wed, 14 Jan 2026 07:01:46 +0000

Intro

I want to share the process of building a website using AI. With smart AI usage, you really don't need programming skills to set up a portfolio site. This is a simple project you can do yourself. In this article, I'll show you the entire process. There are many AI tools out there, I'm a fan of Claude Code, so I went with that. The article goes from general to specific - the further down you read, the more nuances you'll find that are worth remembering and might slip past if you're new to this.

AI as a smart advisor who always has time

You can start your journey with the web version of Claude / ChatGPT or whatever. Treat AI as your entry point to the topic. The more you progress, the more tools you'll add - your personal advisor will help choose the right ones for your case. For me, it was Claude Code because I use it daily. I chose Cursor as my IDE, switched to it from free VS Code, which is also great but not as AI-friendly as Cursor. In short, Claude Code is the executor and Cursor is the environment where the executor works.

Pro tip: If you're non-technical and some terms are foreign to you - use this simple trick: copy the article to AI and ask "What did the author mean by Cursor as IDE, are there free alternatives?". This way you can set up any setup you dream of, often changing tools. It takes some time but it's fun, I recommend it.

What do I want to do? Brainstorming the site

This is the first phase where I think about what I want to achieve. No pressure, I casually write in a notepad what features I'd like to have, any notes, if I have a concept then what style to use, etc.

For me it was something like this:

I'm planning to make my portfolio site. I'm a developer and I'd like it to reflect who I am, what my achievements are. I have one project I work on as a hobby when I have free time, I'd also like to showcase it. Maybe a projects section with "show more?". Definitely need an "about me" section, contact section. I have an avatar so maybe we'll put it in the "about me" section. I'm not connecting email, my email will be hardcoded. I have an article on Medium, dev, we'll put links to those platforms. We'll also link my LinkedIn. Additionally, we'll add a blog section. I value minimalism and simplicity, I'd like to keep the site in that style. I'd like the project to follow the latest web development standards, both on small and large screens.

Even though there's not much text, this is a good enough introduction to start Spec Driven Development, which means directing AI to achieve the results we want.

Finally, we'll have 4 files:

requirements.md → project requirements (WHAT we're building)
plan.md → architecture plan (HOW we're building)
tasks.md → list of tasks to complete (what SPECIFICALLY to do)
STATUS.md → current project status (WHERE we are)

The structure can look like this:

docs/
    ├── requirements.md
    ├── plan.md
    ├── tasks.md
    ├── STATUS.md

Having our notes, we can move to the first .md file. This format is AI-friendly, it's the standard.

Developing requirements with AI

Time to transform our rough notes into structured specs.

WHAT are we building? — requirements.md file

We already know roughly what we want to achieve. Time to write the requirements file requirements.md.

This will be a file informing AI why the project is being created. We won't write it manually. We launch Claude Code typing something like:

Using my initial plan here our previously prepared notes help me prepare the requirements file requirements.md. Then we'll work on the plan, later we'll define the list of tasks to complete, but for now let's focus only on writing down requirements. At the end we'll save it to a file.

Claude Code will think, might ask some questions, it's worth having a discussion. Here I thought I'd add dark mode and after deeper discussion with AI we concluded that the blog section is unnecessary (since there will be links to Medium and Dev.to). Here's the final requirements.md file:

HOW are we building? — plan.md file

We (me and AI 😛) already know what we're building, now we need to define how. We need to choose the tech stack we'll build in. The site is very simple so we don't need to overcomplicate and we can discuss with AI a stack that fits our requirements. My choice was:

Next.js because it has good documentation and you can host it for free on the Vercel platform
for styling Tailwind CSS + shadcn/ui to use ready-made components

After choosing the stack, we can ask AI to research what architecture is worth applying in a portfolio project. Simple project, so the architecture doesn't have to be fancy.

We already have the stack and architecture. We'll host the app on Vercel because it's friendly for Next.js. The domain remains. Half the internet runs on Cloudflare and they have cheap domains per year (about $11/year) → so I decided to buy it there later. Let's move to tasks.

What SPECIFICALLY to do? — tasks.md

Just as we generated a plan based on requirements, we generate tasks based on the plan. We can write in Claude Code:

Review the files @requirements.md and @plan.md and then prepare a tasks.md file that will contain a list of tasks to complete. Tasks should be arranged in a sensible TODO. Spend time as an analyst and think about the tasks. Use ultrathink mode

AI will think and as a result generate a pretty good action plan. If we remember something, we can extend it. We have tasks! Time to track progress.

WHERE are we → STATUS.md

Context is the biggest pain point of working with AI. You want to work so AI knows what you're thinking about and has access to needed information. That's exactly what the STATUS.md file is for - it informs AI what stage you're at. By design, I wanted:

STATUS.md file to edit automatically after a series of tasks
after opening a new session and asking Claude Code "where did we leave off?" it would answer based on the STATUS.md file and we'd continue work

To be safe, before the end of a session I asked Claude "Should we update the STATUS.md file?" It happened that it forgot, so I asked.

Claude Code is your friend. Don't know how to prepare STATUS.md file? Ask AI 😈

You can for example type:

Review the files @requirements.md, @plan.md, and @tasks.md, then come up with a progress tracking system. I suggest a STATUS.md file that you'll automatically update after each task. I want to achieve such a result that when I close the Claude Code session and open a new one I'll ask "Where did we leave off?" and you'll know! Don't overcomplicate, the project is small. Review the structure, let's not over-engineer. Ultrathink on this

A few questions and you'll come up with something great. So we have it, we have mini Spec Driven Development. Below is the result:

Final docs touches before work

At this stage it's worth running the /init command in Claude Code. It will make Claude get familiar with the project and generate CLAUDE.md, which it always has access to, it's in its context. Running the command looks like this:

After generating CLAUDE.md let's also type:

Review my work workflow based on @requirements.md, @plan.md, @tasks.md and STATUS.md → make the task tracking system aligned with CLAUDE.md. I don't want to remind you every time about saving progress and where to get the status from, so adjust CLAUDE.md to my workflow. Before closing a session I want to have an up-to-date STATUS.md, and when opening a new one I want you to always review STATUS.md so you can answer the question "What will we be working on now?" Ultrathink on this

This is a very important element for our workflow. We want Claude Code to have the most current project state.

Development

Only at this stage do we connect GitHub, upload the project and start coding. Actually, we ask AI to do it 🙏. I manually pushed the code to GitHub, and then asked Claude Code to use GitHub CLI and make commits. In my case, the first commit to GitHub had several thousand lines of text, not code but just .md files. Well-coordinated specs will make AI handle things great. Remember to commit quite often, preferably after each bigger task.

From this point on, it gets much easier. If you've made it through all this, you'll see how smoothly it'll go now. We navigate AI to go through tasks. You'll probably get a few errors, then ask AI to analyze. For example, I got an error that the Tailwind config file was created in v3, but the Tailwind styles themselves were already in v4. Due to this inconsistency, some styles didn't work. Going through tasks, you'll naturally decide that something is unnecessary, maybe you'll want to add something - that's normal, add it. For me, at the end of the process, I added a second language and another Tech Stack section. But it's best to finish the plan and then beautify.

Testing

After you go through all tasks, it's time for testing. What you don't like, screenshot it and upload to Claude Code asking for a fix, change. You can also ask for audits, for example:

Analyze my site in terms of the latest UX/UI standards. Focus on refining existing elements instead of proposing new ones. Ultrathink on this

You'll see how much such requests give. Don't save on iterations. Ask for different things.

SEO standards, mobile standards, legal standards, performance standards. We don't need to worry too much about security because we have a static site without any integrations. AI is only as good as its user. It's worth asking open questions:

Do you see any elements we might have missed, maybe SEO, performance?

Claude knows a lot. Questions matter.

MCP Playwright

During testing, mcp_playwright came in handy. In short, it did for me what you do manually - tested in Google browser. Usage is simple, first you install:

claude mcp add playwright npx @playwright/mcp@latest

After installation you tell Claude:

Use mcp_playwright to test my site in terms of UX/UI. Suggest improvements if you notice elements that are unreadable or not working.

Claude automatically starts the server and tests:

Lighthouse

There's a plugin to install in Google Chrome called Lighthouse. It lets you measure your site's performance and SEO. I recommend installing it and doing analytics. Copy the report, paste it into Claude Code and ask:

Review the Lighthouse report. Suggest improvements that will help raise the report quality so our site is as Lighthouse-friendly as possible.

The report generated via the plugin looks like this:

You can also use Lighthouse CLI and ask Claude directly to conduct the analysis.

I'd suggest using Lighthouse Chrome only when the site is running on Vercel. Otherwise, some errors might pop up.

Deployment

After the phase of various tests and checking prompts, we're ready for deployment.

For hosting we choose Vercel in the free hobby version.

Vercel — hosting

Uploading is very simple. We create an account, connect with GitHub, choose the repo and click deploy. Done! The site is already running on a test domain from Vercel. This is what a successful deployment looks like:

It's worth going to the Analytics tab and enabling tracking. Remember then to add a cookie consent banner - not every user gives consent. Tracking lets you measure the number of visits and country of origin.

Enabling analytics in Vercel panel:

However, if we want to have our own domain, we'll need Cloudflare.

Cloudflare — domain

At this point, it's time to choose a domain name. You check availability on Cloudflare - if it's free, you click buy and go through payment.

Domain search on Cloudflare:

You have the domain, now it's time for DNS. In Cloudflare you set up records that will connect the domain with Vercel. Don't know how? Paste screenshots into Claude and ask step by step.

Final DNS setup:

DNS in Cloudflare ready, now we connect the domain in Vercel:

Now we have everything. Sometimes you need to wait up to 24 hours for DNS to propagate. When it does - done. That's basically the end. We have a project running on the internet.

Summary

In the article above, I wanted to show what the process of creating a website by a non-technical person can look like. I believe that currently with AI we can do a lot, we don't need to limit ourselves. If previously setting up a site with hosting and domain could be overwhelming, now with AI support it's enough to ask the right questions, paste the right screenshots. Say where we are and what we want to achieve.

This Spec Driven Development system I developed with Claude Code. It works well in small projects. As proof - my site was created exactly like this: https://www.kamilbuksakowski.dev

Result:

I'm satisfied. I encourage you to experiment with Claude Code and don't be afraid of new topics. AI is a great teacher 🚀

From Frustration to Automation: My 3-Month Journey with i18n Translations

Kamil Buksakowski — Tue, 09 Dec 2025 07:00:25 +0000

TL;DR

I optimized my translation writing process!

Two steps:

I type the error intention or literal error,
I run the script.

Boom! That’s it, I just added 7 translations to JSON files. Commit and done. Notice I typed “intentions”, not literal translations 😈

How was it before?

Come up with a translation,
Come up with a translation key,
Translate to other languages,
Add translation to JSON files,
Replace the key in the code.

As you can see, tons of work. But this was the problem that got me started on automating translations. You can read about my entire journey and the final version below.

Frustration and Problem Definition

In the beginning, there was frustration. I was adding translations manually and getting annoyed that it was boring and a waste of time. I knew it could be faster, but how? I started by defining what I wanted to achieve and what I knew.

What do I want to achieve? I want translations to be added with minimal effort.

What do I know? I know the folder structure and that it’s JSON I want to fill with appropriate translations.

src/
└── i18n/
    ├── de/
    │   └── translation.json          # 🇩🇪 German
    ├── en/
    │   └── translation.json          # 🇬🇧 English (source)
    ├── es/
    │   └── translation.json          # 🇪🇸 Spanish
    ├── fr/
    │   └── translation.json          # 🇫🇷 French
    ├── it/
    │   └── translation.json          # 🇮🇹 Italian
    ├── nl/
    │   └── translation.json          # 🇳🇱 Dutch
    └── pl/
        └── translation.json          # 🇵🇱 Polish

I know I want translations to work in two steps:

I type the translation in code,
AI replaces the key in code and adds appropriate translations to the right JSON file.

Nothing more. Knowing what I wanted to achieve, I was ready to take on the challenge.

First Encounter with AI, First Disappointment

I started getting into AI and the process of writing code faster. So I started using Claude Code, and I was impressed by how well it handled code. Then it was time to tackle my problem — I asked it to add translations for 7 languages.

It added the files, but the whole thing took over a minute and was very expensive (lots of tokens). If you didn’t clearly specify the rules for finding keys, it tried to read the entire translation JSON. Each JSON over 25k tokens and it didn’t even read the whole thing! Because there’s a 25k limit, then it chunks it. So after approving Claude Code’s operations and over a minute — it worked. I didn’t like the result: slow, expensive. I abandoned the topic for a few months.

The Problem That Wouldn’t Go Away. Custom Commands in Action

This topic really bothered me! How to do it right, it must be possible, right? And that’s how I stumbled upon Custom Commands. Why tell the LLM what to do every time and which translations to add where and how, when I can create a command run from the Claude Code terminal that will have instructions.

The result of my work was the command:

/i18n-extract "Translation here as argument"

Goal? I type the command, and as a result it should do all the steps I was doing manually.

How did it go? Badly, on the plus side the rules were written in the i18n-extract.md file, but still: long, expensive.

EUREKA! What if I move some of the work to Bash?

At some point I had a developer epiphany. I’ll tell Claude that what’s mechanical should be executed in Bash instead of doing it itself.

I went back to the beginning: why do I need AI in translations? To do translations! So simple yet so easy to overcomplicate 😅

I took this thought further and came to what I thought was a brilliantly simple division: mechanical layer and creative layer.

Creative layer: AI handles generating translations.

Mechanical layer: Bash will handle all operations of finding translations to replace, saving to files, etc.

After the dust of joy settled, the first problem appeared. How is Bash supposed to find the fragment in code that it needs to send to Claude Code? That was a tough question.

Hmm, what would I actually like the full process to look like?

I type throw new Error(‘Email is already taken’),
I run the command and done.

Then I remembered why I’m doing this too — to make it convenient for me, and since I’m building it for myself, I can set the rules to make it work.

The '#go' marker.

A very simple concept that solved the problem of telling Bash where to find what it’s looking for in the codebase.

At the end we simply add #go, to tell Bash “I’m here”. We assume there’s only one #go in the project.

new Error('Email is already taken #go')

'#go', despite being simple, was a breakthrough for me. While working on this solution, or rather typing the error “Email is already taken #go”, I thought you could make a significant upgrade and a serious convenience boost.

Intention mode vs literal

What’s wrong with “Email is already taken #go”?

We have to type the error message! The literal one it should throw.

What if we split this into two modes? Intention mode and literal. Literal would translate 1:1 what we typed, and intention mode would come up with the translation for us.

Another problem: How to tell Bash when to activate intention mode and when literal? Keywords! How powerful they are! I adopted a simple assumption: if the error contains keywords (“Need”, “Create”, “Generate”) it will activate intention mode and come up with a translation. Example of activating intention mode based on the word “need”:

throw new Error(
  'Need error that user reached project limit on free plan and must upgrade to premium to add more #go'
);

Thus keywords became responsible for activating intention mode (my favorite) or literal mode.

Going through this long journey, after fine-tuning the Bash script to change, delete, replace what was needed — it was time for tests.

Flow:

1) I type the error intention in code.

throw new Error(
  'Need error that user reached project limit on free plan and must upgrade to premium to add more #go'
);

2) I run the custom command from Claude Code terminal: /i18n-extract. As a result, it generated and populated translations to 7 files.

// de/translation.json
{
  "PROJECT_LIMIT_REACHED_UPGRADE_REQUIRED": "Sie haben die maximale Anzahl an Projekten für Ihren aktuellen Free-Tarif erreicht. Um weitere Projekte hinzuzufügen und erweiterte Funktionen für Ihr Logistikmanagement freizuschalten, führen Sie bitte ein Upgrade auf unseren Premium-Tarif durch.\n"
}

// en/translation.json
{
  "PROJECT_LIMIT_REACHED_UPGRADE_REQUIRED": "You have reached the maximum number of projects allowed on your current Free plan. To add more projects and unlock advanced logistics management features, please upgrade to our Premium plan.\n"
}

// es/translation.json
{
  "PROJECT_LIMIT_REACHED_UPGRADE_REQUIRED": "Ha alcanzado el número máximo de proyectos permitidos en su plan Free actual. Para añadir más proyectos y desbloquear funciones avanzadas de gestión logística, actualice a nuestro plan Premium.\n"
}

// fr/translation.json
{
  "PROJECT_LIMIT_REACHED_UPGRADE_REQUIRED": "Vous avez atteint le nombre maximum de projets autorisés dans votre offre Free actuelle. Pour ajouter davantage de projets et débloquer les fonctionnalités avancées de gestion logistique, veuillez passer à notre offre Premium.\n"
}

// it/translation.json
{
  "PROJECT_LIMIT_REACHED_UPGRADE_REQUIRED": "Ha raggiunto il numero massimo di progetti consentiti nel Suo piano Free attuale. Per aggiungere ulteriori progetti e sbloccare le funzionalità avanzate di gestione logistica, effettui l'upgrade al nostro piano Premium.\n"
}

// nl/translation.json
{
  "PROJECT_LIMIT_REACHED_UPGRADE_REQUIRED": "U heeft het maximale aantal projecten bereikt dat is toegestaan in uw huidige Free-abonnement. Om meer projecten toe te voegen en geavanceerde logistieke managementfuncties te ontgrendelen, kunt u upgraden naar ons Premium-abonnement.\n"
}

// pl/translation.json
{
  "PROJECT_LIMIT_REACHED_UPGRADE_REQUIRED": "Osiągnięto maksymalną liczbę projektów dostępną w ramach aktualnego planu Free. Aby dodać więcej projektów i uzyskać dostęp do zaawansowanych funkcji zarządzania logistyką, prosimy o przejście na plan Premium.\n"
}

Finally, AI replaced the error in code with the i18n key.

Before:

throw new Error('Need error that user reached project limit on free plan and must upgrade to premium to add more #go',);

After:

throw new NotFoundException({key: 'translation.PROJECT_LIMIT_REACHED_UPGRADE_REQUIRED'});

The quality of translations was good. The whole thing took about 20 seconds with a stopwatch in hand, in intention mode. I had mixed feelings, I felt it could be better and faster.

EUREKA x 2 — pure Bash!

Better, better. How to do it better, what’s wrong, why so many thoughts when the translation process itself only takes a few seconds?

The answer was interesting to me. It turned out that most of the time was spent on process orchestration through Claude Code. In other words, when I executed the command, Claude Code was deciding that Bash needed to run, then again, then again and again. But why? Why does Claude need to decide this when we know the full flow. Here it was worth going back to the earlier thinking:

Why do I need AI in translations? To do translations!

I took it literally. I refined the previously established architectural division:

Claude Code should only come up with the key and translations. It shouldn’t direct the process. It should be responsible for the creative part.
Bash should do everything else. It directs the entire process and at the right moment makes an API Call to Claude to generate translations and that’s it! It’s responsible for the mechanical layer.

What needed to be done? Remove the command run from Claude Code CLI level and instead run a script written in Bash, which works on our assumptions:

finds the fragment in code we want to translate using the #go marker,
determines whether to use intention or literal mode based on keywords contained in the error,
API Call to Claude to translate,
execution of the mechanical part by Bash (replacing translations, changing keys, cleanup, etc.).

Real intention mode flow:

1) I type the error intention in code that I want to throw (contains the keyword “Need”).

throw new Error(
  'Need error that user reached project limit on free plan and must upgrade to premium to add more #go'
);

2) I run Bash using “time” to immediately measure time.

time bash .claude/scripts/i18n-extract-full.sh --yes

3) The process executes.

Terminal output: 769 tokens used, 7 languages translated in 8.9 seconds

On the screen we see details:

total of 769 tokens used (only key and translation generation),
translations added to 7 languages,
the whole thing took 8.9 seconds.

4) Done, 7 translations added to respective JSONs and key replaced, ready to commit.

Real literal mode flow:

1) I type the error in code 1:1 as it should be translated:

throw new Error('Email is already taken #go');

2) I run Bash.

time bash .claude/scripts/i18n-extract-full.sh --yes

3) The process executes.

Terminal output: 340 tokens used, 7 languages translated in 3.8 seconds

On the screen we see details:

total of 340 tokens used,
translations added to 7 languages,
the whole thing took about 3.8 seconds for literal mode.

4) Done, ready to commit.

Since the literal flow is much simpler, we got down to 3.8 seconds and minimal token usage.

Comparison

As we’ve proven — custom commands don’t work well for this type of task. Below I’m attaching a comparison that clearly shows that for the type of problem discussed in the article, Bash wins.

Comparison table: Custom command 20s vs Plain Bash 3.8–8.9s, showing 2–5x speed improvement and 10x fewer LLM calls

Summary

This article was created to share the evolution of a solution to a specific problem. We often start with a complicated version, and through successive iterations we arrive at a simple solution. Simplicity is a form of art.

The AI hype sometimes makes you tend to overcomplicate. That’s how I was at the beginning. I started with the “everything on AI side” solution, then “custom commands”, then “custom commands + Bash” and only at the end “Full Bash”. I wonder if there hadn’t been an AI hype, would I have started digging into Bash right away 🤔

The solution shown in the article is part of my custom workflow, saves a significant amount of time and frustration, and was interesting to build. Below I’m recording the biggest EUREKA moments I experienced and conclusions:

you’re in the driver’s seat → the #go marker solved a lot of problems, let’s be creative,
just do it, even if the first solution will be overcomplicated, slow, you won’t use it. From idea to result can take a long time, it took me about 3 months for the thought to germinate and evolve during the process,
orchestration on the Bash side + LLM only for the creative layer. Clear division that simplifies a lot,
with custom workflow don’t be afraid to use keywords like “need” to trigger creative mode — ultimately it’s a solution for us, it’s important that it works and you want to use it, it should be convenient.

Thank you if you made it this far! What automations have you managed to create? Maybe you have practical use cases for custom commands?

Feel free to discuss in the comments 👇