There's a moment in every project where you have a working endpoint, you know
you should write tests for it, and you also know you're about to spend the next
hour wiring up an HTTP client, an assertion library, and a dozen little helpers
before you write a single meaningful check.
I got tired of that moment. So I built two-go:
a small, zero-dependency library for testing HTTP APIs from Node. You build a
request with a chainable API, attach the checks you care about, and await it.
import { go } from "two-go";
await go("https://api.example.com")
.get("/users")
.bearer(token)
.expectStatus(200)
.expectHeader("content-type", /json/)
.expectJson("data[0].id", 1);
That one chain sends the request and runs all three checks. If any of them
fails, it throws, so there's no special runner to configure. It works on its
own, and it drops straight into node:test, Jest, Vitest, or Mocha with no
plugin.
That's the core. But the part I actually want to talk about in this post is the
part that didn't exist in API testing libraries when I started: letting an AI
agent do the testing work for you. two-go ships two things for that: an
optional AI layer, and an MCP server.
Why bother with an AI layer at all?
Most of the friction in API testing isn't writing assertions. It's the staring
at a response and deciding what to assert. What status should this return?
What's the shape of data[0]? Which fields could leak? Which weird payloads
break it?
Those are exactly the questions a language model is good at taking a first pass
at. So two-go has an optional two-go/ai entry point that can:
- draft a test suite from a live endpoint or a sample response,
- explain why a test failed,
- review a response for likely bugs,
- and generate adversarial payloads to fuzz an endpoint.
The important design choice: the AI layer never touches the core. two-go still has zero runtime dependencies. The AI module talks to the provider over fetch with your own key, and it works with OpenAI, Anthropic, or any compatible endpoint, including a local model.
Drafting a suite from a real endpoint
export OPENAI_API_KEY=sk-...
two-go ai gen https://api.example.com/users -o test/users.twogo.mjs
# or from a saved response, with a different provider two-go ai gen ./sample.json --provider anthropic -o test/users.twogo.mjs
Or from code:
import { aiGenerateTests } from "two-go/ai";
const code = await aiGenerateTests({
endpoint: "/users",
baseUrl: "https://api.example.com",
sample: { data: [{ id: 1, name: "Ada" }] },
provider: "openai", // or "anthropic", or a custom { baseURL } for a local model
});
The output is a normal *.twogo.mjs file. It goes into git and runs in CI like anything else. Treat it as a first draft. It gets the boilerplate and the obvious checks out of the way, and you tighten the assertions from there.
Explaining a failure (after the fact, never changing the result)
When a test fails, you can ask the model what probably went wrong. This is advisory: it runs after the failure and never changes pass or fail.
import { explainFailure } from "two-go/ai";
try {
await api.get("/users").expectStatus(200);
} catch (err) {
const why = await explainFailure(err, {
response: err.response,
provider: "openai",
});
console.log(why); // likely cause plus a suggested fix
}
Reviewing and fuzzing
aiReview looks at a response and hands back a list of likely problems: a leaked token, a wrong type, a field that shouldn't be there. aiFuzz generates adversarial payloads you then send with the normal client.
import { aiReview, aiFuzz } from "two-go/ai";
const res = await api.get("/me");
const findings = await aiReview(res, { provider: "openai" });
// [{ severity, field, message }]
const payloads = await aiFuzz({
endpoint: "/users",
method: "POST",
schema: { type: "object", properties: { name: { type: "string" } } },
});
for (const body of payloads) {
const r = await api.post("/users").json(body);
if (r.status >= 500) console.log("possible bug on payload", body, "->", r.status);
}
Both are advisory. aiReview gives you findings, aiFuzz gives you inputs, and you decide what to do with them. The model proposes; your assertions dispose.
The MCP server: hand the tools to the agent
The AI layer above is two-go calling out to a model. The MCP server flips the direction: it lets an agent like Claude drive two-go directly, as a set of tools.
If you haven't run into it yet, MCP (Model Context Protocol) is an open standard for exposing tools to AI agents. two-go ships an MCP server
that runs over stdio with no dependencies, no URL, no account, and no API key. It's all local.
Install once so the two-go-mcp command is on your PATH:
npm install -g two-go
Then wire it into your client. For Claude Code:
claude mcp add two-go two-go-mcp
For Claude Desktop, Cursor, Windsurf, Copilot CLI, or Kiro,
drop this into the client's MCP config:
{
"mcpServers": {
"two-go": { "command": "two-go-mcp" }
}
}
(VS Code uses a top-level servers key and a "type": "stdio" field; Codex and Gemini have their own mcp add commands. The README has a copy-paste block for each.)
The tools it exposes to the agent:
-
http_request: send a request, get back status, headers, timing, and body. -
gen_openapi/gen_postman: generate a suite from a spec or collection. -
infer_schema: infer a JSON schema from a value. -
validate_schema: validate a value against a schema.
Once it's connected, you can just talk to your agent: "call the staging users endpoint and tell me if the shape changed," or "generate a test suite from this OpenAPI file." The agent makes the calls through two-go and reasons over the real responses.
And because the server logic is importable, you can host it yourself:
import { createServer } from "two-go/mcp";
const server = createServer();
const response = await server.handle({
jsonrpc: "2.0",
id: 1,
method: "tools/list",
});
The rest of the box (so the AI output has somewhere to land)
The AI features are only useful because the generated tests run against a real library. The non-AI core is the bulk of two-go:
-
HTTP client + inline checks:
expectStatus,expectJson,expectHeader,expectJsonSchema,expectSorted, and a long list of others. -
expect()for any value: Jest-style matchers, with.not,.resolves, and.rejects. - Soft assertions: collect every failure in a run, throw once at the end.
-
Polling:
eventually/pollUntilfor slow or eventually-consistent endpoints. -
Snapshots, sessions with
{{token}}chaining, a fake-data generator, async helpers (parallelLimit,mapLimit,waterfall), a ~170-function utility belt, and a JSON schema validator + inference. -
Importers that turn an OpenAPI doc or Postman collection into a starting
suite, with a CLI:
two-go gen openapi ./openapi.json -o test/api.twogo.mjs.
All of it ships with hand-written TypeScript types and zero runtime dependencies. Node 18+ (it uses the built-in fetch), ESM only.
A realistic flow, end to end
Here's how the pieces fit together in practice:
- You have an endpoint and an OpenAPI doc. Run
two-go gen openapito get a skeleton suite. - Ask your agent (via the MCP server) to call the live endpoint and flag any drift from the doc.
- Run
aiReviewon a sample response to surface leaks and type mismatches. - Tighten the generated checks into real assertions by hand.
- Commit the
*.twogo.mjsfiles. They run in CI like any other test.
The AI does the tedious first pass: boilerplate, obvious checks, "here's what looks off." You keep the judgment calls. Nothing about the AI layer is load bearing. Pull the key and the core still works exactly the same.
Try it
npm install two-go --save-dev
The roadmap (JUnit/JSON reporters, a GraphQL helper, a cookie jar, request-level retry) is open in the issues. If you build something with it, or break it, I'd love to hear about it.
Top comments (1)
Two Go services for one API surface is a lot to maintain, but the MCP boundary is the right place to draw it. The bigger cost is keeping the spec in sync when the API drifts.