Artur Goncharov

Posted on Apr 20

Two months, one engineer, one production B2B marketplace — what AI-directed full-stack actually looks like

#showdev #productivity #ai #webdev

I shipped a production B2B marketplace for the Russian HVAC industry — wentmarket.ru — alone, in two months, end to end. Not an MVP. Full commerce, seven external integrations, seventeen engineering calculation engines, 101 Prisma models, 446 test files, real paying customers.

This post is the breakdown of what was actually in it, how the work was split between me and Claude Code, and the parts I would not delegate again.

If your first reaction is "solo-built in two months is a red flag, not a feature" — good, mine would be too. Stick around. The answer is disclosure plus a list of concrete things that are in the repo.

What "AI-directed engineering" actually means here

The division of labor across the two months:

Me: product scope, domain model, database schema, integration architecture, performance budget, edge cases, production QA, every decision that touches revenue or data integrity.
Claude Code: CRUD handlers, first-pass tests, boilerplate React components, email templates, refactors I specify line by line, anything with an obvious shape.

This is not autocomplete. It is a workflow: I write the spec, Claude writes the first draft, I review and rewrite maybe 30–50% of it, Claude applies the rewrite, I ship. The failure mode is not "AI writes broken code" (current models rarely do on well-scoped tasks), the failure mode is "AI writes working code for the wrong problem." That is a human problem and it does not go away with better models.

In exchange for the discipline of writing clear specs, I get maybe a 3× throughput improvement on everything that is not an architecture decision. On a two-month solo sprint that compounds into roughly a team-of-four's worth of output — which is the number I actually felt.

The product, itemized

wentmarket is a two-sided marketplace for HVAC (ventilation equipment). B2B procurement, B2C retail, and a contractor-oriented engineering layer that helps specify the right fan / air handling unit / silencer for a duty point.

Commerce

584 products, 2,308 configurable product variants, 2,139 motor options, 62 categories
B2B portal with client-specific pricing pulled from 1C ERP, quote workflow, PDF submittal generation
B2C checkout with CDEK delivery rates, Yookassa payment with 54-FZ fiscal receipts and idempotent webhooks
35-section admin panel for moderation, inventory, orders, manual overrides

Engineering

17 calculation engines total, including:
- Fan selection by duty point (flow + pressure) across 18,141 polynomial pressure–flow curves indexed in Postgres, 4 ms average match time
- Air-handling-unit configurator with 10 sub-engines (psychrometric, ε-NTU heat-exchanger sizing, octave-band acoustics, filter selection, recuperator, fan-block, silencer, damper, control, life-cycle cost)
- Silencer selector (acoustic sum across 6 components)
- Life-cycle-cost calculator (15-year horizon, energy + maintenance)
- VFD (variable-frequency drive) sizing
- Duct pressure-loss calculator
- BIM model generator for engineering deliverables
385 automated engineering tests against manufacturer reference data; CI blocks merges at >0.5% deviation

Integrations (each with retry, timeout, HMAC/signature verification, graceful degradation)

1C ERP — bidirectional product and order sync over Russian SOAP
Bitrix24 CRM — full service layer, ~1,500 LOC, lead + deal + customer + funnel analytics
Yookassa — payment processing, 54-FZ fiscal receipts, SHA-256 idempotency
CDEK — delivery cost calculation, order creation, courier dispatch, PDF waybill
Meilisearch — typo-tolerant search with Russian synonyms, Prisma fallback if Meilisearch is down
Telegram — bot with 40+ commands + Mini App with HMAC-verified initData auth
DaData — company lookup by INN, Sentry for error monitoring

Stack
Next.js 16 (App Router, Turbopack), React 19, TypeScript strict, Prisma ORM, PostgreSQL, Redis, BullMQ, NextAuth v5, Docker, GitHub Actions, Vitest (unit + integration), Playwright (E2E across desktop + mobile viewports). Linux VPS, systemd service, Nginx reverse proxy with 50 r/s rate limiting. Redis L2 cache, BullMQ job queues, distributed rate limiting, PostgreSQL soft-delete via Prisma $extends.

Testing

446 test files total
1,574 unit tests (Vitest, 60%+ coverage)
165 Playwright E2E tests across 17 user-journey spec files
385 engineering tests, manufacturer-reference-data parity

That is not decoration. The 385 engineering tests exist because fan selection is a safety-adjacent decision — a mis-sized fan in a kitchen ventilation system is an actual fire risk.

The parts that took longer than they should have, honestly

1C ERP integration. Russian accounting software speaks a SOAP dialect that is half ISO-20022 and half vibes. The schema is publicly undocumented in English. Took about two weeks including a very educational detour into Microsoft Windows character encoding on the wire. If you have not touched windows-1251 in a while, it is still there, waiting.
Polynomial curve ingestion. 18,141 curves came from manufacturer PDFs, scraped HTML tables, Delphi data files, and Excel sheets with merged cells. Each format needed its own parser and a verification harness that plotted the result against the original datasheet to catch fit drift. Curves are stored as double precision[] coefficient arrays in Postgres; Horner's-method evaluation evaluates each at ~30 ns. Wrote that up separately here with the OSS extract at polynomial-fan-matcher.
The AHU configurator UX. First version was a drag-and-drop canvas. Showed it to three HVAC engineers, they bounced off it in 30 seconds. Rebuilt around what they actually do: enter a duty point, receive three matching AHUs, override individual components in focused forms. Time-to-first-viable-configuration went from ~12 min to ~40 sec on the same users. Killed six weeks of work making that call. I would do it again.
Testing the engineering engines. This is where the "Claude does tests" claim has to be qualified. Claude writes test scaffolding and first-pass assertions — but the actual reference data (expected outputs) comes from manufacturer datasheets that I had to digitize by hand, and the tolerance bands come from ASHRAE / СП 60.13330. That calibration is irreducibly human, irreducibly domain-specific, and it is where the value is.

What I would not delegate to Claude, even next time

Domain model decisions. A Product vs. a ProductSeries vs. a ProductVariant vs. a MotorOption is not a naming problem, it is a commercial model. Getting that wrong costs you the next six months.
Integration contracts. What exact payload do we send to 1C when a B2B client places a quote that has not yet been approved? The answer depends on the customer's internal workflow, not on the 1C documentation. A model cannot solve that for you.
Error-handling philosophy. Fail fast or degrade gracefully? The answer is different for the payment webhook (fail fast, write to the incident table, the human decides) and for the search index (degrade gracefully, fall back to Prisma LIKE, never block commerce on search availability). Picking the right mode per surface is an architecture call.
What not to build. Every feature I did not build saved me more time than the ones I did build. "Should we add a chatbot" / "should we build a mobile app" / "should we add multi-currency pricing" all got answered "not in V1" and that is why V1 shipped.

Numbers that matter, briefly

584 products, 2,308 variants, 2,139 motor options, 62 categories
101 Prisma models, 40+ enums, 35-section admin panel
18,141 polynomial curves, 4 ms duty-point match, 385 engineering tests
446 test files, 60%+ unit coverage, 165 E2E specs across desktop + mobile
7 production integrations, each with retry / timeout / HMAC / graceful fallback
Two months, one engineer, with Claude Code as a build-accelerator
Live, paying customers

Why I am posting this

Two honest reasons.

One: I think the industry is still calibrating what "AI-accelerated solo engineering" actually looks like on a serious product. A lot of the public discourse is either "solo founders ship magical 10× speed with AI now" or "it is a glorified autocomplete, nothing has changed." Neither is useful. The real thing is closer to: with the right discipline, one competent senior engineer can ship what a team of four used to ship, on a real B2B product with real integrations, in a short window. I wanted to describe that specifically rather than generically.

Two: I am looking for offers. Open to engineering roles (remote or relocation), licensing the engineering engines to a Western HVAC or building-simulation vendor, partnership conversations, or acquisition discussions. I am in Russia (UTC+3), fluent in written English, speaking still developing so I work async-first.

If any of that is interesting:

Live product: wentmarket.ru
OSS extract (fan-matching polynomial engine): polynomial-fan-matcher
5-min demo video: youtu.be/sZnvwEfCwVk
Contact: goncharov.artur.02@gmail.com

Happy to go deeper on any specific engine, integration, or architectural decision in the comments.

DEV Community