~K¹yle Million

Posted on Jun 5

How I Built a Multi-Product AI API Business on Railway — FastAPI + Stripe + x402 + MCP in One Stack

#mcp #python #ai #programming

A few months ago I wanted to sell access to AI tools I'd built — a passive income analyzer, an AI strategy consultant, an autonomous commerce layer. The standard approach is: pick a SaaS platform, pay their cut, accept their limitations.

Instead I built ACE — an AI commerce engine that handles everything itself. It's been running on Railway for months, delivering license keys, processing Stripe webhooks, and accepting x402 micropayments. Here's the architecture.

What ACE does

ACE is a license server that sits between a payment event and an AI tool delivery. When someone pays:

Stripe fires a webhook
ACE validates the event
ACE generates a Fernet-encrypted license key scoped to the purchased product
Resend delivers the key via email
The customer uses the key to call the API

It handles three products today:

YIELD INTELLIGENCE — passive income opportunity scanner (Treasury rates, dividend screening, AI analysis pass)
COUNSELOR — AI infrastructure strategy and architecture consulting
ACE Autonomous Commerce — agent-to-agent purchase execution with spending limit enforcement

All three are MCP-native (Streamable HTTP, spec 2025-11-25) and A2A-discoverable via agent cards.

The stack

FastAPI (Python)
├── Stripe webhook handler (payment → license)
├── License generation (Fernet encryption)
├── Resend email delivery
├── SQLite WAL (license tracking, idempotency guards)
├── x402 middleware (micropayment route protection)
└── MCP server wrapper (tools/call interface)

Deployed on Railway. Cost at current scale: $0/mo (Railway's free tier covers it until traffic hits).

The payment layer: Stripe + x402

ACE supports two payment modes because its customers are both humans and agents.

Stripe handles human subscription billing. Monthly tiers with Stripe webhooks trigger license key generation. The webhook handler is idempotent — replayed events are caught by SQLite deduplication before a second key is issued.

x402 handles agent-to-agent micropayments. When an AI agent calls a protected endpoint without payment, ACE returns HTTP 402 with a USDC payment challenge. The agent pays on Base mainnet, retries with the X-PAYMENT header, and ACE verifies the on-chain settlement before serving the response.

This split matters: humans subscribe monthly, agents pay per call. Same API, two billing rails.

License key architecture

from cryptography.fernet import Fernet

# Key generation (on webhook)
fernet = Fernet(FERNET_KEY)
license_key = fernet.encrypt(
    json.dumps({
        "product": "yield_intelligence",
        "tier": "professional",
        "customer_id": stripe_customer_id,
        "issued_at": datetime.utcnow().isoformat(),
        "expires_at": (datetime.utcnow() + timedelta(days=30)).isoformat(),
    }).encode()
).decode()

The key is the payload. There's no separate lookup table for validation — decryption is the validation. If the Fernet decrypt succeeds and the expiry hasn't passed, the key is valid.

SQLite WAL handles the idempotency:

# Before issuing any key
cursor.execute(
    "SELECT license_key FROM licenses WHERE stripe_payment_intent = ?",
    (payment_intent_id,)
)
if cursor.fetchone():
    return existing_key  # replay protection

The MCP wrapper

Each product exposes itself as an MCP tool. The wrapper is minimal:

@app.post("/mcp")
async def mcp_endpoint(request: MCPRequest, api_key: str = Depends(validate_key)):
    if request.method == "tools/list":
        return {"tools": TOOLS_SCHEMA}

    if request.method == "tools/call":
        tool_name = request.params["name"]
        arguments = request.params["arguments"]

        result = await TOOL_HANDLERS[tool_name](arguments)
        return {"content": [{"type": "text", "text": json.dumps(result)}]}

The A2A agent card lives at /.well-known/agent.json and contains the pricing manifest, supported protocols, and capability descriptions. Another agent can discover ACE, query its pricing, and decide whether to call it — all before any payment happens.

The BYOK model

ACE's AI products use Claude as the reasoning layer, but the cost structure passes through to the customer. BYOK (Bring Your Own Key): the customer brings their Anthropic API key, ACE applies it to the analysis call, returns the result.

This means ACE's marginal cost for AI calls is zero. The license fee covers the infrastructure and IP (the prompts, the data pipeline, the business logic around the AI call) — not the compute.

# Customer's key is scoped to their request context
async def analyze_yield_opportunities(args: dict, customer_key: str):
    client = AsyncAnthropic(api_key=customer_key)
    response = await client.messages.create(
        model="claude-opus-4-7",
        messages=[{
            "role": "user",
            "content": build_yield_analysis_prompt(args, live_rates)
        }]
    )
    return parse_analysis_response(response)

Deployment on Railway

Railway's Dockerfile deploy:

FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

Environment variables in Railway (STRIPE_SECRET_KEY, FERNET_KEY, RESEND_API_KEY, etc.) — zero secrets in the repo.

Railway's auto-deploy on push means the CI/CD story is: push to main, Railway redeploys in ~2 minutes. No GitHub Actions config needed for this scale.

What I'd do differently

SQLite is fine until it isn't. WAL mode handles concurrent reads well but a sudden traffic spike would hit write contention. The migration path is to Postgres — Railway makes this a one-env-var change — but I'm deferring it until there's actually load.

The x402 middleware needs better error messaging. Currently a malformed payment header returns a generic 402. Adding a structured error body (which field failed, what the correct format is) would reduce debugging time for agent developers calling the API.

Monitoring is thin. I have health checks and Railway's built-in metrics. A proper observability layer (structured logs → something queryable) would make debugging production issues faster. Currently it's railway logs and grep.

Current status

ACE is live. License delivery is automated. The endpoint is MCP-native and A2A-discoverable. Revenue is thin (early stage) but the architecture runs without my involvement.

The next milestone is traffic — getting agent developers who need these tools to find ACE in the places they look (MCP registries, model marketplaces, dev.to articles like this one).

If you're building something that needs passive income analysis, AI architecture strategy, or an autonomous commerce layer in your agent stack — ACE is at ace-license-server-production.up.railway.app.

Built by ~K¹ (William Kyle Million) · IntuiTek¹

DEV Community