SENZEN

Posted on Feb 26

I Built and Launched an AI Document API in Under a Week — Here's Exactly How I Did It

#ai #javascript #productivity #api

I Built and Launched an AI Document API in Under a Week — Here's Exactly How I Did It

TL;DR: I built Condensare — an AI-powered document processing API that turns any uploaded file into structured notes and contextualised implementation suggestions. It's live on RapidAPI. This is the full breakdown of how I built it, deployed it, tested it and shipped it.

The Problem

A few weeks ago I had a simple frustration.

I was uploading documents to various AI tools trying to get useful summaries and actionable insights, and every single one felt generic. The output didn't know what industry I was in, what scale my business operated at, or what I actually wanted to do with the information.

It just summarised. That was it.

So I built Condensare — a document condensing and AI suggestion API that takes any uploaded file and returns structured notes plus contextualised implementation suggestions.

The Stack

I kept it simple and production-focused:

Layer	Technology
API Server	Node.js + Express
AI	OpenAI GPT-4o
Deployment	Vercel (serverless)
Marketplace	RapidAPI
File Uploads	Multer
Document Parsing	pdf-parse, mammoth, xlsx

Vercel serverless was the right call for an API marketplace product — zero infrastructure management, global distribution out of the box, automatic scaling. You don't want to be managing servers when you're trying to get to market fast.

Project Structure

condensare-api/
├── src/
│   ├── app.js                  # Express entry point
│   ├── routes/
│   │   └── condense.js         # Route definitions
│   ├── middleware/
│   │   └── authenticate.js     # Auth + plan management
│   ├── services/
│   │   ├── parseService.js     # File parsing logic
│   │   ├── condenseService.js  # AI condensing logic
│   │   └── suggestService.js   # AI suggestions logic
│   └── utils/
│       └── fileUtils.js        # File validation helpers
├── vercel.json
└── package.json

The Two Endpoints

`POST /api/v1/condense/parse`

Available on: FREE, PRO, MEGA, ULTRA

The lightweight endpoint. Upload any supported file and get back extracted raw text plus metadata — word count, character count, MIME type, a unique request ID and processing time. No AI involved, just clean text extraction.

Example response:

{
  "success": true,
  "data": {
    "id": "c934e6f2-a2d8-416e-8782-c40e4e5eb8d3",
    "filename": "report.pdf",
    "mimeType": "application/pdf",
    "characterCount": 9786,
    "wordCount": 1438,
    "extractedText": "Full extracted text content...",
    "parsedAt": "2026-02-24T10:00:00.000Z",
    "processingTimeMs": 385
  }
}

`POST /api/v1/condense`

Available on: PRO, MEGA, ULTRA

The full AI pipeline. File gets parsed, text gets passed to GPT-4o with a carefully engineered prompt, and you get back two structured objects:

Condensed Notes — title, summary, key points, organised sections, keywords
AI Suggestions — prioritised recommendations, quick wins, long term goals, risk warnings

Example response:

{
  "success": true,
  "data": {
    "id": "2f05d51f-1469-4399-8d93-2005f6215028",
    "plan": "pro",
    "priorityProcessing": false,
    "processingTimeMs": 11641,
    "condensedNotes": {
      "title": "Comprehensive Guide to Setting Up a Project",
      "summary": "Step-by-step guide covering project setup...",
      "keyPoints": [
        "Create a project directory and navigate to it.",
        "Install Node.js and initialise a React project.",
        "Connect the project to a GitHub repository."
      ],
      "sections": [
        {
          "heading": "Setup",
          "content": "Instructions for creating the project directory."
        }
      ],
      "keywords": ["React", "Firebase", "Node.js"],
      "tone": "business",
      "depth": "standard"
    },
    "suggestions": {
      "suggestions": [
        {
          "title": "Integrate AI Features",
          "description": "Leverage AI to enhance user experience.",
          "priority": "high",
          "effort": "medium",
          "impact": "high"
        }
      ],
      "quickWins": ["Set up a GitHub repo immediately."],
      "longTermGoals": ["Build a CI/CD pipeline."],
      "warnings": ["Monitor Firebase costs as usage scales."]
    }
  }
}

File Parsing

Supporting multiple file types meant handling each format differently under the hood:

const parseFile = async (file) => {
  const { mimetype, buffer } = file;

  switch (mimetype) {
    case 'application/pdf':
      return await parsePdf(buffer);

    case 'application/vnd.openxmlformats-officedocument.wordprocessingml.document':
      return await parseDocs(buffer);

    case 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet':
      return await parseXlsx(buffer);

    case 'text/plain':
    case 'text/markdown':
    case 'text/csv':
      return buffer.toString('utf-8');

    default:
      throw new Error(`Unsupported file type: ${mimetype}`);
  }
};

Each parser returns normalised plain text. From the AI's perspective it just sees text — the complexity of handling different formats is completely abstracted away.

Authentication and Plan Management

Authentication works through RapidAPI's proxy secret mechanism. Every request through the RapidAPI gateway includes an X-RapidAPI-Proxy-Secret header proving it came through their infrastructure.

Plan management works through the X-RapidAPI-Subscription header which RapidAPI appends automatically:

const authenticate = (req, res, next) => {
  const proxySecret = req.headers['x-rapidapi-proxy-secret'];
  const expectedSecret = process.env.RAPIDAPI_SECRET;

  // Validate proxy secret
  if (!proxySecret || proxySecret !== expectedSecret) {
    return res.status(401).json({
      success: false,
      error: 'Unauthorised. Invalid or missing API credentials.',
      code: 401
    });
  }

  // Detect plan from subscription header
  const subscriptionHeader = req.headers['x-rapidapi-subscription'];
  const plan = VALID_PLANS.includes(subscriptionHeader?.toLowerCase())
    ? subscriptionHeader.toLowerCase()
    : 'free';

  // Attach plan and limits to request
  req.plan = plan;
  req.planLimits = PLAN_LIMITS[plan];

  next();
};

The four tiers and their limits:

const PLAN_LIMITS = {
  free:  { requestsPerDay: 20,    requestsPerMinute: 5,   aiSuggestions: false },
  pro:   { requestsPerDay: 500,   requestsPerMinute: 30,  aiSuggestions: true  },
  mega:  { requestsPerDay: 2000,  requestsPerMinute: 60,  aiSuggestions: true  },
  ultra: { requestsPerDay: 10000, requestsPerMinute: 100, aiSuggestions: true, priorityProcessing: true }
};

Prompt Engineering

Getting consistently structured JSON output from GPT-4o took more iteration than anything else in the project. Early prompts returned well-written prose but in formats that varied between requests — inconsistent field names, missing sections, sometimes prose instead of JSON.

The solution was to be extremely explicit:

const buildCondensePrompt = ({ text, tone, depth, industry, scale, focus, goal }) => `
You are a document analysis AI. Analyse the following document and return a JSON object.

IMPORTANT: Return ONLY valid JSON. No text before or after. No markdown code blocks.

Document:
"""
${text}
"""

Parameters:
- Tone: ${tone}
- Depth: ${depth}
- Industry: ${industry || 'general'}
- Scale: ${scale || 'medium'}
${focus ? `- Focus: ${focus}` : ''}
${goal ? `- Goal: ${goal}` : ''}

Return this exact structure:
{
  "condensedNotes": {
    "title": "string",
    "summary": "string",
    "keyPoints": ["string"],
    "sections": [{ "heading": "string", "content": "string" }],
    "keywords": ["string"],
    "tone": "${tone}",
    "depth": "${depth}"
  },
  "suggestions": {
    "suggestions": [{ "title": "string", "description": "string", "priority": "high|medium|low", "effort": "high|medium|low", "impact": "high|medium|low" }],
    "quickWins": ["string"],
    "longTermGoals": ["string"],
    "warnings": ["string"]
  }
}
`;

The strict structure instruction plus showing the exact JSON shape expected eliminated the inconsistency completely.

Vercel Deployment

Deploying an Express app to Vercel as a serverless function requires a vercel.json config:

{
  "version": 2,
  "builds": [{ "src": "src/app.js", "use": "@vercel/node" }],
  "routes": [{ "src": "/(.*)", "dest": "src/app.js" }]
}

One issue I hit was Swagger UI not rendering on Vercel. The swagger-ui-express package tries to serve static assets from the local filesystem which doesn't work in a serverless environment. The fix was to serve a simple HTML page that loads Swagger UI from the unpkg CDN instead:

app.get('/docs', (req, res) => {
  res.send(`
    <!DOCTYPE html>
    <html>
      <head>
        <title>Condensare API Docs</title>
        <link rel="stylesheet" href="https://unpkg.com/swagger-ui-dist/swagger-ui.css" />
      </head>
      <body>
        <div id="swagger-ui"></div>
        <script src="https://unpkg.com/swagger-ui-dist/swagger-ui-bundle.js"></script>
        <script>
          SwaggerUIBundle({
            url: '/docs/spec',
            dom_id: '#swagger-ui'
          });
        </script>
      </body>
    </html>
  `);
});

Cleaner, faster, no dependency on local static files.

Environment Variables

# Set via Vercel CLI — never in code
vercel env add OPENAI_API_KEY
vercel env add RAPIDAPI_SECRET
vercel env add NODE_ENV

The RAPIDAPI_SECRET must exactly match the proxy secret shown in your RapidAPI provider dashboard under Security → Firewall Settings. No extra spaces, no modifications. This one caught me out — the secret in Vercel needs to be a fresh paste from the RapidAPI dashboard, not typed manually.

Testing

Before listing on RapidAPI I built a Postman collection with 10 tests covering every layer:

Test	Expected	Result
Health check	200 OK	✅
No auth headers	401	✅
Wrong proxy secret	401	✅
FREE plan hits /condense	403	✅
FREE plan hits /parse	200	✅
PRO plan with file	200	✅
No file provided	400	✅
PRO full pipeline	200	✅
ULTRA with focus + goal	200	✅

All 10 passed locally, then all 10 passed against the production Vercel URL. Only after that did I move to the RapidAPI listing.

To test the real proxy secret without going through RapidAPI, I used curl.exe directly in PowerShell:

curl.exe -X POST "https://your-deployment.vercel.app/api/v1/condense" \
  -H "X-RapidAPI-Proxy-Secret: your-proxy-secret" \
  -H "X-RapidAPI-Subscription: pro" \
  -F "file=@./document.pdf" \
  -F "tone=business" \
  -F "depth=standard" \
  -F "industry=ecommerce" \
  -F "scale=medium"

RapidAPI Listing Tips

A few things I learned setting up the provider listing:

Use OpenAPI import. Write a full openapi.yaml spec and import it — this populates all endpoint documentation automatically rather than configuring everything manually field by field.

Feature toggles matter. The monetisation section lets you toggle which features appear on each plan tier in the pricing table. Setting this up properly makes the pricing page significantly more convincing because subscribers can see exactly what they get at each level.

Proxy secret != API key. These are two different things. The API key is what subscribers use to identify themselves. The proxy secret is what RapidAPI adds to every request to prove it came through their gateway. Your server validates the proxy secret, not the API key.

Test before going public. RapidAPI's built-in testing tool can be unreliable. Test using curl against your production URL with the real proxy secret before making the API public.

Pricing

Pricing an API is harder than pricing SaaS because every request has a real cost. OpenAI API calls are not free, so the pricing needs to cover costs at scale while staying competitive.

Plan	Price	Requests/Day	Key Features
FREE	$0/mo	20	Parse endpoint only
PRO	$9.99/mo	500	Full AI pipeline, brief depth
MEGA	$29.99/mo	2,000	Standard + detailed depth
ULTRA	$79.99/mo	10,000	Focus + goal params, priority processing

The FREE tier exists purely to get developers into the funnel. Let them test the parse endpoint, see the response quality, understand the API structure — then upgrade when they want the AI pipeline.

What I'd Do Differently

Prompt engineering earlier. Getting consistent structured JSON output needed more iteration than expected. Starting with the prompt structure before writing any route handlers would have saved time.

Structured logging from day one. Vercel's function logs are fine for debugging but structured request logging with plan tier, processing time and token usage per request would make cost optimisation much easier as volume grows.

OpenAPI spec first. Writing the OpenAPI spec before building the API forces you to think about request/response structure upfront. I wrote it after, which meant going back to adjust a few things.

What's Next

Monitor usage patterns to validate pricing tiers
Add PowerPoint (.pptx) and HTML file support
Build a simple web UI on top of the API for non-developers
Set up structured logging and cost dashboards

Final Thoughts

The whole thing — API, deployment, testing, RapidAPI listing — took under a week of focused work. The hardest parts were prompt engineering for consistent structured output and getting the Vercel serverless deployment working correctly with file uploads.

If you're thinking about building an API product, the RapidAPI marketplace is a genuinely good distribution channel. The tooling is mature, the developer audience is large and the monetisation infrastructure is all handled for you.

The full API is live on RapidAPI now. If you're building something that needs document processing or AI summarisation, give it a try.

Built with Node.js, Express, OpenAI GPT-4o and Vercel. Available on RapidAPI across four pricing tiers starting at free.

DEV Community

I Built and Launched an AI Document API in Under a Week — Here's Exactly How I Did It

I Built and Launched an AI Document API in Under a Week — Here's Exactly How I Did It

The Problem

The Stack

Project Structure

The Two Endpoints

`POST /api/v1/condense/parse`

`POST /api/v1/condense`

File Parsing

Authentication and Plan Management

Prompt Engineering

Vercel Deployment

Environment Variables

Testing

RapidAPI Listing Tips

Pricing

What I'd Do Differently

What's Next

Final Thoughts

Top comments (0)