I Built and Launched an AI Document API in Under a Week — Here's Exactly How I Did It
TL;DR: I built Condensare — an AI-powered document processing API that turns any uploaded file into structured notes and contextualised implementation suggestions. It's live on RapidAPI. This is the full breakdown of how I built it, deployed it, tested it and shipped it.
The Problem
A few weeks ago I had a simple frustration.
I was uploading documents to various AI tools trying to get useful summaries and actionable insights, and every single one felt generic. The output didn't know what industry I was in, what scale my business operated at, or what I actually wanted to do with the information.
It just summarised. That was it.
So I built Condensare — a document condensing and AI suggestion API that takes any uploaded file and returns structured notes plus contextualised implementation suggestions.
The Stack
I kept it simple and production-focused:
| Layer | Technology |
|---|---|
| API Server | Node.js + Express |
| AI | OpenAI GPT-4o |
| Deployment | Vercel (serverless) |
| Marketplace | RapidAPI |
| File Uploads | Multer |
| Document Parsing | pdf-parse, mammoth, xlsx |
Vercel serverless was the right call for an API marketplace product — zero infrastructure management, global distribution out of the box, automatic scaling. You don't want to be managing servers when you're trying to get to market fast.
Project Structure
condensare-api/
├── src/
│ ├── app.js # Express entry point
│ ├── routes/
│ │ └── condense.js # Route definitions
│ ├── middleware/
│ │ └── authenticate.js # Auth + plan management
│ ├── services/
│ │ ├── parseService.js # File parsing logic
│ │ ├── condenseService.js # AI condensing logic
│ │ └── suggestService.js # AI suggestions logic
│ └── utils/
│ └── fileUtils.js # File validation helpers
├── vercel.json
└── package.json
The Two Endpoints
POST /api/v1/condense/parse
Available on: FREE, PRO, MEGA, ULTRA
The lightweight endpoint. Upload any supported file and get back extracted raw text plus metadata — word count, character count, MIME type, a unique request ID and processing time. No AI involved, just clean text extraction.
Example response:
{
"success": true,
"data": {
"id": "c934e6f2-a2d8-416e-8782-c40e4e5eb8d3",
"filename": "report.pdf",
"mimeType": "application/pdf",
"characterCount": 9786,
"wordCount": 1438,
"extractedText": "Full extracted text content...",
"parsedAt": "2026-02-24T10:00:00.000Z",
"processingTimeMs": 385
}
}
POST /api/v1/condense
Available on: PRO, MEGA, ULTRA
The full AI pipeline. File gets parsed, text gets passed to GPT-4o with a carefully engineered prompt, and you get back two structured objects:
- Condensed Notes — title, summary, key points, organised sections, keywords
- AI Suggestions — prioritised recommendations, quick wins, long term goals, risk warnings
Example response:
{
"success": true,
"data": {
"id": "2f05d51f-1469-4399-8d93-2005f6215028",
"plan": "pro",
"priorityProcessing": false,
"processingTimeMs": 11641,
"condensedNotes": {
"title": "Comprehensive Guide to Setting Up a Project",
"summary": "Step-by-step guide covering project setup...",
"keyPoints": [
"Create a project directory and navigate to it.",
"Install Node.js and initialise a React project.",
"Connect the project to a GitHub repository."
],
"sections": [
{
"heading": "Setup",
"content": "Instructions for creating the project directory."
}
],
"keywords": ["React", "Firebase", "Node.js"],
"tone": "business",
"depth": "standard"
},
"suggestions": {
"suggestions": [
{
"title": "Integrate AI Features",
"description": "Leverage AI to enhance user experience.",
"priority": "high",
"effort": "medium",
"impact": "high"
}
],
"quickWins": ["Set up a GitHub repo immediately."],
"longTermGoals": ["Build a CI/CD pipeline."],
"warnings": ["Monitor Firebase costs as usage scales."]
}
}
}
File Parsing
Supporting multiple file types meant handling each format differently under the hood:
const parseFile = async (file) => {
const { mimetype, buffer } = file;
switch (mimetype) {
case 'application/pdf':
return await parsePdf(buffer);
case 'application/vnd.openxmlformats-officedocument.wordprocessingml.document':
return await parseDocs(buffer);
case 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet':
return await parseXlsx(buffer);
case 'text/plain':
case 'text/markdown':
case 'text/csv':
return buffer.toString('utf-8');
default:
throw new Error(`Unsupported file type: ${mimetype}`);
}
};
Each parser returns normalised plain text. From the AI's perspective it just sees text — the complexity of handling different formats is completely abstracted away.
Authentication and Plan Management
Authentication works through RapidAPI's proxy secret mechanism. Every request through the RapidAPI gateway includes an X-RapidAPI-Proxy-Secret header proving it came through their infrastructure.
Plan management works through the X-RapidAPI-Subscription header which RapidAPI appends automatically:
const authenticate = (req, res, next) => {
const proxySecret = req.headers['x-rapidapi-proxy-secret'];
const expectedSecret = process.env.RAPIDAPI_SECRET;
// Validate proxy secret
if (!proxySecret || proxySecret !== expectedSecret) {
return res.status(401).json({
success: false,
error: 'Unauthorised. Invalid or missing API credentials.',
code: 401
});
}
// Detect plan from subscription header
const subscriptionHeader = req.headers['x-rapidapi-subscription'];
const plan = VALID_PLANS.includes(subscriptionHeader?.toLowerCase())
? subscriptionHeader.toLowerCase()
: 'free';
// Attach plan and limits to request
req.plan = plan;
req.planLimits = PLAN_LIMITS[plan];
next();
};
The four tiers and their limits:
const PLAN_LIMITS = {
free: { requestsPerDay: 20, requestsPerMinute: 5, aiSuggestions: false },
pro: { requestsPerDay: 500, requestsPerMinute: 30, aiSuggestions: true },
mega: { requestsPerDay: 2000, requestsPerMinute: 60, aiSuggestions: true },
ultra: { requestsPerDay: 10000, requestsPerMinute: 100, aiSuggestions: true, priorityProcessing: true }
};
Prompt Engineering
Getting consistently structured JSON output from GPT-4o took more iteration than anything else in the project. Early prompts returned well-written prose but in formats that varied between requests — inconsistent field names, missing sections, sometimes prose instead of JSON.
The solution was to be extremely explicit:
const buildCondensePrompt = ({ text, tone, depth, industry, scale, focus, goal }) => `
You are a document analysis AI. Analyse the following document and return a JSON object.
IMPORTANT: Return ONLY valid JSON. No text before or after. No markdown code blocks.
Document:
"""
${text}
"""
Parameters:
- Tone: ${tone}
- Depth: ${depth}
- Industry: ${industry || 'general'}
- Scale: ${scale || 'medium'}
${focus ? `- Focus: ${focus}` : ''}
${goal ? `- Goal: ${goal}` : ''}
Return this exact structure:
{
"condensedNotes": {
"title": "string",
"summary": "string",
"keyPoints": ["string"],
"sections": [{ "heading": "string", "content": "string" }],
"keywords": ["string"],
"tone": "${tone}",
"depth": "${depth}"
},
"suggestions": {
"suggestions": [{ "title": "string", "description": "string", "priority": "high|medium|low", "effort": "high|medium|low", "impact": "high|medium|low" }],
"quickWins": ["string"],
"longTermGoals": ["string"],
"warnings": ["string"]
}
}
`;
The strict structure instruction plus showing the exact JSON shape expected eliminated the inconsistency completely.
Vercel Deployment
Deploying an Express app to Vercel as a serverless function requires a vercel.json config:
{
"version": 2,
"builds": [{ "src": "src/app.js", "use": "@vercel/node" }],
"routes": [{ "src": "/(.*)", "dest": "src/app.js" }]
}
One issue I hit was Swagger UI not rendering on Vercel. The swagger-ui-express package tries to serve static assets from the local filesystem which doesn't work in a serverless environment. The fix was to serve a simple HTML page that loads Swagger UI from the unpkg CDN instead:
app.get('/docs', (req, res) => {
res.send(`
<!DOCTYPE html>
<html>
<head>
<title>Condensare API Docs</title>
<link rel="stylesheet" href="https://unpkg.com/swagger-ui-dist/swagger-ui.css" />
</head>
<body>
<div id="swagger-ui"></div>
<script src="https://unpkg.com/swagger-ui-dist/swagger-ui-bundle.js"></script>
<script>
SwaggerUIBundle({
url: '/docs/spec',
dom_id: '#swagger-ui'
});
</script>
</body>
</html>
`);
});
Cleaner, faster, no dependency on local static files.
Environment Variables
# Set via Vercel CLI — never in code
vercel env add OPENAI_API_KEY
vercel env add RAPIDAPI_SECRET
vercel env add NODE_ENV
The RAPIDAPI_SECRET must exactly match the proxy secret shown in your RapidAPI provider dashboard under Security → Firewall Settings. No extra spaces, no modifications. This one caught me out — the secret in Vercel needs to be a fresh paste from the RapidAPI dashboard, not typed manually.
Testing
Before listing on RapidAPI I built a Postman collection with 10 tests covering every layer:
| Test | Expected | Result |
|---|---|---|
| Health check | 200 OK | ✅ |
| No auth headers | 401 | ✅ |
| Wrong proxy secret | 401 | ✅ |
| FREE plan hits /condense | 403 | ✅ |
| FREE plan hits /parse | 200 | ✅ |
| PRO plan with file | 200 | ✅ |
| No file provided | 400 | ✅ |
| PRO full pipeline | 200 | ✅ |
| ULTRA with focus + goal | 200 | ✅ |
All 10 passed locally, then all 10 passed against the production Vercel URL. Only after that did I move to the RapidAPI listing.
To test the real proxy secret without going through RapidAPI, I used curl.exe directly in PowerShell:
curl.exe -X POST "https://your-deployment.vercel.app/api/v1/condense" \
-H "X-RapidAPI-Proxy-Secret: your-proxy-secret" \
-H "X-RapidAPI-Subscription: pro" \
-F "file=@./document.pdf" \
-F "tone=business" \
-F "depth=standard" \
-F "industry=ecommerce" \
-F "scale=medium"
RapidAPI Listing Tips
A few things I learned setting up the provider listing:
Use OpenAPI import. Write a full openapi.yaml spec and import it — this populates all endpoint documentation automatically rather than configuring everything manually field by field.
Feature toggles matter. The monetisation section lets you toggle which features appear on each plan tier in the pricing table. Setting this up properly makes the pricing page significantly more convincing because subscribers can see exactly what they get at each level.
Proxy secret != API key. These are two different things. The API key is what subscribers use to identify themselves. The proxy secret is what RapidAPI adds to every request to prove it came through their gateway. Your server validates the proxy secret, not the API key.
Test before going public. RapidAPI's built-in testing tool can be unreliable. Test using curl against your production URL with the real proxy secret before making the API public.
Pricing
Pricing an API is harder than pricing SaaS because every request has a real cost. OpenAI API calls are not free, so the pricing needs to cover costs at scale while staying competitive.
| Plan | Price | Requests/Day | Key Features |
|---|---|---|---|
| FREE | $0/mo | 20 | Parse endpoint only |
| PRO | $9.99/mo | 500 | Full AI pipeline, brief depth |
| MEGA | $29.99/mo | 2,000 | Standard + detailed depth |
| ULTRA | $79.99/mo | 10,000 | Focus + goal params, priority processing |
The FREE tier exists purely to get developers into the funnel. Let them test the parse endpoint, see the response quality, understand the API structure — then upgrade when they want the AI pipeline.
What I'd Do Differently
Prompt engineering earlier. Getting consistent structured JSON output needed more iteration than expected. Starting with the prompt structure before writing any route handlers would have saved time.
Structured logging from day one. Vercel's function logs are fine for debugging but structured request logging with plan tier, processing time and token usage per request would make cost optimisation much easier as volume grows.
OpenAPI spec first. Writing the OpenAPI spec before building the API forces you to think about request/response structure upfront. I wrote it after, which meant going back to adjust a few things.
What's Next
- Monitor usage patterns to validate pricing tiers
- Add PowerPoint (.pptx) and HTML file support
- Build a simple web UI on top of the API for non-developers
- Set up structured logging and cost dashboards
Final Thoughts
The whole thing — API, deployment, testing, RapidAPI listing — took under a week of focused work. The hardest parts were prompt engineering for consistent structured output and getting the Vercel serverless deployment working correctly with file uploads.
If you're thinking about building an API product, the RapidAPI marketplace is a genuinely good distribution channel. The tooling is mature, the developer audience is large and the monetisation infrastructure is all handled for you.
The full API is live on RapidAPI now. If you're building something that needs document processing or AI summarisation, give it a try.
Built with Node.js, Express, OpenAI GPT-4o and Vercel. Available on RapidAPI across four pricing tiers starting at free.
Top comments (0)