I Built an API That Parses Any Contract into Structured JSON
Every company deals with contracts. NDAs, employment agreements, leases, SaaS terms — they all contain critical data buried in unstructured text. Extracting that data manually is slow, expensive, and error-prone.
So I built Clausify — an API that takes any contract document and returns clean, structured JSON with all the key fields extracted.
What It Does
Upload a contract (PDF, Word, scanned image, or text file), and Clausify returns structured data like this:
{
"parties": [
{
"name": "Acme Corporation",
"address": "123 Business Ave, New York, NY",
"representative": "John Smith, CEO"
},
{
"name": "Beta Technologies LLC",
"address": "456 Tech Road, San Francisco, CA",
"representative": "Jane Doe, CTO"
}
],
"effective_date": "2024-01-15",
"duration": "3 years",
"obligations_of_receiving_party": "Shall not disclose any Confidential Information to third parties without prior written consent.",
"governing_law": "State of New York"
}
One API call. No ML pipeline to set up. No training data needed.
How to Use It
1. Get Your API Key
Sign up on RapidAPI and subscribe to the free plan (20 requests/month).
2. Parse a Contract
cURL:
curl -X POST "https://clausify.p.rapidapi.com/v1/extractions" \
-H "X-RapidAPI-Key: YOUR_API_KEY" \
-H "X-RapidAPI-Host: clausify.p.rapidapi.com" \
-F "file=@contract.pdf" \
-F "contract_type=nda"
Python:
import requests
url = "https://clausify.p.rapidapi.com/v1/extractions"
headers = {
"X-RapidAPI-Key": "YOUR_API_KEY",
"X-RapidAPI-Host": "clausify.p.rapidapi.com"
}
with open("contract.pdf", "rb") as f:
response = requests.post(
url,
headers=headers,
files={"file": f},
data={"contract_type": "nda"}
)
data = response.json()
print(data["result"])
JavaScript:
const form = new FormData();
form.append("file", fs.createReadStream("contract.pdf"));
form.append("contract_type", "nda");
const response = await fetch(
"https://clausify.p.rapidapi.com/v1/extractions",
{
method: "POST",
headers: {
"X-RapidAPI-Key": "YOUR_API_KEY",
"X-RapidAPI-Host": "clausify.p.rapidapi.com",
},
body: form,
}
);
const data = await response.json();
console.log(data.result);
3. Choose Your Contract Type
Clausify has 6 built-in templates optimized for different contract types:
| Type | What It Extracts |
|---|---|
general |
Parties, dates, obligations, price, terms |
nda |
Confidential info definition, duration, exclusions |
employment |
Salary, benefits, probation, non-compete |
lease |
Rent, deposit, maintenance, renewal terms |
saas |
SLA, data handling, liability, auto-renewal |
procurement |
Goods, quantity, delivery, warranty, penalties |
Or pass custom fields to extract exactly what you need:
-F 'fields=["vendor_name", "payment_deadline", "penalty_clause"]'
Use Cases
Legal Tech Platforms — Auto-populate case management systems with contract data.
HR Software — Parse employment contracts to extract salary, benefits, and start dates.
Real Estate Tools — Extract lease terms, rent amounts, and renewal dates from rental agreements.
Procurement Systems — Pull vendor info, pricing, and delivery dates from purchase orders.
Due Diligence — Batch-process hundreds of contracts during M&A and extract key terms.
How I Built It
The tech stack is straightforward:
- FastAPI for the API framework
- GPT for intelligent extraction (not regex — it understands context)
- PyMuPDF for PDF text extraction
- python-docx for Word documents
- GPT Vision for scanned documents (built-in OCR)
- Railway for deployment
- RapidAPI for distribution and billing
The core logic is simple: parse the document into text, send it to GPT with a contract-type-specific prompt, and return structured JSON. The magic is in the prompt engineering — each contract type has an optimized extraction template.
Pricing
| Plan | Price | Requests |
|---|---|---|
| Basic | Free | 20/month |
| Pro | $9.99/mo | 200/month |
| Ultra | $29.99/mo | 800/month |
| Mega | $69.99/mo | 3000/month |
Try It
The free tier gives you 20 requests to test with your own contracts:
I'd love to hear your feedback — what contract types or features would be most useful for your workflow? Drop a comment below.
Top comments (0)