Your business runs on documents. PDFs, CSVs, images, text files. Now AI can understand them all through a single TypeScript API.
The challenge? Each format requires different parsing. PDFs need visual analysis. CSVs need tabular understanding. Different APIs. Different libraries. Different headaches.
NeuroLink solves this with a unified multimodal API. One interface handles every format. Auto-detection identifies file types. Smart routing selects the right provider. You write one code path for all documents.
TL;DR
- Process PDFs natively (no OCR needed)
- Analyze CSV data with AI insights
- One API call for any document type
- Full TypeScript type safety
- Production-ready pipeline patterns
Read on for the complete guide...
Table of Contents
- Why Unified Document Processing Matters
- PDF Processing
- CSV Data Analysis
- Cross-Format Analysis
- Production Patterns
- Provider Comparison
Why Unified Document Processing Matters
Traditional document AI requires juggling multiple tools:
| Document Type | Traditional Approach | Problems |
|---|---|---|
| pdf-parse + OCR + separate API | Loses visual context, slow | |
| CSV | csv-parser + custom formatting | No semantic understanding |
| Images | sharp + vision API | Separate pipeline |
| Text | fs.readFile + custom parsing | No intelligence |
NeuroLink replaces this complexity with one API call:
import { NeuroLink } from "@juspay/neurolink";
const ai = new NeuroLink();
// Process ANY document type
const result = await ai.generate({
input: {
text: "Analyze this document and extract key insights",
files: ["report.pdf", "data.csv"]
}
});
The SDK handles everything:
- Format Detection: Magic bytes identify file type accurately
- Provider Selection: Routes PDFs to vision-capable providers
- Text Optimization: Formats tabular data for LLM consumption
- Error Handling: Graceful fallbacks for edge cases
PDF Processing
PDFs are the workhorse of business documents. NeuroLink processes them natively, preserving visual context that OCR-based approaches lose.
Why Native PDF Matters
Traditional PDF processing converts to text, destroying valuable information:
| Approach | Charts | Tables | Images | Layout |
|---|---|---|---|---|
| OCR-based | Lost | Partial | Lost | Lost |
| Text extraction | Lost | Lost | Lost | Lost |
| Native (NeuroLink) | Preserved | Preserved | Analyzed | Understood |
Native processing sends the PDF directly to vision-capable models. The AI sees exactly what humans see.
Basic PDF Analysis
import { NeuroLink } from "@juspay/neurolink";
const ai = new NeuroLink();
const result = await ai.generate({
input: {
text: "What is the total revenue in this financial report?",
pdfFiles: ["quarterly-report.pdf"]
},
provider: "vertex",
maxTokens: 1000
});
console.log(result.content);
// "The Q3 2025 report shows total revenue of $42.3M..."
Structured Data Extraction with Schema
Extract structured JSON from unstructured PDFs using schema enforcement:
const invoice = await ai.generate({
input: {
text: "Extract invoice details in JSON format",
pdfFiles: ["invoice.pdf"]
},
provider: "anthropic",
schema: {
type: "object",
properties: {
vendor: { type: "string" },
invoiceNumber: { type: "string" },
date: { type: "string" },
lineItems: {
type: "array",
items: {
type: "object",
properties: {
description: { type: "string" },
quantity: { type: "number" },
unitPrice: { type: "number" },
total: { type: "number" }
}
}
},
subtotal: { type: "number" },
tax: { type: "number" },
total: { type: "number" }
}
},
output: { format: "json" }
});
console.log(JSON.parse(invoice.content));
// { vendor: "Acme Corp", invoiceNumber: "INV-2025-001", ... }
Schema enforcement guarantees the output structure. No more parsing inconsistent responses.
Multi-PDF Comparison
Compare multiple documents in a single request:
const comparison = await ai.generate({
input: {
text: "Compare Q1 and Q2 reports. What changed in revenue and expenses?",
pdfFiles: ["q1-report.pdf", "q2-report.pdf"]
},
provider: "vertex",
maxTokens: 2000
});
CLI PDF Commands
Process PDFs directly from the command line:
# Basic PDF analysis
npx @juspay/neurolink generate "Summarize this contract" \
--pdf contract.pdf \
--provider vertex
# Multiple PDFs
npx @juspay/neurolink generate "Compare these invoices" \
--pdf invoice1.pdf \
--pdf invoice2.pdf \
--provider anthropic
CSV Data Analysis
CSV files contain the data that drives decisions. NeuroLink transforms raw data into actionable insights.
Basic CSV Analysis
const insights = await ai.generate({
input: {
text: "What are the key trends in this sales data? Identify top performers.",
csvFiles: ["sales-2024.csv"]
}
});
console.log(insights.content);
// "Key trends from your sales data:
// 1. Q4 showed strongest growth at 23% MoM
// 2. Top performer: Sarah Chen ($2.3M total)..."
Advanced CSV Options
Control how CSV data is processed:
const analysis = await ai.generate({
input: {
text: "Identify the top 10 customers by total revenue",
csvFiles: ["customers.csv"]
},
csvOptions: {
maxRows: 1000,
formatStyle: "markdown",
includeHeaders: true
}
});
For large files, maxRows prevents token overflow while maintaining representativeness.
CLI CSV Commands
# Analyze CSV data
npx @juspay/neurolink generate "Find trends" --csv sales.csv
# Multiple CSVs
npx @juspay/neurolink generate "Compare datasets" --csv q1.csv --csv q2.csv
# With options
npx @juspay/neurolink generate "Summarize top rows" \
--csv large-data.csv \
--csv-max-rows 500
Cross-Format Analysis
One of NeuroLink's most powerful features is cross-referencing data across formats:
const verification = await ai.generate({
input: {
text: "Does the transaction data in the CSV match the totals in the PDF report?",
files: [
"transactions.csv",
"monthly-report.pdf"
]
},
provider: "vertex"
});
NeuroLink's auto-detection handles mixed formats seamlessly. Pass any combination of supported files and the SDK routes each to the appropriate processing pipeline.
Production Patterns
Real-world document processing requires robust patterns for scale and reliability.
Document Processing Pipeline
import { NeuroLink } from "@juspay/neurolink";
import fs from "fs";
import path from "path";
interface ProcessingResult {
file: string;
summary: string;
success: boolean;
error?: string;
}
class DocumentPipeline {
private ai: NeuroLink;
constructor() {
this.ai = new NeuroLink({
conversationMemory: { enabled: true }
});
}
async processDirectory(dirPath: string): Promise<ProcessingResult[]> {
const files = fs.readdirSync(dirPath);
const results: ProcessingResult[] = [];
for (const file of files) {
const filePath = path.join(dirPath, file);
const ext = path.extname(file).toLowerCase();
const provider = this.selectProvider(ext);
try {
const result = await this.ai.generate({
input: {
text: "Extract key information and create a summary",
files: [filePath]
},
provider
});
results.push({ file, summary: result.content, success: true });
} catch (error: any) {
results.push({ file, summary: "", success: false, error: error.message });
}
}
return results;
}
private selectProvider(ext: string): string {
if (ext === ".pdf") return "vertex";
return "openai";
}
}
Streaming for Long Documents
When processing lengthy documents, streaming provides real-time feedback:
async function analyzeWithStreaming(filePath: string) {
const ai = new NeuroLink();
const result = await ai.stream({
input: {
text: "Provide a comprehensive analysis of this document",
files: [filePath]
},
provider: "vertex",
maxTokens: 4000
});
for await (const chunk of result.stream) {
if (chunk.type === "text") {
process.stdout.write(chunk.content);
}
}
}
Error Handling
async function processDocumentSafely(filePath: string, query: string) {
const ai = new NeuroLink();
try {
const result = await ai.generate({
input: { text: query, files: [filePath] },
provider: "vertex"
});
return { success: true, content: result.content };
} catch (error: any) {
switch (error.code) {
case "FILE_TOO_LARGE":
console.log("Document exceeds page limit, splitting...");
return await processLargeDocument(filePath, query);
case "PROVIDER_NOT_CAPABLE":
console.log("Falling back to vision-capable provider");
return await processWithFallback(filePath, query);
case "RATE_LIMIT_EXCEEDED":
console.log("Rate limited, retrying with backoff");
return await retryWithBackoff(() => processDocumentSafely(filePath, query));
default:
throw error;
}
}
}
Provider Comparison
Not all providers handle documents equally:
| Provider | Native PDF | Max Pages | CSV | Best For |
|---|---|---|---|---|
| Vertex AI | Yes | 100 | Yes | Cost-effective PDF processing |
| Anthropic | Yes | 100 | Yes | Best reasoning quality |
| Google AI | Yes | 100 | Yes | Large file support |
| OpenAI | Yes | 100 | Yes | General purpose |
| Bedrock | Yes | 100 | Yes | AWS integration |
Recommendation: Use Vertex AI for cost-effective PDF-heavy workloads. Fall back to Anthropic for complex documents requiring strong reasoning.
Get Started
Install NeuroLink and start processing documents:
npm install @juspay/neurolink
Full documentation: docs.neurolink.ink
Found this helpful? Drop a comment below with your questions or share your document processing experience!
Want to try NeuroLink?
- GitHub: github.com/juspay/neurolink
- Star the repo if you find it useful!
Follow us for more AI development content:
- Dev.to: @neurolink
- Twitter: @Neurolink__
Top comments (0)