As web applications become more sophisticated, handling document processing at scale presents unique challenges. When building https://www.acetoolz.com, a professional online tools platform, I needed to create a robust PDF processing pipeline that could handle multiple operations like compression, conversion, merging, and OCR while maintaining performance and security.
In this article, I'll walk you through building a serverless PDF processing pipeline using Next.js, TypeScript, and the iLovePDF API that can handle millions of documents efficiently.
The Challenge
PDF processing involves several technical hurdles:
- File Size Limitations: Vercel has a 4.5MB payload limit for serverless functions
- Processing Time: Complex PDF operations can exceed serverless timeout limits
- Memory Constraints: Large PDF files can cause memory issues in serverless environments
- Security: Handling sensitive documents requires careful data management
- User Experience: Users expect fast, reliable processing with real-time feedback
Architecture Overview
Our solution uses a hybrid approach combining client-side validation, serverless API routes, and third-party processing services:
Client Upload → Next.js API Route → iLovePDF API → Processed File → Client Download
Implementation Deep Dive
- Tool Configuration System
First, I created a typed configuration system for PDF tools:
// types/index.ts
export interface Tool {
id: string;
title: string;
description: string;
category: string;
inputs: ToolInput[];
actions: ToolAction[];
settings: ToolSetting[];
}
// lib/tools/pdfCompressor.ts
export const pdfCompressorTool: Tool = {
id: 'pdf-compressor',
title: 'Compress PDF',
description: 'Reduce PDF file size while maintaining good quality',
category: 'PDF',
inputs: [
{
id: 'pdf-file',
type: 'file',
label: 'Upload PDF File',
accept: '.pdf',
required: true,
maxSize: 15 * 1024 * 1024, // 15MB for PRO users
},
],
settings: [
{
id: 'compression-level',
type: 'select',
label: 'Compression Level',
options: [
{ value: 'standard', label: 'Standard Compression' },
{ value: 'high', label: 'High Compression' },
{ value: 'maximum', label: 'Maximum Compression' },
],
defaultValue: 'standard',
},
],
};
- Role-Based Access Control
Different user tiers get different capabilities:
// API route with role-based limits
const FREE_TIER_LIMIT = 5 * 1024 * 1024; // 5MB
const PRO_TIER_LIMIT = 15 * 1024 * 1024; // 15MB
export async function POST(request: NextRequest) {
const session = await auth();
if (!session) {
return NextResponse.json(
{ error: 'Please sign in to use the PDF compressor' },
{ status: 401 }
);
}
const isPremium = session.user?.role?.name === 'PDF_Pro' ||
session.user?.role?.name === 'PRO';
const sizeLimit = isPremium ? PRO_TIER_LIMIT : FREE_TIER_LIMIT;
if (file.size > sizeLimit) {
return NextResponse.json(
{
error: `File too large. ${isPremium ? 'PRO' : 'Free'} tier limit: ${Math.round(sizeLimit / 1024 / 1024)}MB`,
upgradeRequired: !isPremium
},
{ status: 413 }
);
}
}
- iLovePDF Integration
To bypass Vercel's payload limits, we use iLovePDF's cloud processing:
// lib/utils/ilovepdf-upload.ts
export interface ILovePDFTaskAuth {
taskId: string;
server: string;
uploadToken: string;
expire: number;
remainingCredits?: number;
}
export const validatePDFForUpload = (file: File, limits: ILovePDFUploadLimits): string | null => {
if (!limits.allowedFormats.includes(file.type)) {
return 'Invalid file format. Only PDF files are allowed.';
}
if (file.size > limits.maxSizePerFile) {
return `File too large. Maximum size: ${Math.round(limits.maxSizePerFile / 1024 / 1024)}MB`;
}
return null;
};
- Processing Pipeline
The complete processing flow handles multiple PDF operations:
// api/tools/pdf-compress/route.ts
export async function POST(request: NextRequest) {
try {
// 1. Authentication & Authorization
const session = await auth();
const formData = await request.formData();
const file = formData.get('file') as File;
const compressionLevel = formData.get('compressionLevel') as string;
// 2. Validation
const validation = validatePDFForUpload(file, {
maxFiles: 1,
maxSizePerFile: sizeLimit,
allowedFormats: ['application/pdf']
});
// 3. Create iLovePDF Task
const taskAuth = await createILovePDFTask('compress');
// 4. Upload to iLovePDF
const uploadResult = await uploadToILovePDF(file, taskAuth);
// 5. Process with compression settings
const compressionSettings = COMPRESSION_LEVELS[compressionLevel];
const processResult = await processILovePDFTask(
taskAuth.taskId,
compressionSettings
);
// 6. Download processed file
const processedBuffer = await downloadFromILovePDF(processResult.downloadUrl);
// 7. Return to client
return new NextResponse(processedBuffer, {
headers: {
'Content-Type': 'application/pdf',
'Content-Disposition': `attachment; filename="compressed_${file.name}"`,
'Content-Length': processedBuffer.length.toString(),
},
});
} catch (error) {
console.error('PDF compression failed:', error);
return NextResponse.json(
{ error: 'PDF compression failed. Please try again.' },
{ status: 500 }
);
}
}
- Client-Side Integration
The React component provides real-time feedback:
// components/tools/PDFCompressorTool.tsx
export default function PDFCompressorTool() {
const [processing, setProcessing] = useState(false);
const [progress, setProgress] = useState(0);
const handleCompress = async (file: File, settings: CompressionSettings) => {
setProcessing(true);
setProgress(0);
const formData = new FormData();
formData.append('file', file);
formData.append('compressionLevel', settings.level);
try {
const response = await fetch('/api/tools/pdf-compress', {
method: 'POST',
body: formData,
});
if (!response.ok) {
const error = await response.json();
throw new Error(error.message);
}
// Handle file download
const blob = await response.blob();
const url = URL.createObjectURL(blob);
const a = document.createElement('a');
a.href = url;
a.download = `compressed_${file.name}`;
a.click();
URL.revokeObjectURL(url);
} catch (error) {
console.error('Compression failed:', error);
} finally {
setProcessing(false);
setProgress(0);
}
};
return (
<div className="pdf-compressor">
{/* UI implementation */}
</div>
);
}
Advanced Features
Error Handling & Retry Logic
const retryOperation = async (operation: () => Promise<any>, maxRetries = 3) => {
for (let attempt = 1; attempt <= maxRetries; attempt++) {
try {
return await operation();
} catch (error) {
if (attempt === maxRetries) throw error;
// Exponential backoff
const delay = Math.pow(2, attempt) * 1000;
await new Promise(resolve => setTimeout(resolve, delay));
}
}
};
Usage Tracking & Analytics
// Track API usage for analytics
const trackToolUsage = async (userId: string, toolId: string, metadata: any) => {
await prisma.toolUsage.create({
data: {
userId,
toolId,
metadata,
timestamp: new Date(),
},
});
};
Security Considerations
- File Validation: Strict MIME type checking and file signature validation
- Size Limits: Role-based file size restrictions
- Rate Limiting: Prevent abuse with usage quotas
- Temporary Storage: Automatic cleanup of processed files
- Authentication: NextAuth.js with database sessions
Performance Optimizations
Dynamic Imports
// Lazy load PDF processing components
const PDFCompressorTool = dynamic(
() => import('@/components/tools/PDFCompressorTool'),
{
ssr: false,
loading: () => <div>Loading PDF Compressor...</div>
}
);
Streaming Responses
// Stream large files to improve memory usage
const stream = new ReadableStream({
start(controller) {
// Chunk processing logic
}
});
return new NextResponse(stream);
Edge Runtime (where applicable)
export const runtime = 'edge';
Deployment & Scaling
The pipeline is deployed on Vercel with the following considerations:
- Function Timeout: 30 seconds for Pro plans
- Memory Limits: 1GB for serverless functions
- Cold Starts: Minimized with proper caching strategies
- Regional Deployment: Edge functions for better latency
Monitoring & Observability
// Custom metrics and logging
import { withMonitoring } from '@/lib/monitoring';
export const POST = withMonitoring(async (request: NextRequest) => {
// Implementation with automatic metrics collection
});
Results & Impact
This architecture has processed over 10,000+ PDF files with:
- 99.9% uptime across all PDF tools
- Average processing time: 3-8 seconds
- Customer satisfaction: 4.8/5 stars
- Cost efficiency: 80% reduction vs dedicated servers
Lessons Learned
- External APIs are crucial for bypassing serverless limitations
- User feedback during processing significantly improves UX
- Role-based limits help manage costs while offering premium features
- Comprehensive error handling prevents user frustration
- Security-first design is essential for document processing
Future Enhancements
- WebAssembly for client-side processing of smaller files
- Real-time collaboration features
- Custom compression algorithms for specific use cases
Conclusion
Building a serverless PDF processing pipeline requires careful consideration of limitations and creative solutions. By combining Next.js serverless functions with external processing APIs, we created a scalable, secure, and user-friendly system that handles document processing at scale.
The complete implementation is powering the PDF tools at https://www.acetoolz.com/pdf, where you can try the live PDF compressor and other tools.
What challenges have you faced with document processing in serverless environments? Share your experiences in the comments below!
Top comments (0)