Handling large file uploads can be challenging, especially when you need reliability, reusability, and good performance. In this post, I'll walk through creating a micro service specifically designed for multi-part file uploads using Node.js, Express, and AWS S3.
Why Multi-Part Uploads?
Traditional file uploads have several limitations:
- Timeouts for large files
- No resumability if connection drops
- Memory pressure on servers
- Slower upload speeds
Multi-part uploads solve these by:
- Breaking files into smaller chunks
- Uploading chunks independently
- Parallel upload capability
- Ability to resume failed uploads
Architecture Overview
Our microservice will have:
- Initiation endpoint - Starts the upload process
- Chunk upload endpoint - Handles individual parts
- Completion endpoint - Finalizes the upload
- S3 integration - Stores the files
Implementation
1. Setting Up the Project
mkdir upload-microservice
cd upload-microservice
npm init -y
npm install express multer aws-sdk cors dotenv uuid
2. Basic Server Setup
// server.js
require('dotenv').config();
const express = require('express');
const cors = require('cors');
const { S3 } = require('aws-sdk');
const app = express();
app.use(cors());
app.use(express.json());
const s3 = new S3({
region: process.env.AWS_REGION,
credentials: {
accessKeyId: process.env.AWS_ACCESS_KEY,
secretAccessKey: process.env.AWS_SECRET_KEY
}
});
const PORT = process.env.PORT || 3000;
app.listen(PORT, () => {
console.log(`Upload microservice running on port ${PORT}`);
});
3. Initiate Upload Endpoint
app.post('/initiate-upload', async (req, res) => {
const { fileName, fileType, fileSize } = req.body;
// Generate unique upload ID
const uploadId = require('uuid').v4();
// Create multipart upload in S3
const params = {
Bucket: process.env.S3_BUCKET,
Key: fileName,
ContentType: fileType
};
try {
const data = await s3.createMultipartUpload(params).promise();
res.json({
uploadId: data.UploadId,
fileKey: data.Key
});
} catch (err) {
res.status(500).json({ error: err.message });
}
});
4. Upload Part Endpoint
const multer = require('multer');
const upload = multer({ storage: multer.memoryStorage() });
app.post('/upload-part', upload.single('chunk'), async (req, res) => {
const { uploadId, fileKey, partNumber } = req.body;
const chunk = req.file.buffer;
const params = {
Bucket: process.env.S3_BUCKET,
Key: fileKey,
PartNumber: parseInt(partNumber),
UploadId: uploadId,
Body: chunk
};
try {
const data = await s3.uploadPart(params).promise();
res.json({
partNumber: partNumber,
eTag: data.ETag
});
} catch (err) {
res.status(500).json({ error: err.message });
}
});
5. Complete Upload Endpoint
app.post('/complete-upload', async (req, res) => {
const { uploadId, fileKey, parts } = req.body;
const params = {
Bucket: process.env.S3_BUCKET,
Key: fileKey,
UploadId: uploadId,
MultipartUpload: {
Parts: parts.map(part => ({
PartNumber: part.partNumber,
ETag: part.eTag
}))
}
};
try {
await s3.completeMultipartUpload(params).promise();
res.json({ message: 'Upload completed successfully' });
} catch (err) {
res.status(500).json({ error: err.message });
}
});
Client-Side Implementation
Here's a simplified client-side example:
async function uploadFile(file) {
// 1. Initiate upload
const { uploadId, fileKey } = await initiateUpload(file.name, file.type);
// 2. Split file into chunks and upload
const chunkSize = 5 * 1024 * 1024; // 5MB chunks
const chunks = Math.ceil(file.size / chunkSize);
const uploadPromises = [];
const uploadedParts = [];
for (let i = 0; i < chunks; i++) {
const start = i * chunkSize;
const end = Math.min(start + chunkSize, file.size);
const chunk = file.slice(start, end);
const part = await uploadChunk(chunk, uploadId, fileKey, i + 1);
uploadedParts.push(part);
}
// 3. Complete upload
await completeUpload(uploadId, fileKey, uploadedParts);
}
Deployment Considerations
- Scaling: Use containers (Docker) and orchestration (Kubernetes)
- Monitoring: Add logging and metrics
- Security: Implement authentication and rate limiting
- Resumability: Add endpoints to list uploaded parts
Benefits of This Approach
- Handles large files efficiently
- Supports pause/resume functionality
- Enables parallel uploads for faster transfers
- Reduces memory pressure on server
- Works well with cloud storage solutions
Next Steps
- Add file validation
- Implement progress tracking
- Add authentication
- Set up proper error handling and retries
The complete code is available on GitHub. Let me know in the comments if you'd like me to cover any specific aspects in more detail!
Top comments (0)