In this article, we’ll go over how you can handle large files in your Node.js applications using various large file uploading techniques. These techniques can also be applied to other programming languages and frameworks as needed.
Possible issues when uploading a large file
Uploading a large file in an application can often pose several expected and unintended challenges, since you cannot predict the location, device type, or network conditions used to upload a file.
Upload speed, bandwidth, and latency
The larger a file, the more bandwidth and time it takes to upload. While a small file can be an easy pass in an application, uploading a large file can be a different story and can be a pain point for your users.
A single large file upload can consume significant bandwidth.
Typically, upload speed problems occur when you transfer large amounts of data in a single batch to your server.
In this scenario, regardless of the end user’s location, all files are directed to a single destination via the same route, creating a traffic jam similar to that in Manhattan during rush hour.
When your users upload multiple large files simultaneously, it can cause your servers to become paralyzed: the speed decreases and the server response time increases, especially if your server is not configured to handle large file uploads.
Upload failures and timeouts
When uploading a single large file, upload failures and timeouts may occur for various reasons:
- If there is a network interruption during the upload process
- If the file isn’t sent within the allotted timeout period
- If the server takes too long to respond to a request due to overload
Whatever the reason may be, upload failures are likely to occur if you do not handle large files properly.
Mobile & edge case constraint
Globally, a majority of users access the internet via mobile devices, which often present unique challenges and edge cases to consider when enabling large-file uploads.
For one, mobile devices generally have slower network speeds and stricter data limits compared to wired connections, leading to issues such as timeouts and network interruptions.
What about a situation where a user wants to switch between apps while uploading is happening, but the browser suspends background activities to conserve battery? Or the user’s device may be in battery-saving mode, in which the mobile OS aggressively shuts down non-essential tasks, interrupting the upload process.?
It is also possible that the device has limited storage or the OS has specific limitations that may cause issues with uploading large files. This could result in the device crashing or stalling uploads.
Server limitations
Handling large file uploads is not just about configuring servers; it also involves determining the maximum file size your server can handle.
Factors such as your server’s storage, memory, and the number of concurrent tasks it can handle simultaneously can be issues.
Yes, your server might be able to handle a single large file upload by default, but what happens when 1,000 users upload large files simultaneously, and your server needs to process them all?
Proven techniques for handling large file uploads
How do you handle large files and avoid problems related to large file uploads while ensuring your users have a good experience? The following techniques are some of the best practices for handling file uploads.
1. Chunk file uploads: Split large files into manageable parts
Chunking involves breaking a large file into smaller parts (chunks) and uploading them individually. This method makes error management easier, as only the failed chunk needs to be re-uploaded. Additionally, chunking helps efficiently manage memory and bandwidth usage.
To implement chunking, programmatically split the large files into smaller chunks and upload them sequentially to your server. Next, set up a backend service to combine all the chunk files into the original file.
Once that is done, you can then return a response or the URL of the uploaded file to the frontend of your application, allowing users to access it.
For example, say you have an HTML form with an input that you would like to use for uploading large files (in this case, let’s consider a large file to be a file above 10 MB) with the following markup:
<form id="upload-form">
<input type="file" id="file-upload">
<button type="submit">Upload file</button>
</form>
Using JavaScript on the frontend, you can split the large file into smaller chunks and upload them individually using the code below:
document.getElementById('upload-form').addEventListener('submit', e => {
e.preventDefault();
uploadFile();
});
async function uploadFile() {
const fileInput = document.getElementById('file-upload');
const file = fileInput.files[0];
if (!file) {
alert('Please select a file to upload');
return;
}
const chunkSize = 5 * 1024 * 1024; // 5MB chunks
const totalChunks = Math.ceil(file.size / chunkSize);
const fileId = generateFileId(); // Generate unique ID for this upload
console.log(`Uploading file: ${file.name}, Size: ${file.size}, Chunks: ${totalChunks}`);
try {
// Upload all chunks
for (let chunkIndex = 0; chunkIndex < totalChunks; chunkIndex++) {
const start = chunkIndex * chunkSize;
const end = Math.min(start + chunkSize, file.size);
const chunk = file.slice(start, end);
await uploadChunk(chunk, chunkIndex, totalChunks, fileId, file.name);
// Update progress
const progress = ((chunkIndex + 1) / totalChunks) * 100;
console.log(`Upload progress: ${progress.toFixed(2)}%`);
}
await mergeChunks(fileId, totalChunks, file.name);
console.log('File upload completed successfully!');
} catch (error) {
console.error('Upload failed:', error);
}
}
async function uploadChunk(chunk, chunkIndex, totalChunks, fileId, fileName) {
const formData = new FormData();
formData.append('chunk', chunk);
formData.append('chunkIndex', chunkIndex);
formData.append('totalChunks', totalChunks);
formData.append('fileId', fileId);
formData.append('fileName', fileName);
const response = await fetch('/upload-chunk', {
method: 'POST',
body: formData
});
if (!response.ok) {
throw new Error(`Failed to upload chunk ${chunkIndex}`);
}
return response.json();
}
async function mergeChunks(fileId, totalChunks, fileName) {
const response = await fetch('/merge-chunks', {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify({
fileId,
totalChunks,
fileName
})
});
if (!response.ok) {
throw new Error('Failed to merge chunks');
}
return response.json();
}
function generateFileId() {
return Date.now().toString(36) + Math.random().toString(36);
}
The code above splits a large file upload into smaller chunks (5 MB in size) and uploads them in bits to a server.
There are three important functions in the code: uploadFile, uploadChunk, and mergeChunks.
I’ll show you what each one does and how they work together.
Let’s start with the uploadFile function, which is the main function that calls the other functions.
This function performs multiple steps for splitting the file:
- First, it gets the file from the form input.
- Then it generates a
fileIdfor the file and, using the chunkSize variable of 5 MB, calculates how many chunks the file can be split into. - Next, using the
Blob.slicemethod, it splits the file into various chunks, loops through each chunk, and for each chunk, calls theuploadChunkfunction. - When the
uploadChunkfunction finishes, it calls themergeChunksfunction.
The uploadChunk function is quite straightforward; it creates a new FormData object and adds all the necessary information about the uploaded chunk to it.
This information includes values like chunk, chunkIndex, totalChunks, fileId, and fileName. It then sends a POST request using a Fetch API to the URL /upload-chunk to upload each chunk file to the server.
When each chunk file has been uploaded successfully, the mergeChunks function is called. This function uses the variables fileId, totalChunks, and fileName to send another request to the server at the URL merge-chunks, this time to merge all the uploaded chunk files and retrieve the complete original file.
An example of how you might implement this in a Node.js backend that uses Express.js and expects to receive large files in chunks would look like this:
import express from 'express';
import multer from 'multer';
import fs from 'fs';
import path from 'path';
const app = express();
const PORT = 3000;
app.use(express.json());
app.use(express.static('public'));
// Create directories for uploads and temporary chunks
const uploadsDir = './uploads';
const chunksDir = './chunks';
fs.mkdirSync(uploadsDir, { recursive: true });
fs.mkdirSync(chunksDir, { recursive: true });
// Configure multer for chunk uploads with field parsing
const upload = multer({
storage: multer.diskStorage({
destination: (req, file, cb) => {
// Create a temporary directory first, we'll move the file later
const tempDir = path.join(chunksDir, 'temp');
fs.mkdirSync(tempDir, { recursive: true });
cb(null, tempDir);
},
filename: (req, file, cb) => {
// Use a temporary filename
cb(null, `temp-${Date.now()}-${Math.random().toString(36)}`);
},
}),
});
// Endpoint to upload individual chunks
app.post('/upload-chunk', upload.single('chunk'), (req, res) => {
const { fileId, chunkIndex, totalChunks } = req.body;
// Validate required fields after multer processes the request
if (!fileId) {
// Clean up the temporary file
if (req.file) {
fs.unlinkSync(req.file.path);
}
return res.status(400).json({ error: 'fileId is required' });
}
if (chunkIndex === undefined) {
// Clean up the temporary file
if (req.file) {
fs.unlinkSync(req.file.path);
}
return res.status(400).json({ error: 'chunkIndex is required' });
}
if (!req.file) {
return res.status(400).json({ error: 'No chunk uploaded' });
}
// Create the proper directory and move the file
const chunkDir = path.join(chunksDir, fileId);
fs.mkdirSync(chunkDir, { recursive: true });
const finalChunkPath = path.join(chunkDir, `chunk-${chunkIndex}`);
// Move the file from temp location to final location
fs.renameSync(req.file.path, finalChunkPath);
console.log(
`Received chunk ${chunkIndex} of ${totalChunks} for file ${fileId}`
);
res.json({
message: `Chunk ${chunkIndex} uploaded successfully`,
chunkIndex: parseInt(chunkIndex),
fileId,
});
});
// Endpoint to merge all chunks into the final file
app.post('/merge-chunks', async (req, res) => {
const { fileId, totalChunks, fileName } = req.body;
try {
const chunkDir = path.join(chunksDir, fileId);
const finalFilePath = path.join(uploadsDir, fileName);
// Create write stream for the final file
const writeStream = fs.createWriteStream(finalFilePath);
// Read and write chunks in order
for (let i = 0; i < totalChunks; i++) {
const chunkPath = path.join(chunkDir, `chunk-${i}`);
if (!fs.existsSync(chunkPath)) {
throw new Error(`Chunk ${i} is missing`);
}
await new Promise((resolve, reject) => {
const readStream = fs.createReadStream(chunkPath);
readStream.pipe(writeStream, { end: false });
readStream.on('end', resolve);
readStream.on('error', reject);
});
}
writeStream.end();
// Wait for the write stream to finish
await new Promise((resolve, reject) => {
writeStream.on('finish', resolve);
writeStream.on('error', reject);
});
// Clean up chunks directory
fs.rmSync(chunkDir, { recursive: true, force: true });
console.log(`File ${fileName} merged successfully`);
res.json({
message: 'File uploaded successfully',
fileName,
filePath: finalFilePath,
});
} catch (error) {
console.error('Error merging chunks:', error);
res.status(500).json({ error: 'Failed to merge chunks' });
}
});
app.listen(PORT, () => {
console.log(`Server running on http://localhost:${PORT}`);
});
The backend code creates all the endpoints used by the frontend to upload the chunk files.
It uses multer as middleware to catch each file in a request and process it before proceeding with the request. Let’s look at each endpoint and how it handles it.
-
/upload-chunk: When a file is sent to this endpoint, it first checks if the file info is correct. If it is, it creates a temporary directory named/chunksand uploads the chunk file there with thechunkIndexandfileIdattached to the file name so it knows the order of the chunks when it needs to merge them later. -
/merge-chunks: When the endpoint is requested, it first uses thefileIdto retrieve the uploaded chunks, then loops through them to create a single file using fs.createWriteStream.
This way, it creates the original uploaded file and moves it to the /uploads directory.
If you’d rather not implement chunking in your application from scratch, you could use a ready-made solution like Uploadcare’s File Uploader, which supports chunking out of the box.
2. Resumable uploads: Ensure reliability
Implementing resumable uploads helps maintain upload integrity in case of interruptions. It allows the upload to pause and resume without starting over. This technique is beneficial in environments with unstable network connections.
Resumable uploads allow users to pause and resume uploads at any time. This feature is particularly useful for large files, as it enables users to continue uploading from where they left off in case of network interruptions or other issues.
When implementing resumable uploads, include clear error messages and next steps to inform the user about what happened during the upload process and what to do next.
3. Streaming large files in real time
Another useful method is streaming, where the file is uploaded as it is being read. This is particularly beneficial for large files, as it reduces stress on both the client and the server, and enables continuous data transfer.
Some use cases where streaming a large file might come in handy are:
- When uploading very large media files in GBs, like 4k videos, game assets, or raw datasets
- When there is low RAM memory on the server, or when you use a serverless environment for your backend
An example of this is using the createReadStream method in Node.js from the fs module to read files as they are being uploaded:
// example of streaming large file uploads in Node.js
const fs = require('fs');
const http = require('http');
const server = http.createServer((req, res) => {
const stream = fs.createReadStream('largefile.mp4');
stream.pipe(res);
});
server.listen(3000, () => console.log('Server running on port 3000'));
4. Use a CDN and upload files to the closest data center
Using a Content Delivery Network (CDN) effectively handles large file uploads and ensures a smooth and reliable user experience. A CDN is a distributed server network that delivers content to users based on their geographic location.
By using a CDN for your backend services and uploading files to the closest data center, you can significantly speed up large-file uploads in your application.
At Uploadcare, we use Amazon S3, which receives numerous batches of data simultaneously and stores each in globally distributed edge locations. To further reduce speed and latency, we use an acceleration feature that enables fast transfers between a browser and an S3 bucket.
Adopting this method can help you to produce a wow effect for your users.
For example, if a user is in Singapore, the uploaded data doesn’t try to reach the primary AWS server in the US. Instead, it goes to the nearest data center, which is 73% faster.
A speed estimate for uploading data to AWS with and without the transfer acceleration feature
Check out the speed comparison and possible acceleration for your target regions in this speed checker.
Conclusion
While handling large files can be tedious, it can be streamlined by applying these techniques throughout the upload process to provide a better user experience.
Start by implementing chunking during uploads, then migrate to a CDN as you scale. Your users will notice the difference.
Happy coding!

Top comments (0)