In a previous post, we built a file upload service using Go with local storage and Amazon S3 for cloud-based storage. But what if you need to handle large files—think multi-gigabyte video files, or datasets? 😅 That’s where things can get tricky. You don’t want your server bogging down or running out of memory.
In this post, we’ll explore how to handle large file uploads efficiently using streaming and chunking with AWS S3. This way, even the largest files won’t bring your app to its knees.
Here’s what we’ll cover:
- Why handling large files requires special care.
- Streaming large files directly to S3 with minimal memory usage.
- Chunking large files and reassembling them on S3.
- Best practices for large file uploads in a production environment.
Ready to get those large files flying into the cloud? Let’s dive in! 🌥️
Step 1: Why Handling Large Files Is Different
When dealing with large file uploads, the last thing you want is to load an entire file into memory. For smaller files, this is no big deal, but with larger files, you’ll quickly hit the limits of your server’s memory, especially when handling multiple simultaneous uploads.
Streaming and chunking are key techniques that allow you to handle these large files efficiently.
- Streaming: Upload files to S3 as they’re being received by the server, rather than loading the whole file into memory.
- Chunking: Break large files into smaller parts (chunks) and upload each chunk individually. This is especially useful for resuming failed uploads or for uploading in parallel.
Step 2: Streaming Large Files Directly to S3
We’ll use the AWS SDK to stream the file from the user’s upload request directly into S3, minimizing the amount of memory we need on the server.
Updating the Upload Handler
Instead of storing the entire file in memory or on disk before uploading it to S3, we can use streams to send the file in real-time. Let’s modify our existing fileUploadHandler
to handle large files more efficiently.
import (
"fmt"
"io"
"net/http"
"github.com/aws/aws-sdk-go/aws"
"github.com/aws/aws-sdk-go/aws/session"
"github.com/aws/aws-sdk-go/service/s3"
)
func fileUploadHandler(w http.ResponseWriter, r *http.Request) {
// Limit the request size (e.g., 10GB max size)
r.Body = http.MaxBytesReader(w, r.Body, 10<<30)
// Parse the multipart form data
err := r.ParseMultipartForm(10 << 20)
if err != nil {
http.Error(w, "File too large", http.StatusRequestEntityTooLarge)
return
}
// Retrieve the file from the form
file, handler, err := r.FormFile("file")
if err != nil {
http.Error(w, "Error retrieving file", http.StatusBadRequest)
return
}
defer file.Close()
// Set up AWS session
sess, err := session.NewSession(&aws.Config{
Region: aws.String("us-west-1"),
})
if err != nil {
http.Error(w, "Error connecting to AWS", http.StatusInternalServerError)
return
}
// Create the S3 client
s3Client := s3.New(sess)
// Stream the file directly to S3
_, err = s3Client.PutObject(&s3.PutObjectInput{
Bucket: aws.String("your-bucket-name"),
Key: aws.String(handler.Filename),
Body: file, // Stream the file directly from the request
ACL: aws.String("public-read"),
})
if err != nil {
http.Error(w, "Error uploading file to S3", http.StatusInternalServerError)
return
}
fmt.Fprintf(w, "File uploaded successfully to S3!")
}
In this approach, the file is streamed directly from the request to S3, so you’re not storing the whole file in memory, which is a lifesaver for large files!
Step 3: Chunking Large Files
If you want to take it a step further, you can break files into chunks on the client side and upload them in smaller pieces. This is especially useful for handling flaky connections or massive files, where restarting an upload from scratch would be painful.
Client-Side Chunking Example
On the client side, break the file into smaller chunks and upload each one separately. Here’s an example using JavaScript:
async function uploadFileInChunks(file) {
const chunkSize = 5 * 1024 * 1024; // 5MB per chunk
const totalChunks = Math.ceil(file.size / chunkSize);
for (let i = 0; i < totalChunks; i++) {
const start = i * chunkSize;
const end = Math.min(file.size, start + chunkSize);
const chunk = file.slice(start, end);
const formData = new FormData();
formData.append("chunk", chunk);
formData.append("chunkIndex", i);
formData.append("filename", file.name);
await fetch("/upload-chunk", {
method: "POST",
body: formData,
});
}
}
Server-Side Handling of Chunks
On the server side, you can receive these chunks and append them to the file stored on S3:
func chunkUploadHandler(w http.ResponseWriter, r *http.Request) {
// Parse the multipart form
err := r.ParseMultipartForm(10 << 20)
if err != nil {
http.Error(w, "Error parsing form", http.StatusBadRequest)
return
}
// Retrieve the chunk and file info
file, _, err := r.FormFile("chunk")
if err != nil {
http.Error(w, "Error retrieving chunk", http.StatusBadRequest)
return
}
defer file.Close()
filename := r.FormValue("filename")
chunkIndex := r.FormValue("chunkIndex")
// Set up the AWS session
sess, err := session.NewSession(&aws.Config{
Region: aws.String("us-west-1"),
})
if err != nil {
http.Error(w, "Error connecting to AWS", http.StatusInternalServerError)
return
}
s3Client := s3.New(sess)
// Append chunk to the file in S3 (multipart upload in AWS)
_, err = s3Client.PutObject(&s3.PutObjectInput{
Bucket: aws.String("your-bucket-name"),
Key: aws.String(filename + chunkIndex), // Handle chunking in naming
Body: file,
ACL: aws.String("public-read"),
})
if err != nil {
http.Error(w, "Error uploading chunk to S3", http.StatusInternalServerError)
return
}
fmt.Fprintf(w, "Chunk %s uploaded successfully!", chunkIndex)
}
This method allows you to upload chunks of a file independently and merge them in the cloud. It’s perfect for handling very large uploads where reliability is critical.
Step 4: Best Practices for Large File Uploads
Limit Request Sizes: Always set a reasonable max request size (
MaxBytesReader
) to prevent users from overwhelming your server.Multipart Uploads for S3: AWS S3 supports multipart uploads, which is ideal for large files. You can upload parts in parallel and even resume failed uploads.
Secure File Uploads: Ensure you validate file types and use secure connections (HTTPS) for file uploads. Sanitize file names to prevent directory traversal attacks.
Progress Indicators: If you’re chunking files, implement a progress indicator for a better user experience, especially for large files.
Wrapping Up
Handling large file uploads doesn’t have to be a headache. By using streaming and chunking techniques with Go and S3, you can efficiently manage even the biggest files without tanking your server’s memory. Whether you’re building a file storage service, video platform, or media-heavy app, you’re now equipped to handle massive uploads like a pro. 🎉
Have you implemented large file uploads in your projects? Drop your experience or tips in the comments, and let’s keep the conversation going! 😎
Top comments (0)