DEV Community

Cover image for Handling Large File Uploads in Go with AWS S3: Stream Like a Pro
Neel Patel
Neel Patel

Posted on

Handling Large File Uploads in Go with AWS S3: Stream Like a Pro

In a previous post, we built a file upload service using Go with local storage and Amazon S3 for cloud-based storage. But what if you need to handle large files—think multi-gigabyte video files, or datasets? 😅 That’s where things can get tricky. You don’t want your server bogging down or running out of memory.

In this post, we’ll explore how to handle large file uploads efficiently using streaming and chunking with AWS S3. This way, even the largest files won’t bring your app to its knees.

Here’s what we’ll cover:

  1. Why handling large files requires special care.
  2. Streaming large files directly to S3 with minimal memory usage.
  3. Chunking large files and reassembling them on S3.
  4. Best practices for large file uploads in a production environment.

Ready to get those large files flying into the cloud? Let’s dive in! 🌥️


Step 1: Why Handling Large Files Is Different

When dealing with large file uploads, the last thing you want is to load an entire file into memory. For smaller files, this is no big deal, but with larger files, you’ll quickly hit the limits of your server’s memory, especially when handling multiple simultaneous uploads.

Streaming and chunking are key techniques that allow you to handle these large files efficiently.

  • Streaming: Upload files to S3 as they’re being received by the server, rather than loading the whole file into memory.
  • Chunking: Break large files into smaller parts (chunks) and upload each chunk individually. This is especially useful for resuming failed uploads or for uploading in parallel.

Step 2: Streaming Large Files Directly to S3

We’ll use the AWS SDK to stream the file from the user’s upload request directly into S3, minimizing the amount of memory we need on the server.

Updating the Upload Handler

Instead of storing the entire file in memory or on disk before uploading it to S3, we can use streams to send the file in real-time. Let’s modify our existing fileUploadHandler to handle large files more efficiently.

import (
    "fmt"
    "io"
    "net/http"
    "github.com/aws/aws-sdk-go/aws"
    "github.com/aws/aws-sdk-go/aws/session"
    "github.com/aws/aws-sdk-go/service/s3"
)

func fileUploadHandler(w http.ResponseWriter, r *http.Request) {
    // Limit the request size (e.g., 10GB max size)
    r.Body = http.MaxBytesReader(w, r.Body, 10<<30)

    // Parse the multipart form data
    err := r.ParseMultipartForm(10 << 20)
    if err != nil {
        http.Error(w, "File too large", http.StatusRequestEntityTooLarge)
        return
    }

    // Retrieve the file from the form
    file, handler, err := r.FormFile("file")
    if err != nil {
        http.Error(w, "Error retrieving file", http.StatusBadRequest)
        return
    }
    defer file.Close()

    // Set up AWS session
    sess, err := session.NewSession(&aws.Config{
        Region: aws.String("us-west-1"),
    })
    if err != nil {
        http.Error(w, "Error connecting to AWS", http.StatusInternalServerError)
        return
    }

    // Create the S3 client
    s3Client := s3.New(sess)

    // Stream the file directly to S3
    _, err = s3Client.PutObject(&s3.PutObjectInput{
        Bucket: aws.String("your-bucket-name"),
        Key:    aws.String(handler.Filename),
        Body:   file, // Stream the file directly from the request
        ACL:    aws.String("public-read"),
    })
    if err != nil {
        http.Error(w, "Error uploading file to S3", http.StatusInternalServerError)
        return
    }

    fmt.Fprintf(w, "File uploaded successfully to S3!")
}
Enter fullscreen mode Exit fullscreen mode

In this approach, the file is streamed directly from the request to S3, so you’re not storing the whole file in memory, which is a lifesaver for large files!


Step 3: Chunking Large Files

If you want to take it a step further, you can break files into chunks on the client side and upload them in smaller pieces. This is especially useful for handling flaky connections or massive files, where restarting an upload from scratch would be painful.

Client-Side Chunking Example

On the client side, break the file into smaller chunks and upload each one separately. Here’s an example using JavaScript:

async function uploadFileInChunks(file) {
  const chunkSize = 5 * 1024 * 1024; // 5MB per chunk
  const totalChunks = Math.ceil(file.size / chunkSize);

  for (let i = 0; i < totalChunks; i++) {
    const start = i * chunkSize;
    const end = Math.min(file.size, start + chunkSize);
    const chunk = file.slice(start, end);

    const formData = new FormData();
    formData.append("chunk", chunk);
    formData.append("chunkIndex", i);
    formData.append("filename", file.name);

    await fetch("/upload-chunk", {
      method: "POST",
      body: formData,
    });
  }
}
Enter fullscreen mode Exit fullscreen mode

Server-Side Handling of Chunks

On the server side, you can receive these chunks and append them to the file stored on S3:

func chunkUploadHandler(w http.ResponseWriter, r *http.Request) {
    // Parse the multipart form
    err := r.ParseMultipartForm(10 << 20)
    if err != nil {
        http.Error(w, "Error parsing form", http.StatusBadRequest)
        return
    }

    // Retrieve the chunk and file info
    file, _, err := r.FormFile("chunk")
    if err != nil {
        http.Error(w, "Error retrieving chunk", http.StatusBadRequest)
        return
    }
    defer file.Close()

    filename := r.FormValue("filename")
    chunkIndex := r.FormValue("chunkIndex")

    // Set up the AWS session
    sess, err := session.NewSession(&aws.Config{
        Region: aws.String("us-west-1"),
    })
    if err != nil {
        http.Error(w, "Error connecting to AWS", http.StatusInternalServerError)
        return
    }

    s3Client := s3.New(sess)

    // Append chunk to the file in S3 (multipart upload in AWS)
    _, err = s3Client.PutObject(&s3.PutObjectInput{
        Bucket: aws.String("your-bucket-name"),
        Key:    aws.String(filename + chunkIndex), // Handle chunking in naming
        Body:   file,
        ACL:    aws.String("public-read"),
    })
    if err != nil {
        http.Error(w, "Error uploading chunk to S3", http.StatusInternalServerError)
        return
    }

    fmt.Fprintf(w, "Chunk %s uploaded successfully!", chunkIndex)
}
Enter fullscreen mode Exit fullscreen mode

This method allows you to upload chunks of a file independently and merge them in the cloud. It’s perfect for handling very large uploads where reliability is critical.


Step 4: Best Practices for Large File Uploads

  1. Limit Request Sizes: Always set a reasonable max request size (MaxBytesReader) to prevent users from overwhelming your server.

  2. Multipart Uploads for S3: AWS S3 supports multipart uploads, which is ideal for large files. You can upload parts in parallel and even resume failed uploads.

  3. Secure File Uploads: Ensure you validate file types and use secure connections (HTTPS) for file uploads. Sanitize file names to prevent directory traversal attacks.

  4. Progress Indicators: If you’re chunking files, implement a progress indicator for a better user experience, especially for large files.


Wrapping Up

Handling large file uploads doesn’t have to be a headache. By using streaming and chunking techniques with Go and S3, you can efficiently manage even the biggest files without tanking your server’s memory. Whether you’re building a file storage service, video platform, or media-heavy app, you’re now equipped to handle massive uploads like a pro. 🎉

Have you implemented large file uploads in your projects? Drop your experience or tips in the comments, and let’s keep the conversation going! 😎

Top comments (0)