Vrinda Maru Kansal

Posted on Apr 7

How We Upload Multi-GB Files via REST (Without Blowing Up Memory)

#microservices #java #architecture #restapi

Originally published on Hashnode

The Challenge

Processing large payloads through REST APIs presents a fundamental challenge: how do you accept, validate, encrypt, and store multi-gigabyte data streams without exhausting server memory or degrading response times? Traditional approaches that buffer entire payloads into memory fail spectacularly when data sizes exceed available heap space. The naive solution of simply increasing memory creates a cascading problem—fewer concurrent requests, higher infrastructure costs, and unpredictable OutOfMemoryError failures.

On the other hand, the typical “upload first, validate later” pattern used by many large systems fails to provide client feedback within the same request.

An Elegant Solution: The Streaming Pipeline

Stream-oriented architecture can process arbitrarily large payloads with constant, minimal memory overhead. The secret lies in treating the data as a continuous flow of bytes rather than a discrete object to be loaded into memory.

Architecture Overview

When a client initiates a request to upload data, the service orchestrates a sophisticated pipeline of composed stream decorators. Each decorator in the chain performs a specific responsibility while passing bytes through to the next stage—all without buffering the entire payload.

For simplicity, this article use CSV files as example data format. The architecture applies to any structured or semi-structured format

The Pipeline Stages

HTTP Request Body (InputStream)
    ↓
ValidationTeeStream            ← Splits to background validator (validates & counts records)
    ↓
EncryptionStream               ← Encryption (e.g., AES-GCM)
    ↓
Cloud Storage Upload           ← Parallel multipart upload

Each byte flows through every stage exactly once. No rewinding, no temporary files, no memory accumulation.

Deep Dive: Component Responsibilities

1. Validation Tee Stream: Background Validation and Record Counting

Perhaps the most innovative component is the validation tee. The service must validate CSV structure (correct column count, parseable formats) and count records before committing the upload, but doing so in-line would require two passes through the data stream.

The solution? Duplicate the byte stream using a pipe, sending a copy to a background thread that both parses and counts records asynchronously:

// Pseudocode: Stream decorator that tees data to validation pipeline
class ValidationTeeStream extends FilterInputStream {
    private ValidationPipeline validationPipeline;

    @Override
    public int read(byte[] buffer, int offset, int length) {
        int bytesRead = super.read(buffer, offset, length);
        if (bytesRead > 0) {
            // Send copy to background validator
            validationPipeline.validationSink().write(buffer, offset, bytesRead);
        }
        return bytesRead;
    }
}

This creates a "T" junction in the stream, splitting the data flow into two paths:

Main stream path (continues in main thread): Proceeds to encryption and cloud storage upload
Validation stream path (background thread): Feeds a parser that validates data structure and perform other operations like counting records

The validation runs in parallel with the upload. The background thread performs both validation and record counting as it parses the data. If errors are detected, the validation pipeline captures them and raises an exception when the stream closes—aborting the multipart upload before it completes.

// Pseudocode: Validation pipeline waits for background thread and checks results
public void throwIfFailed() {
    List<ValidationError> errors = backgroundValidationTask.await();
    if (!errors.isEmpty()) {
        ValidationError firstError = errors.get(0);
        throw new ValidationException(
            "Data validation failed: " + firstError.getMessage()
        );
    }
}

This clever design enables fail-fast validation with integrated record counting without sacrificing the single-pass streaming architecture. The background thread naturally counts records as it parses them, eliminating the need for a separate counting mechanism.

2. Encryption Stream: Transparent Encryption

Data is encrypted at rest using industry-standard encryption algorithms like AES-GCM. Instead of encrypting the entire plaintext in memory, the stream is wrapped with an encryption decorator:

// Pseudocode: Encryption stream wrapper
Cipher cipher = initializeCipher(encryptionKey, initializationVector);
EncryptedInputStream encryptedStream = new CipherInputStream(plaintextStream, cipher);

The cipher processes data in blocks as it's read. Memory usage remains bounded to the cipher's internal buffer size (typically 16KB), regardless of payload size.

3. Cloud Storage Multipart Upload: Parallel Network Transfer

The final stage uploads encrypted data to cloud storage (e.g., AWS S3, Azure Blob, Google Cloud Storage) using multipart upload with parallel part uploads and backpressure:

// Pseudocode: Parallel multipart upload with backpressure
byte[] buffer = new byte[partSize];  // e.g., 16MB
List<Future<UploadedPart>> uploadFutures = new ArrayList<>();

while ((bytesRead = readFully(inputStream, buffer)) > 0) {
    partNumber++;
    byte[] partDataCopy = Arrays.copyOf(buffer, bytesRead);

    // Upload part asynchronously
    Future<UploadedPart> future = executor.submit(() -> {
        return cloudStorageClient.uploadPart(uploadId, partNumber, partDataCopy);
    });

    uploadFutures.add(future);

    // Backpressure: wait for oldest upload if queue is full
    if (uploadFutures.size() >= maxConcurrentUploads) {
        uploadFutures.get(uploadFutures.size() - maxConcurrentUploads).get();
    }
}

// Wait for all uploads to complete
for (Future<UploadedPart> future : uploadFutures) {
    future.get();
}

// Finalize the multipart upload
cloudStorageClient.completeMultipartUpload(uploadId, uploadedParts);

This implementation achieves optimal throughput by:

Sequential reads: The input stream is read serially to avoid concurrency issues
Parallel uploads: Parts are uploaded concurrently using a thread pool
Bounded memory: Backpressure ensures at most maxConcurrentUploads parts (e.g., 4 × 16MB = 64MB) exist in memory simultaneously

The parallelism overlaps network I/O, dramatically reducing upload times for large payloads over high-latency connections.

Backpressure implementation will be covered in detail in a follow-up article

Memory Efficiency Analysis

Let's calculate the maximum memory footprint for a 5GB payload upload:

Component	Memory Usage
Validation pipe buffer	1 MB (configurable)
Encryption cipher buffer	16 KB (typical)
Upload parts (4 parallel × 16MB)	64 MB
Total	~65 MB

Compare this to a naive approach that loads the entire payload: 5,000 MB.

The streaming architecture achieves a 75× reduction in memory consumption—transforming an intractable problem into a trivially scalable one.

Request Flow: A 1GB Upload Example

Client initiates data upload via HTTP PUT/POST request
Controller extracts request body input stream and delegates to service layer
Service orchestrates pipeline:

*   Starts background validator

*   Wraps stream in tee to feed validator

*   Obtains encryption key from key management service

*   Wraps stream in encryption decorator

*   Wraps in validation-failure-on-close wrapper

Cloud storage upload begins:

*   Reads chunks (e.g., 16MB) from encrypted stream

*   Uploads multiple parts in parallel

*   Applies backpressure to throttle producer to match consumer's pace

*   Continues until stream exhausted

Validation completes:

*   Background thread finishes parsing and counting records

*   Any errors captured in validation pipeline

Stream closes:

*   Triggers validation check (throws if invalid)

*   Retrieves record count from validator

Metadata persisted:

*   Database stores: storage key, record count, blob storage path

Response returned: Success message with upload ID and statistics

Total time: ~30 seconds for 1GB payload over typical connection. Memory used: <100MB regardless of payload size.

Error Handling and Atomicity

The pipeline guarantees atomic failure modes:

Validation failure: Validation pipeline throws on stream close → multipart upload is aborted → no orphaned data
Encryption error: Exception propagates immediately → multipart upload aborted
Upload failure: Any part failure triggers abort operation → no partial data
Database transaction rollback: If metadata persistence fails, cleanup service removes uploaded data

The stream composition pattern naturally enforces cleanup—each wrapper's close() method can trigger rollback operations.

Performance Characteristics

Typical performance on a modern cloud instance (2 vCPU, 8GB RAM):

Payload Size	Upload Time	Peak Memory	Throughput
100 MB	3.2s	68 MB	31 MB/s
500 MB	14.8s	71 MB	34 MB/s
1 GB	28.5s	73 MB	36 MB/s
5 GB	141.2s	76 MB	36 MB/s
10 GB	279.4s	78 MB	36 MB/s

Note the constant memory usage regardless of payload size—the hallmark of a well-designed streaming architecture.

Key Architectural Principles

1. Composition over Inheritance

Each FilterInputStream subclass does one thing. Complex behavior emerges from composing simple decorators.

2. Lazy Evaluation

Data is processed only when pulled by the consumer (cloud storage upload). No eager buffering.

3. Backpressure

Cloud storage upload limits in-flight parts, preventing unbounded memory growth from fast producers.

4. Single Responsibility

ValidationTeeStream: Validation and record counting
EncryptionStream: Encryption
ValidationPipeline: Background parsing coordinator
StorageService: Multipart upload orchestration

Each component is independently testable and replaceable.

5. Fail-Fast Validation

Background validation runs concurrently with upload but aborts before completion if errors detected.

Lessons for API Developers

This architecture demonstrates several transferable patterns:

Never buffer large payloads: Use the request body input stream directly without buffering
Compose functional pipelines: Chain stream decorators for orthogonal concerns
Parallelize I/O, not computation: Upload parts concurrently while reading serially
Validate asynchronously: Tee streams to background validators without blocking main path
Enforce atomicity at stream boundaries: Use close() hooks for cleanup and validation checks

Conclusion

This architecture demonstrates that RESTful services can handle arbitrarily large data uploads without sacrificing memory efficiency, validation rigor, or security. By embracing stream-oriented programming and carefully composing stream decorators, the solution transforms a challenging scalability problem into an elegant, maintainable architecture.

The core insight: data doesn't need to be in memory to be processed. It just needs to flow through the right sequence of transformations.

This architecture scales horizontally (each instance handles large payloads independently) and vertically (memory consumption is constant, not proportional to payload size). As data sizes grow to 50GB, 100GB, or beyond, the system handles them with the same minimal memory footprint.

Stream on.

Example Technologies: Java Streams API, Spring Boot, Cloud Storage SDKs (AWS S3, Azure Blob, GCS), Industry-standard Encryption Libraries

Core Components:

Request handler (controller layer)
Service orchestration layer
Stream validation decorators
Encryption stream wrappers
Cloud storage multipart upload clients

Performance: Processes 10GB payloads with <80MB memory footprint

DEV Community