DEV Community

Prashanth Mangipudi
Prashanth Mangipudi

Posted on

How I Exported 1 Million+ Oracle Records to CSV in Under 30 Seconds Using Node.js

The Problem

During the modernization of a legacy enterprise application, one feature initially appeared deceptively simple:

Export data to CSV.

The legacy application was built using Java Struts and Hibernate and supported exporting datasets containing more than one million records.

As part of our modernization effort, I was migrating functionality to a new stack:

  • React frontend
  • Node.js backend
  • Oracle database

One key requirement remained unchanged:

The export had to be synchronous.

Users should click Export and immediately receive a downloadable file.

No background jobs.
No queues.
No polling.
No email notifications.

Just a direct download experience.

Why This Is Hard

The first instinct many developers have is:

  1. Query the database
  2. Load all rows into memory
  3. Convert to CSV
  4. Return the file

For a million records, this approach quickly breaks down.

Problems include:

  • High memory consumption
  • Event loop blocking
  • Slow response times
  • Application crashes
  • Reverse proxy timeouts

The challenge wasn't generating a CSV.

The challenge was generating a CSV while keeping memory usage nearly constant.

Understanding What the Legacy Java Application Was Doing

Although the implementation details were hidden behind Hibernate, the application was effectively relying on a streaming model:

Database Cursor
→ Fetch Records in Chunks
→ Write to Output Stream
→ Send to Browser

The goal was not to reinvent the feature.

The goal was to reproduce the same streaming philosophy in Node.js.

The Architecture I Chose

Instead of loading all records into memory,I designed a fully streamed pipeline.

Oracle ResultSet Cursor
→ Chunk Fetching
→ CSV Stream
→ GZIP Compression
→ HTTP Response Stream
→ Browser Download

Each component processes data incrementally.

At no point does the application hold the entire dataset in memory.

Key Optimization #1: Bypass the ORM

Our application uses TypeORM for standard CRUD operations.

For exports, however, ORM abstractions become expensive.

Entity hydration, decorators, and object transformation introduce unnecessary overhead.

For this feature I bypassed TypeORM entirely and used the native Oracle driver.

This allowed direct access to Oracle ResultSets and cursor-based fetching.

Key Optimization #2: Cursor-Based Fetching

Rather than executing a query and retrieving every row at once, I used Oracle ResultSets with configurable fetch sizes.

The application fetches rows in batches and immediately pushes them downstream for processing.

This keeps memory consumption stable regardless of whether the export contains ten thousand rows or one million rows.

Key Optimization #3: Stream CSV Generation

Instead of creating a giant array and converting it to CSV afterward, records are converted as they are read from Oracle.

This creates a continuous flow:

Database → CSV → Response

No intermediate buffering.

No giant objects.

No memory spikes.

Key Optimization #4: GZIP Compression

One surprising win came from compression.

CSV files compress extremely well.

A large export that might otherwise require hundreds of megabytes can often be reduced dramatically before transmission.

Benefits include:

  • Reduced network transfer time
  • Faster downloads
  • Lower bandwidth consumption

The compression step became an important part of achieving our performance target.

Key Optimization #5: Dedicated Export Connection Pool

Exports have very different behavior compared to normal API traffic.

Long-running queries and large result sets can monopolize database connections.

To avoid impacting application traffic, I introduced a dedicated Oracle connection pool specifically for export workloads.

This isolated export traffic from normal user requests and improved system stability.

Results

After implementing:

  • Native Oracle driver
  • Dedicated export connection pool
  • Cursor-based fetching
  • Streaming CSV generation
  • GZIP compression

I successfully handled exports containing over one million records while maintaining a stable memory footprint and meeting our performance expectations.

Lessons Learned

The biggest lesson was that exporting large datasets is primarily a streaming problem.

Most performance issues come from unnecessary buffering and object creation.

When working with large exports:

  • Stream everything
  • Avoid ORM abstractions where appropriate
  • Fetch data in chunks
  • Compress responses
  • Measure memory usage continuously

Modern Node.js is fully capable of handling enterprise-scale exports when designed around streaming principles.

Final Thoughts

Migration projects often focus on feature parity.

In this case, understanding the architectural philosophy behind the legacy Java implementation was more important than translating code line-by-line.

The result was a Node.js solution that preserved the user experience while leveraging modern streaming capabilities to efficiently handle millions of records.

Top comments (0)