This post was originally published on The Capsized Eight Blog
Dealing with various sources of data in web applications requires us to create services that will extract information from CSV, Excel, and other file types. In that case, it's best to use some existing libraries, or if your backend is on Rails, use gems. There are many gems with very cool features like
Roo. But you can also use plain
Either way, if those are small CSV files, you will get your job done easily. But what if you need to import large CSV files (~100MB / ~1M rows)?
I found that PostgreSQL has a really powerful yet very simple command called
COPY which copies data between a file and a database table.
It can be used in both ways:
- to import data from a CSV file to database
- to export data from a database table to a CSV file.
Example of usage:
COPY forecasts FROM 'tmp/forecast.csv' CSV HEADER;
This piece of SQL code will import the content from a CSV file to our
forecasts table. Note one thing: it's assumed that the number and order of columns in the table is the same as in the CSV file.
Importing a CSV file with ~1M rows now takes under 4 seconds which is blazing fast when compared to previous solutions!
For more details, read the original blogpost.