DEV Community

Discussion on: Beyond CSV files: using Apache Parquet columnar files with Dask to reduce storage and increase performance. Try it now!

Collapse
 
paddy3118 profile image
Paddy3118 • Edited

Sadly I don't use Dask, but in the past I have used zcat on a Linux command line to stream a csv to stdin for a script to then process without needing the whole of the data uncompressed in memory/on disk.

Thread Thread
 
zompro profile image
Jorge PM

Cool I can totally see a use case for that streaming into something like Apache Kafka. I will prototype a couple of things and see if it can become another little article. Thanks!