Skip to content

DEV Community

javi santana

Posted on Jul 19, 2022 • Edited on Jul 21, 2022

4 keys of analyzing data fast

#database #data #beginners

Don't store the data you don't need. Sounds silly but a lot of the data you have to read is not useful.
Don't read the data you don't need. Discard the data using indices or any other tool your database/framework provides
Run heavy operations later. For example, filtering data is faster than aggregating it so when processing data always filter first and do other heavy things later (joins, aggregations and so on)
Sort your data before storing it. Sorting data makes compression much better and you use all the power of current hardware (sequential reads are 100x faster than random access)

Following these 3 rules I process large datasets 100-1000x faster than I usually did.

(image from craiyon.com generated with "f1 going fast")

Top comments (0)

Subscribe