loading...

How we made our SQL database QuestDB even faster and more accurate

nicquestdb profile image Nicolas Hourcard Originally published at questdb.io ・1 min read

See our article here

About a month ago, we posted about using SIMD instructions to make aggregation calculations faster.

Many comments suggested that we implement compensated summation (aka Kahan) as the naive method could produce inaccurate and unreliable results. This is why we spent some time integrating kahan and Neumaier summation algorithms. This post summarises a few things we learned along this journey.

We thought Kahan would badly affect the performance since it uses 4x as many operations as the naive approach. However, some comments also suggested we could use prefetch and co-routines to pull the data from RAM to cache in parallel with other CPU instructions. We got phenomenal results thanks to these suggestions, with Kahan sums nearly as fast as the naive approach.

A lot of you also asked if we could compare this with Clickhouse. As they implement Kahan summation, we ran a quick comparison. Here's what we got for summing 1bn doubles with nulls with Kahan algo. The details of how this was done are in the post.

QuestDB: 68ms Clickhouse: 139ms

Thanks for reading and please leave us a star if you find the project interesting!

Nic

Posted on by:

nicquestdb profile

Nicolas Hourcard

@nicquestdb

Co-founder of QuestDB, the fastest open source time-series database

Discussion

markdown guide
 

Hi Nicolas,

I have been following QuestDB for sometime now. I use QuestDB as an example for well-written minimal dependency Java projects.

It would be great if you could link the commits/diffs for the Kahan and Neumaier summations too in this post so we can look into the changes required for such an undertaking.

Another idea for a blog post would be tips on how to vectorize Java code. AFAIK, the JVM auto-vectorizes code or we need to use C++ code via JNI.

Thanks,
Raunak

 

Hi Raunak,

Thank you for the kind comments. Here is commit diff that added Kahan and Neumaier vector summations

Vlad

 

Awesome one Nicolas!
cc Vlad

You all are amazing!⚡️