DEV Community

Dario Sindičić
Dario Sindičić

Posted on

QPP - My final thesis - first part

QPP or Query performance prediction is an interesting field in which you want to have the ability to predict how much time would some SQL query run. If you have the ability to do that, then you can do load-balancing, have SLA contract with performance properties or you can deny query from running if you see that it will run forever. This problem is very tricky and it has been investigated from 1980s. A lot of different ways to the solution has been proven to work in lab conditions. Those ways can be divided into two main categories, the first one include machine learning, and the second one doesn't.
I wanted to investigate the second one because sadly I see that this part of computer science is being forgotten. My task was to construct QPP prediction in such way that it would work in load balancing scenario. Load balancing scenario applies that QPP doesn't have to be 100% percent accurate. For example it doesn't have to say that query will run 2 seconds, but it has to say that query 1 will run two times faster than query 2. So the output from QPP will only have to be in linear relationship with a running time of the query.
Query plan is plan in which query will be run on RDBMS. This plan can be used in predicting query time. Wu2013 suggest that calibration queries can be used in calculation of basic operations in RDBMS, like seq page cost, cpu tuple cost, cpu index tuple cost and cpu operator cost. Afterwards join order is one of the most important things. In order to reduce cardinality error hash join or merge join are applied before nested loop join. This is the first part in which I wanted to introduce basic information about my final thesis and to get you in QPP field if you haven't already been familiar with it. In the next blog post I will post performance results compared to the round-robin and least tasks policy.
Stay tuned.
Dariodsa

Top comments (0)