This example will check how to optimize a loop with queries inside. If you are experienced/senior, this may be easy or not useful. After this post, you will be able to solve this problem with a production-ready solution.
Context
This post discusses a real-life scenario where we need to import data for a restaurant and summarize it for the marketing team within a week. The data we have is 24,000 receipts from the last year, and each receipt contains the date, a list of food IDs, a promotion ID, an employee ID, and satisfaction ratings.
date
List -> id of food
id from promotion
id from employee
satisfaction (0-5)
Analysis
To summarize the data, we need to retrieve more information about the food items, check the promotion details, and obtain the employee's name for each receipt. We can achieve this by looping through each receipt and making database queries to get the required information. In code it will be something like this
loop(receive from receives){
allFood = getFoodsById(receive.foodIds) // 1*foods database query
promotion = getPromotionById(receive.promotionId) // 1 database query
employee = getEmployeeById(receive.employeId) // 1 database query
addSumaryInformation(allFood, promotion, employee)
}
However, with (2 + foods) * 24,000 database queries. We are going to assume that we have 10 foods for each order. It would take approximately 20 days to complete the process, which is longer than the required time frame. This delay could be problematic since this is a critical process that needs to be performed every day.
Formula:
((12)*24000)*0,1s = 28800 seconds = 480 h = 20 days
Solution
The good news is that reducing the processing time is possible and easy by optimizing the code. For instance, we can reduce the number of database queries by fetching all the necessary information in a single query. Additionally, we can use caching to avoid making multiple calls to the database for the same information. By implementing these optimizations, we can significantly reduce the processing time and meet the marketing team's deadline.
In conclusion, by optimizing the code, we can efficiently import and summarize data for the restaurant within the required timeframe and meet the marketing team's needs.
Code
Assumption: We had looped the list elements and got all ids in
Map<id, Food> foods = getAllFoodsByIds(foodIds)
Map<id, Promotion> foods = getAllPromotionsByIds(foodIds)
Map<id, Employee> foods = getAllEmployeesByIds(employeeIds)
loop(receive from i){
allFood = ... here will have a loop that gets from the map each key
promotion = promotions.get(receive.promotionId)
employee = getEmployeeById(receive.employeeId)
addSumaryInformation(allFood, promotion, employee)
}
If each big query is 1 second and we have three queries, this will be 3 seconds. We can assume 5 minutes if we need to make more queries (database max IN in SQL) or process the result. This time will be much lower than the original 20 days. It is a 0,017% of the initial time.
If each big query is 1 second and we have three queries, this will be 3 seconds. We can assume 5 minutes if we need to make more queries (database max IN in SQL) or process the result. This time will be much lower than the original 20 days. It is a 0,017% of the initial time.
Note: In reality, the time can be much lower. I tested this in Java/Spring/JPA, and the time reduction has been around 90/95%.
Note2: This solution may increase the consumption of other resources.
Thank you for reading! Check my website to get more information about me.
Top comments (0)