We go thru the dataset from the very first record and split it into pages with a predefined size. The goal of the process is to have user_id boundaries as you mentioned before. Example for a page size of 10 000:
page 1, user_ids: 1 - 13 122
page 2, user_ids: 13 125 - 23 421
page 3, user_ids: 23 423 - 35 008
page 4, ...
You may notice that the difference between user_ids in boundaries is more than 10 000 - this is because we can have gaps (deleted users).
When you need to render all users for a page #3, you look into the cache and user the user_id boundaries for the purpose.
But when a user on page #3 is deleted - you need to recalculate the cache for all pages after this one. There is no reason to do that for pages #1 and #2 since boundaries for the users there are unchanged.
Sorry for the delay with the response this time.
For further actions, you may consider blocking this person and/or reporting abuse
We're a place where coders share, stay up-to-date and grow their careers.
I can think about the following approach:
We go thru the dataset from the very first record and split it into pages with a predefined size. The goal of the process is to have
user_id
boundaries as you mentioned before. Example for a page size of 10 000:You may notice that the difference between
user_ids
in boundaries is more than 10 000 - this is because we can have gaps (deleted users).When you need to render all users for a page #3, you look into the cache and user the
user_id
boundaries for the purpose.But when a user on page #3 is deleted - you need to recalculate the cache for all pages after this one. There is no reason to do that for pages #1 and #2 since boundaries for the users there are unchanged.
Sorry for the delay with the response this time.