Context
In many systems, some queries are slow. Maybe they hit a large dataset, or they depend on multiple joins or external services. You can cache the results, sure. But when that cache expires, your next unlucky user is stuck waiting for the whole thing to recompute. That’s not ideal.
I had a similar situation. The query was too slow to run live, so I cached the results with a 1-day expiry. But I didn’t want any user even just one to hit the raw query when the cache expired.
So I came up with a simple fix: proactive cache refresh, without background cron jobs or complex scheduling.
The Idea
Instead of setting the cache expiry to 1 day, I set it to 2 days.
But here’s the trick:
When I fetch the cached data, I check how old it is.
If it's close to expiry (say, within 1 day), I kick off a background job to refresh the cache.
So users always get the old response instantly.
And the cache is rebuilt in the background before it expires.
Why This Works
- No cron needed: You don't need a separate job scheduler.
- No user waits: Even if the cache is stale, users still get a fast response.
- Always fresh: The cache is refreshed regularly, just not synchronously.
Example Code (Ruby Version)
class CacheBackedQuery
TTL = 2.days.to_i
REFRESH_THRESHOLD = 1.day.to_i
def initialize(cache_key:, query_proc:)
@cache_key = cache_key
@query_proc = query_proc
end
def get_data
cached = Cache.get_key(@cache_key) # Cache can be Redis, Memcached, etc.
if cached
parsed = JSON.parse(cached)
age = Time.now.to_i - parsed['created_at'].to_i
if age > REFRESH_THRESHOLD
enqueue_refresh_job
end
return parsed['data']
end
# This should almost never run
data = @query_proc.call
Cache.set_key_and_expiry(@cache_key, { data: data, created_at: Time.now.to_i }.to_json, TTL)
data
end
private
def enqueue_refresh_job
RefreshCacheJob.perform_async(@cache_key, Marshal.dump(@query_proc))
end
end
Usage Example
query = -> {
User.group(:id)
.select(:id, 'COUNT(login_count) as login_count')
.order('login_count DESC')
.limit(10)
.map { |u| { id: u.id, count: u.login_count.to_i } }
}
fetcher = CacheBackedQuery.new(
cache_key: "popular_users:account_#{account.id}",
query_proc: query
)
result = fetcher.get_data
And in your job file:
class RefreshCacheJob
def perform(cache_key, serialized_query)
query_proc = Marshal.load(serialized_query)
data = query_proc.call
Cache.set_key_and_expiry(cache_key, { data: data, created_at: Time.now.to_i }.to_json, CacheBackedQuery::TTL)
end
end
Trade-offs
- Slightly stale data: At most, users may see data that’s 1 day old.
- Cache churn: You may refresh a few times unnecessarily if traffic is low. Acceptable for most use cases.
When to Use This
✅ Works great for dashboards, analytics, or infrequently changing data.
❌ Not ideal if you need real-time accuracy.
Credits
Big thanks to Shantilal from my team for pairing on this solution. We iterated on the idea together and refined it into a clean, reusable pattern.
Final Thoughts
This pattern is simple, effective, and doesn't need infra changes.
It avoids unnecessary user pain, and doesn’t rely on scheduled background jobs.
Give it a try if you have slow queries that shouldn’t ever run in the foreground.
Top comments (0)