DEV Community: Sahil Kapoor

Synchronized Expiration in Distributed Systems

Sahil Kapoor — Mon, 02 Mar 2026 17:02:00 +0000

It’s 12:05 PM. Your user traffic is perfectly flat. Database CPU is stable at 38%. Suddenly, within seconds, CPU spikes to 100%. Connection pools max out. Query latency jumps from 40ms to over 3 seconds. Timeouts trigger retries. Retries amplify load. Amplified load creates more timeouts.

By 12:07 PM, your database is effectively dead.

When you dig into the logs, you don’t find a DDoS attack or a viral marketing push. You find something much more ordinary and much more dangerous. At 12:00 PM, a cron job ran, or perhaps a massive influx of users logged in for a scheduled event. Their data was cached with a standard 5-minute Time-To-Live (TTL). Now, at 12:05 PM, thousands of cache keys have expired at the exact same millisecond. Your application servers, finding the cache empty, all rush to the database simultaneously to fetch the same data.

This is Synchronized Expiration , often leading to the "Thundering Herd" problem. In a distributed system, relying on simple, fixed TTLs is a ticking time bomb.

When the cache is healthy, it absorbs and smooths demand. When many keys expire together, regeneration load is released in a concentrated burst instead of being spread over time. The failure sequence looks like this:

Most large-scale outages are triggered by synchronized events rather than gradual overload.

Why TTL is a Time Bomb in Distributed Systems

In low-traffic systems, TTL behaves exactly as expected. Data expires, the next request fetches fresh data, and the system continues normally.

In highly concurrent distributed systems, TTL creates hidden synchronization points.

When a hot object is first requested, hundreds of application servers cache it at roughly the same moment. If the TTL is 300 seconds, those servers will independently decide the data is stale at roughly the same moment 300 seconds later.

Real-world traffic follows a power-law distribution. A tiny percentage of hot keys absorb the majority of reads. Applications like Instagram experience this constantly. When a global celebrity like Cristiano Ronaldo or Taylor Swift posts, tens of millions of users open the app to view the same profile, follower count, and engagement metrics. Those profile objects become extremely hot keys.

If those keys expire simultaneously, millions of requests instantly bypass cache protection and converge on backend databases. The database is not sized for total traffic. It is sized for cache misses. When misses spike beyond expected levels, collapse is immediate.

Autoscaling also does not solve this. Scaling decisions operate on seconds-to-minutes timescales. Cache stampedes occur in milliseconds.

💡

Cache stampedes occur in milliseconds.

To survive scale, expiration must be treated as a probabilistic and distributed process, not a fixed boundary.

1. TTL Jitter

The simplest and highest-leverage fix is to introduce randomness into expiration.

Instead of assigning:

TTL = 300 seconds

You assign:

TTL = 300 seconds + random(-30, +30)

This converts expiration from a synchronized event into a distributed one. The same amount of regeneration work occurs, but it is spread over time instead of concentrated into a single spike.

The trade-off is bounded inconsistency. Some users may see data slightly longer than others. In exchange, peak database load drops dramatically. This is almost always worth it.

2. Request Coalescing

Jitter reduces alignment but does not eliminate regeneration pressure for extremely hot keys. Request coalescing ensures only one worker regenerates the cache while others wait or use stale data.

Using Redis:

SET lock:key worker_id NX PX 5000

If successful, that worker queries the database and updates the cache. Other workers defer. Only one worker regenerates the value; other workers do not query the database.

However, distributed locking introduces its own failure modes. If a worker pauses due to garbage collection or network delay, the lock may expire prematurely. Another worker acquires the lock and performs the same regeneration. The original worker resumes and overwrites the cache with stale data.

Production-grade systems mitigate this using fencing tokens or monotonic version numbers. Each regeneration receives a strictly increasing version. Only newer versions are allowed to update the cache.

3. Stale-While-Revalidate

Traditional caching blocks on expiration:

Cache Miss → Database Query → Response

Stale-While-Revalidate changes the model:

Cache Expired → Serve Stale Data → Refresh in Background

Consider a Cricket World Cup final. Virat Kohli is batting. Tens of millions of users refresh simultaneously to check his score. If the cache on player key expires, blocking all requests on regeneration would overwhelm the database. Instead, the system can server the slightly stale data immediately and asynchronously refreshes the cache.

This isn't just a backend application pattern. Modern CDNs (like Cloudflare and Fastly) and reverse proxies (like Nginx and Varnish) support this natively via the Cache-Control: stale-while-revalidate HTTP header.

X applies similar logic to celebrity follower counts. When millions of users open a celebrity profile, the follower count may be a few seconds behind reality. The system prioritizes availability over perfect real-time accuracy.

4. Probabilistic Early Refresh

Standard TTL creates a step function: Fresh → Expired.

Probabilistic refresh can convert this into a ramp. The closer a key gets to expiration, the higher the probability a request triggers a background refresh.

if cache_age > threshold:
    probability = (cache_age - threshold) / (TTL - threshold)
    if random() < probability:
        async_refresh()

This ensures popular keys are refreshed before expiration. Regeneration happens gradually and invisibly, instead of synchronously and catastrophically.

You don't always have to write this math from scratch. Modern local caching libraries, like Java’s Caffeine, have features like refreshAfterWrite built-in, which automatically handles probabilistic early refresh and coalescing for L1 caches without requiring custom locking code.

5. Cache Warming

Cache warming proactively refreshes hot keys before users request them. There are two approaches:

Mathematical approach: Track access frequency using probabilistic structures like Count-Min Sketch to identify hot keys dynamically.

Domain-driven approach: Use business knowledge. In cricket or fantasy platforms, tomorrow’s match schedule is known in advance. Player profiles, statistics, and leaderboards can be preloaded overnight when database load is low. Similarly, Instagram/X know which accounts have hundreds of millions of followers. Their profiles are continuously refreshed in cache regardless of access patterns.

This shifts regeneration work to off-peak periods when database contention is lower.

The Eviction Trap: Cache warming is dangerous if you lack memory headroom. If you aggressively warm tomorrow's data, you might trigger an Out of Memory (OOM) event or an LRU eviction policy that deletes today's hot keys. Always align your bulk warming scripts with your distributed cache memory limits.

6. Multi-Layer Caching: Reducing Blast Radius

Mature systems use layered caching:

Layer 1: Local in-process cache (Caffeine, in-memory LRU)
Layer 2: Distributed cache (Redis, Memcached)
Layer 3: Database

The L1 cache acts as a localized shock absorber. If Redis expires, regeneration load is limited to the number of application nodes, not total user requests.

For example: 1 million requests per second hitting 100 nodes becomes 100 regeneration events, not 1 million.

Observability during Cache Stampede

A cache stampede often looks exactly like a database bottleneck if you don't have the right metrics. To detect synchronized expiration before it causes an outage, you must monitor:

Spiky Cache Miss Rates: If your miss rate is usually flat but occasionally spikes in perfect vertical lines, your TTLs are aligned.
p99 Database Latency vs. Cache Miss Correlation: If your 99th percentile DB latency perfectly overlaps with cache miss spikes, your cache is acting as a trigger, not a shield.
Redis CPU vs. Network I/O: If Redis CPU spikes but network throughput drops, you might be experiencing extreme lock contention from poorly implemented coalescing.

The Takeaway

Caching plays a central role in controlling database load. Deterministic TTL creates hidden synchronization points that eventually collapse under load. Production systems distribute regeneration work across time:

Jitter distributes regeneration.
Coalescing limits concurrency.
Stale-while-revalidate protects availability.
Probabilistic refresh eliminates hard boundaries.
Cache warming prevents peak regeneration entirely.
Multi-layer caching reduces blast radius.
Observability detects the stampede before it happens.

Final Thought:

At scale, synchronized expiration produces failure patterns similar to traffic floods. Treating expiration as a hard boundary concentrates regeneration into short intervals. Add jitter. Use background refresh. Ensure regeneration remains controlled under concentrated access.

Redis in Modern Systems

Sahil Kapoor — Thu, 05 Feb 2026 00:00:27 +0000

If you built backend systems in the late 2000s or early 2010s, you remember the pain. Databases were the bottleneck. Scaling meant vertical upgrades, buying bigger hardware until you ran out of budget or physics. Every performance problem turned into another index, another replica, or another late night trying to shave milliseconds off complex SQL JOINs.

Then Redis showed up.

For the first time, you could put state in memory, share it across processes, give it a TTL, and move on. APIs got faster. Databases stopped melting. Systems felt forgiving for the first time. That first experience stuck. For many teams, Redis became “the cache”, a tactical fix for performance problems.

Redis, short for REmote DIctionary Server , was originally built as a network-accessible in-memory dictionary, but with a crucial difference. It exposed structured data and deterministic operations rather than opaque blobs. That design choice mattered more than it seemed at the time.

But stopping at “Redis is a cache” is a mistake. Originally released in 2009 by Salvatore Sanfilippo, Redis was designed to answer a specific question: how do you serve data fast enough for real-time applications? Over the last 15 years, that narrow goal evolved into a general-purpose in-memory data structure server.

Today, Redis sits at the core of the modern tech stack. It is the shared memory between stateless microservices. It is the coordination layer for distributed locks. It is the buffer for high-velocity telemetry.

Why Redis Is Fast?

When engineers ask why Redis is fast, the common answer is "it runs in RAM." While true, that is only half the story. If you wrote a naive Java or Python application that stored data in RAM, it would still likely be slower than Redis under high concurrency.

Redis performance is the result of three specific architectural choices:

The Single-Threaded Event Loop

Redis (mostly) uses a single thread to handle commands. This seems counter-intuitive in an era of multi-core CPUs, but it is a feature, not a bug.

By running on a single thread, Redis avoids context switching and race conditions. It never needs to acquire a lock to update a value, because no other thread can touch that value at the same time. This creates predictable tail latency. In multi-threaded systems, performance often degrades non-linearly as threads fight for locks. In Redis, performance is linear until the CPU is saturated.

Note: Modern Redis isn't strictly single-threaded. It uses background threads ("bio" threads) for heavy tasks like closing file descriptors (UNLINK) and flushing data to disk (fsync), keeping the main event loop unblocked.

I/O Multiplexing

How does a single thread handle 50,000 concurrent client connections? Redis uses I/O multiplexing (typically epoll on Linux or kqueue on macOS/BSD).

Instead of blocking the thread waiting for a client to send data, Redis asks the kernel to monitor all open socket connections. When a socket becomes readable, the kernel wakes up the Redis thread, which processes the command, writes the response to a buffer, and moves instantly to the next socket.

Memory Efficiency and Specialized Encodings

Redis is obsessive about memory layout. It doesn't just store standard linked lists or hash tables. It adapts the underlying data structure based on the size of the data to optimize for CPU cache locality.

Ziplists / Listpacks: If you store a small list or hash, Redis stores it as a contiguous block of memory (a byte array) rather than a structure with pointers. This reduces memory fragmentation.
IntSets: If a Set contains only integers, Redis stores them as a sorted array of integers. This uses a fraction of the memory required for a standard hash table.

From Cache to Control Plane: The Use Cases

Most teams adopt Redis in stages as system complexity increases. What begins as simple caching often grows into session storage and analytics, and eventually becomes a coordination layer for distributed systems.

Phase 1: Read Optimization (Caching)

This is where everyone starts. The application checks Redis; if the data is missing, it queries the database and populates Redis.

Value: Reduced latency and reduced database load.
Risk: Cache invalidation. If the database changes and Redis isn't updated, users see stale data. The hardest part of caching is not storing data, but knowing when to delete it.

Phase 2: Transient State (Sessions & Analytics)

Here, Redis is the primary store for data that can be lost without catastrophe.

Session Storage: User sessions are read on every request. Storing them in a database is overkill; storing them in a stateless JWT works but makes revocation hard. Redis is the middle ground: fast reads with instant revocation.
HyperLogLog: For analytics, you can use probabilistic structures. A HyperLogLog allows you to count unique items (like daily active users) with an error rate of <1% using only 12KB of memory, regardless of how many users you have.

Phase 3: Distributed Coordination (The Control Plane)

This is where Redis becomes critical infrastructure. In a microservices architecture, you need a way for services to agree on the state of the world.

Distributed Locks

If you have five worker processes processing payments, how do you ensure a specific order isn't processed twice? You use a Redis lock.

Using SET resource_name my_random_value NX PX 30000, you can acquire a lock that auto-expires in 30 seconds.

The critical part is releasing the lock safely. You cannot just DEL the key, because you might delete a lock held by another process if yours took too long. You must use a Lua script to check ownership and delete atomically:

-- Release lock only if ownership matches
if redis.call("get", KEYS[1]) == ARGV[1] then
    return redis.call("del", KEYS[1])
else
    return 0
end

Rate Limiting

API Gateways often rely on Redis to enforce quotas (e.g., "100 requests per minute"). Using the INCR command and EXPIRE, a cluster of stateless servers can enforce a global limit without talking to each other directly.

Redis and Microservices

Microservices amplify coordination problems. Once a system is decomposed into many stateless services, shared state does not disappear. It just becomes harder to manage. Rate limits, idempotency, retries, leader election, and job ownership all become cross-cutting concerns that no single service naturally owns.

Redis fits microservices not because it is fast, but because it provides a shared, low-latency coordination layer without forcing services to share databases or schemas. In microservice architectures, Redis commonly owns coordination invariants , not domain data. Earlier use cases like rate limiting and idempotency become architectural glue rather than isolated features.

From a design standpoint, Redis reduces the need for "coordination microservices" whose sole responsibility is managing shared state. Entire services can disappear when their responsibilities collapse into Redis primitives. The trade-off is blast radius. Redis becomes part of the system’s control plane. If Redis is slow or unavailable, many services feel it at once.

That means Redis usage in microservices must be intentional. Invariants owned by Redis should be explicit, documented, and tested. Lua scripts should be versioned like application code. Failure modes should be designed, not discovered in production.

Persistence: RDB vs. AOF

One of the biggest shocks for new Redis administrators is discovering that Redis can lose data. Understanding persistence is mandatory for production environments.

RDB (Redis Database Snapshot)

RDB creates a point-in-time snapshot of your dataset at specified intervals (e.g., "every 5 minutes if 100 keys changed").

Pros: Compact binary files, fast startup time.
Cons: If Redis crashes, you lose all data written since the last snapshot.
The "Fork" Danger: To create a snapshot, Redis calls the fork() syscall. The OS uses Copy-on-Write (CoW) to clone the process memory. If your dataset is huge (e.g., 40GB) and you have heavy write traffic during the snapshot, your memory usage can double, potentially causing the OS to OOM-kill Redis.

AOF (Append Only File)

AOF logs every write operation received by the server. When Redis restarts, it replays the log to reconstruct the dataset.

Pros: Much higher durability. You can configure fsync to run every second.
Cons: The file grows indefinitely (though Redis rewrites it in the background) and recovery is slower than RDB.

Recommendation: For pure caching, disable both (or use RDB for warm restarts). For a general-purpose store, use RDB + AOF with fsync set to everysec.

RDB snapshot lifecycle showing copy-on-write memory amplification during fork() under write-heavy load

Scaling Redis

When a single Redis node runs out of memory or CPU, you have two paths.

Path A: High Availability with Redis Sentinel

Sentinel is a system designed to keep Redis online. It monitors a primary node and its replicas. If the primary dies, Sentinel coordinates an election and promotes a replica to primary.

Architecture: 1 Primary + N Replicas.

Limitation: You are limited by the RAM of a single node. You cannot store 200GB of data if your server only has 64GB. Writes are also limited to the throughput of one node.

Path B: Horizontal Scaling with Redis Cluster

Redis Cluster shards data across multiple nodes. It splits the keyspace into 16,384 "hash slots." Every key belongs to a slot, and every slot belongs to a node.

Architecture: N Primaries + N Replicas.
Pros: You can scale to Terabytes of RAM.

Cons: Complexity. Client libraries must be "cluster-aware." Multi-key operations (like transactions or MGET) are only allowed if all keys involved hash to the same slot. This requires careful data modeling using "hash tags" (e.g., {user:100}:profile and {user:100}:orders ensure both keys land on the same shard).

The Ecosystem Fracture: Redis vs. Valkey

In 2024, Redis Ltd. transitioned the project from the permissive BSD license to the RSAL/SSPL license. This restricted cloud providers from selling managed Redis services without paying. In response, the Linux Foundation, backed by AWS, Google, and Oracle, forked the last open version of Redis to create Valkey.

What does this mean for you?

Functionally, Valkey 7.x/8.x and Redis 7.x are nearly identical for the end user. The commands, protocols, and data structures remain the same. However, for new deployments, many engineering teams and Linux distributions are defaulting to Valkey to ensure long-term open-source compatibility.

Production Anti-Patterns and Failure Modes

There are some common, repeatable failure modes seen when you use Redis beyond simple caching and become part of the system’s coordination layer.

Latency Killers: Blocking the Event Loop

The `KEYS` Command

KEYS pattern* is an O(N) operation. On a dataset with millions of keys, Redis scans everything. Because Redis executes commands on a single thread, the entire server stops responding until the scan finishes.

This is not a slow query problem. It is a full stop.

Fix: Rename or disable KEYS in redis.conf. Use SCAN, which is incremental and non-blocking.

The Big Key

Storing massive hashes, lists, or sets creates latency cliffs. Deleting a 500MB collection forces Redis to free memory element by element, blocking the event loop.

Fix: Use UNLINK instead of DEL for non-blocking deletes. Model data so collections remain bounded.

Availability Killers: Self-Inflicted Outages

Connection Storms

When application fleets restart, thousands of clients may reconnect simultaneously. TLS handshakes and authentication can drive Redis CPU to 100 percent, causing connection timeouts, retries, and cascading failure.

Fix: Use connection pooling, jittered reconnects, and exponential backoff.

Scalability Killers: Load That Does Not Distribute

The Hot Key

In a healthy cluster, traffic spreads evenly. In a hot key scenario, a single key (for example, global_feature_flags) receives the majority of traffic. One shard saturates while others sit idle. The system appears overloaded despite unused capacity.

Redis Cluster cannot solve this automatically.

Fix: Redesign key structure, introduce client-side caching, or split the key into multiple buckets.

Correctness Killers: When Performance Settings Delete Data

Maxmemory Policy Misconfiguration

What happens when Redis runs out of RAM depends entirely on maxmemory-policy.

noeviction: Writes fail once memory is exhausted.
allkeys-lru: Any key can be evicted, including coordination state.

The trap appears when Redis is used for both caching and durable queues. An eviction policy intended for cache keys can silently delete queue items or locks.

Fix: Use volatile-lru and ensure only cache keys have TTLs. Never allow eviction to apply to coordination state.

Closing Thoughts

Redis is easy to adopt but hard to master. Teams that use Redis only as a cache benefit from speed. Teams that understand it as a distributed data structure server gain architectural leverage. They can build queues, real-time analytics, and coordination systems without adding new infrastructure complexity.

The difference is not in the commands you type, but in understanding the consequences of those commands on the single-threaded engine underneath. Treat Redis with the same respect you treat your primary database, and it will be the most reliable part of your stack.

Make the Consequence Real

Sahil Kapoor — Wed, 28 Jan 2026 05:19:46 +0000

There is a specific threshold in leadership where your instincts for problem solving still fire, but the outcomes stop being reliably good. You are still competent. Teams still ship. But the impact you expect does not materialize.

Nothing is obviously broken. That is what makes this stage dangerous.

The skills that got you here have not failed outright. They have started failing quietly. Decisions still get made. Work still moves. But the second-order effects drift. The same effort produces less leverage. The same interventions produce weaker results.

At this level, the cost of being wrong is no longer a missed deadline. It is paid by other people. It shows up in culture, in customer trust, and in the long-term health of the organization. Because these costs are indirect, they are easy to rationalize away until they accumulate into something harder to undo.

This transition is not about acquiring a new checklist of skills. It is about shedding old certainties. Senior roles are genuinely new work. A Director role can take years to become merely competent at. Senior leadership often takes longer, because the work shifts away from direct execution and into shaping systems, influencing incentives, and steering human behavior under real uncertainty.

Engineering prepares you exceptionally well for the early part of this journey. Clarity. Elegant design. Better abstractions. Better tooling. These are the levers that reward you early in your career. For a long time, the belief holds that clear thinking yields better systems, and better systems yield better outcomes.

Then, gradually, it stops working.

The failure mode is subtle. It rarely announces itself as an outage or a crisis. Instead of obvious failure, you get noise. More meetings. More coordination. More exceptions. Decisions take longer because every path now crosses someone else’s context. As a senior leader, you sense the slowdown before you can name it, long before any dashboard can point to a single broken thing.

The Limits of Delegation

Many leaders instinctively reach for delegation when this slowdown appears. Delegate until it hurts. Give away the work you enjoy most. Give away the work that makes you feel useful. Give away the work that reinforces your identity as a strong engineer.

This advice is not wrong. It is incomplete.

Delegation at this level is not primarily about freeing time. It is about scaling judgment. You stop being a node in the system and become an architect of the system itself, responsible for how decisions get made when you are not in the room.

That shift is uncomfortable precisely because it signals progress. Your influence becomes indirect. The feedback loop lengthens. You no longer get the satisfaction of fixing things yourself, only the slower evidence that the system is learning.

If this shift is not internalized, senior roles quietly devolve into performance art. You attend meetings. You offer thoughtful opinions. Then, when things wobble, you pull yourself back into execution to compensate. This often works just long enough to embed the habit, especially in fast-growing organizations, before it abruptly fails.

The visible behaviors of senior leadership are familiar. Speaking last. Reading the room. Asking thoughtful questions. These behaviors matter. They build trust and prevent obvious mistakes. But they are posture, not process. On their own, they do not change how an organization actually behaves.

Director and VP roles demand a different operating model. Delegation only works if the system creates its own friction. If you delegate authority without architecting the consequences of failure, you have not built a team. You have simply renounced your job.

Making the Consequence Real

Engineering leadership is about making consequences real, even when doing so feels emotionally expensive, culturally risky, occasionally lonely, and deeply at odds with the part of you that would rather be liked.

As Andy Grove once put it:

A manager’s output is the output of their organization.

This is not a statement about control. At scale, you rarely have that. It is a statement about leverage. Your real leverage lies in deciding where your attention goes and, more importantly, where it stubbornly refuses to leave.

So what are these consequences, concretely?

They are not punishments, warnings, or threats. They are predictable costs that appear when a problem is not addressed. Most leaders hesitate not because these costs are unclear, but because they feel politically or emotionally uncomfortable.

In practice, consequences look like this. Quality slips and feature work slows because more time is spent explaining failures to customers and stakeholders. Reliability is deprioritized and senior leaders spend recurring time in reviews and postmortems instead of future planning. Teams avoid hard work and autonomy shrinks. Decisions that were once delegated now require visibility and discussion. When problems persist, leadership attention stays fixed on that area week after week while other initiatives wait.

These are not arbitrary penalties. They are the natural outcomes of unresolved issues. Making the consequence real means refusing to absorb these costs silently on behalf of the system.

At this level, you do not delegate solutions. You set direction, and you deliberately decide which consequences you are willing to personally carry and which ones you will push back into the organization.

That last part is why this is hard.

Making a consequence real means accepting that you will disappoint people, introduce friction, and surface conflicts that were previously hidden. It means spending your most limited resource, your attention, repeatedly on the same uncomfortable issue while other priorities compete loudly for it.

Designing Consequences That Teach

Here is the mistake many leaders make. They add consequences for teams while insulating themselves from the cost.

Consequences only teach when they first constrain leaders.

If leadership can move on while a problem persists, the organization learns how to signal progress instead of changing behavior. For a consequence to teach, it must force leaders to keep paying attention when they would rather declare closure and focus elsewhere.

In practice, this often means recurring reviews, repeated conversations, and sustained focus until outcomes actually change. The discomfort is the point. When leaders are forced to stay with a problem, pretending it is solved stops being an option.

When the System Learns

The system shifts when it becomes clear that problems will not be quietly carried by leadership anymore.

At first, teams respond because leadership attention is present. Over time, they respond because avoiding the problem no longer helps. Waiting does not reduce scrutiny. Escalation does not make it disappear.

That is when behavior changes. Issues are raised earlier because delay has no upside. Decisions get made closer to the work because responsibility is real. Conversations shorten because fewer things are being negotiated implicitly. The system starts correcting itself not because anyone is watching closely, but because ignoring problems has stopped working.

This transition takes longer than most leaders expect. In our case, it required more than a year of sustained attention. It was tempting many times to declare partial victory and move on. Those were usually the moments that mattered most.

Eventually, the extra attention is no longer needed. The behavior has become normal.

What the Role Is Actually For

The goal of engineering leadership is not control. It is not accountability in the transactional sense. It is to architect conditions that make the right behavior inevitable by aligning incentives, focusing attention, and fostering continuous learning over time.

For a long time, leaders are rewarded for absorbing pain. They smooth over missed commitments, translate rough edges for customers, and quietly carry systemic debt so teams can keep moving. That instinct is useful early on. At scale, it becomes the problem.

The work, at this stage, is letting the cost of decisions land where those decisions are made. Lost autonomy, slowed progress, and sustained leadership attention become visible signals rather than private burdens. Only then does the organization get accurate feedback about how it is actually operating.

When consequences are made real, behavior changes without coercion. Teams adjust not because they were told to, but because the system finally reflects reality back to them. Learning stops being optional.

That is what the role is actually for.

An Ode to Stack Overflow: The Community That Taught a Generation to Think

Sahil Kapoor — Thu, 15 Jan 2026 08:39:34 +0000

I have been part of Stack Overflow for a long time. Long enough that my profile still feels like a piece of personal history rather than a line on a resume. Over the years, I crossed 11,000 reputation points, earned more than 180 badges, and, by conservative estimates, helped close to a million developers through answers, edits, and quiet improvements to existing posts. None of that felt transactional. It felt like belonging to a living, opinionated, sometimes pedantic, but deeply committed community of engineers trying to make the internet a little more correct.

What made Stack Overflow special was not just the answers. It was the process. You learned how to ask better questions. You learned that clarity mattered, that reproductions mattered, that context mattered. Posts were edited by strangers who cared enough to make your question readable. Answers were challenged, refined, and sometimes replaced entirely. You waited. Four hours. Sometimes four days. And when someone finally replied, there was a shared sense of discovery. We found it. Not you. Not me. We.

The rules were strict. Often frustrating. But they shaped a culture that rewarded precision and filtered noise. Over time, many of us internalized those standards. They made us better engineers, not just faster ones.

The Cliff Edge

For a long time, the decline of Stack Overflow was subtle. Activity softened gradually, almost imperceptibly, like a city emptying out after peak hours. Then, sometime after 2022, it fell off a cliff.

In March 2023, Stack Overflow saw roughly 87,000 questions. By December 2024, that number had collapsed to around 25,000. Fifteen years of growth erased in about eighteen months.

That number matters because it sends the site back to its earliest days. The last time Stack Overflow saw activity at this level was in mid‑2009, less than a year after launch. This was not a moderation tweak or a temporary low, it was a structural change in how developers seek help.

Since the launch of ChatGPT in November 2022, developers have largely stopped asking questions in public. They ask them in private, inside chat windows that answer instantly (and forget just as quickly).

The Twist

At first glance, this looks like a familiar internet story. Community fades, traffic collapses, revenue follows. Except that is not what happened.

Here is the twist. While the community evaporated, Stack Overflow’s revenue actually went up. The company is reportedly pulling in over 115 million dollars a year. The reason is simple. They stopped selling ads to humans and started selling humans to machines.

As developers stopped asking questions, companies like OpenAI and Google became desperate for high‑quality, well‑structured answers to train their models. Stack Overflow pivoted. It took the 58 million questions we wrote, debated, edited, and refined for free and licensed them as one of the most valuable training datasets on the internet. The most valuable asset was no more the active community. It was the archive we left behind.

The Irony

When you ask a question on Stack Overflow, the answer becomes a public artifact. A junior developer three years from now can still find it. When you ask an AI, the answer is ephemeral. It exists for you for a few seconds and then disappears. We are trading a public library for a private whisper. We are no longer building a shared knowledge base. We are simply consuming one.

There is a bitter irony here.

The only reason these models can fix your code today is because people like me spent more than a decade answering questions for internet points. They were trained on our pedantry, our debates, our edits, and our insistence on correctness. They stand on the shoulders of reputation systems and gold badges that once rewarded contribution.

We built the dataset that replaced us. We are no longer the customers. We are the product, frozen in amber and sold by the gigabyte.

The Trade‑Off

I used to spend hours crafting the perfect minimal, reproducible example to post on Stack Overflow to not get downvoted. Now I paste a messy error log into ChatGPT or Claude and say, fix this.

It is faster. It is easier. And it is safer. A bot does not judge you for not reading the documentation. It does not close your question or mark it as off‑topic.

But here is what keeps me bothering: the knowledge freeze.

AI models are trained on the past. Your answers from 2016 help ChatGPT write React code today. But when React 20 ships next year, where does the new training data come from if everyone asks the AI and no one asks in public?

We are burning the furniture to keep the house warm. By moving from a public square to a private chat window, we have stopped documenting the cutting edge. The AI knows everything we did, but it has no reliable way to learn what we are doing now.

What Remains

Even now, I still type stackoverflow.com directly into my browser from time to time. Not because I expect new answers, but out of habit. Muscle memory. Nostalgia. I miss the feeling of asking a question in public and trusting that another human, somewhere else in the world, would take the time to help.

Stack Overflow itself seems to recognize this transition. Its leadership speaks of a new era that blends community knowledge with AI‑assisted workflows. Perhaps that future will work. Perhaps the platform will survive, transformed into something adjacent rather than central.

But whatever comes next, the old Stack Overflow mattered. It taught a generation of developers how to think, not just how to code. It showed us that generosity scaled, that strangers could collaborate at internet scale, and that patience was sometimes the price of truth.

If this really is the end of that era, it deserves acknowledgment. Not as a failure, but as a foundation.

And for that, I remain grateful.

How Cybersecurity Will Evolve in 2026

Sahil Kapoor — Mon, 12 Jan 2026 04:35:41 +0000

Over the last few years, even the organizations we tend to put on a pedestal for being “security‑mature” got absolutely hammered. It wasn’t just the small fish. Change Healthcare, a massive, heavily regulated provider, was knee‑capped by ransomware in early 2024 after attackers slipped in using stolen credentials and moved laterally with little resistance. Okta, a company whose entire existence revolves around identity,had to admit in late 2023 that its own support system was compromised, leading to session token theft that bypassed the very protections it sells. Then came the Snowflake mess in mid‑2024, where the damage wasn’t driven by some brilliant zero‑day, but by customers getting burned through credential reuse and service accounts with far too much access.

When teams sat down for post‑incident reviews, the same ugly details kept resurfacing. It wasn’t sophisticated nation‑state magic. It was critical infrastructure running on three‑year‑old spreadsheets, printed runbooks, and asset inventories that were maybe 60 percent accurate on a good day. Nobody could say with real confidence which systems were clean and which weren’t. That kind of uncertainty never shows up on a polished executive dashboard. It shows up at 2 a.m., when you hesitate to revoke access because you don’t know which production system will fall over, or when recovery stalls because no one fully trusts the backups.

The failure wasn’t a lack of tools. We’re drowning in tools. It was the stubborn belief that “keeping them out” was the same thing as being safe. By the time organizations reached 2025, that belief was already dead in the water.

What the last breach cycle actually exposed

If you dig into the logs from the last cycle, very little of it feels new. It’s the same story we’ve seen before, just faster. These incidents were the predictable bill coming due for systems that scaled far more quickly than our ability,or willingness,to manage them properly.

Ransomware rarely began with clever exploitation. It started with a valid login and a handful of built‑in administrative tools used to elevate privileges. In many cases, recovery paths had been quietly broken for months by updates or configuration drift, but no one noticed until alarms finally went off. I’ve sat in reviews where teams didn’t even realize an intruder was present until the restore attempt failed. At that point, the conversation stops being about “controls” and starts being about whether the business survives the week.

Identity failures followed the same pattern. MFA existed, sure, but attackers simply walked around it using session hijacking and token abuse. Okta and Entra ID sessions stayed alive for days, long after users stopped paying attention. We poured energy into hardening the login prompt and largely ignored what happened after someone got in.

And the cloud environment only magnified the blast radius. Service accounts hoarded permissions like digital packrats. Third‑party integrations stayed active years after the teams that bought them had moved on. When something broke, it rarely broke in isolation,it broke wide.

If you’re still trying to out‑click an LLM, you’ve lost

Here’s the reality we have to face. AI stripped away the friction that once slowed attackers down. Reconnaissance, phishing, and crafting malware used to take time and skill. Large language models turned those steps into cheap, repeatable scripts. The gap between initial access and total disaster shrank to almost nothing.

Defensive teams didn’t get faster. We’re still human. I’ve watched analysts manually triage alerts, alt‑tabbing between six different tools and arguing over priority levels, while an attacker’s script quietly issued new OAuth tokens in the background. It’s painful to watch.

That’s why the traditional SOC model is failing. It’s not because analysts are bad at their jobs. It’s because the model assumes a human can type and think faster than software. Once you say that out loud, the flaw becomes obvious. Security operations have to behave like systems, not help desks. Correlation, investigation, and containment need to happen continuously, without waiting for a human to notice the pattern. People still matter, but mostly at the edges,deciding when to pull the plug on an account, isolate a production environment, or make a messy disclosure call.

Assume you’re already cooked (and plan accordingly)

With how tangled modern vendor ecosystems have become, the idea of “perfect exclusion” is mostly theater. Something is going to fail. The only real question is whether you’re ready for that moment or still designing as if it won’t happen to you.

Resilience isn’t a slogan. It shows up in the boring details. Backups that are actually isolated, not just logically separated. Recovery plans that have been exercised under pressure, not just approved in a quarterly review. I once saw a disaster‑recovery drill fall apart because the one person who knew the password for a legacy dependency was on a fishing trip. That’s the lived reality of “resilience.”

Strong programs don’t avoid incidents entirely. They keep incidents small. They limit how far access spreads, how long systems stay unreliable, and how much trust erodes while control is restored.

Identity is the only perimeter left,and it’s a mess

When incidents are dissected honestly, very few start with a broken firewall. They start when an attacker successfully acts as someone who already belongs inside the system.

Cloud breaches made this impossible to ignore. Long‑lived sessions, over‑permissioned service accounts, and forgotten roles turned identity into the easiest path through the environment. Okta and Entra ID tokens often stayed valid long enough for a minor compromise to snowball into a major one.

Access decisions can no longer be static. Context changes. Devices drift. Behavior shifts. Trust has to decay unless something continuously earns it back. Credentials that live forever and permissions that never expire don’t fail loudly. They fail months later, quietly, when nobody remembers why they exist.

Cloud infrastructure amplifies every one of these mistakes. Permissions grow faster than reviews. Infrastructure‑as‑code drifts from reality. Integrations outlive the teams that approved them. During live incidents, teams often realize a vendor still has access only because that access is being actively abused right in front of them.

Regulation isn’t paperwork anymore

For smaller companies, regulation stopped being theoretical the moment enforcement timelines tightened. It doesn’t arrive as a blog post or a checklist. It shows up as an angry email asking for logs, access histories, and explanations on deadlines that don’t care how your systems were built.

Teams that treated logging and access control as afterthoughts end up reconstructing history under stress. Teams that baked those controls into their systems move faster, with fewer surprises. The difference only becomes obvious once someone external is asking uncomfortable questions.

We don’t have a talent gap. We have a priority problem

The industry loves to blame a “shortage of people,” mostly because it avoids a harder conversation about incentives and trade‑offs.

What’s actually scarce are organizations willing to pay for senior engineers who understand how systems fail, how identity propagates, and how to read a PCAP without a wizard. It’s easier to buy another tool than to change how teams are built. The teams that hold up under pressure are usually smaller. They automate routine work, keep system boundaries explicit, and reserve human attention for decisions that actually matter.

One last uncomfortable truth

A lot of companies will spend the next year swapping tools, renaming teams, and telling themselves that this time the platform will save them. New dashboards will get rolled out. Old problems will get relabeled. Very little will change in how systems are actually designed or operated.

If you’re building or running a security program right now, the uncomfortable work is much less glamorous. It’s asking which identities you would disable first if something went wrong. It’s finding out whether you can actually restore your most critical system without guesswork.

And this is where AI actually earns its keep. Not as a silver bullet, but as leverage. Machines can sift through logs, correlate identity signals, and surface real risk faster than any human team ever could. Used well, AI gives security teams breathing room. It takes the grunt work off their plate so people can focus on judgment calls that still require context, accountability, and experience.

The real decision is whether you design your systems to take advantage of that leverage, or keep pretending humans will somehow be fast enough on their own.

Software Is Not Flexible. It Hardens as It Grows

Sahil Kapoor — Mon, 05 Jan 2026 02:42:25 +0000

We’ve been misled by the word soft.

Software has earned a strange reputation over the years. Not just for being powerful, but for being forgiving. Easy to bend. Easy to revise. Easy to clean up once the real requirements finally become clear. That belief doesn’t usually come from engineers. In most teams, it is reinforced by product timelines, business pressure, and the comfort of knowing that nothing ships in steel or concrete. Code feels negotiable. Mistakes feel reversible.

Early in a project, code behaves like Lego blocks. You snap things together, pull them apart, rebuild the castle into a spaceship, and nothing breaks in any visible way. If a decision turns out to be wrong, you patch around it and move on. Compared to steel, concrete, or wiring, software feels forgiving, almost indulgent of experimentation.

As a codebase grows, services multiply, more developers contribute code, and business logic keeps getting layered in, it starts to behave more like wet concrete. Early on, you can still push it around, move walls, redraw boundaries, and smooth over rough edges. But once the system is coordinating across services, holding persistent data, serving real users, and supporting revenue, it begins to set. From that point on, change no longer feels like editing. It feels like demolition.

Most teams don’t notice the moment this shift happens. They notice it later, when things that used to be easy start taking longer, and when small changes begin to carry an uncomfortable amount of risk.

The Lego mindset is part of how teams get there. It shows up as small technical shortcuts that feel harmless at the time, like a missed cache key in Redis that quietly sends traffic back to ** ** SQL Server, or a background job that bypasses the domain layer because it is “just internal.” If you stack bricks just because they fit, you don’t end up with flexibility. You end up with a structure that works right up until it doesn’t. One edge case, one traffic spike, or one new requirement later, the wobble starts.

Optimising for speed of assembly is not the same as optimising for survival under load.

Real engineering is about whether the internal bonds hold when the system is stressed, not about how fast you can put something together. The Lego approach optimises for getting to “done,” but it does very little to optimise for load, failure modes, or long-term change.

The Double Standard

Most systems don’t fall apart when they are failing. They fall apart right after they start working.

Once something is live and people depend on it, the definition of risk quietly shifts. Shipping fast still matters, but breaking things suddenly matters more. So teams start making small compromises to keep momentum going. They skip an abstraction to hit a deadline. They hardcode a path because it is faster than untangling the model. They promise themselves they will clean it up later, when things are calmer.

Remember, that calm never comes.

In every other discipline, scale brings more rigor, not less. Bridges do not loosen their standards as traffic increases. Aviation systems do not get more casual as routes expand. Software, somehow, is treated as the exception. We justify this because we can’t see the cracks. There is no sagging beam or visible fracture. Everything looks fine until it isn’t, so we tell ourselves the risk is abstract, manageable, and something we can deal with later.

We would never accept that logic anywhere else.

Imagine a pilot saying:

“We’re behind schedule. Let’s skip the pre-flight check. We’ll just push an engine patch mid-air if things start smoking.”

Or a surgeon saying:

“I’m skipping the hand-washing to save time. We can treat the sepsis during the next sprint.”

Or a civil engineer saying:

“The foundation is cracking, but the client needs the penthouse finished by Friday. We’ll refactor the basement once the building is full of people.”

In those fields, discipline is the price of entry. In software, we still talk about it as if it were a nice-to-have.

Once users depend on a behavior, a temporary hack quietly stops being temporary. It becomes an undocumented contract.

The Physics of Slowdown

Once those contracts exist, the system starts to change how it behaves.

This is where cohesion and coupling stop being abstract ideas and start showing up in daily work. Responsibilities that belong together end up scattered. A change in one place quietly reaches into others. Code stops reflecting intent and starts reflecting everything it has ever had to accommodate.

You see the effects long before anyone names the cause. Engineers hesitate before making small changes. Deployments get delayed, not because the change is large, but because nobody is confident about what else might be affected.

At first, this friction is easy to ignore. Then it compounds.

Features that once took days start taking weeks. Bug fixes trigger regressions. Releases stop feeling routine and start feeling dangerous. The Total Cost of Ownership does not rise gradually. It jumps.

When teams say they have slowed down, this is usually why. Not because engineers suddenly care too much about quality, but because the system has lost its ability to absorb change.

Other industries never had the luxury of pretending otherwise. A sloppy manufacturing supply chain can bankrupt a company in a quarter. A loose telecom protocol can take an entire network down. Poorly planned urban infrastructure costs lives.

In those fields, technical debt isn’t a metaphor. It’s a suicide note.

Software is the only place where we’ve convinced ourselves that the bill can be outrun indefinitely.

Knowing When to Step Back

The instinct to defer discipline usually comes from good intent, not ignorance. Lean Startup thinking pushed back, rightly, against over-engineering before product market fit. Many teams wasted time building futures that never arrived.

The problem begins when that mindset outlives its usefulness and teams fail to recognise the inflection point.

Once a product works, later rarely comes. Usage grows, data accumulates, teams expand, and the system quietly shifts from experiment to infrastructure. At that point, treating discipline as optional stops being pragmatic.

This is the moment where product and business leadership matters most. Speed does not come from ignoring technical limits. It comes from knowing when the system needs time, not features.

Slowing down here does not mean stopping. Refactoring and rebuilding are not aesthetic exercises. They are strategic interventions to restore the system’s ability to change. Sometimes that means untangling a dependency. Sometimes it means rebuilding a critical path.

The only sustainable approach is incremental change. Patterns like the Strangler Fig exist because hardened systems can’t be reset, only replaced piece by piece.

Knowing when to listen to these signals is what keeps speed, reliability, and trust intact as systems grow.

Best AI Tools for Product Managers in 2026

Sahil Kapoor — Fri, 02 Jan 2026 09:00:00 +0000

Two years ago, most product managers were experimenting with generic AI tools like ChatGPT to speed up writing or brainstorming. In 2026, the landscape looks fundamentally different. A full stack of focused AI tools now exists across discovery, prototyping, research, automation, and storytelling. This shift forces a change in mindset. You can either adapt and move 10x faster from PRDs to MVPs, or fall behind while others compound their speed.

This post is a practical follow-up to my earlier essay AI and the Rise of the Full-Stack Product Manager. In that piece, I argued that AI is turning PMs into full-stack builders who can think, design, execute, and automate without waiting on others. Here, I break down the most useful AI tools for PMs in 2026, with hands-on perspectives on where they shine and how they are actually used.

🚀 Prompt to MVP Tools

Bolt.new: From idea to full-stack MVP

Bolt is one of the fastest ways to turn a raw idea into a working product. You describe your product in plain English, and Bolt generates a complete web application including front end, back end, and database. It is ideal for stakeholder demos, overnight validation experiments, or pressure-testing ideas before pulling in engineering resources. It is not designed for complex production architectures, but it excels at speed and clarity during early validation.

Lovable: Vibe coding made real

Lovable feels like pair programming with an invisible engineering team. You describe workflows, screens, and logic, and it builds usable UIs and backend flows. PMs can use it to ship internal tools, feedback portals, or experimental pilots that would otherwise remain stuck in the backlog. It does not replace refined design systems, but it unlocks independent shipping.

🎨 Storytelling and Presentation

Gamma: AI decks without PowerPoint pain

Gamma changes how PMs communicate strategy and progress. You feed it structured bullets or raw notes, and it generates clean, interactive presentations. It works well for roadmap reviews, leadership updates, investor decks, and product launches. Deep brand control is limited, but the storytelling speed is unmatched.

🖌️ UX Design and Flow Builders

UXPilot: AI-powered wireframes in minutes

UXPilot works as both a Figma plugin and a standalone tool. From prompts, it generates wireframes, screens, and end-to-end flows. PMs use it to align teams visually early, test multiple journeys quickly, or reduce design back-and-forth before involving designers. It is not intended for final UI polish, but it saves significant design bandwidth.

Uizard: From sketch to prototype

Uizard turns sketches, screenshots, or text prompts into interactive prototypes. It is particularly useful in workshops where ideas need to come alive immediately. While it struggles with complex design systems, it excels at collaboration, ideation, and rapid validation.

📊 Strategic Product Management

Productboard AI: Prioritization at scale

Productboard AI clusters large volumes of feedback into themes and links them to features and outcomes. PMs use it to defend roadmap decisions, plan sprints with confidence, and bring structure to overwhelming customer input. It works best when feedback pipelines are clean and well-integrated.

Zeda.io: AI-led product discovery

Zeda aggregates signals from support tickets, NPS, and community channels to surface real customer pain points. It helps validate ideas with evidence and identify patterns across silos. Initial setup requires effort, but the discovery payoff is substantial.

📝 Daily PM Copilots

Notion AI: AI inside your workspace

Notion AI lives where many PMs already work. It helps draft PRDs from messy notes, summarize long documents, and propose structured outlines. While it can oversimplify technical nuance, it reliably saves hours on daily documentation.

🔍 Research Second Brains

NotebookLM: Google’s AI research notebook

NotebookLM acts like a personal research analyst. PMs upload PDFs, docs, and reports, then ask questions, generate summaries, or even produce audio briefings. It remains experimental, but it is extremely effective for research-heavy product work.

Saner.AI: Auto-organized knowledge hub

Saner automatically organizes notes, tasks, and information into a searchable system. It shines for PMs who suffer from fragmented context across tools and want a single, recallable knowledge base.

⚡ Workflow and Marketing Automation

Gumloop: No-code growth automations

Gumloop is built for marketing and growth workflows. PMs use it to collect surveys, analyze responses with AI, and push insights into tools like Notion or Slack. It is especially valuable for cutting repetitive operational work.

n8n: Developer-grade automations

n8n is a powerful automation platform suited for technical PMs. It enables complex multi-step workflows, deep integrations, and AI-driven product ops across systems. The learning curve is higher, but the leverage is significant.

Zapier: The easiest automation entry point

Zapier remains the fastest way to connect tools without code. PMs use it for lightweight automations like summarizing feedback, syncing reports, or connecting forms. It is not ideal for complex logic, but its simplicity is unmatched.

⭐ Other Noteworthy AI Tools

Amplitude AI: Helps forecast churn and predict user behavior.
Otter.ai: Records and summarizes meetings automatically.
Figma AI: Speeds up wireframing and ideation.
ClickUp Brain: Helps with task prioritization.
Crayon: Tracks competitor moves in real time.
Pendo: Blends product analytics with in-app engagement.
Motion: Organizes tasks and schedules with AI.

Final Thoughts

The best AI tools for product managers in 2026 are no longer novelty experiments. They are becoming the default operating system for modern product work. AI will not replace product managers, but PMs who use AI will consistently outpace those who do not.

This stack is only a starting point. If you have used AI tools that fundamentally changed how you build, ship, or learn, share them. I will keep evolving this list as the ecosystem matures.

Embedding Flutter Modules into Native Android and iOS Apps

Sahil Kapoor — Tue, 30 Dec 2025 09:55:00 +0000

We have an app built in native iOS and Android. It is a large codebase with years of history, complex navigation, and a fairly involved CI/CD setup.

The product team wanted to add a small mini-game inside the Rewards tab. The goal was to make the rewards section stickier. Static coupons were not cutting it, and they wanted a "Falling Coins" style interaction to drive daily logins.

I was already working with Flutter at the time, and I had some bandwidth. My lead asked if I could take this on without disturbing the main native teams. It was clearly a side feature, valuable for engagement but not worth weeks of duplicated native work.

Given those constraints, I had to figure out how to build this without slowing down the native release train. The options were not great:

WebViews: We tried this first. The touch latency was awful, and it felt like a cheap webpage stuck inside a premium app.
Unity: Overkill. Adding the Unity runtime would have bloated our APK and IPA size by 20MB or more for a game that users might play for 30 seconds.
Native: Writing physics logic and collision detection twice, once in Swift and once in Kotlin, and trying to keep the gravity constants in sync was not appealing.

That is when we decided to try Flutter Add-to-App.

The Experiment: A Game Engine in Disguise

We treated Flutter not as a UI framework, but as a lightweight rendering engine. We used the official Add-to-App documentation to integrate a Flutter module into our existing Gradle and CocoaPods setups.

The game itself was a simple 2D game , built using Flame, a game engine on top of Flutter. We were dealing with sprites, basic physics, collision detection, and a predictable game loop, exactly the kind of workload where Flutter performs well. We bundled our game assets, images and sounds, directly inside the module, which kept the native project clean.

Here is why this approach clicked. In the game, we had to calculate the trajectory of falling coins and detect when the user’s basket caught them.

In native: I would have had to write the same updatePosition logic in Kotlin and Swift, hoping the math matched perfectly on both platforms.
In Flutter: I wrote the physics logic once in Dart.

The result was consistency. The game felt identical on an iPhone 14 and a Samsung S21. The animation stayed locked at 60fps because Flutter draws directly to its own Skia and Impeller rendering pipeline, bypassing native UI hierarchy limitations.

Sometimes, watching hot reload work instantly on both the iOS simulator and Android emulator, it honestly feels like Flutter is taking over the parts of mobile development that used to be the most painful. I have written more about this shift earlier in Flutter Is Taking Over.

The Gotchas

While the game logic was smooth, the integration required some architectural work.

1. Engine warm-up

The first time we launched the game activity, there was a visible black screen for about 400 milliseconds. This happens because the Flutter engine takes time to spin up.

The fix: We implemented pre-warming. We initialized the FlutterEngine in Application.onCreate on Android and AppDelegate on iOS, and cached it for reuse.

Here is the code we added to Application.kt to warm the engine before the user ever sees it:

// Pre-warm the engine in the background
val flutterEngine = FlutterEngine(this)

// Start executing Dart code to warm up the engine
flutterEngine.dartExecutor.executeDartEntrypoint(
    DartExecutor.DartEntrypoint.createDefault()
)

// Cache it globally to be picked up by the Activity later
FlutterEngineCache.getInstance().put("my_game_engine_id", flutterEngine)

By the time the user tapped the game tab, the engine was already hot and waiting, with no visible lag. On iOS, we did the exact same thing in AppDelegate using a FlutterEngineGroup to pre-warm and reuse engines efficiently across view controllers.

2. Talking to the host

The game needed to talk to the native app when something meaningful happened, for example when a user won a coupon or finished a round. This is where MethodChannel comes in.

We treated the Flutter module as a self-contained game engine, but delegated anything related to user state, persistence, or rewards back to the native app. Flutter emitted events, and the host app decided what to do with them.

Dart: Calls platform.invokeMethod("couponWon", {"value": 10}) when the game logic determines a reward.
Kotlin / Swift: Listens for the couponWon event, validates it, and persists it using existing native infrastructure like Room, Core Data, or backend APIs.

This separation kept the Flutter side focused purely on gameplay, while the native app remained the source of truth for user data and rewards. MethodChannels are simple, but they require discipline. Method names become an implicit contract between Flutter and native code, and changing them needs coordination.

3. The navigation struggle

This was the trickiest part. When a user is inside the Flutter game, they expect the Android hardware Back button to pause the game, not kill the Activity immediately.

We intercepted the back gesture on the native side and asked Flutter whether it could handle it. If the game was running, Flutter paused it. If the user was already at a menu, Flutter told native to go ahead and close the screen.

Pros and Cons

If you are considering embedding Flutter for a specific feature like a game, a dashboard, or a complex form, here is an honest breakdown.

Pros

Pixel consistency: The game looks identical across platforms.
Performance: For 2D animations and casual games, Flutter outperforms WebViews and is significantly lighter than Unity.
Iteration speed: Tweaking gameplay variables took seconds with hot reload. In native code, each change would have required a full rebuild.

Cons

App size: Even with optimizations, linking the Flutter engine added roughly 5 to 7MB to the app. Android App Bundles helped, but the base increase is real.
Memory usage: Running native navigation and a Flutter engine simultaneously increases RAM usage. We explicitly cleaned up the engine on exit to avoid issues on low-end devices.
Context switching: Moving between Android Studio and VS Code can be mentally taxing during active development.

Final Thoughts

Looking back, this was less about Flutter and more about being pragmatic.

We had a side feature to ship, limited time, and no appetite to pull the native teams into a long build cycle. Flutter Add-to-App gave us a way to move fast, keep the feature isolated, and avoid writing the same logic twice.

I would not use this approach for core app flows or anything deeply tied to navigation and app state. But for side features like games, promos, or experimental surfaces, it works surprisingly well if you are clear about boundaries.

If you are thinking about trying this, start small. Treat Flutter as a tool, not a strategy, and use it where it genuinely saves time. In our case, it did exactly that.

💡

This piece was written by Harshit Sachan as part of the Guest Posts series on Sahil's Playbook.

MongoDB Data Modeling: How to Design Schemas for Real-World Applications

Sahil Kapoor — Tue, 23 Dec 2025 08:29:36 +0000

Every time I see a MongoDB system that performs beautifully at scale, it’s never because the team did something exotic. It’s because they aligned their schema with one simple truth: your data model must follow your application’s access patterns. Not theoretical relationships. Not entity diagrams. Actual reads and writes.

MongoDB is built for this. Once you stop thinking in terms of entities and start thinking in terms of how your application consumes data, schema design becomes far more intuitive.

This piece breaks down the practical way MongoDB expects you to model data for real-world systems, the patterns that make distributed queries fast, and the anti-patterns that quietly destroy performance.

1. The Golden Rule: Data Accessed Together, Stored Together

MongoDB’s core strength is data locality. When all the data you need for a screen or an API call lives inside a single document, you get:

predictable read performance
fewer network hops
minimal coordination overhead across nodes

Imagine a user profile screen that shows: user info, subscription details, last 3 orders, preferences etc.

In MongoDB, this works best when these pieces live inside a single document. Your application reads once, renders once, and moves on.

Here’s how a real-world User document might look:

{
  "_id": 101,
  "name": "Aditi Sharma",
  "email": "aditi@example.com",
  "preferences": {
    "language": "en",
    "theme": "dark"
  },
  "recent_orders": [
    {
      "order_id": 9001,
      "amount": 450,
      "placed_at": "2024-12-10T12:00:00Z"
    },
    {
      "order_id": 9002,
      "amount": 199,
      "placed_at": "2024-12-11T16:00:00Z"
    }
  ],
  "subscription": {
    "tier": "Gold",
    "renewal": "2025-01-01"
  }
}

One API call. One predictable latency. No fan-out queries.

This design philosophy is the backbone of fast MongoDB systems; group fields that are read together into one document so your read path stays stable and efficient.

2. Embed vs Reference: The Practical Decision Matrix

MongoDB gives you two big tools: embedding and referencing. The challenge is knowing when to use which.

A clean mental model is this: how many items sit on the many side of your relationship and how often are they accessed?

Let’s break it down.

A. One-to-Few: Embed

If the child objects are:

small
bounded
frequently accessed with the parent

Then embedding is perfect.

Example: User + Addresses

{
  "name": "Sahil",
  "addresses": [
    { "type": "home", "city": "Gurgaon" },
    { "type": "office", "city": "Bangalore" }
  ]
}

Bounded arrays shine here. Fast reads, minimal overhead. The key idea is that when the list will never grow beyond a small, safe upper limit, embedding ensures consistent performance without worrying about document bloat.

B. One-to-Many: Reference

If you have potentially thousands of children, embedding becomes impractical. Document size grows. Updates become slow.

A classic example is products and reviews.

Product document:

{
  "_id": 77,
  "title": "Don 3",
  "price": 399
}

Review document:

{
  "product_id": 77,
  "rating": 5,
  "comment": "Insane movie!"
}

This keeps your primary document light and responsive. The reviews load only when the user requests them, which is exactly the point: when a related dataset grows large, referencing preserves both agility and performance.

C. One-to-Squillions: Hybrid

Unbounded relationships like logs, activity feeds, or transactions require a hybrid model using bucketing , sharded collections , or capped collections.

The idea is to avoid:

unbounded arrays
massive documents
unpredictable write behavior

MongoDB works best when documents stay reasonably sized. For unbounded data, spread writes across multiple documents instead of forcing everything into a single growing structure.

3. Production-Proven Patterns That Make MongoDB Fly

The following patterns aren’t theoretical. They show up everywhere across high-scale systems.

A. Subset Pattern (The Homepage Problem)

Let’s say a movie has 10,000 reviews. The homepage needs only the top 3.

Embedding all 10,000 is impossible. Querying reviews separately for every homepage view is expensive.

The subset pattern solves this by keeping only the frequently accessed slice of data inside the main document.

{
  "_id": 77,
  "title": "Don 3",
  "top_reviews": [
    { "rating": 5, "user": "Aarav", "comment": "🔥" },
    { "rating": 4, "user": "Reema", "comment": "Loved it" }
  ]
}

This gives instant page loads while keeping the full review set separate.

You’ll see this pattern everywhere:

product listings
home feeds
dashboards
content cards

It optimizes for the 95 percent case by keeping just enough data in the parent document to serve the common path quickly.

B. Extended Reference Pattern (Minimizing Follow-Up Calls)

Sometimes a reference isn’t enough. Your API often needs a few extra fields from the referenced document.

Instead of making another query, you store just those fields alongside the reference.

Example: Order document embedding commonly used customer fields:

{
  "order_id": 99,
  "customer": {
    "id": 123,
    "name": "Jane Doe",
    "avatar": "jane.jpg"
  }
}

This isn’t about duplicating entire objects. It’s about tuning the document so your read path becomes a single operation.

It’s especially powerful in microservices where latency adds up quickly. The broader idea is to store the small fields that your read path depends on so you avoid extra lookups during critical flows.

C. Bucket Pattern (For Logs and Event Streams)

Logs arrive continuously. Storing each log event as an individual document introduces huge overhead.

MongoDB’s bucket pattern groups related events into a single document.

{
  "user_id": 123,
  "day": "2024-12-10",
  "events": [
    { "ts": 1702212010, "type": "click" },
    { "ts": 1702212022, "type": "scroll" }
  ]
}

This cuts your writes massively. Queries also become more predictable.

4. Anti-Patterns That Hurt Real-World Systems

These traps look harmless when data is small but explode at scale.

A. Unbounded Arrays

{
  "log_entries": []
}

An array like this grows forever. And every write rewrites the entire document. Your database becomes slower and slower until it hits a hard size ceiling. Always bucket or reference.

B. Overly Fragmented Collections

Some teams create a separate collection for every small entity:

users
addresses
preferences
phone numbers
tags

Each extra collection increases the number of queries required to assemble a single response.

High-scale MongoDB systems aggressively minimize the number of collections needed for a single screen.

C. Bloated Documents

Embedding large blobs like images or PDFs inside documents leads to heavy reads.

{
  "user": "Aditi",
  "profile_pic": "<2MB binary>"
}

Even a simple metadata lookup now transfers megabytes.

Keep large objects in object storage or GridFS. MongoDB should carry metadata, not media, so each request moves only the bytes it truly needs.

5. How MongoDB Wants You to Think About Data

MongoDB rewards schemas that follow how your application actually consumes data.

The right mental model is simple:

If multiple fields are always read together, embed them.
If a child grows large or unbounded, reference it.
If a child is partially read frequently, embed a subset.
If writes dominate, keep documents small.

This approach helps to keep reads predictable, writes efficient, documents maintainable and performance steady as scale increases.

Remember to model around access patterns, not abstract entities, and your system remains predictable even as demand grows.

A Real Example: Swiggy-Style Order Flows

Take a food delivery app. On the order history screen, the user only needs a lightweight summary of each order: the order_id, restaurant name, amount, a thumbnail, and maybe the top few items. On the order detail page, the same order expands into the full item list, delivery timeline events, delivery agent details, payment breakdown, and the restaurant’s full address.

A practical schema for this might look like:

// orders collection: optimized for history listings and quick lookups
{
  _id: ObjectId("675abc123..."),
  user_id: 123,
  restaurant: {
    id: 45,
    name: "Bombay Biryani",
    thumbnail: "biryani-thumb.jpg"
  },
  amount: 375,
  summary_items: [
    { name: "Chicken Biryani", qty: 1 },
    { name: "Gulab Jamun", qty: 2 }
  ],
  created_at: ISODate("2024-12-10T13:05:00Z"),
  status: "delivered"
}

// order_items collection: full detail for the order detail page
{
  order_id: ObjectId("675abc123..."),
  items: [
    {
      name: "Chicken Biryani",
      qty: 1,
      price: 250
    },
    {
      name: "Gulab Jamun",
      qty: 2,
      price: 125
    }
  ],
  restaurant_address: {
    line1: "Sector 29",
    city: "Gurgaon",
    lat: 28.4595,
    lng: 77.0266
  },
  payment_breakdown: {
    subtotal: 375,
    taxes: 45,
    delivery_fee: 25,
    discounts: 50,
    total: 395
  }
}

// order_events collection: bucketed delivery timeline
{
  order_id: ObjectId("675abc123..."),
  day: "2024-12-10",
  events: [
    { ts: 1702212010, type: "created" },
    { ts: 1702212110, type: "accepted_by_restaurant" },
    { ts: 1702212310, type: "picked_up" },
    { ts: 1702212610, type: "delivered" }
  ]
}

In this design, the history screen queries only the orders collection, the detail page joins in the order_items document when needed, and the tracking UI reads from order_events. The result is an absurdly fast system for millions of users, even during lunch peak, because each flow reads just enough data to do its job and nothing more.This becomes a clean split between fast summary access and deeper detail access, ensuring the common path stays lightweight:

Orders contain a subset of frequently accessed fields.
Full details live in their own structure.
Delivery events use buckets.
Restaurant metadata is embedded if used often.

The result is an absurdly fast system for millions of users, even during lunch peak. The lesson: tune your schema to the real flow of data consumption and the system naturally scales.

6. Indexing Strategies

Index design in MongoDB isn’t an afterthought. It’s what turns a well-structured schema into a fast system. Here are battle-tested indexing patterns that pair naturally with the modeling techniques above.

A. Single-Field Indexes for High-Cardinality Fields

Fields like email, product_id, or order_id should always be indexed because they are frequently used in equality filters.

// Fast lookup by product
 db.reviews.createIndex({ product_id: 1 });

B. Compound Indexes for Common Query Shapes

MongoDB matches queries to indexes by prefix. If most of your queries look like:

 db.orders.find({ user_id: 123 }).sort({ placed_at: -1 })

Then your index should match that shape:

 db.orders.createIndex({ user_id: 1, placed_at: -1 });

This avoids in-memory sorts and keeps pagination fast.

C. Indexing Embedded Fields

// Index the user's city inside embedded addresses
 db.users.createIndex({ "addresses.city": 1 });

Embedded objects and arrays can be indexed directly. MongoDB handles multi-key indexes automatically when arrays are involved.

D. Partial Indexes for Sparse or Optional Fields

Useful when only a subset of documents contains the field. This keeps indexes small and efficient.

 db.orders.createIndex(
   { "subscription.renewal": 1 },
   { partialFilterExpression: { "subscription.renewal": { $exists: true } } }
 );

E. TTL Indexes for Bucketed or Ephemeral Data

Great for logs, events, sessions. TTL + buckets gives extremely efficient log deletion.

 db.events.createIndex(
   { created_at: 1 },
   { expireAfterSeconds: 86400 }
 );

F. Prefix Rule Reminder

If you create an index like:

 db.orders.createIndex({ user_id: 1, placed_at: -1, amount: 1 });

MongoDB can use it for queries that include:

user_id
user_id + placed_at
user_id + placed_at + amount

But NOT for:

placed_at alone
amount alone

Design indexes around actual query patterns. The principle behind all indexing in MongoDB is simple: optimize for the queries that hit your system most often, not hypothetical ones.

Bringing It All Together

Good MongoDB schema design feels like UI-driven modeling. You organize data based on the screens and API calls your application actually serves.

When you:

embed intentionally
duplicate selectively for performance
reference when data grows large
bucket when data grows endlessly

MongoDB becomes one of the most efficient databases to operate. Predictable performance. Reduced infra cost. Cleaner code.

This design mindset is what separates MongoDB systems that scale effortlessly from those that collapse under real-world workloads. The overarching idea: when you align schema, access patterns, and indexing, MongoDB delivers consistent performance even as complexity grows.

Your Career Isn’t a Ladder, It’s a Metamorphosis

Sahil Kapoor — Sat, 20 Dec 2025 08:28:52 +0000

💭

This piece was written by Nikkhita Bhowmick as part of the Guest Posts series on Sahil's Playbook.

We imagine careers as ladders: linear and predictable. Climb long enough, and you assume you’ll reach somewhere impressive.

But real growth feels nothing like climbing. It feels like shedding.

Every transition - role change, burnout, reinvention - isn’t a step up but a quiet shift happening beneath the surface. Growth doesn’t look like progress. It looks like metamorphosis.

Careers don’t progress. They transform.

Egg Stage: Curiosity

At the start, you have no shape, only potential. You’re guided by curiosity, not certainty. Tools and processes don’t matter yet. You simply feel pulled toward creation.

Curiosity gives you permission to explore without pressure. You experiment freely, not to prove anything but to understand what excites you. You follow impulses that make no strategic sense yet shape your instincts for years.

When I created my first product, there was no roadmap. Just late nights, messy notes, half-finished ideas, and the thrill of turning imagination into something real.

That period teaches you that passion is the first form of discipline. The beginning is shapeless, but full of possibility.

Larva Stage: Rapid Learning and Consumption

The larva’s job is simple: grow fast. This is the phase where you say "yes" to everything. You absorb constantly, stretching your limits in every direction.

You learn tools you don’t fully understand yet. You break things and fix them. You copy, improvise, borrow patterns, discard others, and rebuild your mental models daily. You work with people who are far ahead of you, and their pace becomes your pace.

It is chaotic, but energizing. Your confidence rises and collapses weekly. You outgrow old assumptions quickly. This is the steepest learning curve of your career - the one that builds raw capability.

But it’s not your final form. It’s preparation. This is the stage where speed matters more than polish.

Pupa (Chrysalis) Stage: Internal Transformation

Then comes the stillness, the phase most people misinterpret. From the outside, you look inactive. But internally, everything is dissolving and reorganizing.

This phase usually begins with disruption: a project ends, a team dissolves, a company pivots, or you hit burnout. The noise that once filled your days goes silent. Without the constant churn of deliverables, deeper questions rise to the surface:

What do I want to build next?
What kind of work aligns with who I am becoming?
What version of myself am I outgrowing?

When a major project I worked on shut down overnight due to regulation, the stillness was disorienting. Yet, it forced reflection that growth phases never allow. Reinvention often disguises itself as collapse, but this is where identity reshapes.

Butterfly Stage: Application and Expansion

Movement returns, but with intention. You’re no longer driven by adrenaline. You’re driven by alignment.

You make decisions from clarity rather than the fear of missing out. You choose projects that fit your values. You decline work that doesn’t reflect your direction. You build with a calmer mind but sharper precision.

This is the phase where everything you learned in earlier cycles clicks into place. You operate with quiet confidence instead of constant proving. You recognize patterns faster. You invest effort where leverage is highest.

When I began developing a travel website and a messenger app, it wasn’t about chasing momentum; it was about choosing work that felt meaningful. That’s what fluency looks like: creating with purpose instead of pressure.

The Cycle Is Continuous

In nature, metamorphosis happens once. In careers, it loops endlessly.

A new industry resets you to the Egg phase.
A new role pulls you into rapid learning ( Larva ) again.
A confusing period becomes your Chrysalis.
Clarity lifts you into the Butterfly stage.

Each stage is necessary. Each stage is temporary. And each stage equips you for the next version of your work and identity.

None of this is regression. It is evolution unfolding in cycles.

You will return to every stage again and again. That is the design.

The Takeaway

You’re not climbing a career. You’re becoming.

Every pause, collapse, and reinvention is part of how you grow. And here’s the real truth: the next version of you is already forming inside the one you are now.

2026 Prediction: Foundation Models Are Becoming a Black Hole for AI Startups

Sahil Kapoor — Wed, 17 Dec 2025 02:46:35 +0000

If you have been tracking AI startups over the last two years, something should already feel off.

Products continue to ship and teams continue to raise money, yet momentum quietly stalls. Customers stop expanding usage. Categories stop compounding. This is not random, and it is not simply bad execution. It is a pattern that emerges when market structure shifts faster than most companies are willing to acknowledge.

As we move through the end of 2025 and look toward 2026, the AI market is entering a compression phase. Foundation models are no longer just improving in quality or efficiency. They are steadily absorbing the surface area that entire products were built on.

In earlier technology cycles, platforms needed ecosystems to succeed. Cloud needed applications to justify migration. Mobile needed developers to create new use cases. AI does not work the same way. As models improve, they do not create room for more startups. They reduce the need for them by collapsing capabilities directly into the model or into the platforms that distribute those models.

OpenAI has recently shipped an agent framework. That release should not be read as a routine feature update. It should be read as a warning. Categories built around autonomous sales agents, copilots, and task wrappers are already under pressure. What looked defensible in early 2025 is beginning to resemble orchestration around someone else’s roadmap.

Google is rolling multimodal awareness directly into Android. When the operating system can see, hear, and reason across applications by default, entire classes of AI wearables and assistants start losing their justification. This shift does not arrive with loud announcements or forced migrations. It happens quietly, through defaults that users accept without thinking.

Apple is extending Siri with on screen context. When a platform can understand what is already on the screen, contextual assistant apps do not get a grace period. Features that once justified standalone companies become a line item in an operating system update. Users do not complain or churn angrily. They simply stop needing the product.

As we head into 2026, building too close to the model is becoming a structural risk. The gravity is already forming.

"If your product can be replaced by a better prompt, it is already dead."

The only durable advantage left is owning the workflow where decisions turn into irreversible action.

This is the lens founders should be using right now, not to predict winners or to explain failures after the fact, but to decide what needs to change while there is still time to change it.

What follows are the patterns to look for and the responses that matter if you want a real chance of surviving the 2026 shakeout.

1. From Copilots to Workflows

Over the last two years, the industry has fixated on copilots. The idea was straightforward: place a chat interface next to an existing workflow and let the model assist the user when asked.

That approach is becoming fragile. A product that waits for the user to ask a question now competes directly with the operating system, the browser, and the model provider itself. In that setup, you are effectively selling answers, and answers are rapidly becoming free.

Companies that want to survive need to move beyond copilots and start owning workflows. The shift is subtle in interface design but decisive in outcome. Instead of selling smarter responses, these products eliminate work altogether.

In a wrapper model, the system waits for a prompt, drafts a restocking email, and stops there. In a workflow model, the system detects inventory changes, creates the purchase order, updates the system of record, and queues the email automatically, leaving the human to simply click "Approve".

Models can answer questions, but they cannot own permissions, liability, or execution inside private systems. That boundary, where action replaces suggestion, is where defensibility now lives.

2. Intelligence Is Becoming a Commodity

One of the hardest lessons of 2025 is that raw intelligence is not a moat. Frontier models converge faster than most product roadmaps can adapt. Pricing changes overnight. Capabilities leak across providers. If your margins depend on a single model staying special, they will compress.

Companies that want to survive should start treating intelligence like electricity. That means abstraction, routing, and cost awareness by default, rather than reverence for any single provider.

Simple tasks should run on fast, inexpensive, and often open models. Expensive reasoning should be reserved for the small number of cases where it creates real value. If you cannot switch models quickly, you do not control your economics.

The danger zone sits in the middle. Generic wrappers with no routing logic and no proprietary data become resellers of someone else’s API, with worse margins and no leverage.

3. Distribution Comes Before AI

For years, founders asked whether incumbents could move fast enough to compete with startups. That was the wrong question. Speed was never the deciding factor.

As we move toward 2026, it is becoming clear that intelligence depreciates while context compounds. Distribution, data access, and trust accumulate slowly and are extremely difficult to replicate.

Companies like Notion, HubSpot, and Shopify are not winning because their models are superior. They are winning because their AI has permission to read, write, and act inside workflows that already exist.

The invisible interface is winning. The most effective AI features do not announce themselves. They show up as a CRM that auto fills fields, a design tool that proposes layouts, or a code editor that fixes bugs before you hit run.

If you are starting from zero distribution, AI does not create demand. It amplifies what is already there.

How to Build Outside the Event Horizon

If you are building today, you cannot rely on being smarter than the model. You must be structurally different from it. The companies that survive the 2026 crash will not win by out reasoning foundation models. They will win by embedding intelligence into systems that models cannot replace, and by doing so early enough that the shift compounds rather than arrives too late.

The era of prompt engineering is over. We are now entering the era of flow engineering, where the primary challenge is not generating better answers, but designing systems that move work forward with minimal human intervention.

Here are four concrete actions companies should take if they want to survive what 2026 is likely to bring.

1. Regain Control Over Cost, Margins, and Roadmap

Hard coding a single provider’s API keys into your production environment is technical suicide. It locks your roadmap, your margins, and ultimately your fate to a vendor that has both the incentive and the capability to move up the stack and compete directly with you.

The action here is simple but non negotiable. You need an abstraction layer between your product and the intelligence layer. Models should be treated as interchangeable infrastructure, not as a core part of your identity.

The goal is operational freedom. You should be able to swap the backend model, from GPT to Claude to a fine tuned open source alternative, without changing a single line of frontend code or retraining your users.

Here is the litmus test. If OpenAI were to double its pricing tomorrow, could you shift most of your traffic to a cheaper model in under an hour. If the answer is no, you are not running a business. You are operating as a hostage to someone else’s roadmap.

2. Turn User Corrections Into a Compounding Data Moat

Most startups are quietly throwing away the only data that actually matters. When a user accepts an AI output, that tells you very little. When a user corrects an AI output, that tells you exactly where generic intelligence breaks down in your domain.

Those corrections are training gold.

The action here is to instrument your systems so that you capture the diff between what the model generated and what the human finalized. Every edit, deletion, override, and reordering should be stored and associated with context.

Over time, this dataset of corrections becomes your real moat. It allows you to fine tune smaller, cheaper models that outperform frontier models inside your specific workflow. This is how you simultaneously lower costs and improve quality, something generic AI products struggle to do.

3. Shift Users From Prompting to Approving

If your core experience starts with a blinking cursor and the question "How can I help you?", you are placing cognitive burden on the user at the exact moment when AI should be removing it. Users are already exhausted by prompting, and that fatigue will only increase.

The action here is to move away from reactive chat interfaces and toward optimistic systems that act on context by default. Do not wait for instructions. Use the data you already have to anticipate what needs to be done.

Instead of asking users to create, present a draft state and ask them to verify. A travel product should not ask where the user wants to go. It should present a complete itinerary based on calendar, budget, and history, and then ask for confirmation.

This shift moves cognitive load from creation, which is hard and slow, to verification, which is easy and fast. That single change is often the difference between novelty usage and real adoption.

4. Capture Value Created, Not Humans Logged In

Per user per month pricing is fundamentally incompatible with AI. The entire purpose of automation is to reduce the number of humans required to complete a task. If you succeed, seat based revenue shrinks.

The action is to price against outcomes, not logins. Your pricing model should reflect the work your system completes, not the number of people who touched it.

Do not charge per seat. Charge per invoice processed, per contract reviewed, per shipment reconciled, or per hire completed.

Companies that continue to charge for seats will increasingly fight churn. Companies that charge for work done will capture the value they actually create.

The Bottom Line

The 2026 AI shakeout is unlikely to be dramatic. It will be quiet, gradual, and structural.

Generic products will be absorbed as intelligence commoditizes. Products that own execution, permissions, and workflow will endure.

Do not build a better wrapper. Build a better workflow.

Beyond Paycheques: What Truly Keeps Employees Engaged

Sahil Kapoor — Sat, 13 Dec 2025 17:37:00 +0000

💭

This piece was written by Kriti Jain as part of the Guest Posts series on Sahil's Playbook.

When I got into HR a couple of years ago, I honestly thought the equation was straightforward: pay people fairly and they’ll stay. Simple, right? But a few onboarding marathons, interview cycles, chai breaks, and emotional “What’s next for me?” conversations later, I realised something important: people don’t stay for money. They stay for meaning.

One of my favourite quotes from Work Rules! shaped my entire perspective:

“People tend to quit bosses, not companies. And they stay when they feel trusted, valued, and seen.”

Once I understood that, everything I noticed at work began to shift. Slowly, patterns appeared. And here’s what they taught me.

1. Recognition over random perks

The quickest way to lift someone’s spirit is not a trophy or a stage moment. It’s a genuine, quiet acknowledgment: “You did great today.” That one sentence can change how someone feels about their work.

It surprised me how often people remembered these small moments more than any award function. Recognition softened conversations, dissolved tension, and reshaped the energy in rooms. The office stopped feeling mechanical. It became human again.

2. Growth isn’t a perk, it’s fuel

There was one exit interview I’ll never forget. The employee said, “I didn’t see where I could go next.” That hit differently because it was painfully honest.

We had opportunities, but we weren’t showing them. The moment we made growth visible through mentorship circles, learning paths, and accessible resources people’s attitudes changed. The workload didn’t shrink, but the feeling of being stuck disappeared.

Growth isn’t an extra. It’s oxygen. And without it, even talented people eventually suffocate.

3. Flexibility equals respect

If you’d asked me two years ago, I would’ve defended the 9-to-5 structure with full confidence. Today? Not at all.

Once we offered flexibility, something unexpected happened. Productivity didn’t drop. It improved. People stopped performing “activity” and started producing results. Trust created space for better work and better humans.

Trust always outperforms time-tracking. Every. Single. Time.

4. Well-being isn’t fluff, it’s survival

Burnout rarely announces itself. It shows up quietly through delayed replies, tired eyes, or a camera that stays off. Watching someone slowly fade out isn’t dramatic. It’s heartbreaking. And preventable.

We introduced simple practices: casual check-ins, easy support channels, short wellness breaks, and even the occasional no-meeting day. None of these were big moves, but together they made work feel breathable again.

You can’t pour from an empty cup. And HR shouldn’t wait for someone to shatter before offering help.

5. Culture is the glue people don’t talk about

Culture isn’t built through policy documents. It’s built in moments.

The surprise cupcake on a desk.

The unexpected shout-out.

The inside joke from last year’s team outing that refuses to die.

These tiny exchanges hold teams together far more than coffee machines or game rooms ever will. What truly matters is the energy people walk in with, the energy they take home, and the energy that makes them decide, “Yes, I’ll come back tomorrow.”

The real HR tea

Paychecks may attract talent, but they don’t make people stay. What truly anchors them is:

Belonging. Growth. Trust. Culture. Well-being. Purpose.

HR isn’t just about onboarding, hiring, or policies. It’s about understanding people their fears, hopes, dreams, and motivations.

Two years in, here’s the truth I’ve learned and lived: retention isn’t built on salaries. It’s built on heart, honesty, and human connection.

DEV Community: Sahil Kapoor

Synchronized Expiration in Distributed Systems

Why TTL is a Time Bomb in Distributed Systems

1. TTL Jitter

2. Request Coalescing

3. Stale-While-Revalidate

4. Probabilistic Early Refresh

5. Cache Warming

6. Multi-Layer Caching: Reducing Blast Radius

Observability during Cache Stampede

The Takeaway

Redis in Modern Systems

Why Redis Is Fast?

The Single-Threaded Event Loop

I/O Multiplexing

Memory Efficiency and Specialized Encodings

From Cache to Control Plane: The Use Cases

Phase 1: Read Optimization (Caching)

Phase 2: Transient State (Sessions & Analytics)

Phase 3: Distributed Coordination (The Control Plane)

Distributed Locks

Rate Limiting

Redis and Microservices

Persistence: RDB vs. AOF

RDB (Redis Database Snapshot)

AOF (Append Only File)

Scaling Redis

Path A: High Availability with Redis Sentinel

Path B: Horizontal Scaling with Redis Cluster

The Ecosystem Fracture: Redis vs. Valkey

Production Anti-Patterns and Failure Modes

Latency Killers: Blocking the Event Loop

The KEYS Command

The Big Key

Availability Killers: Self-Inflicted Outages

Connection Storms

Scalability Killers: Load That Does Not Distribute

The Hot Key

Correctness Killers: When Performance Settings Delete Data

Maxmemory Policy Misconfiguration

Closing Thoughts

Make the Consequence Real

The Limits of Delegation

Making the Consequence Real

Designing Consequences That Teach

When the System Learns

What the Role Is Actually For

An Ode to Stack Overflow: The Community That Taught a Generation to Think

The Cliff Edge

The Twist

The Irony

The Trade‑Off

What Remains

How Cybersecurity Will Evolve in 2026

What the last breach cycle actually exposed

If you’re still trying to out‑click an LLM, you’ve lost

Assume you’re already cooked (and plan accordingly)

Identity is the only perimeter left,and it’s a mess

Regulation isn’t paperwork anymore

We don’t have a talent gap. We have a priority problem

One last uncomfortable truth

Software Is Not Flexible. It Hardens as It Grows

The Double Standard

The Physics of Slowdown

Knowing When to Step Back

Best AI Tools for Product Managers in 2026

🚀 Prompt to MVP Tools

Bolt.new: From idea to full-stack MVP

Lovable: Vibe coding made real

🎨 Storytelling and Presentation

Gamma: AI decks without PowerPoint pain

🖌️ UX Design and Flow Builders

UXPilot: AI-powered wireframes in minutes

Uizard: From sketch to prototype

📊 Strategic Product Management

Productboard AI: Prioritization at scale

Zeda.io: AI-led product discovery

📝 Daily PM Copilots

Notion AI: AI inside your workspace

🔍 Research Second Brains

The `KEYS` Command