DEV Community

Cover image for Debugging CPU Contention on Android With Perfetto Scheduler Traces
Raul Smith
Raul Smith

Posted on

Debugging CPU Contention on Android With Perfetto Scheduler Traces

I noticed it first as a feeling, not a metric.

Scrolling felt heavy. Not broken. Just resistant. Animations missed a beat every few seconds. Touch input landed late, like the screen was thinking before responding. Logs showed nothing alarming. Memory stayed stable. Frame rates looked fine in short tests.

Yet users kept describing the same thing in reviews. Sluggish. Sticky. Inconsistent.

That was when I stopped blaming code paths and started suspecting time itself.

The moment CPU time became the suspect

I had already trimmed allocations, flattened view hierarchies, and moved work off the main thread. On paper, the app should have been smooth.

But paper assumes uninterrupted execution.

On real devices, my app was not alone. Media playback, system services, location updates, notifications, and another foreground app all wanted CPU time at the same moment.

Nothing was crashing. Nothing was timing out. The system was simply choosing who got to run.

That is when I opened Perfetto.

What Perfetto shows that logs never will

Perfetto does not tell you what your code did. It tells you when it was allowed to do it.

The first trace I captured was humbling. Threads I assumed were running continuously were being paused and resumed constantly. Small slices of execution scattered across timelines. Long gaps where nothing ran at all.

The main thread was technically fast. Each chunk of work was short. The problem was spacing.

Frames missed deadlines not because work was heavy, but because work was delayed.

CPU contention does not look like slowness. It looks like interruption.

Reading scheduler traces without guessing

At first, the scheduler view looks overwhelming. Colored bars everywhere. Threads bouncing across cores.

I focused on three things.

First, runnable but not running threads. When a thread wants CPU time but is waiting, that gap matters more than how long the code itself takes.

Second, frequency drops. Perfetto shows when cores downscale. Less frequency means the same work stretches longer, even if scheduling looks fair.

Third, competing processes. I lined up my app’s threads against system services and other apps. The overlap told the story.

My app was behaving. The device was busy.

Why background work made everything worse

One surprise came from my own background tasks.

I had scheduled background processing to keep the UI light. In isolation, it worked. Under contention, it backfired.

Background threads kept waking up, asking for CPU time, and forcing the scheduler to juggle more runnable work. That increased context switching and pushed UI tasks later.

Perfetto made this obvious. Each background wake introduced tiny delays that stacked up into visible jank.

I reduced concurrency instead of increasing it. Fewer threads. Longer idle periods. More predictable execution.

The UI improved immediately.

Thermal pressure was the invisible multiplier

Another trace captured after a long session looked completely different.

CPU frequency dropped earlier. Time slices shrank. Scheduling became more aggressive.

Nothing in my code had changed. The device state had.

Perfetto helped me accept a hard truth. Performance is not stable across time. It degrades as the device heats, even if memory and network stay calm.

Designs that rely on consistent CPU availability fail quietly.

Fixes that actually worked

I stopped chasing micro-optimizations and focused on timing resilience.

I broke long tasks into resumable chunks. I avoided assuming a callback would run immediately. I reduced background wakeups. I moved non-urgent work to moments when the app already had focus.

Most importantly, I tested under contention. Streaming video while scrolling. Running navigation while syncing. Heating the device intentionally.

Perfetto became part of my regular workflow, not a last resort.

Why this matters beyond one app

These lessons changed how I think about mobile systems entirely.

CPU time is not a resource you own. It is a resource you borrow.

That mindset shift has shaped how I approach projects, including mobile app development Charlotte teams where apps live alongside heavy user workloads and system services on real devices, not test benches.

Architectures that assume uninterrupted execution break first. Architectures that tolerate delay survive longer.

The quiet resolution

After the fixes, nothing magical happened. No sudden jump in benchmark numbers. No dramatic charts.

Users stopped complaining about jank.

That was enough.

Perfetto did not make my app faster. It made my understanding slower and more careful.

Once I stopped asking the CPU to behave and started watching how it actually behaved, the system stopped feeling unpredictable.

It was never random. It was just shared.

Top comments (0)