80% of your presentation tests live in the reducer tier and run in 5ms. The other four tiers exist, but they are narrow on purpose — device-bound work is the rarest thing the pyramid measures. Strict MVI is what makes that shape possible in the first place.
Five tiers at a glance
The shape is fixed: a wide reducer base, a thinner middleware band, a narrow store-integration strip, a screen layer, and a handful of macrobenchmarks on top. A mid-sized feature produces 30–50 reducer tests, 10–20 middleware tests, 5–10 store integration tests, 5–10 screen tests, and 2–4 macrobenchmarks. Wall-clock budgets are equally uneven: under a second for the reducer tier, a few seconds for middleware, about five seconds for integration, and device-bound time for everything above.
If a feature has fewer reducer tests than middleware and integration combined, the reducer is doing too little or authors default to integration out of habit. The pyramid also diagnoses the codebase, not only the test suite. The tier where an assertion lives is a proof of what the code couples to: a state-shape claim that requires a coroutine dispatcher hints at reducer impurity; a middleware test that touches Compose hints at leaking presentation concerns into async orchestration. The rule for each new behaviour is to write the lowest-tier test that can prove it, and to move up only when that tier refuses to see the bug.
Rule: write every presentation test at the lowest tier that can prove the behaviour.
Reducer tier — 80% of tests
class FactsReducerTest {
@Test
fun `Refresh on idle sets isRefreshing and clears error`() {
val out = FactsReducer.reduce(
state = FactsState(error = UiText.Raw("stale")),
intent = FactsIntent.Refresh,
)
assertTrue(out.state.isRefreshing)
assertNull(out.state.error)
assertTrue(out.followUps.isEmpty())
}
@Test
fun `Refresh while already refreshing returns same instance`() {
val state = FactsState(isRefreshing = true)
val out = FactsReducer.reduce(state, FactsIntent.Refresh)
assertSame(state, out.state)
}
@Test
fun `Retry sets loading and emits LoadNextPage follow-up`() {
val out = FactsReducer.reduce(
state = FactsState(error = UiText.Raw("x")),
intent = FactsIntent.Retry,
)
assertTrue(out.state.isLoading)
assertEquals(listOf(FactsIntent.LoadNextPage), out.followUps)
}
}
Error-to-UiText mappings are the obvious parameterised block:
@RunWith(Parameterized::class)
class FactsReducerErrorMappingTest(
private val cause: Throwable,
private val expected: UiText,
) {
companion object {
@JvmStatic
@Parameterized.Parameters(name = "{0} -> {1}")
fun data() = listOf(
arrayOf(IOException(), UiText.Resource(R.string.error_network)),
arrayOf(ServerException(500, "x"), UiText.Resource(R.string.error_server)),
arrayOf(RuntimeException(), UiText.Resource(R.string.error_unknown)),
)
}
@Test
fun `FactsLoadFailed maps cause to UiText`() {
val out = FactsReducer.reduce(FactsState(), FactsIntent.FactsLoadFailed(cause))
assertEquals(expected, out.state.error)
}
}
No @Before, no Dispatchers.setMain, no runTest, no MockK, no Turbine, no Hilt. Anything from that list inside a reducer test means the test belongs a tier up. The canonical shape is three tests per intent — happy path, no-op branch with assertSame, and one edge or error branch — plus a parameterised block for error mappings. Follow-up intents get an exact list assertion, not a contains-check; the reducer's follow-up list is a total contract, and any looseness there hides re-entrant loops that only surface in integration later. See https://medium.com/@ouday.khaled/pure-reducers-in-kotlin-why-your-android-unit-tests-should-run-in-5-milliseconds-a4277cea3c38 for the full reducer contract.
Test names follow the BDD pattern <Intent> <condition> <outcome> — Refresh on idle sets isRefreshing and clears error. The intent is the grammatical subject; the entry point into the Store is send(intent), so the failure message reads like the bug report. Every feature also ships a seeded fuzz test that folds 10,000 random intents through the reducer and asserts global invariants — no duplicate facts by id, currentPage >= 1, bookmarkLoadingIds stays an ImmutableSet. Because the reducer is pure, the whole fuzz run clears in under 100 ms and surfaces invariant bugs that would take a week to reproduce manually.
Rule: if a reducer test imports a coroutine utility, delete the import or move the test.
Middleware tier with Turbine
A middleware test wires three inputs — an Intent, a fake repository, and a State snapshot — and asserts two outputs: the list of dispatched intents and the list of emitted effects. Each arrow is a contract point.
@Test
fun `ToggleBookmark success dispatches Started then Confirmed`() = runTest {
val dispatched = mutableListOf<FactsIntent>()
val emitted = mutableListOf<FactsEffect>()
middleware.process(
intent = FactsIntent.ToggleBookmark("a"),
state = FactsState(facts = persistentListOf(CatFact(id = "a", isBookmarked = false))),
scope = backgroundScope,
dispatch = { dispatched += it }, emit = { emitted += it; true },
)
advanceUntilIdle()
assertEquals(FactsIntent.BookmarkStarted("a"), dispatched[0])
assertTrue(dispatched[1] is FactsIntent.BookmarkConfirmed)
assertTrue(emitted.isEmpty())
}
Cancellation keys, debounce, and throttle are the tier's hardest work, and advanceTimeBy makes them deterministic:
@Test
fun `rapid ToggleBookmark cancels the in-flight job`() = runTest {
fakeRepo.bookmarkDelay = 200L
val dispatched = mutableListOf<FactsIntent>()
val state = FactsState(facts = persistentListOf(CatFact(id = "a", isBookmarked = false)))
middleware.process(FactsIntent.ToggleBookmark("a"), state, backgroundScope, { dispatched += it }, { true })
advanceTimeBy(50L)
middleware.process(FactsIntent.ToggleBookmark("a"), state, backgroundScope, { dispatched += it }, { true })
advanceUntilIdle()
val confirmed = dispatched.filterIsInstance<FactsIntent.BookmarkConfirmed>()
assertEquals(1, confirmed.size)
}
Use StandardTestDispatcher with setMain/resetMain, backgroundScope from runTest, and advanceTimeBy/advanceUntilIdle for debounce and cancellation. Prefer a real use case over a fake repository to mocking the use case — the fake is the observable surface, and every expectation lands on it rather than on the use case's internals. Middleware tests never construct their own CoroutineScope: a hand-rolled CoroutineScope(Job()) keeps running after the test and leaks state across cases.
Three async surfaces live here and nowhere else: debounced input, keyed per-entity cancellation, and onStart pipelines that observe a repository Flow. A SearchChanged test advances the virtual clock past 350ms and confirms exactly one SearchQueryCommitted emerges, not three. A bookmark test fires two ToggleBookmark("a") intents 50ms apart and asserts that only one BookmarkConfirmed("a") lands. An onStart test emits into the fake repository's Flow and asserts a FactsReceived dispatch follows. If a middleware test tries to assert on store.state, it is already wearing two hats and belongs a tier up.
Rule: capture dispatch and emit as lists; assert the lists, not the middleware's internals.
Store integration tier
@Test
fun `Refresh success updates state with facts`() = runTest {
fakeRepo.nextRefreshReturns(listOf(CatFact(id = "a", fact = "a", isBookmarked = false)))
store.state.test {
assertEquals(FactsState(), awaitItem())
store.send(FactsIntent.Refresh)
assertTrue(awaitItem().isRefreshing)
val final = expectMostRecentItem()
assertFalse(final.isRefreshing)
assertEquals(1, final.facts.size)
cancelAndIgnoreRemainingEvents()
}
}
Real reducer, real middlewares, fake repository, Turbine on store.state and store.effects. Verify the endpoints — initial state, final state via expectMostRecentItem(), one effect per effect kind. Do not assert every intermediate emission; that is a brittle reimplementation of the reducer test. SavedStateHandle restore is a store test, not a middleware test: the contract crosses both collaborators, and the test proves that constructing a new FactsStoreImpl with a pre-seeded SavedStateHandle restores searchInput and currentPage into the first state.value.
Keep this tier deliberately narrow. One happy path per feature and one failure per async intent family is enough — eight tests is generous. Middleware ordering regressions surface here as telemetry firing against pre-reducer state or save-state capturing the wrong snapshot, and the fix is always a single-line reorder in the Store's middlewares = listOf(...) list. Anti-patterns like state-in-effect or SharedFlow for effects also surface here as lost emissions when Turbine times out on store.effects.test { awaitItem() }; those belong in https://medium.com/@ouday.khaled/10-mvi-anti-patterns-senior-android-reviewers-reject-on-sight-44bd94fd5358.
Rule: store tests prove the wiring, not the branches.
Screen and macrobenchmark
@Test
fun `clicking fact card dispatches FactClicked with correct id`() {
val intents = mutableListOf<FactsIntent>()
composeRule.setContent {
CatFactTheme {
FactsScreen(state = FactsState(facts = persistentListOf(fact)), onIntent = { intents += it })
}
}
composeRule.onNodeWithTag("fact_card_abc").performClick()
assertEquals(listOf(FactsIntent.FactClicked("abc")), intents)
}
Compose tests drive the Screen with a fabricated State and a single onIntent capture list. One interactive element, one assertion — the list asserts both which intent and its payload. The Route is omitted by design; Route wiring is covered by the store tier and by the flagship https://medium.com/@ouday.khaled/the-strict-mvi-playbook-how-staff-android-engineers-structure-jetpack-compose-at-scale-410a8717f1f7.
Macrobenchmark occupies the tip. Four benchmarks cover the hot paths per intent family:
@Test
fun scrollFactsList() = benchmarkRule.measureRepeated(
packageName = "com.catfact.app",
metrics = listOf(FrameTimingMetric()),
startupMode = StartupMode.WARM, iterations = 5,
setupBlock = { /* open app, seed list */ },
) {
device.findObject(By.res("facts_list")).fling(Direction.DOWN)
}
A TraceSectionMetric("FactsReducer.reduce") over 100µs per reduction is a regression; a frame spike on resume points at effect-channel overflow during the stopped window. Scroll regressions usually trace back to a reducer allocating a new ImmutableList on every emission, which the no-op detection rule in https://medium.com/@ouday.khaled/pure-reducers-in-kotlin-why-your-android-unit-tests-should-run-in-5-milliseconds-a4277cea3c38 prevents. The baseline profile covers the full intent journey — refresh, search, toggle, navigate, retry — not only cold start and scroll, so a user-critical intent that regresses by five frames shows up before the next release train.
Rule: device-bound tests count the render, not the branches.
Coverage targets
| Layer | Min line coverage |
|---|---|
| Reducer | 100% — pure, no excuse |
| Middleware | 85% — branches around cancellation |
| Store integration | 70% — happy + 1 failure per feature |
| Screen | ≥ 1 intent-dispatch test per element |
The reducer and middleware tiers combine to 85%+ of the presentation surface — verified before a device ever boots. The store tier picks up the wiring gaps. Screen tests are counted by interactive elements, not by lines, because a Compose node with no dispatch point proves nothing; coverage tools register the click handler as executed whether or not the intent it dispatches is correct.
Wire these thresholds into jacocoCombinedReport and fail the build under them. A pure reducer that drops below 100% means a branch was added without a test; a middleware below 85% usually means a cancellation path slipped through the catch (e: CancellationException) { throw e } rethrow and landed in state.error. One follow-up practice pays back twice: every reproduced bug ships with a captured intent sequence that replays through the reducer in milliseconds, and that capture lands in the commit as a regression test at the base tier.
Rule: the build enforces the pyramid, not the author.
Where to go next
- The Strict-MVI Playbook — the architecture that makes this pyramid possible.
- Pure Reducers in Kotlin — the 5ms base tier, in depth.
- 10 MVI Anti-Patterns Reviewers Reject on Sight — what makes the store tier lie.




Top comments (0)