The fastest test suite is the one that doesn't need a device. That sounds like a testing tip, but it's really an architecture result: if your logic lives in a framework-free core behind interfaces, the vast majority of it is plain Kotlin you can test in commonTest — no emulator, no simulator, no flake. This post walks the full testing pyramid I run on KMP suites, from pure domain tests up through snapshots and performance.
Why the hexagonal core pays off in tests
CoreLib holds domain logic and port interfaces only — no Android, no platform SDKs. That single rule means a use case never touches a real network, GPS, or Bluetooth stack; it touches an interface. In tests you pass a fake. So the bulk of the suite is fast, deterministic, and lives in commonMain's sibling commonTest:
// commonTest — pure, no platform, runs on every target
class PlayabilityCalculatorTest {
@Test
fun penalizesHighWind() {
val ctx = scoringContext(windMph = 35, tempF = 68, rain = false)
val score = PlayabilityCalculator().calculateScore(ctx)
assertTrue(score.value < 50
}
}
kotlin.test gives you @Test/assert* that compile to JUnit on JVM/Android and XCTest on iOS — write once, run on all targets.
Faking platforms with expect/actual
When a test does need a platform seam, you don't mock the SDK — you depend on the port and substitute a fake. For the rare cases where a common test needs a platform primitive, expect/actual lets you provide a test implementation per target. The result: 90%+ of behavior is verified in commonTest, and platform code shrinks to thin, separately-tested adapters.
Testing the reactive layer with Turbine
KMP code is full of Flow/StateFlow. Asserting on emissions by hand is painful; Turbine makes it readable:
@Test
fun emitsConnectingThenConnected() = runTest {
bleClient.connectionState.test {
assertEquals(DISCONNECTED, awaitItem())
bleClient.connect("device-1")
assertEquals(CONNECTING, awaitItem())
assertEquals(CONNECTED, awaitItem())
cancelAndIgnoreRemainingEvents()
}
}
runTest + a test dispatcher make coroutine time deterministic, so these run instantly and don't flake.
Enforcing the architecture with ArchUnit
The reason the core stays testable is that nothing leaks platform dependencies into it — and that's itself a test. ArchUnit turns the architecture rules into assertions that fail the build when someone violates them:
@Test
fun domainHasNoPlatformDependencies() {
classes().that().resideInAPackage("..core.domain..")
.should().onlyDependOnClassesThat()
.resideOutsideOfPackages("android..", "platform..", "java.net..")
.check(importedClasses)
}
This is the highest-leverage test in the suite: it keeps every other test fast by guaranteeing the boundary never erodes. (In CI this runs in the cheap "gate before you build" step.)
Coverage with Kover, as a gate
Kover measures multiplatform coverage and can fail the build below a threshold — so coverage is enforced, not aspirational:
kover {
reports {
verify { rule { minBound(80) } } // build fails under 80%
}
}
A badge on each library's README turns that into a credibility signal for anyone browsing the repo.
UI: snapshot testing with Paparazzi
Compose UI gets verified without a device using Paparazzi, which renders composables to images and diffs them against goldens:
@Test fun radarView_dark() {
paparazzi.snapshot { ViewPointTheme(dark = true) { RadarView(rssi = mapOf("A" to -60)) } }
}
Catches visual regressions (spacing, color, layout) in CI, no emulator needed. For interaction logic, Compose UI tests / Robolectric cover the Android side; iOS UI is exercised in simulator tests.
Performance as a test: Macrobenchmark
Speed regressions are bugs too. Macrobenchmark measures startup time and jank/frame timing on a real device build, and Baseline Profiles bake in the wins:
@Test fun startup() = benchmarkRule.measureRepeated(
packageName = "com.example.app",
metrics = listOf(StartupTimingMetric()),
iterations = 5,
startupMode = StartupMode.COLD
) { pressHome(); startActivityAndWait() }
Track the numbers over time and a PR that regresses startup gets caught before it ships.
The pyramid, summarized
| Layer | Tool | Where | Speed |
|---|---|---|---|
| Domain logic | kotlin.test, MockK | commonTest |
⚡ instant |
| Reactive flows | Turbine | commonTest |
⚡ instant |
| Architecture rules | ArchUnit | JVM | ⚡ instant |
| Coverage gate | Kover | all | fast |
| UI snapshots | Paparazzi | JVM render | fast |
| Android UI/integration | Robolectric / Compose UI test | Android | medium |
| iOS | XCTest (simulator) | macOS runner | medium |
| Performance | Macrobenchmark | device | slow (nightly) |
The shape is deliberate: nearly everything that can fail is caught in the fast, deviceless tiers, so CI stays cheap and developers get answers in seconds. The slow tiers (device perf, simulator) run where they belong — not on every keystroke.
Takeaway
A KMP testing suite isn't a pile of tools; it's a consequence of architecture. Keep the core framework-free, depend on ports, and the majority of your behavior becomes pure commonTest that runs on every target without a device. Then layer ArchUnit to protect the boundary, Kover to enforce coverage, Turbine for flows, Paparazzi for pixels, and Macrobenchmark for speed — and let CI run the cheap tiers on every push and the expensive ones on a schedule.
Top comments (1)
This is an excellent breakdown of building a robust, device-independent testing suite for Kotlin Multiplatform. I really appreciate how you emphasize that the architecture itself—the hexagonal core, pure commonTest, and clear boundaries—drives testability and speed. Tools like Turbine, ArchUnit, Kover, Paparazzi, Robolectric, and Macrobenchmark are all used strategically rather than in isolation, which ensures fast feedback for developers while preserving thorough coverage where it matters.
I’d love to collaborate and explore extending this approach—experimenting with cross-platform CI pipelines, automated architecture enforcement, and scaling the test pyramid for multi-team KMP projects. Sharing strategies for keeping the majority of tests fast and deterministic while handling the slow tiers on a schedule could be very useful for teams adopting KMP at scale.
Would you be open to discussing a joint experiment or knowledge exchange on KMP testing best practices?