myougaTheAxo

Posted on Mar 2

Testing AI-Generated Android Apps: A Pragmatic Strategy

#android #kotlin #testing #beginners

Testing AI-Generated Android Apps: A Pragmatic Strategy

When you use AI to generate Android apps—whether through Claude, Codex, or other tools—you inherit both a gift and a responsibility. The gift is rapid prototyping. The responsibility is quality assurance. This guide covers a pragmatic testing strategy that prevents your AI-generated Kotlin apps from becoming unmaintainable nightmares.

Why AI Apps Need Different Testing

AI-generated code excels at scaffolding and boilerplate but often struggles with edge cases and domain logic. Your testing pyramid needs to be inverted compared to manually-written apps:

Unit Tests (70%): Test the logic AI can't infer (business rules, ViewModel state transitions, data transformations)
Integration Tests (20%): Test Room DAO layers, API clients, and data flow
UI Tests (10%): Test layouts and navigation (AI is decent at UI code)

The key insight: AI-generated UI code is usually correct. AI-generated business logic needs skepticism.

Testing Pyramid for AI-Generated Apps

        /\
       /  \  UI Tests (10%)
      /____\
     /      \
    / Int.   \  Integration Tests (20%)
   /________\
  /          \
 / Unit Tests \  Unit Tests (70%)
/__________\

Unit tests focus on:

ViewModel state mutations
Business logic branches
Data transformations
Error handling

Integration tests focus on:

Database operations
Repository patterns
API client calls
Data flow end-to-end

UI tests are minimal—just verify navigation and critical user flows.

Pattern 1: ViewModel Testing with FakeDao

AI loves generating clean ViewModels. Test them with a fake repository:

class TaskViewModelTest {
    private val fakeDao = FakeTaskDao()
    private val repository = TaskRepository(fakeDao)
    private val viewModel = TaskViewModel(repository)

    @Test
    fun addTask_updatesState() = runTest {
        viewModel.addTask("Buy milk")

        advanceUntilIdle()

        val state = viewModel.uiState.value
        assertEquals(1, state.tasks.size)
        assertEquals("Buy milk", state.tasks[0].title)
    }

    @Test
    fun deleteTask_removesFromList() = runTest {
        // Setup
        fakeDao.insertTask(TaskEntity(id = 1, title = "Test"))
        viewModel.deleteTask(1)

        advanceUntilIdle()

        val state = viewModel.uiState.value
        assertEquals(0, state.tasks.size)
    }

    @Test
    fun addTask_emptyTitle_doesNotAdd() = runTest {
        viewModel.addTask("")

        advanceUntilIdle()

        val state = viewModel.uiState.value
        assertEquals(0, state.tasks.size)
    }
}

class FakeTaskDao : TaskDao {
    private val tasks = mutableListOf<TaskEntity>()

    override suspend fun insertTask(task: TaskEntity) {
        tasks.add(task)
    }

    override suspend fun deleteTask(id: Int) {
        tasks.removeIf { it.id == id }
    }

    override fun observeTasks(): Flow<List<TaskEntity>> = flow {
        emit(tasks.toList())
    }
}

Why this works: The ViewModel doesn't care if the DAO is real or fake. Fake objects run instantly and never hit the database. This tests the ViewModel logic in isolation, where AI-generated bugs live.

Pattern 2: Room DAO Testing with inMemoryDatabaseBuilder

Room DAOs are often correct from AI, but integration testing catches schema mismatches and query bugs:

class TaskDaoTest {
    private lateinit var db: AppDatabase
    private lateinit var taskDao: TaskDao

    @Before
    fun setup() {
        db = Room.inMemoryDatabaseBuilder(
            ApplicationProvider.getApplicationContext(),
            AppDatabase::class.java
        ).build()
        taskDao = db.taskDao()
    }

    @After
    fun tearDown() {
        db.close()
    }

    @Test
    fun insertAndRetrieveTask() = runTest {
        val task = TaskEntity(id = 1, title = "Test Task", completed = false)
        taskDao.insertTask(task)

        val retrieved = taskDao.getTaskById(1).first()

        assertEquals(task.title, retrieved.title)
        assertEquals(task.completed, retrieved.completed)
    }

    @Test
    fun updateTask_modifiesExisting() = runTest {
        val original = TaskEntity(id = 1, title = "Original", completed = false)
        taskDao.insertTask(original)

        val updated = original.copy(title = "Updated", completed = true)
        taskDao.updateTask(updated)

        val retrieved = taskDao.getTaskById(1).first()
        assertEquals("Updated", retrieved.title)
        assertEquals(true, retrieved.completed)
    }

    @Test
    fun deleteTask_removesRecord() = runTest {
        taskDao.insertTask(TaskEntity(id = 1, title = "ToDelete", completed = false))
        taskDao.deleteTask(1)

        val result = taskDao.getTaskById(1).first()
        assertEquals(null, result)
    }

    @Test
    fun observeTasks_emitsAllTasks() = runTest {
        taskDao.insertTask(TaskEntity(id = 1, title = "Task 1", completed = false))
        taskDao.insertTask(TaskEntity(id = 2, title = "Task 2", completed = true))

        val tasks = taskDao.observeTasks().first()

        assertEquals(2, tasks.size)
    }
}

Key insight: inMemoryDatabaseBuilder() creates a database in RAM that is destroyed after each test. This is fast and isolated. If your AI-generated DAO queries have typos or schema issues, they'll fail here.

Pattern 3: Compose UI Testing

AI generates Compose code well. Test only user interactions and state changes:

class TaskScreenTest {
    @get:Rule
    val composeTestRule = createComposeRule()

    @Test
    fun clickAddButton_opensAddDialog() {
        composeTestRule.setContent {
            TaskScreen(viewModel = mockViewModel())
        }

        composeTestRule.onNodeWithTag("add_button").performClick()

        composeTestRule.onNodeWithText("Add Task").assertIsDisplayed()
    }

    @Test
    fun typeTaskTitle_updatesList() {
        val viewModel = mockViewModel()
        composeTestRule.setContent {
            TaskScreen(viewModel = viewModel)
        }

        composeTestRule.onNodeWithTag("title_input").performTextInput("Buy milk")
        composeTestRule.onNodeWithTag("add_button").performClick()

        // Verify state changed (mocked ViewModel can track this)
        assertTrue(viewModel.addTaskCalled)
        assertEquals("Buy milk", viewModel.lastTaskTitle)
    }

    @Test
    fun emptyTaskList_showsEmptyState() {
        val emptyViewModel = mockViewModel().apply {
            _uiState.value = TaskUiState(tasks = emptyList())
        }
        composeTestRule.setContent {
            TaskScreen(viewModel = emptyViewModel)
        }

        composeTestRule.onNodeWithText("No tasks yet").assertIsDisplayed()
    }
}

Principle: UI tests should verify navigation and critical user journeys, not implementation details. AI-generated UI is usually correct—test the state transitions, not the @Composable function internals.

When NOT to Test

This is crucial. AI tempts you to over-test. Here's what you should skip:

Layout previews: Don't test that a Spacer(modifier = Modifier.height(16.dp)) renders. Visual testing is manual.
Google's library code: Don't test Room, Compose, or Jetpack internals. They're tested by Google.
Simple data classes: Don't test auto-generated copy(), equals(), or hashCode().
Configuration changes: Unless you've manually written complex state restoration, the framework handles it.
Accessibility properties: Automated testing can't validate UX—test through real devices or accessibility audits.

Focus your effort where bugs hide: domain logic, state transitions, and edge cases in your code.

Testing Structure for AI Apps

Organize your test directory like this:

src/test/
├── kotlin/
│   ├── viewmodel/
│   │   ├── TaskViewModelTest.kt
│   │   └── FilterViewModelTest.kt
│   ├── repository/
│   │   └── TaskRepositoryTest.kt
│   └── util/
│       ├── FakeTaskDao.kt
│       └── TestDispatchers.kt

src/androidTest/
├── kotlin/
│   ├── dao/
│   │   ├── TaskDaoTest.kt
│   │   └── CategoryDaoTest.kt
│   └── ui/
│       ├── TaskScreenTest.kt
│       └── SettingsScreenTest.kt

Unit tests run on the JVM (fast). Instrumented tests (androidTest) run on a device or emulator (slower).

Coverage Goals for AI Code

Unit tests: 70-80% coverage (aim for critical paths)
Integration tests: 40-50% coverage (DAO layer + critical repositories)
UI tests: 20-30% coverage (critical user journeys only)

Don't obsess over 100% coverage. AI code has predictable bugs. Cover the logic, not the scaffolding.

Debugging Failed Tests

When an AI-generated test fails:

Check the assertion message first—it tells you what expected vs. actual.
Check for async issues—runTest { advanceUntilIdle() } waits for coroutines.
Check the fake object—is your FakeDao actually returning data?
Check the database schema—Room migration issues hide here.
Check mocking setup—are your mocks returning the right values?

Most AI test failures are setup issues, not logic bugs.

Quick Checklist

[ ] Unit tests for ViewModels (70% of tests)
[ ] Fake DAO for isolated ViewModel testing
[ ] Room DAO tests with inMemoryDatabaseBuilder
[ ] Compose UI tests for critical user flows
[ ] No tests for layout implementation details
[ ] No tests for Google library code
[ ] Test structure matches src/test + src/androidTest
[ ] Run tests before every commit

Takeaway

AI-generated Android code is scaffolding—solid structure but needs scrutiny in the logic layers. Your testing pyramid should be heavy on unit tests (fakes are your friend), light on UI tests (AI's Compose code is usually fine), and integrated at the DAO level.

All 8 templates follow testable architecture. https://myougatheax.gumroad.com

DEV Community

Testing AI-Generated Android Apps: A Pragmatic Strategy

Testing AI-Generated Android Apps: A Pragmatic Strategy

Why AI Apps Need Different Testing

Testing Pyramid for AI-Generated Apps

Pattern 1: ViewModel Testing with FakeDao

Pattern 2: Room DAO Testing with inMemoryDatabaseBuilder

Pattern 3: Compose UI Testing

When NOT to Test

Testing Structure for AI Apps

Coverage Goals for AI Code

Debugging Failed Tests

Quick Checklist

Takeaway

Top comments (0)