Why teachers need explainable AI, not just accurate AI — building the KC dashboard

#numpath #adaptivelearning #python #vue

What We Built

NumPath's teacher dashboard previously showed one number per student: 7-day accuracy. A teacher looking at "Emma — 43%" has no idea whether Emma is struggling with borrowing, place value, number sense, or all three. The number is technically correct and completely unactionable.

In this post I'll walk through how we added a Knowledge Component (KC) mastery panel to the dashboard — colour-coded progress bars per skill that expand to show p_mastery %, mastery level label, and opportunity count. The backend piece is a single endpoint backed by a left-join use case. The research reason it matters is more interesting than the code.

The Design Decision

The core choice was: what data does a teacher actually need?

We had three options:

Accuracy-only (what we had): fast to compute, no additional queries, but unactionable
Raw BKT parameters: show p_mastery, p_learn, p_guess, p_slip — complete but overwhelming for a classroom teacher
KC mastery levels: translate p_mastery into a three-tier label (Novice / Developing / Mastered) with colour coding, keeping the raw number available on expand

We chose option 3. The mastery level thresholds are defined as named constants in get_kc_states.py:

_MASTERY_DEVELOPING = 0.40
_MASTERY_MASTERED   = 0.80

def _mastery_level(p_mastery: float) -> str:
    if p_mastery >= _MASTERY_MASTERED:
        return "Mastered"
    if p_mastery >= _MASTERY_DEVELOPING:
        return "Developing"
    return "Novice"

One deliberate UX choice: a student with no attempts at all still sees all 5 skills at 0% / Novice. There's no "no data yet" placeholder. The teacher sees the full KC grid from day one — an empty bar is information ("this student hasn't encountered this skill yet"), not an error.

The access control pattern is worth noting too. We added a require_authenticated dependency — any valid JWT — and enforced role logic in the route handler:

@router.get("/{student_id}/kc-states", response_model=KCStatesResponse)
async def get_kc_states(
    student_id: uuid.UUID,
    db: AsyncSession = Depends(get_db),
    auth: dict = Depends(require_authenticated),
) -> KCStatesResponse:
    role = auth.get("role")
    if role == "student" and auth.get("sub") != str(student_id):
        raise HTTPException(status_code=status.HTTP_403_FORBIDDEN, detail="Access denied")
    ...

Students see their own KC states. Teachers see any student's. The rule lives in one place — the route — rather than being split across two separate dependency functions.

Why It Matters for the Research

The MacLellan ITS framework's "Teacher-in-the-Loop" principle isn't just about giving teachers a screen. It's about giving them information they can act on. A 43% accuracy number tells a teacher "this student is struggling." A KC panel that shows SUB_BORROW at 12% (Novice, 8 attempts) while PLACE_VALUE is at 67% (Developing, 14 attempts) tells a teacher "this student needs targeted borrowing practice — and they've already tried eight times, so hints aren't landing."

That's the difference between a reporting tool and a teaching tool. The RCT we're designing in Phase 4 will measure whether teachers who have KC-level visibility actually intervene differently than those who see accuracy alone. This dashboard is the instrument we're studying, not just a convenience feature.

What We Learned

The left-join strategy — two separate queries plus a dict lookup — turned out to be cleaner than an ORM outerjoin(). SQLAlchemy async outerjoin() with nullable columns requires explicit handling of None values in ways that are easy to get wrong. Two queries and a dict.get() with a default is more readable and easier to test with mocks:

kc_by_skill_id = {record.skill_id: record for record in kc_records}

summaries = [
    KCStateSummary(
        skill_code=skill.code,
        p_mastery=round(kc_by_skill_id[skill.id].p_mastery, 3)
        if skill.id in kc_by_skill_id else 0.0,
        ...
    )
    for skill in all_skills
]

Nine unit tests covering the use case ran in 0.03s with no live database. That's the payoff for keeping the domain logic in a use case rather than inline in the route.

What's Next

Phase 2 of the KC dashboard adds recent attempt history to the student detail panel — the specific problems a student got wrong, with their classified mistake codes, so a teacher can see patterns as they form.

Key Takeaways

Accuracy is output, KC mastery is signal — a single accuracy number is not enough for a teacher to act; per-KC mastery state is the minimum viable explainability for an ITS
Empty is informative, not broken — showing all KCs at 0% for a new student tells a teacher "this skill hasn't been practised yet"; hiding it implies the data is missing
Two queries + dict > one complex join — for small, static reference data (5 skills), two simple queries and a dict lookup are more readable, testable, and maintainable than an ORM outer join with nullable column handling