DEV Community

Oliver Samuel
Oliver Samuel

Posted on

CHW Monthly Activity Aggregation: Turning Visit Logs into Insight

Community Health Workers (CHWs) generate a huge amount of visit-level data: every household visit, assessment, and follow-up. On its own, this raw data is hard to use. Our CHW Monthly Activity Aggregation project turns those raw records into a clean monthly summary that's ready for dashboards and performance reviews.

At a high level, this project is a dbt (data build tool) project that reads a fact table of CHW activities and produces a single, analytics-ready table: public.chw_activity_monthly, with one row per CHW per reporting month.

What Problem Does This Project Solve?

  • Today's challenge: Managers often get raw logs (“John visited household 123 at 10:15”) instead of understandable summaries (“John visited 28 households in March, with 6 pregnancy visits”).

  • Our solution: We automatically group all visits into monthly summaries per CHW, applying agreed-upon business rules, so decision-makers see consistent, comparable metrics over time.

  • Source: public.fct_chw_activity (via dbt source marts.fct_chv_activity)

  • Output: public.chw_activity_monthly, incremental model keyed by ['chv_id', 'report_month']

  • Tech stack: dbt + Postgres, orchestrated via Docker

Project layout
Project layout

How the Data Flows

The project follows a simple story:

1. Start with raw visit data

Each row in fct_chv_activity is a single CHW activity: who visited, when, what type of visit, and where.

2. Clean and filter

We:

  • Drop records with missing CHW IDs
  • Drop records with missing activity dates
  • Exclude visits marked as deleted

3. Assign a “reporting month”

Instead of just using calendar months, we use a 26th-of-the-month rule:

  • Visits on the 1st to 25th belong to that month
  • Visits on the 26th or later are counted in the next month

This matches the business reporting cycle CHW programs often use.

4. Aggregate to monthly metrics

For each CHW and reporting month, we calculate:

  • total_activities
  • unique_households_visited
  • unique_patients_served
  • pregnancy_visits
  • child_assessments
  • family_planning_visits

5. Store the result as a reusable table

The final table public.chw_activity_monthly is what reporting tools (e.g., Power BI, Looker) will connect to.

Data Flow
A simple diagram (whiteboard or slide) showing arrows from fct_chv_activity → cleaning/filtering → month assignment → chw_activity_monthly with the key metrics.

Key Components

Macro: month_assignment(date_column) in macros/month_assignment.sql

Implements the 26th-day rule using SQL case + date_trunc and a + interval '1 month' when the day ≥ 26.

Model: models/starter_code/chw_activity_monthly.sql

  • materialized = 'incremental'
  • incremental_strategy = 'delete+insert'
  • unique_key = ['chv_id', 'report_month']

Within the model:

  • CTE raw

    • Selects from {{ source('marts', 'fct_chv_activity') }}
    • Filters out null activity_date, null chv_id, and is_deleted = true
    • For incremental runs, only processes months at or after the earliest report_month currently present
  • CTE assigned

    • Calls {{ month_assignment('activity_date') }} to compute report_month
  • CTE aggregated

    • Groups by chv_id, report_month
    • Computes counts and conditional sums for activity types

Model
chw_activity_monthly.sql model with the CTE structure visible and the config block at the top.

Business Rules in More Detail

  • Only valid, non-deleted records are included (no missing CHW, no missing date, no deleted events).
  • Report month:

    • A visit on March 24 counts in March.
    • A visit on March 27 counts in April.
  • Metrics per CHW per month:

    • Count total activities
    • Count unique households and unique patients
    • Break out program categories: pregnancy visits, child assessments, family planning visits

If the organization later changes how months are defined or what counts as, say, a “pregnancy visit,” we can adjust these rules centrally in the dbt code, and all downstream dashboards will automatically align.

fct\_chv\_activity
A small, anonymized excerpt of fct_chv_activity and the corresponding rows in chw_activity_monthly, to illustrate how several visits roll up into one summary row.

SELECT 
       chv_id,
       report_month,
       total_activities,
       unique_households_visited,
       unique_patients_served,
       pregnancy_visits,
       child_assessments,
       family_planning_visits
FROM public.chw_activity_monthly
ORDER BY chv_id, report_month
LIMIT 5;
Enter fullscreen mode Exit fullscreen mode

query\_output

Testing and Data Quality

The project uses dbt’s testing framework to keep the model trustworthy:

  • Not-null tests on key fields like chv_id and report_month
  • Uniqueness test that enforces each (chv_id, report_month) pair appears only once (via dbt_utils.unique_combination_of_columns)

This ensures you do not accidentally have duplicate rows or missing identifiers in your analytical table.

dbt test
Terminal output from dbt test --select chw_activity_monthly showing all tests passing.

How to Run the Project

These steps assume you have Docker installed and are running in the project root folder.

1. Start the Docker services

docker compose up -d
Enter fullscreen mode Exit fullscreen mode

This brings up:

  • A dbt runner container
  • A Postgres database with the CHW activity data loaded

Docker run
Terminal view showing docker compose up -d completing successfully, with containers listed via docker ps.

2. Open a shell in the dbt runner

docker exec -it dbt_runner bash
cd chw_project/
Enter fullscreen mode Exit fullscreen mode

Inside this container, you're now in the dbt/chw_project directory, where the dbt project is configured (via dbt_project.yml and profiles.yml).

3. Build the monthly activity model

Run:

dbt run
Enter fullscreen mode Exit fullscreen mode

dbt will:

  • Compile the SQL
  • Apply the filters and the month assignment
  • Materialize or update the public.chw_activity_monthly table

A successful run
Terminal output of dbt run, showing chw_activity_monthly as OK or SUCCESS with timing.

4. Run tests for the model

Still inside the dbt runner container:

dbt test --select chw_activity_monthly
Enter fullscreen mode Exit fullscreen mode

You should see the tests for:

  • Not-null constraints
  • Unique combination of chv_id + report_month

Summary output
Test summary output, highlighting that all tests are passing.

5. Inspect the final table in Postgres

From your host machine:

docker exec -it dbt_postgres bash
psql -U dbt_user -d analytics
Enter fullscreen mode Exit fullscreen mode

In the psql prompt:

SELECT *
FROM public.chw_activity_monthly
LIMIT 20;
Enter fullscreen mode Exit fullscreen mode

This shows sample rows for CHWs, each representing a single CHW’s activity in a reporting month.

psql result

Recommendations

Once the table exists and has passed tests, analytics or program teams can:

  • Point their BI tool to public.chw_activity_monthly
  • Build dashboards like:

    • CHW monthly productivity (activities per month)
    • Coverage (households visited by CHW and region)
    • Program mix (share of pregnancy vs. child vs. family-planning visits)
  • Track trends over time and identify:

    • Under-served areas (few households reached)
    • High-performing CHWs
    • Seasonal patterns in demand

Because all these dashboards share the same centrally defined table and business rules, reports across teams will be consistent and comparable.

Conclusion

This project turns complex CHW visit logs into a simple summary table, so you can clearly see who is doing what, where, and when.
It’s a dbt incremental model over fct_chv_activity, using a macro for the 26th-day reporting rule, strong record-level filtering, and dbt tests for key constraints.

The result is a reliable foundation for CHW performance analytics and program decision-making, with transparent and maintainable logic behind every number.


Top comments (0)