Mandar Nilange

Posted on Apr 30

Building a Lossless Multi-Provider Health Layer in Flutter (Strava, HealthKit, Health Connect, Oura)

#flutter #dart #opensource #healthtech

A deep dive into Health Forge — a federated, zero-backend Flutter toolkit that unifies HealthKit, Health Connect, Oura, Strava, and Garmin without throwing away the metrics each provider is famous for.

If you have ever tried to build a serious health app in Flutter that pulls from more than one provider, you have probably had this exact moment of grief:

You wire up the popular health package. It works. You sync sleep from Apple Watch. You sync sleep from Oura. You feel productive. Then you realize Oura's Readiness Score — the single most useful number Oura produces — is gone. Not stored as null. Not flagged as unsupported. Just… not in the model. Because the unified schema doesn't have a slot for it.

So you start writing per-provider wrappers. Now you have one API for HealthKit, a different one for Strava, a third for Oura's REST API, and your "platform layer" is doing more dispatching than your business logic. One of those wrappers is GPL. Another vendors its own commercial backend. The package that promised to "just work" has metastasized.

This is the problem Health Forge was built to solve. It is an MIT-licensed, federated, zero-backend Flutter toolkit that aggregates health data from multiple providers into a unified model without losing the provider-specific metrics that make those providers worth integrating in the first place.

This article is the design walkthrough — the trade-offs, the architecture decisions, and the parts that genuinely surprised me on the way to v0.1.

Repo: https://github.com/mandarnilange/health_forge
Packages on pub.dev: health_forge_core, health_forge, health_forge_apple, health_forge_ghc, health_forge_oura, health_forge_strava

The Flutter health-data landscape, honestly

Before I started, I tried very hard to not write yet another health package. The existing options break into three families, and each one fails a different way:

Approach	What it does well	Where it falls over
The `health` package	One API across HealthKit and Health Connect	Strips provider-specific metrics. No Oura, Strava, Garmin.
Individual provider wrappers	High fidelity per provider	Every wrapper has a different API surface, varying maintenance, occasional GPL contamination.
Commercial SDKs (Terra, Vital, Spike)	Genuinely good DX	Vendor lock-in, mandatory backend, closed source, monthly bill scaling with your users.

I wanted a fourth option: a unified model that doesn't lie to you about what the data was, with no backend, MIT-licensed, and federated so I only ship code for the providers I actually use.

So I wrote one.

Design goal #1: Don't strip provider metrics. Don't pollute the core either.

This is the central tension. A unified model is the whole point — but the moment you drop Oura's readinessScore or Strava's sufferScore, your "unified" app is no better than its lowest-common-denominator schema.

I solved this with a type-map extension mechanism sitting on top of every record:

abstract class ProviderExtension {
  Map<String, dynamic> toJson();
  String get typeKey;
}

mixin HealthRecordMixin {
  // ... envelope fields ...
  Map<Type, ProviderExtension> get extensions;

  T? extension<T extends ProviderExtension>() => extensions[T] as T?;
}

That extension<T>() getter is the entire payoff. From your app code:

final sleep = result.records.whereType<SleepSession>().first;

// Common, normalized fields — works for any provider
print('Slept ${sleep.endTime.difference(sleep.startTime)}');

// Provider-specific magic, type-safe
final oura = sleep.extension<OuraSleepExtension>();
if (oura != null) {
  print('Readiness: ${oura.readinessScore}');
  print('Body temp deviation: ${oura.temperatureDeviation}°C');
}

The core schema doesn't know OuraSleepExtension exists. The Oura package registers it at init time:

ProviderExtensionRegistry.register(
  'oura_sleep',
  OuraSleepExtension.fromJson,
);

This is what made the difference. The unified SleepSession is yours. The Oura-flavored bonus data rides along, survives JSON serialization, survives the cache, survives an isolate hop, and disappears from your binary if you don't depend on health_forge_oura.

Trade-off worth naming out loud: extensions aren't compile-time guaranteed. If you forget to register an extension type at app startup, deserialization silently drops it. I considered an annotation-driven registry; for v0.1 I optimized for "tiny core, explicit registration." This will likely tighten in v0.2.

Design goal #2: Federate the providers

If you only use Apple Health, you should not pay for Oura's HTTP client, OAuth flow, and rate limiter. Conversely, when Garmin's API has a breaking change, your Apple-only app should not be forced into a coordinated upgrade.

The dependency graph is enforced as a hard rule:

health_forge_core (pure Dart)
        ▲
        │
        │ depended on by
        │
   ┌────┴───────────────────────────────┐
   │                                    │
health_forge (Flutter)         health_forge_{provider}
                                       │
                                       │ never depends on
                                       ▼
                                 each other or on health_forge

Concretely:

health_forge_core is pure Dart with no Flutter imports. It runs in an isolate. It runs on a server. It runs in a Dart CLI tool.
health_forge is the Flutter client — registry, auth orchestrator, query builder, cache, sync.
Provider packages depend on health_forge_core and nothing else from this repo.

The mono-repo is managed with melos, which means a single command bootstraps everything and runs analyze/test across all eight packages:

dart pub global activate melos
dart run melos bootstrap
dart run melos run analyze   # zero warnings required
dart run melos run test

Why this matters in practice: when I want to ship the Garmin adapter, I don't need to coordinate a release of health_forge_apple. I bump health_forge_garmin and that's it. Independent versioning is a feature.

The ugly side: cross-package envelope changes (e.g., adding a new Provenance field) require coordinated releases. The solution there is discipline — bump core, then sweep the providers, then bump the Flutter client. Annoying, but predictable.

Design goal #3: Conflict resolution that you can actually audit

Here's a fun problem nobody warns you about: if your user wears an Apple Watch and an Oura ring, both will report sleep. Both will report heart rate. Sometimes the windows overlap. Sometimes they conflict.

The naïve answer is "let the user pick a primary device." That's not good enough — Apple Watch is better at workouts, Oura is better at overnight metrics. The right answer depends on the metric.

Health Forge ships a MergeEngine with five built-in strategies and an audit trail:

final mergeConfig = MergeConfig(
  defaultStrategy: ConflictStrategy.priorityBased,
  strategiesByMetric: {
    MetricType.sleepSession: ConflictStrategy.priorityBased,
    MetricType.heartRate:    ConflictStrategy.mostGranular,
    MetricType.weight:       ConflictStrategy.average,
  },
  providerPriorities: {
    MetricType.sleepSession: [
      DataProvider.oura,
      DataProvider.apple,
      DataProvider.ghc,
    ],
  },
  timeOverlapThresholdSeconds: 300,
);

final result = await QueryExecutor(
  registry: forge.registry,
  mergeEngine: MergeEngine(config: mergeConfig),
).execute(query);

print('Resolved: ${result.records.length}');
print('Conflicts: ${result.conflictReport.entries.length}');
for (final entry in result.conflictReport.entries) {
  print('  ${entry.metric}: dropped ${entry.dropped.length}, '
        'kept by ${entry.strategy}');
}

The five strategies:

Strategy	When to use it
`priorityBased`	You know which provider you trust per metric. Most apps live here.
`keepAll`	You don't want to throw away anything; render with attribution.
`average`	Aggregate dashboards over numeric metrics like weight or RHR.
`mostGranular`	Prefer 1 Hz heart-rate series over a 5-minute summary.
`custom`	Pass a callback. Domain-specific logic for clinical apps.

The non-negotiable feature: every merge decision lands in a ConflictReport. You can render it, log it, or surface it to the user. Health data is high-stakes — silent dedup is a bug, not a feature.

A real lesson from building this: the merge engine had to be pure Dart, no exceptions. I want to run it in an isolate when a query returns thousands of records. The moment a Flutter type sneaks into core, isolate transfer breaks. ADR-0004 ("pure Dart core") sounds dogmatic on paper; on the ground, it's the only thing that lets the merge engine scale.

The custom strategy is the one place this leaks. A custom callback runs on the calling isolate because closures aren't generally portable across isolates in Dart. That's a documented trade-off, not a bug.

Design goal #4: Make REST adapters as easy as native SDK adapters

Health Forge has two flavors of provider:

Native SDK adapters — health_forge_apple (HealthKit) and health_forge_ghc (Health Connect). These bridge to platform code via Pigeon-generated FFI.
REST API adapters — health_forge_oura and health_forge_strava. These speak HTTP.

The risk was that REST adapters would feel like second-class citizens — different auth, different errors, different caching. Instead, every adapter implements the same interface:

abstract class HealthProvider {
  DataProvider get providerType;
  ProviderCapabilities get capabilities;

  Future<AuthResult> authorize();
  Future<void> deauthorize();
  Stream<HealthRecord> query(HealthQuery query);
}

The REST adapter pattern (ADR-0007) standardizes the messy parts: OAuth 2.0 with PKCE, a Dio HTTP client wrapped in a rate limiter, paginated streaming, and a token store that persists across launches.

Strava has the most interesting twist — its public API rate-limits on two dimensions simultaneously (15-minute and daily windows). The DualRateLimiter is one of the parts of the codebase I'm proudest of, and it's also the kind of thing you would never want to write three times. Putting it in a shared adapter pattern means the next REST provider (Garmin? Whoop? Polar?) gets it for free.

What the application code actually looks like

Enough architecture. Here's the developer-facing surface for an app that wants Apple + Oura sleep:

# pubspec.yaml — only what you use
dependencies:
  health_forge: ^0.1.1
  health_forge_apple: ^0.1.1
  health_forge_oura: ^0.1.1

import 'package:health_forge/health_forge.dart';
import 'package:health_forge_apple/health_forge_apple.dart';
import 'package:health_forge_oura/health_forge_oura.dart';

Future<void> main() async {
  // Register extensions before constructing the client
  HealthForgeOura.init();

  final forge = HealthForgeClient();
  forge.use(AppleHealthProvider());
  forge.use(OuraHealthProvider(
    clientId: const String.fromEnvironment('OURA_CLIENT_ID'),
    redirectUri: 'myapp://oauth/oura',
  ));

  // One call kicks off every provider's auth flow
  await forge.auth.authorizeAll();

  // Build a query — fluent, type-safe, isolate-friendly
  final query = (forge.query()
        ..forMetrics([MetricType.sleepSession])
        ..inRange(TimeRange(
          start: DateTime.now().subtract(const Duration(days: 30)),
          end: DateTime.now(),
        )))
      .build();

  final executor = QueryExecutor(
    registry: forge.registry,
    mergeEngine: MergeEngine(
      config: const MergeConfig(
        defaultStrategy: ConflictStrategy.priorityBased,
        providerPriorities: {
          MetricType.sleepSession: [DataProvider.oura, DataProvider.apple],
        },
      ),
    ),
  );

  final result = await executor.execute(query);

  for (final session in result.records.whereType<SleepSession>()) {
    final oura = session.extension<OuraSleepExtension>();
    print('${session.startTime} | '
          'duration=${session.endTime.difference(session.startTime).inHours}h | '
          'readiness=${oura?.readinessScore ?? "—"}');
  }
}

A few things worth noting:

The health_forge_oura package brought along its OAuth flow, rate limiter, and extension registration. None of that is in your code.
The extension<T>() call is null-safe because Apple records won't have an OuraSleepExtension. The print statement gracefully degrades.
The merge engine ran on a normal isolate-safe code path; on a query that returns thousands of samples you can hand the merge step off entirely.
Caching is opt-in via CacheManager. The default is in-memory; the Drift-backed DriftCacheManager gives you SQLite-backed offline-first behavior with no extra server. ADR-0006 covers the schema if you're curious.

What I learned that wasn't on the architecture diagrams

A few things only became clear after the code was running:

1. freezed and inheritance don't mix, but mixins are fine. I started by trying to give every record a base class with the envelope fields. freezed generates classes that can't extend other freezed classes. The fix was a mixin (HealthRecordMixin) with abstract getters that each freezed factory implements. Slight boilerplate per record, but full code-gen compatibility and no inheritance lock-in.

2. Zero-backend means *zero backend.* I had to talk myself out of "but what if we just had a tiny edge function for token refresh…" several times. Every concession to a backend changes the licensing story, the privacy story, and the "MIT toolkit you can ship in a paid app" story. Tokens live in flutter_secure_storage. Refreshes happen on-device. Done.

3. Coverage gates are load-bearing. The repo enforces ≥90% line coverage per package in CI (excluding generated files). When I refactored the merge engine in week three, that gate caught two regressions a more permissive threshold would have missed. For a library where downstream consumers can't easily fork, the test suite is the API contract.

4. ADRs paid for themselves twice. When I came back to the project after a two-week gap, the seven ADRs in design/adr/ reminded me why the core was pure Dart, why extensions used a type map instead of inheritance, and why the cache schema was denormalized. I almost talked myself into "fixing" each of those decisions before reading the rationale.

What's next

Status as of v0.1.1:

Package	Status
`health_forge_core`	Stable — 21 record types, merge engine, 5 strategies
`health_forge`	Stable — registry, auth, queries, in-memory + Drift cache
`health_forge_apple`	Device-tested on iOS
`health_forge_ghc`	Device-tested on Android
`health_forge_oura`	Code-complete, fully unit-tested, awaiting end-to-end testing against live API
`health_forge_strava`	Code-complete, fully unit-tested, awaiting end-to-end testing against live API
`health_forge_garmin`	Next on the roadmap

The example app in example/ is a working Flutter dashboard with real OAuth flows for Oura and Strava (via deep links), mock providers for desktop development, a 10-card metric dashboard, and a query/browse screen.

If any of this is useful to you, I'd love to hear:

What providers do you actually need? (Polar, Whoop, Withings have all been suggested.)
Do the conflict-resolution strategies cover your use case, or do you need something the custom callback can't express?
How does your team feel about running the merge engine on a background isolate vs. on the main one?

The repo is at https://github.com/mandarnilange/health_forge. Issues, PRs, and "your design here is wrong because…" comments are all genuinely welcome. The contributor guide enforces TDD, zero analyzer warnings, and an ADR for any architecture-level change — same rules I held myself to.

Thanks for reading.

If you build something with Health Forge — or if it makes you want to build a competing thing — drop a comment. I learn more from the second category than the first.