DEV Community

Beck_Moulton
Beck_Moulton

Posted on

Ingesting 100M Heartbeats: Scaling Wearable Tech Without Going Broke

So, you’ve got a "million-dollar idea" for a health monitoring SaaS. You’re going to track heart rates, HRV, and stress levels from smartwatches in real-time. It sounds amazing on a pitch deck.

Then you write the MVP. You spin up a standard Postgres instance on AWS RDS. You get 100 beta testers. Everything is smooth.

Then you get 10,000 users.

Suddenly, your database CPU is pegged at 100%, your IOPS credits are gone, and your AWS bill looks like a ransom note. Why? Because writing 10,000 rows every single second is a completely different beast than a typical CRUD app.

Here is how we tackle the hidden infrastructure costs of continuous health monitoring without bankrupting a startup.

The Math of "Continuous"

Let’s be real about the numbers. If a device sends a heartbeat payload once per second (1 Hz):

  • 1 User = 86,400 writes/day.
  • 1,000 Users = 86.4 million writes/day.
  • Payload size: Even a tiny 100-byte JSON packet means gigabytes of ingest traffic daily.

Standard RDBMS (like MySQL or vanilla Postgres) are optimized for transactional integrity (ACID), not for eating millions of tiny writes per second. The B-Tree indexing overhead alone will kill your write throughput.

Strategy 1: Stop the "Chatty" Protocol (Batching)

The first mistake is treating a wearable like a chat app. Do not open a WebSocket or API request for every single heartbeat. The network overhead (TCP handshakes, HTTP headers) is often larger than the data itself.

The Fix: Buffer on the device.
Ideally, the device should collect 60 seconds of data and send it in one compressed chunk.

On the backend, instead of INSERTing one row at a time, use bulk inserts. Here is a sloppy but effective example in Node.js showing the difference:

// DON'T DO THIS
data.points.forEach(async (point) => {
  await db.query('INSERT INTO heartbeats VALUES (...)', [point]); // RIP Database
});

// DO THIS
const format = require('pg-format');

const values = data.points.map(p => [p.userId, p.value, p.timestamp]);
const query = format('INSERT INTO heartbeats (user_id, bpm, time) VALUES %L', values);

await db.query(query); // One network round trip, one transaction
Enter fullscreen mode Exit fullscreen mode

This simple change reduces your IOPS (Input/Output Operations Per Second) by a factor of ~50-100.

Strategy 2: Use a Time-Series Database (TSDB)

I love Postgres. But for raw metrics, you need something that handles data that is append-only and ordered by time.

Tools like TimescaleDB (which sits on top of Postgres) or InfluxDB are lifesavers here. They use "hypertables" or specific storage engines that partition data by time chunks.

Why does this save money? Compression.

Time-series data is highly repetitive.
{ time: 10:00:01, bpm: 70 }
{ time: 10:00:02, bpm: 70 }
{ time: 10:00:03, bpm: 71 }

TSDBs use delta-of-delta compression. Instead of storing the full timestamp and full integer every time, they store the tiny changes between them. We’ve seen storage costs drop by 90% just by switching from vanilla Postgres tables to compressed hypertables.

If you want to read more technical deep dives on database selection, I publish frequent updates on my tech guides and tutorials.

Strategy 3: The Art of Downsampling (Rollups)

Here is the hard truth: Nobody needs second-by-second resolution from 3 months ago.

A doctor might need granular data for yesterday's arrhythmia event, but for the trend report from last year? They just need the daily average, min, and max.

The Fix: Continuous Aggregates.

Don't calculate averages on the fly (that’s slow). Pre-calculate them as data comes in and drop the raw data later.

In SQL (Timescale syntax), it looks a bit like this:

-- Create a view that updates automatically
CREATE MATERIALIZED VIEW hourly_heartrate
WITH (timescaledb.continuous) AS
SELECT
  time_bucket('1 hour', time) as bucket,
  user_id,
  avg(bpm) as avg_bpm,
  max(bpm) as max_bpm
FROM heartbeats
GROUP BY bucket, user_id;

-- Add a retention policy to delete raw data after 7 days
SELECT add_retention_policy('heartbeats', INTERVAL '7 days');
Enter fullscreen mode Exit fullscreen mode

Now, your storage doesn't grow infinitely. You keep high-res data for a week (for the immediate alerts) and low-res data forever (for the long-term trends).

Conclusion

Building for "Continuous Monitoring" isn't just about code; it's about physics and economics.

  1. Buffer on the client to save network calls.
  2. Batch on the server to save IOPS.
  3. Compress with a TSDB to save disk space.
  4. Downsample old data to keep the database fast.

It’s tempting to over-engineer, but usually, a disciplined data lifecycle policy is worth more than a fancy Kubernetes cluster.

Have you dealt with high-frequency ingestion before? Let me know in the comments how you handled the load!

Happy coding. 🚀

Top comments (0)