Building the future for benefits tracking

#webdev #typescript #automation #nextjs

Most people have no idea what's left in their benefits plan. Their employer pays for dental, vision, massage, and more, and at the end of the year whatever you didn't touch just disappears. I never knew what I had left, and when I did I'd forget to use it before it reset, so I built something to fix that.

BenefitTrack emails you every month with exactly how much you have left to spend across all your benefits, so you stop losing money to coverage you already paid for.

The core problem

BenefitTrack runs a monthly browser automation against insurance carrier portals that change without warning and have no public API. What's difficult is that we don't control when those portals change. A portal can restructure a page, change how it signals a successful login, or drop a new modal on the home screen, and we find out only when something breaks at 8 AM on the first of the month.

Why Stagehand over selectors

The obvious approach is a traditional scraper: find the login form, click the button by its CSS class, wait for a redirect, and parse the balances. That works right up until the portal changes its markup, at which point the scrape either crashes or, worse, succeeds against a page that no longer means what you think it does.

To mitigate this we chose to build with Stagehand, a browser automation framework from Browserbase that lets you describe what you want in plain language instead of hardcoding selectors.

// selector-based: breaks if the class name or DOM structure changes
await page.click('.btn-primary.submit-login');

// Stagehand: works as long as there's a sign in button on the page
await stagehand.act('click the sign in button');

The catch is that every LLM call costs time and money. A full scrape takes a few minutes and is obviously more expensive than a selector-based one. On a portal that can change anytime without notice, it pays for itself the first time it survives a portal change you didn't see coming.

The stack

We're building BenefitTrack on Next.js, NestJS, Supabase, and Vercel. It lets us move fast and keep costs near zero while we're still getting off the ground.

The one genuinely interesting part is how we handle long-running scrapes. A full portal login with MFA can run several minutes, well past a normal serverless timeout. Vercel's waitUntil gets around it by letting the request return right away while the work keeps going in the background.

@Post()
@HttpCode(202)
run(@Body() body: ScrapeRequestDto) {
  waitUntil(
    this.scrapeService.run(body).catch(() => {
      // errors are already logged + persisted to scrape_tasks by the service
    }),
  );

  return { accepted: true };
}

The orchestrator fires a POST to /api/scrape for each employee, gets a 202 back, and moves on. Each scrape runs on its own and writes its own result to the database. However, if a function dies mid-scrape there's no recovery, and there's a hard 800-second ceiling per invocation. Both are fine for now, but at real scale a durable job queue would be ideal.

Handling credentials

Scraping a carrier portal means holding a user's login for that portal, which is about as sensitive as data gets. Credentials are encrypted at rest and only decrypted in memory for the length of a single scrape, and they never get written to logs or error traces.

Concurrency and reliability

Once a few dozen scrapes are running at once, there's a lot more that can go wrong than in a single run.

First is Browserbase itself: 25 concurrent sessions and 25 new sessions per minute on our current plan. These limits are enforced with a Bottleneck token bucket.

const limiter = new Bottleneck({
  maxConcurrent: 25,
  reservoir: 25,
  reservoirRefreshAmount: 25,
  reservoirRefreshInterval: 60_000, // refill 25 tokens every minute
});

await Promise.all(
  employeeCarriers.map((ec) =>
    limiter.schedule(async () => {
      await fetch(`${baseUrl}/api/scrape`, { method: 'POST', body: JSON.stringify(payload) });
    }),
  ),
);

Each scheduled task spends a token, and the bucket tops back up to 25 every minute. Cross either line and Browserbase answers with a 429.

Then there's not running the same scrape twice. Re-triggering the job is essential when something fails halfway, so the system has to know who already succeeded and leave them alone. Before dispatching, it pulls any scrape_task marked status: 'success' in the current calendar month and only queues the rest.

The email send has its own check, and that separation is deliberate. Before sending, it looks for a notification already sent that month for that employee. The two have separate checks because they fail independently: a scrape can succeed while the email fails, or an email can go out before a scrape ever finishes. A single combined check would let a failure on one side quietly block the other.

Sessions hang too. Browserbase times out, a portal stalls loading a page, and the task just sits in running. An hourly cron job retries anything stuck in running or queued for more than 30 minutes, marks it failed, and sends it back out. A loop guard cuts that off after two sweep-marked failures in a month, since otherwise a genuinely broken scraper would keep retrying itself until the month ran out.

And scrapes can lie. The session finishes, the LLM hands back a result, and the data is still wrong: empty benefit types, negative amounts, amountUsed > totalAmount. So before anything is saved, the service validates every field and throws on the first bad one. That matters because of idempotency: a scrape_task marked success blocks the retry path. If bad data gets written as a success, the employee gets a wrong email and nothing ever retries it. Failing the task and running again is better than sending something wrong every time.

The bug that taught me the most

The one that stuck with me was an auth-detection loop. A user's session was perfectly valid, but my check kept deciding they were logged out and kicking off a full re-login. One symptom, and underneath it three separate wrong assumptions, all of which I dug out in a single day.

First, I was reading the URL to decide whether login had worked, and the post-login URL wasn't consistent. So I switched to looking for the member's name and a "Sign Out" button as proof they were in, except both were tucked inside a menu on the logged-in home page and neither was actually visible. And under all of that, the check was firing before the page had finished loading at all, since the portal sat behind a bot-protection layer that could take a minute or more to paint.

Three causes, one symptom. The real takeaway was that the thing I'd chosen to mean "logged in" was never a reliable signal to begin with.

Where it stands today

Every carrier integration is built by hand, and right now only one portal is fully working. Adding a carrier means writing a new scraper from scratch, and a large enough layout change can still break extraction even with Stagehand soaking up the smaller stuff.

There's also no automated retry for real code bugs. The stuck-task sweep handles transient failures like a hung session, but it can't save a scraper that navigates to the wrong page or pulls the wrong field. That kind of thing takes a code change and a manual re-run.

What building it taught me

Almost none of the hard parts were things I prepared for. The trouble came from things I couldn't have guessed: a portal that took ninety seconds to load, a button hidden behind a menu, redirect chains that shifted from one session to the next. I couldn't have planned for these, but I was flexible enough to change my approach the moment they showed up.

The other lesson: when you're scraping a portal you don't own, don't trust URLs or specific elements to sit where you left them. Anchor on the content a real person would use to know they've landed in the right place.

Idempotency turned out to matter far more than it sounds like it should. If we did not include this wrong, someone could get multiple emails per month, or none at all. For a product whose entire job is one correct email a month, that's not a small thing.

BenefitTrack is available today for a limited number of carriers, with more being added over the coming weeks. If staying on top of your benefits sounds useful, take a look at benefittrack.ca.