web4browser

Posted on Jun 29

How to Diff Browser Profile State Before Reusing It in Playwright

#playwright #automation #testing #debugging

A browser profile can be available and still not be ready.

That is one of the easiest bugs to miss in long-running browser automation.

A worker picks up a profile. The lock is free. The user data directory exists. The proxy test passes. Playwright can launch the browser.

Then the task opens a login screen, lands in the wrong workspace, uses stale local storage, or continues after the previous run already failed.

The issue is not always Playwright.

It is often profile state drift.

If your automation depends on logged-in account context, start with an account context check before browser automation. Then add one more guardrail: diff the profile before reusing it.

Availability is not readiness

A simple worker queue usually checks something like this:

type ProfileLease = {
  profileId: string;
  userDataDir: string;
  lockedBy?: string;
  lockedUntil?: string;
};

If the profile is not locked, the worker runs.

That answers one question:

Can this worker open the profile?

It does not answer the better question:

Is this profile still in the expected state for this task?

That difference matters.

For logged-in browser automation, the profile is not just a folder. It may carry cookies, local storage, session state, extensions, timezone assumptions, language settings, proxy mapping, and the residue of the last run.

Playwright can reuse browser state through a persistent context. That is useful. It is also why stale state can become dangerous.

The failure pattern

This usually appears after the system has been running for a while.

At first, everything looks stable:

profile acquired
browser launched
page opened
task started

Then the task fails in a way that does not look like a code bug:

expected: dashboard page
actual: login page

expected: account_us_042 context
actual: unknown account context

expected: workspace "client-a"
actual: workspace picker

expected: continue task
actual: verification required

The worker did not crash.

The profile was not missing.

The browser opened correctly.

But the profile state was no longer the state the task expected.

That is the gap this article focuses on.

If your application exposes a safe readonly marker, such as a workspace ID, account label, or profile banner, you can include that marker in your snapshot. If it does not, treat account identity as unknown and route uncertain states to review.

Define a baseline after a known good run

Do not diff against memory.

Diff against a saved baseline.

A baseline is the profile state you trusted after the last successful run.

type ProfileBaseline = {
  profileId: string;
  accountKey: string;

  expectedProxyId: string;
  expectedRegion: string;
  expectedTimezone: string;
  expectedLanguage: string;

  browserMajorVersion: string;

  lastKnownUrl: string;
  lastRunStatus: "success" | "review" | "failed";

  criticalCookieNames: string[];
  expectedLocalStorageKeys: string[];

  updatedAt: string;
};

This is not meant to be a full fingerprint dump.

It is an operational snapshot.

It tells the next worker what the profile was supposed to look like when it was last considered safe for automation.

Include registry state, not only page state

Some profile state comes from the browser.

Some state comes from your own system.

For example, proxy identity usually comes from your profile registry, not from the page itself. The page may help you verify region, but the assigned proxy ID should come from the system that maps profiles to proxies.

That means your preflight should compare two layers:

Registry metadata before launch
Browser-visible state after launch

A minimal registry record might look like this:

type ProfileRegistryRecord = {
  profileId: string;
  userDataDir: string;
  assignedProxyId: string;
  assignedRegion: string;
};

The browser can tell you some things.

Your profile registry should tell you which proxy, region, owner, project, or account group the profile is supposed to belong to.

Do not mix those sources together without naming them.

Also be careful with the word region.

assignedRegion is registry intent, not proof of real egress. It tells you what the profile is supposed to use. If real network location matters, verify it with a separate non-account network check before opening the logged-in page.

Collect a lightweight current snapshot

Before launching the real task, run a preflight check.

The preflight should avoid destructive actions. Do not click buttons, submit forms, publish content, change account settings, or accept unexpected prompts.

Just open a stable page, inspect state, record what you see, and close.

import { chromium } from "playwright";

type ProfileSnapshot = {
  profileId: string;

  assignedProxyId: string;
  assignedRegion: string;

  currentUrl: string;
  timezone: string;
  language: string;

  cookieNames: string[];
  localStorageKeys: string[];
};

async function collectProfileSnapshot(
  profile: ProfileRegistryRecord,
  preflightUrl: string
): Promise<ProfileSnapshot> {
  const context = await chromium.launchPersistentContext(profile.userDataDir, {
    headless: true,
  });

  const page = await context.newPage();

  await page.goto(preflightUrl, {
    waitUntil: "domcontentloaded",
    timeout: 30_000,
  });

  const cookies = await context.cookies();

  const localStorageKeys = await page.evaluate(() => {
    return Object.keys(window.localStorage).sort();
  });

  const timezone = await page.evaluate(() => {
    return Intl.DateTimeFormat().resolvedOptions().timeZone;
  });

  const language = await page.evaluate(() => navigator.language);

  const snapshot: ProfileSnapshot = {
    profileId: profile.profileId,

    assignedProxyId: profile.assignedProxyId,
    assignedRegion: profile.assignedRegion,

    currentUrl: page.url(),
    timezone,
    language,
    cookieNames: cookies.map((cookie) => cookie.name).sort(),
    localStorageKeys,
  };

  await context.close();

  return snapshot;
}

This check is intentionally small.

You are not trying to prove that the account is safe forever. You are trying to catch obvious drift before the worker performs real work.

Also remember that opening a persistent context can still update some browser state. Treat this as a minimal preflight, not a pure read-only inspection.

Diff by severity

Do not reduce profile drift to one boolean.

Some differences are harmless. Some should trigger review. Some should stop the run immediately.

Use severity.

type DiffSeverity = "ok" | "soft" | "hard";

type ProfileDiff = {
  severity: DiffSeverity;
  field: string;
  reason: string;
};

function diffProfileState(
  baseline: ProfileBaseline,
  snapshot: ProfileSnapshot
): ProfileDiff[] {
  const diffs: ProfileDiff[] = [];

  if (baseline.expectedProxyId !== snapshot.assignedProxyId) {
    diffs.push({
      severity: "hard",
      field: "proxy",
      reason: `Expected proxy ${baseline.expectedProxyId}, got ${snapshot.assignedProxyId}`,
    });
  }

  if (baseline.expectedRegion !== snapshot.assignedRegion) {
    diffs.push({
      severity: "hard",
      field: "region",
      reason: `Expected region ${baseline.expectedRegion}, got ${snapshot.assignedRegion}`,
    });
  }

  if (baseline.expectedTimezone !== snapshot.timezone) {
    diffs.push({
      severity: "hard",
      field: "timezone",
      reason: `Expected ${baseline.expectedTimezone}, got ${snapshot.timezone}`,
    });
  }

  if (baseline.expectedLanguage !== snapshot.language) {
    diffs.push({
      severity: "soft",
      field: "language",
      reason: `Expected ${baseline.expectedLanguage}, got ${snapshot.language}`,
    });
  }

  const missingCriticalCookies = baseline.criticalCookieNames.filter(
    (name) => !snapshot.cookieNames.includes(name)
  );

  if (missingCriticalCookies.length > 0) {
    diffs.push({
      severity: "hard",
      field: "cookies",
      reason: `Missing critical cookies: ${missingCriticalCookies.join(", ")}`,
    });
  }

  const missingStorageKeys = baseline.expectedLocalStorageKeys.filter(
    (key) => !snapshot.localStorageKeys.includes(key)
  );

  if (missingStorageKeys.length > 0) {
    diffs.push({
      severity: "soft",
      field: "localStorage",
      reason: `Missing localStorage keys: ${missingStorageKeys.join(", ")}`,
    });
  }

  if (baseline.lastRunStatus === "failed") {
    diffs.push({
      severity: "hard",
      field: "lastRunStatus",
      reason: "Previous run failed. Review the profile before reuse.",
    });
  }

  return diffs;
}

The goal is not perfect detection.

The goal is to prevent blind reuse.

Route the profile before the task runs

Once you have diffs, route the profile.

type ProfileRoute = "run" | "human_review" | "quarantine";

function decideProfileRoute(diffs: ProfileDiff[]): ProfileRoute {
  if (diffs.some((diff) => diff.severity === "hard")) {
    return "quarantine";
  }

  if (diffs.some((diff) => diff.severity === "soft")) {
    return "human_review";
  }

  return "run";
}

Then use the route before starting the real task.

const snapshot = await collectProfileSnapshot(
  profile,
  "https://example.com/dashboard"
);

const diffs = diffProfileState(baseline, snapshot);
const route = decideProfileRoute(diffs);

if (route === "quarantine") {
  await saveDiffReport(profile.profileId, route, diffs);
  throw new Error("Profile drift detected. Quarantine before reuse.");
}

if (route === "human_review") {
  await createReviewTicket(profile.profileId, diffs);
  throw new Error("Profile needs review before automation continues.");
}

await runRealTask(profile);

This small routing step changes the model.

A profile is no longer only:

free
busy

It becomes:

ready to run
needs review
unsafe to reuse

That is a better model for multi-account automation.

Save a diff report

When a profile is blocked, save enough evidence for the next developer or operator.

{
  "profile_id": "profile_us_042",
  "route": "quarantine",
  "checked_at": "2026-06-29T08:00:00Z",
  "diffs": [
    {
      "severity": "hard",
      "field": "proxy",
      "reason": "Expected proxy_us_resi_12, got proxy_de_dc_04"
    },
    {
      "severity": "hard",
      "field": "timezone",
      "reason": "Expected America/New_York, got Europe/Berlin"
    },
    {
      "severity": "hard",
      "field": "cookies",
      "reason": "Missing critical cookies: session_id"
    }
  ],
  "next_action": "Open manually, confirm account state, refresh baseline only after review."
}

A useful report should answer five questions:

Which profile was checked?
What changed?
Why did that change matter?
Did the worker run or stop?
What should the next person do?

Without this report, the next run becomes guesswork.

Common mistakes

Only checking locks

Locks prevent two workers from using the same profile at the same time.

They do not prove that the profile is healthy.

A locked profile can be stale. An unlocked profile can be unsafe.

Treating every cookie difference as critical

Some cookies do not matter for the workflow.

Some are required.

Do not compare every cookie blindly. Define the small set of critical cookie names or session indicators your workflow actually depends on.

Updating the baseline too early

Never refresh the baseline just because the preflight ran.

Only refresh it after a known good task, a controlled account reset, or a human review.

Otherwise, the baseline becomes a record of drift instead of a guardrail against drift.

Running headless through uncertain states

Headless preflight is useful for cheap checks.

But if the page shows login, verification, account warnings, permission changes, payment steps, publishing actions, or unexpected workspace state, the task should stop.

Do not push through uncertain account states silently.

A practical checklist

Before reusing a browser profile, check:

Is this the intended profile for the account?
Is the assigned proxy still the expected proxy?
Does the assigned region still match the baseline?
Is the expected session still present?
Are critical cookies still available?
Are required local storage keys still available?
Does the timezone match the account environment?
Does the language still match the expected workflow?
Did the previous run finish cleanly?
Is there an open review ticket for this profile?
Should the baseline be refreshed, or should the profile be quarantined?

The important shift is simple:

Reuse a profile only after proving it is still the right environment for the task.

For teams that manage many long-lived browser profiles, this is where a structured profile environment boundary becomes useful. In a workspace like Web4 Browser, the profile is treated as part of a larger account environment, not just a folder that Playwright can open.

Profile reuse is powerful.

Profile reuse without state diffing is guesswork.

DEV Community