DEV Community: Eugene Yakhnenko

Making OAuth Testable: Rethinking OIDC Clients in JavaScript

Eugene Yakhnenko — Sun, 03 May 2026 00:14:10 +0000

The real pain point

Most OAuth/OIDC integrations in JavaScript are difficult to test in a meaningful way. Testing usually involves mocking network calls, faking redirects, stubbing token responses, and simulating browser state. The result is that you are not testing OAuth. You are testing your mocks.

The typical test for an OIDC login flow looks something like this: intercept the fetch call to the token endpoint, return a hardcoded JSON response, check that the UI updated. You have verified that your code handles a specific shape of data. You have not verified that your code actually implements the OIDC protocol correctly.

This is not a minor distinction. OAuth and OIDC are security protocols. The value of testing them comes from exercising the real behavior: actual redirects, actual token exchanges, actual state validation. When every external interaction is replaced with a stub, the test becomes a tautology.

The problem is not OAuth itself. It is how we structure clients.

Why OIDC clients are hard to test

Most OIDC libraries combine several concerns into a single abstraction:

Protocol logic: PKCE code challenges, state parameters, nonce validation, token parsing
HTTP: fetch calls, interceptors, retry logic
Storage: localStorage, sessionStorage, cookies
Framework concerns: React hooks, Angular services, Vue composables

This creates implicit behavior. A single useAuth() hook might trigger discovery, check for stored tokens, initiate a background refresh, and update reactive state, all before the component finishes mounting. None of these steps are visible to the caller.

It also creates tight coupling to the runtime. You cannot test the protocol logic without also dealing with fetch, the DOM, and framework-specific rendering. And so the instinct is to mock everything. Replace fetch with a spy. Stub sessionStorage. Fake the redirect.

When everything is coupled, everything has to be mocked. And when everything is mocked, you are testing a simulation, not the thing itself.

A different approach: treat OIDC as a protocol

OIDC does not need to be a runtime-driven client. If you look at what the protocol actually requires, most of it is pure computation: building requests, validating callbacks, parsing tokens, and checking expiration. All of these take data in and return data out. They do not need fetch. They do not need localStorage. They do not need a DOM.

The protocol is pure. IO is not. The mistake most libraries make is treating these as one thing.

Architecture shift: separate protocol from runtime

The idea is straightforward: split the OIDC client into two layers.

The first layer is a functional core. It contains every piece of protocol logic, and nothing else. No fetch calls. No storage access. No global state. No framework imports. Every function takes explicit parameters and returns a result. A function like buildTokenRequest takes a discovery document, a code, and a code verifier, and returns an object with a URL, headers, and body. It does not send the request. That is someone else's job.

The second layer is a set of adapters. Each adapter is framework-specific and handles the IO that the core deliberately avoids. A React adapter composes core functions with fetch and React state. An Angular adapter uses HttpClient and services. A Vue adapter uses composables. A Svelte adapter uses stores.

The adapters are thin. They call core functions to build requests, execute those requests using whatever HTTP mechanism the framework provides, and pass responses back through core functions for parsing and validation.

The result:

Protocol logic has zero dependencies, not even on fetch. It uses only the Web Crypto API for PKCE generation.
No framework concerns leak into the core. React does not exist in the token parsing code.
No hidden side effects. Every IO operation is explicit and visible in the adapter layer.
Testing boundaries are clear. You can test the core with pure unit tests. You can test the adapters with integration tests. Neither requires mocking the other.

Testing OAuth without mocks

This is where the architecture pays off. Because the core is pure, you can test it exhaustively with straightforward unit tests. Pass in a discovery document. Get back an authorization URL. Verify the parameters. No HTTP server needed. No browser needed. No mocks needed.

But unit testing the core is only half the story. The real value comes from what this architecture enables at the integration level: testing the full OIDC flow against a real identity provider.

The test setup uses Autentico, a lightweight OIDC provider built for testing. Autentico is a single binary with no external dependencies. In CI, the full setup takes roughly 500 milliseconds: generate cryptographic secrets, create an admin user, register a client, start the server. That is fast enough to spin up a fresh identity provider instance for every individual test.

The goal is not to test Autentico. It is to remove the need for mocks entirely by making the provider disposable.

Each test gets its own Autentico instance with its own database, its own users, and its own registered clients. There is no shared state between tests. No leftover sessions. No token caches that bleed across test boundaries. If a test fails, it fails because of the code under test, not because a previous test left the identity provider in an unexpected state.

The fixture handles everything programmatically:

Generates random cryptographic secrets (access token, refresh token, CSRF, RSA signing key)
Creates a fresh SQLite database
Runs the onboarding step to set up an admin user
Starts the server on an isolated port
Registers an OAuth client with the correct redirect URIs
Creates a test user with known credentials
Waits for the health check endpoint to respond
Tears everything down after the test completes

No manual configuration. No shared test environment. No Docker containers. Just a binary that starts in under a second.

Deterministic end-to-end tests

With a real identity provider running per test, the end-to-end tests exercise the actual protocol flow through a real browser.

Using Playwright, each test performs the full sequence: navigate to the application, click login, get redirected to the identity provider, fill in credentials, submit, get redirected back with an authorization code, exchange the code for tokens, fetch user info, and verify the UI reflects the authenticated state.

Nothing is intercepted. Nothing is stubbed. The browser makes real HTTP requests. The identity provider issues real tokens signed with a real RSA key. The application parses real JWT claims and validates real nonces.

The tests assert both UI state and the exact protocol sequence. A traffic tracker records every fetch request and browser navigation to the identity provider in the order they occur, filtered to OIDC-relevant paths. After each test, assertions verify not just that the login succeeded, but that the exact expected sequence happened in order:

GET  /.well-known/openid-configuration   # app loads, fetches discovery
NAV  /oauth2/authorize                    # browser redirects to IdP
GET  /.well-known/openid-configuration   # app reloads after callback, fetches discovery again
POST /oauth2/token                        # exchanges authorization code for tokens
GET  /oauth2/userinfo                     # fetches user profile

The two discovery calls are not a bug. There is a full page navigation between them. The first happens when the app mounts. The browser then navigates to the authorization endpoint. After the user authenticates, the IdP redirects back, the app reloads from scratch, and discovery is fetched again before the token exchange. The sequence tracker makes this visible. An earlier version of the test suite tracked fetches and navigations separately, which made it look like both discoveries happened together. The combined sequence revealed the actual interleaving.

Most tests assert outcomes. These tests also assert the protocol itself. A token refresh that should not have happened. A missing userinfo request. A navigation that fired before a fetch it was supposed to follow. These are the kinds of issues that mock-based tests cannot detect, because the mocks only respond to the calls you anticipated.

The tests also verify security properties:

Tokens are never stored in localStorage or sessionStorage
Callback URL parameters (code, state) are cleaned up after processing
Sessions are not preserved across page reloads (in-memory only)
The back button after logout does not expose authenticated content
Tampered state parameters trigger the correct error

Each of these tests runs against the real flow. The assertion that tokens are not in storage is meaningful because real tokens were actually issued and processed. The assertion about state mismatch is meaningful because a real authorization request was initiated with a real state parameter.

Running tests across frameworks

Because the core is framework-agnostic and each adapter is a thin wrapper, the same test suite runs against every framework. The same spec file tests React, Angular, Vue, Svelte, Lit, Solid, and Preact. Each framework gets its own dev server on an isolated port, its own Autentico instance on a separate port, and its own database.

A shell script orchestrates the runs with configurable parallelism. Locally, with all eight frameworks running in parallel, the full suite completes in under a minute. In CI, they run sequentially to stay within resource limits.

The test names are prefixed with the framework identifier, so failures are immediately attributable:

[React] OIDC Login Flow > completes full login flow with tokens
[Angular] RequireAuth > auto-refreshes expired token when navigating to protected page
[Vue] Security > tokens are not stored in localStorage or sessionStorage

This setup catches framework-specific regressions. A change to the Svelte adapter that accidentally double-fires a discovery request will fail the traffic assertion even though the UI behavior looks correct.

What this catches that mocks don't

One concrete example: token refresh race conditions.

The test for automatic token refresh works like this. First, complete a full login. Then, override Date.now in the browser to simulate time passing beyond the token's expiration. Then navigate to a protected page. The RequireAuth guard should detect the expired token, attempt a refresh, and let the user through if the refresh succeeds.

The tricky part is restoring the clock. Restoring Date.now from Playwright's page.evaluate after the refresh arrives as a macrotask, but the framework's state update from the refresh response runs in the microtask chain. The component re-renders with the new token while Date.now still returns the fake expired time, triggering another refresh.

The solution is to patch window.fetch alongside Date.now, and restore the real clock from inside the fetch promise chain, before the framework processes the response.

This is not a hypothetical edge case. It is a real bug that surfaced during development. A mock-based test would never catch it because the mock controls both the clock and the response, and there is no actual async flow to create the race condition.

Another example: the test that revokes a refresh token server-side, then navigates to a protected page. The guard attempts a refresh, gets a failure from the real identity provider, and falls back to a full login redirect. With mocks, you would return a 400 from a stubbed endpoint. With a real provider, the revocation is real, the failure is real, and the redirect is real. If the client's error handling has a subtle bug in how it interprets the provider's error response, the real test catches it. The mock never will, because the mock returns exactly the error format you expected.

Tradeoffs

This approach is not free. There are real costs.

Running a real identity provider adds setup complexity. The test fixture is more involved than a simple beforeEach that sets up mocks. The Autentico binary needs to be downloaded, and each test pays the cost of starting a server process.

A single test provider gives you deterministic behavior, but it does not cover provider-specific quirks. Real-world OIDC providers have subtle differences in token formats, claim structures, and error responses. Testing against Autentico validates the protocol, not every provider's interpretation of it.

The tests are slower than pure unit tests. A full E2E test with browser automation, server startup, and real HTTP exchanges takes seconds, not milliseconds. The per-test Autentico instance adds roughly 500 milliseconds of overhead. For a single test, that is noticeable. Across a full suite with parallelism, it is manageable.

This is not the fastest way to test auth. It is the most reliable. When the suite passes, you know that the full OIDC flow works in a real browser against a real provider. When it fails, the failure points to an actual problem, not a gap between your mocks and reality.

Takeaway

OAuth is not inherently hard to test. It becomes hard when protocol logic, IO, and framework concerns are mixed into one abstraction. When they are separated, each piece becomes testable on its own terms.

The protocol layer is pure computation. Test it with inputs and outputs. The adapter layer is framework-specific IO. Test it against a real provider. The identity provider setup is fast enough to be disposable. Give each test a fresh instance and eliminate shared state entirely.

When the suite passes, you are not trusting mocks. You are verifying the protocol itself.

This approach is implemented in oidc-js, a zero-dependency, cross-framework OIDC client built around a functional core and thin adapters, tested end-to-end with Autentico, a lightweight OIDC provider built for exactly this kind of workflow.

What 200 Concurrent Users Taught Me About SQLite Performance

Eugene Yakhnenko — Tue, 28 Apr 2026 20:00:51 +0000

I was about to release Autentico 2.0. The feature work was done, tests were passing, docs were updated. Before tagging the release I figured I'd spend some time on performance. Run some stress tests, see where things stand, maybe squeeze out some easy wins. What followed was a week-long detour through profiling, architecture design, benchmarking, and a humbling lesson about assumptions.

Autentico is a self-contained OAuth 2.0 / OpenID Connect identity provider built with Go and SQLite. One binary, one database file, no external dependencies. The benchmark workload is a full PKCE authorization code flow: authorize, login with password, token exchange, token introspection, and refresh. Five HTTP requests per iteration, four or five SQLite writes per iteration, and one bcrypt password verification.

Profiling on the Wrong Machine

I started with k6 stress tests on my older i5 laptop. 100 virtual users, 30 seconds, the full auth flow. The results were fine but not great. So I profiled.

90% of CPU time was spent in bcrypt.CompareHashAndPassword.

That's the function that verifies a user's password against the stored hash. It's intentionally slow (that's the point of bcrypt), it's CPU-bound, and it was dominating everything else. SQLite writes took microseconds. JWT signing was negligible. HTTP routing was invisible. Just bcrypt, eating all available cores.

The conclusion seemed obvious: bcrypt is the bottleneck, and you can't make bcrypt faster. You can only do more of it in parallel. But on a single machine running SQLite, you can't just add more instances. SQLite is single-writer, single-file. You can't horizontally scale the traditional way.

Or can you?

Designing Verifico

The bottleneck wasn't the database. It was one function call. So what if you scaled just that function?

I explored the options systematically:

CQRS with SQLite replication. LiteFS can replicate SQLite across nodes, one primary for writes, replicas for reads. A real architecture, but it solves a general scaling problem. Mine was specific. I didn't need to distribute reads and writes. I needed to distribute bcrypt.

Postgres. The standard answer for outgrowing SQLite. But Postgres doesn't solve bcrypt CPU. You'd still run CompareHashAndPassword on the application server. Multiple instances behind a load balancer would spread the load, but you'd be paying for full application instances (database connections, memory, middleware) when all you need is more CPU for one function.

Child processes. Spawn separate processes for bcrypt work. But Go already parallelizes CPU-bound work across all cores via goroutines and the runtime scheduler. On a single machine, you can't beat Go's built-in parallelism. Separate processes just add IPC overhead.

Sticky sessions. Route users to specific instances. But you need a shared lookup table, which needs a shared database, which is the problem you're trying to avoid.

Then the idea clicked: keep Autentico as a single instance, owning the database and handling everything. But when it needs to verify a password, send the hash and the plaintext to a remote worker. The worker runs bcrypt and returns true or false. Workers are stateless, trivial, and can run on the cheapest hardware available.

I called it Verifico ("I verify" in Italian, matching Autentico's naming). Same binary, new subcommand: autentico verifico start. One HTTP endpoint, one function call, a shared secret for auth, and round-robin load balancing with automatic fallback to local bcrypt if workers are down.

The security model went through its own journey. I started at mTLS (operationally heavy for a boolean endpoint), worked through AES encryption (reimplementing TLS poorly), landed on a shared secret over a private network. The password already traveled over the public internet to reach Autentico. One more hop inside a VPC is no worse.

It Worked

On the i5, Verifico delivered real improvements. With the server constrained to 2 cores and workers handling bcrypt, non-login endpoints dropped from seconds to single-digit milliseconds. The server's cores were free for HTTP handling, SQLite queries, and JWT signing. Throughput scaled linearly with worker count, up to about 6 cores. At 8 it flattened out.

I was pleased. Built a clean solution, benchmarked it, it worked. Ready to ship.

It Didn't Work

Then I ran the same benchmarks on a modern Ryzen 7 desktop. 16 cores, faster single-thread performance, more cache.

I constrained Autentico to 2 cores and started adding 2-core workers: 2+2, 2+2+2, all the way up to 2+7x2. On the i5, throughput had kept climbing with each worker up to 6 cores. On the Ryzen:

Config	iter/s	Login p95
2 server + 2 worker	15.4/s	3.61s
2 server + 4 worker	15.4/s	3.68s
2 server + 6 worker	15.2/s	3.58s
2 server + 10 worker	15.0/s	3.60s
2 server + 14 worker	14.7/s	3.76s

Flat. Five configurations, 2 to 14 worker cores, and throughput barely moved. Adding workers did nothing.

The Ryzen was simply faster at bcrypt. Even at the default cost of 10, each core chewed through password hashes fast enough that bcrypt stopped being the bottleneck. The real contention was elsewhere entirely.

I had spent days designing, implementing, and benchmarking a solution for a bottleneck that was hardware-specific.

Finding the Real Bottleneck

I went back to profiling, this time on the Ryzen. A Go block profile under load revealed that every contention point was at database/sql.(*DB).conn. Goroutines waiting for a connection from the pool. Not SQLite's file lock, not disk I/O. The Go connection pool.

Reads accounted for 65% of total contention, writes 35%. The top offenders were all routine operations: looking up a client by ID, creating a session, creating a token. Fast queries, stuck waiting in line.

The Boring Win: WAL Mode

SQLite's default rollback journal locks the entire database during writes, blocking all readers. WAL (Write-Ahead Logging) changes this: readers see a consistent snapshot while writes go to a separate log. The change is one line:

PRAGMA journal_mode = WAL;

It's persistent. Set it once and every future connection inherits it. No application code changes.

Results at 200 virtual users, 30 seconds:

Cores	Without WAL	With WAL	Improvement
1	13.4 iter/s	16.7 iter/s	+25%
2	23.6 iter/s	31.3 iter/s	+33%
4	32.2 iter/s	49.8 iter/s	+55%
6	33.0 iter/s	54.3 iter/s	+65%
8	31.9 iter/s	50.2 iter/s	+57%

One pragma. No code changes. Up to 65% throughput improvement. But WAL alone hits a ceiling around 6 cores and actually regresses past that.

The Real Scaling Win: Read/Write Pool Split

WAL allows concurrent readers alongside a single writer. The natural next step: give readers their own connection pool.

I split the single *sql.DB into two pools. A write pool with one connection (serializing all mutations, eliminating SQLITE_BUSY errors) and a read pool with multiple connections for concurrent SELECT queries.

The key was making this invisible to callers. Instead of updating every file that touches the database, I wrote a DB wrapper that routes by method: Exec and Begin go to the writer, Query and QueryRow go to the reader pool. Every package just calls db.GetDB() and the routing happens automatically. Zero changes to business logic.

type DB struct {
    writer *sql.DB
    reader *sql.DB
}

func (d *DB) Exec(query string, args ...any) (sql.Result, error) {
    return d.writer.Exec(query, args...)
}

func (d *DB) Query(query string, args ...any) (*sql.Rows, error) {
    return d.reader.Query(query, args...)
}

This also required some iteration. The first attempt was slower due to a bug where pooled connections weren't getting their PRAGMA settings. Once fixed:

Cores	WAL Only	WAL + Pool Split	Improvement
4	49.8 iter/s	57.0 iter/s	+14%
6	54.3 iter/s	76.1 iter/s	+40%
8	50.2 iter/s	88.3 iter/s	+76%
unlimited	45.9 iter/s	101.4 iter/s	+121%

Where WAL alone plateaus and regresses, the pool split keeps scaling. At 500 virtual users over 60 seconds, the pool split delivered 3.5x the throughput of the main branch with 59-78% latency reduction across all endpoints. Zero errors on both configurations.

The read pool sweet spot was 4 connections. More than that floods the writer with contention when all those concurrent reads finish simultaneously and try to write. The auto-calculation min(available CPUs, 4) with a floor of 2 covers most cases.

What Shipped in 2.0

Two changes made it into the release:

WAL mode, enabled by default. Free performance for every deployment.

Read/write connection pool split, transparent to users. The server auto-tunes the read pool size based on available CPUs.

Verifico didn't ship. The benchmarks on the Ryzen showed it wasn't solving a real bottleneck, so there was no reason to add the complexity. The code is there if the need ever materializes on constrained hardware, but for now it's a solution waiting for a problem.

What I Learned

Profiling tells the truth, but only about the machine you're sitting at. I should have known better. In my early years I spent time writing x86 assembly with FASM, where you learn that certain instructions cost more clock cycles than others and that two CPUs at the same clock speed can have very different real-world performance thanks to pipeline optimizations, L1/L2/L3 cache differences, and branch prediction. I knew hardware isn't uniform. What I didn't expect was that the scaling behavior would change. I assumed that if adding worker cores improved throughput on one machine, it would improve throughput on another, maybe at different absolute numbers but with the same shape. Instead, the Ryzen's faster per-core bcrypt performance shifted the bottleneck entirely. The curve wasn't the same shape at a different scale. It was a different curve.

The boring fix usually wins. WAL mode is in the SQLite documentation. Connection pooling is a well-understood pattern. Together they more than doubled throughput. Neither required novel architecture.

Build the optimization, then question it. I don't regret building Verifico. The design process (working through CQRS, Postgres, gRPC, mTLS, landing on the simplest thing) was valuable, and it works for its intended use case. But I should have validated the assumption on more than one machine before committing to it.

Don't benchmark at low concurrency and call it done. Some of the intermediate results at 100 virtual users looked promising for approaches that fell apart at 200. Always test at your target load.

Autentico is an open-source OAuth 2.0 / OpenID Connect identity provider. Version 2.0 is coming soon.

Why your drawing app uses 2% CPU when you're not using it

Eugene Yakhnenko — Sun, 19 Apr 2026 18:45:14 +0000

A measured comparison of Figma, tldraw, Excalidraw, and Skedoodle, and the architectural choice that makes the difference.

Open your browser. Go to any drawing or whiteboarding app: tldraw, Excalidraw, Figma, whatever you use. Put it on a blank canvas. Don't touch anything.

Open your browser's task manager.

That app is probably using 1–3% CPU right now. Not the browser as a whole. Not all your tabs combined. Just that one page, sitting there, doing nothing visible. Figma alone burns 3.49%. Multiply across every "modern web app" tab you keep open and you start to understand why your fan spins up when you're not using the computer.

I wanted to know where that CPU was going. I built a Playwright rig, loaded tldraw, Excalidraw, and Figma on a blank canvas, and sampled CPU for 30 seconds across 5 runs. I also measured a drawing app of my own, Skedoodle.

Here's the result:

Three apps pay a tax. One sits at the measurement noise floor. This post is about where each one's idle CPU goes, and then about a more surprising finding underneath: rendering architecture isn't what determines active CPU.

Methodology in one paragraph

I built a small Playwright-based perf framework that opens each app in Chromium, sits on a blank canvas for 30 seconds, and samples Chrome DevTools Protocol Performance.metrics every 500ms. The reported "CPU%" is ΔTaskDuration / wall_clock, attributed to the page, not the whole browser process. Median of 5 runs; whiskers on the chart are min–max. Machine: Microsoft Surface, Intel Core i5-1035G7, 8 cores, Arch Linux. The full methodology and raw data are in the repo. pnpm --filter skedoodle-perf baseline reproduces every number in this post.

Why tldraw ticks every frame

tldraw ships a component called TickManager. It does what the name suggests: runs forever. Here's the relevant code:

start() {
  this.isPaused = false
  this.cancelRaf?.()
  this.cancelRaf = throttleToNextFrame(this.tick)
  this.now = Date.now()
}

@bind
tick() {
  if (this.isPaused) { return }
  const now = Date.now()
  const elapsed = now - this.now
  this.now = now
  this.editor.inputs.updatePointerVelocity(elapsed)
  this.editor.emit('frame', elapsed)
  this.editor.emit('tick', elapsed)
  this.cancelRaf = throttleToNextFrame(this.tick)   // re-arm for next frame
}

Every frame (60Hz), tick runs and re-schedules itself via requestAnimationFrame. No dirty-flag guard. It ticks regardless of whether anything changed.

And the tick does real work. It updates pointer velocity even when the pointer hasn't moved. It drains an event queue that might be empty. It fires 'frame' and 'tick' to anyone listening. The listeners do their own work: viewport and camera animation checks, scribble handlers (no-op when idle, but still a function call), and a PerformanceManager._onFrame that computes getCulledShapes() on every frame.

None of that is wasted effort inside tldraw's model. Pointer velocity enables gesture recognition and flick handling. Frame events drive camera tweens and smooth zoom-to-fit. Culling keeps active-draw fast at scale. If you want those features, something has to run the tick. Multiply 60 ticks by half a millisecond of work each and you get ~1.5% CPU as the price of having them.

Skedoodle doesn't have most of those features. So it doesn't tick.

Excalidraw's React reconciler

Excalidraw does not run a perpetual rAF. Their throttleRAF helper is pull-based; it only schedules when called:

export const throttleRAF = <T extends any[]>(fn: (...args: T) => void) => {
  let timerId: number | null = null;
  let lastArgs: T | null = null;
  const scheduleFunc = () => {
    timerId = window.requestAnimationFrame(() => {
      timerId = null;
      const args = lastArgs;
      lastArgs = null;
      if (args) { fn(...args); }
    });
  };
  // ...
};

And their AnimationController explicitly stops itself when there's nothing left to animate:

if (AnimationController.animations.size === 0) {
  AnimationController.isRunning = false;
  return;   // loop stops here when idle
}

So where does Excalidraw's 1.18% idle CPU go? Into React's reconciler. Excalidraw stores hover state, current tool, and pointer position in React component state. Each setState triggers componentDidUpdate, which runs a ~160-line prev/next diff, commits to its store, fires onChange listeners, and toggles theme classes.

This is a reasonable design choice. Keeping interaction state in React gives you the normal React ergonomics: declarative rendering, hooks, standard event handling. The cost is that any internal state change wakes the reconciler, and at idle there's still enough internal churn (hover ticks, mouse-move handlers, periodic state sync) to keep it awake on a steady cadence.

It's a different shape of problem from tldraw's: not a perpetual rAF, but a steady drip of React work, landing at roughly the same cost.

Figma is a different kind of cost

Figma's idle CPU is the highest of the four, and most of it isn't rendering. At idle, the page is running: a websocket keepalive to Figma's backend, CRDT bookkeeping for the file you have open, cursor-presence logic for other collaborators, autosave timers, and the usual authenticated-product chatter (telemetry, experiment assignment, analytics).

The perf runs for the other three apps (tldraw OSS, Excalidraw, Skedoodle) were local and unauthenticated. None of them were paying for collab infrastructure at measurement time. Figma doesn't offer a local-only mode, so its number reflects a shipping collaborative product rather than a fair architectural peer. Keep it in the chart as a ceiling on "what a production collaborative whiteboard costs idle," not as a comparison against the other three.

The rest of this post is about the other three.

Skedoodle's 0.09%: event-driven rendering

Skedoodle's idle CPU is near zero because nothing polls. The canvas doesn't repaint unless a user event changed the scene. State mutations don't wake a reconciler because canvas state isn't in React. There's no equivalent of tldraw's TickManager.

Two choices enforce this.

The renderer's internal loop is disabled. Skedoodle is built on Two.js, a thin 2D renderer. Two.js's default is its own internal requestAnimationFrame loop, the autostart: true option. Enable it, and Two.js will call update() every frame for you, forever. If it were left on, Skedoodle would measure about the same as tldraw.

The first line of Skedoodle's canvas setup turns it off. client/src/canvas/canvas.hook.tsx:

return new Two({
  autostart: false,
  fitted: true,
  width: container.clientWidth,
  height: container.clientHeight,
  type: twoType,
}).appendTo(container);

With autostart: false, Two.js never calls update() on its own. Something in the application has to call it.

The application only calls update() on user events. Skedoodle's entire render-scheduling layer is this one method on the canvas manager:

throttledTwoUpdate = () => {
  const updateFrequency = useOptionsStore.getState().updateFrequency;

  if (updateFrequency === 0) {
    this.two?.update?.();
  } else {
    if (!this._throttledUpdate || this._lastFrequency !== updateFrequency) {
      this._lastFrequency = updateFrequency;
      this._throttledUpdate = throttle(() => {
        this.two?.update?.();
      }, updateFrequency);
    }
    this._throttledUpdate();
  }
};

Three things to notice.

First, the function reads updateFrequency from a Zustand store on every call. That's deliberate: the user can change the throttle rate from the UI at runtime, and the next invocation picks up the new value without re-instantiation.

Second, if updateFrequency is 0, the call goes through immediately with no throttling. This is "High Performance" mode in the UI. For interactions like dragging a single shape or editing a bezier handle, unthrottled gives the most responsive feel and costs nothing extra, because the call rate is already bounded by the pointer event rate.

Third, for any non-zero frequency, Skedoodle builds a throttled wrapper using lodash's throttle (leading + trailing edges) and caches it. The cache is keyed on the frequency value, so changing the throttle rate invalidates and rebuilds; otherwise the same wrapper is reused across calls.

Who calls this? Tool handlers do, after they've mutated scene state — the brush tool on every pointer-move, the shape tool after adjusting dimensions, the pointer tool on selection changes. Zustand store mutations that affect scene state call it. Nothing else. When the user is sitting still, throttledTwoUpdate() isn't called, two.update() doesn't run, and the canvas doesn't repaint.

That's the 0.09%. It isn't a trick. It's what's left when you remove everything that was polling.

The throttle rate (10, 30, 60, or 120 FPS, or "High Performance" for unthrottled) is exposed in the Settings panel as "Update Frequency." That last detail matters: it's evidence that the event-driven model is product surface, not accidental. A thick-library architecture couldn't offer that knob, because the library owns its own tick rate.

The tie that's the real story

Here's what happens when everyone's actually drawing: a synthesized 15-second pointer trace (Archimedean spiral, 60 Hz, 902 events) replayed identically across all four apps.

Skedoodle and tldraw land 0.23 percentage points apart on median CPU across five runs. That's noise-floor territory, and it's the finding that most changed how I think about this.

These two apps have nothing architecturally in common on the rendering side. Skedoodle uses Two.js's SVG renderer. tldraw built its own React-plus-canvas rendering stack from scratch. Totally different choices. Same cost.

Which means active-draw CPU is not determined by which rendering library you pick. It's determined by whether your app does anything else while rendering. tldraw ticks every frame and drains the queue; Skedoodle runs its throttled update. Both do roughly the same amount of shape-drawing work per user event. Same number.

Excalidraw is ~8 points higher, almost certainly rough.js doing stroke roughening on every pointer event. Figma saturates a CPU core: every stroke routes through a WASM renderer, a CRDT, autosave persistence, and telemetry. Different cost structure entirely.

Put together: idle cost and active cost are different problems. Idle is about what your app does when no one's asking it to do anything, which is architectural. Active is about how much work each user interaction triggers, which is workload-dependent. The first is a design choice. The second is mostly inherent.

Picking Two.js over tldraw won't make drawing faster. Picking an event-driven architecture over a perpetual tick will make your app disappear when no one's using it.

The tax

There's a reason most drawing apps use something thicker than Two.js. With a thin renderer you write your own interaction layer: selection, hit-testing, handles, undo/redo, snapping, the whole surface. In Skedoodle's case that's roughly 5,000 lines of application code that exists specifically because the library didn't provide it. tldraw, Fabric, and Konva give you all of that.

Other costs worth being honest about:

You re-discover bug classes. An early version of Skedoodle had a selection/hover layering bug where selection chrome rendered beneath newly drawn shapes; a mature transformer library would have prevented it by owning the chrome layer. Similar categories (pointer capture during fast drags, z-ordering of rotation handles against content, hit-testing that treats stroked paths as fills, coordinate math that breaks at extreme zoom levels) are things a library like tldraw or Konva has already worked through. With a thin renderer, you encounter them yourself, usually after they ship.
Upgrade path is closer to the metal. When the library has a bug, it's more likely to be your problem.

The flip side of "own your render loop" is "own your interaction stack." It's not free, just moved.

When not to do this

Three workloads where event-driven rendering stops helping:

Scenes that legitimately need every frame. Tween systems, physics, particle effects, animated cursors. If the scene changes without user input, an event-driven loop has nothing to trigger it. You need a tick. tldraw's model fits.
Thousands of continuously moving shapes. When redraws are expensive and frequent, the cost isn't in whether to call update(), it's in whether the renderer can batch, dirty-rect, or cull. Thin renderers without those primitives stop helping.
Transformer-heavy interaction surfaces. If your product is defined by multi-select, rotation handles, and snapping across transformed groups, the LOC cost of building that yourself is large and front-loaded. tldraw's transformer is legitimately good. Buy it; don't rebuild it.

Skedoodle's workload is sparse updates driven by user input. Event-driven fits. If it were a particle simulator, I'd want every frame to fire and I'd run a TickManager too.

Try it yourself

The perf framework is committed alongside the Skedoodle source, with a written-down methodology and a 5-run baseline. pnpm install && pnpm --filter skedoodle-perf baseline reproduces every number in this post. Figma needs a one-time auth capture, and the README walks through it.

The architectural choice here is event-driven rendering: nothing polls, renders happen because something changed. autostart: false enforces it at the Two.js boundary. throttledTwoUpdate enforces it inside the application. Neither alone is the whole story; the combination is.

Your drawing app doesn't have to use 2% CPU when you're not using it. It uses that much because of a choice.

Source: github.com/eugenioenko/skedoodle. The perf framework lives in the perf/ directory; baseline numbers in perf_results.md; the research notes that became this post in article_notes.md.

I Found 5 Security Bugs in My OAuth2 Provider on My First Try (With an MCP Security Tool)

Eugene Yakhnenko — Thu, 09 Apr 2026 04:51:02 +0000

I built Autentico, a self-contained OAuth 2.0 / OpenID Connect identity provider in Go. I took spec compliance seriously. Every code path is annotated with the RFC section it implements, I passed the OpenID Foundation conformance suite, and I ran OWASP ZAP scans against it. I thought I was in good shape.

Then I connected go-appsec/toolbox to Claude Code, browsed my app for ten minutes, and found five vulnerabilities (including a HIGH severity issue) on my very first session with the tool. I had almost no prior experience with security testing.

Here's how that happened.

The foundation: RFC annotations and conformance testing

When I built Autentico, I wanted to do things by the book. Every return path, every validation check, every error response references the exact spec section that mandates it:

// RFC 7009 §2.1: "The authorization server first validates the client
// credentials (in case of a confidential client)."
authenticatedClient, err := client.AuthenticateClientFromRequest(r)

// RFC 6749 §10.4: refresh token MUST be bound to the client it was issued to;
// presenting a refresh token issued to a different client MUST be rejected.
if authToken.ClientID != "" && request.ClientID != "" && authToken.ClientID != request.ClientID {

// RFC 7662 §2.2: REQUIRED. Whether the token is currently active.
Active bool `json:"active"`

I reviewed 10 RFCs and specs across the OAuth2 and OIDC ecosystem, tracking every MUST, SHOULD, and MAY requirement in compliance tables. I ran the OpenID Foundation conformance suite (oidcc-basic-certification-test-plan) and passed. I had unit tests, e2e tests, functional tests, and browser tests.

This gave me confidence in the spec compliance of the implementation. But spec compliance and security are not the same thing.

Traditional scanning: OWASP ZAP

I ran an OWASP ZAP API scan (both authenticated and unauthenticated) against 169 URLs. The results were useful but shallow:

Missing OWASP security headers (X-Frame-Options, CSP, Permissions-Policy, etc.)
A couple of endpoints returning 500 instead of 404 for nonexistent resources

I fixed everything in one PR. Final ZAP results: 0 FAIL, 112 PASS, 4 WARN (all informational). Clean bill of health from the scanner.

ZAP tests what it can see from the outside: headers, status codes, common injection patterns. It doesn't understand OAuth flows, MFA logic, or token lifecycle. For that, I needed something different.

Enter go-appsec/toolbox

go-appsec/toolbox is an MCP (Model Context Protocol) server designed for collaborative security testing between humans and AI agents. It's not a scanner; it's a workbench. The idea is simple:

You handle the browser: log in, navigate the app, trigger the flows you want tested
The AI agent watches the traffic through a proxy, analyzes it, and suggests or executes attacks

The tool provides MCP tools for traffic capture (proxy_poll), request replay with modifications (replay_send), JWT inspection (jwt_decode), cookie analysis (cookie_jar), out-of-band testing (oast_create), and more. You connect it to Claude Code (or any MCP-compatible client), and the AI agent uses these tools to probe your application while you drive the browser.

Setup

The setup took minutes:

Start the toolbox MCP server with proxy on port 8080
Configure the browser to proxy through it
Connect the MCP server to Claude Code via claude mcp add
Browse the application to capture traffic

I captured about 112 proxy flows covering OAuth authorization, token exchange, admin CRUD, account management, and MFA enrollment. Then I asked Claude to start testing.

What I found: 5 vulnerabilities on my first try

I want to emphasize: this was my first time using the tool. I had no prior pentesting experience and very little knowledge of how to use go-appsec/toolbox effectively. I was learning the workflow as I went. Despite that, the collaboration between the tool and the AI agent produced real, actionable findings.

The standout: unauthenticated token introspection (HIGH)

The /oauth2/introspect endpoint returned full token metadata (active status, scopes, user ID, and claims) without requiring any client credentials. Anyone who had a token value could check whether it was active and extract its claims.

The AI agent found this by using request_send to POST to the introspect endpoint with no authorization header. The response came back 200 OK with active: true and full claim data. This is the kind of finding that demonstrates the tool's workflow: it captured the legitimate introspect request during browsing, stripped the credentials, replayed it, and confirmed the server didn't enforce authentication. Fixed within minutes during the same session.

The other four

The remaining findings were two MEDIUM and two LOW severity issues:

PKCE not enforced for public clients. The agent used replay_send on a captured authorize flow with code_challenge removed. The server accepted it.
Refresh tokens not rotated on use. The agent hit the token endpoint twice with the same refresh token. Both succeeded.
CSRF error leaked internal config. A POST without the CSRF cookie returned the environment variable name and value in the error message.
Stored XSS in client_name (no exploitable render context). A <script> tag was accepted in the admin API, though the output was HTML-encoded.

What passed (23 tests)

Importantly, the tool also confirmed a lot of things were solid: redirect URI validation (6 bypass variants attempted), JWT alg:none confusion, scope escalation, admin authorization enforcement, username enumeration timing, SQL injection, mass assignment, and account lockout logic. All held up.

What the author found: 10 more issues, deeper logic bugs

After I shared my experience, the toolbox author ran their own session against Autentico. With deeper knowledge of both the tool and security testing methodology, they found five additional vulnerabilities. All logic-level bugs that require understanding how OAuth and MFA flows interact:

MFA enforcement bypass (#172)

This one is the best example of what AI-assisted testing can find that scanners can't. MFA enforcement had four independent gaps that reinforced each other:

The password grant issued tokens without any MFA challenge, even when require_mfa was enabled
Pre-MFA sessions weren't invalidated when the policy changed
An attacker with a bearer token could rotate a user's TOTP secret without presenting a valid OTP code
MFA could be disabled with just the account password, no TOTP code required

No single gap is obvious in isolation. Finding them requires reasoning about the interaction between authentication flows, token grants, and policy enforcement. A scanner sees endpoints; the AI agent understood the MFA lifecycle.

Password grant authenticating deactivated users (#174)

The AuthenticateUser() function didn't check deactivated_at, while every other user lookup in the codebase did. A soft-deleted user could authenticate via the password grant and receive fresh tokens indefinitely. The admin who deleted the user would have no idea. This is a one-line fix (AND deactivated_at IS NULL) but finding it requires noticing the inconsistency across query patterns.

Admin API audience validation bypass (#183)

The admin API only checked that the user had the admin role. Any token belonging to an admin user was accepted regardless of which client issued it. A malicious app registered with the IdP could trick an admin into authorizing it, then replay that token against the admin API for full control. The fix enforces that tokens must also include admin audience in their audience claim, which only tokens issued through the admin client carry by default.

The other ones:

Empty aud claim in access tokens (#171). Tokens had "aud": [], and the admin middleware didn't validate azp, so a token from any client worked on the admin API.
Missing Cache-Control: no-store headers (#173). Sensitive API responses (user lists, settings, sessions) could be cached by browsers and proxies.
Blind SSRF in federation discovery (#177). The HTTP client followed redirects to internal/loopback addresses when fetching federated IdP discovery documents.

The takeaway

I tested my OAuth2 provider with three approaches:

Approach	What it found	Depth
OIDC Conformance Suite	Spec compliance gaps	Protocol-level
OWASP ZAP	Missing headers, error handling	Surface-level
go-appsec/toolbox + AI	10 vulnerabilities including auth bypass, MFA gaps, SSRF	Logic-level

The traditional tools did their job. They confirmed my implementation followed the specs and had standard security headers in place. But the logic-level vulnerabilities (the ones that actually matter for an identity provider) only surfaced when an AI agent could reason about how the pieces fit together.

What surprised me most is that I didn't need to be a security expert to get value from this. The MCP collaboration model means the agent brings security testing knowledge and methodology, while you bring the application context (which flows matter, what the admin UI does, how MFA is supposed to work). Together, you cover ground that neither could alone.

Ten minutes of browsing. First time using the tool. Five findings, three fixed on the spot. That's a pretty compelling return on investment for any developer who cares about the security of what they're building.

All 10 findings across both sessions have been fixed and are tracked in the Autentico Github. All thanks to go-appsec/toolbox Github.

Why I Built an Identity Provider in Go and SQLite

Eugene Yakhnenko — Thu, 26 Mar 2026 23:44:40 +0000

When I set out to build Auténtico, my primary goal was to create a fully-featured OpenID Connect Identity Provider where operational simplicity was the first-class design principle.

Identity infrastructure is notoriously complex. A typical self-hosted setup involves a database server, a cache tier like Redis, a worker queue, and the identity service itself. When I needed a lightweight OpenID Connect (OIDC) server to run on a small 2GB RAM VPS, I realized the existing landscape was either operationally exhausting or structurally flawed for my specific needs.

This is the story of how (and why) I built Auténtico, a self-contained, single-binary OIDC provider backed by SQLite that removes the ceremony from identity management.

The Itch: Finding the Right Lightweight IdP

My journey started because I was researching and implementing a frontend OIDC library for product needs at my company. That scratched an itch, and I evolved it into a functional backend OIDC protocol server in Go.

Months later, when I needed a lightweight Identity Provider, I evaluated the popular options but quickly hit roadblocks:

Casdoor: I didn't like how they treated private data. Their demo instances recycle accounts every 5 minutes, making it impossible to truly test account deletion.
PocketId: This is a fantastic tool, but it had a critical UX flaw for my needs: it is passkey-only by default.

While passkeys are the future, the current ecosystem is heavily fragmented. If a user is on an older OS or a restrictive browser, a passkey-only IdP completely locks them out.

The Antidote: Zero-Ceremony Architecture

I decided to convert my OIDC protocol server into a full IdP, ensuring that every architectural decision was evaluated against a single question: does this reduce or increase the operational burden on the person running this?

Auténtico removes the entire traditional identity stack:

Single Binary: The entire IdP runs as one Go binary.
Embedded SQLite: There is no separate database server. The entire state lives in one file. Eliminating the external database removes connection pool tuning, credential rotation, and network partitions.
No External Infrastructure: No Redis, no Postgres, no message queues. Background cleanup goroutines automatically purge expired tokens, sessions, and auth codes.
Embedded UIs: Both the Admin dashboard (React/Ant Design) and the user-facing Account UI (React/Tailwind) are compiled directly into the binary using go:embed. There are zero separate frontend deployments.

Flexibility Over Dogma: Solving the Passkey Trap

To solve the hardware and OS fragmentation issues I experienced with passkeys, I ensured Auténtico wouldn't trap operators into a single authentication path.

Instead, Auténtico offers three distinct authentication modes that are switchable at runtime without restarting the server:

password
password_and_passkey
passkey_only

If you deploy passkey_only and discover your users' specific browser combinations are failing, you can instantly flip a setting in the Admin UI to fall back to passwords. For robust security without passkeys, it includes standard fallback methods like TOTP (with in-browser QR enrollment) and Email OTP. For users with modern browsers, it fully supports hardware-backed FIDO2 authentication and even allows first-login registration in one seamless flow.

The "Deliberately Un-clever" Architecture & The AI Accelerator

To make this work, the codebase had to be deliberately un-clever. I designed a strict vertical-slice architecture where each package (like pkg/login or pkg/token) owns its exact slice of functionality with a predictable structure:

model.go
handler.go
service.go
Database CRUD

This strictness had a massive secondary benefit: it created the perfect environment for AI. Because I spent the time establishing this blueprint, I reached a tipping point where I could hand off the boilerplate. AI agents seamlessly followed the patterns to generate the CRUD operations and rapidly write over 700 tests (hitting 80% coverage) precisely because the architectural constraints were so rigid.

The Scale Ceiling (And Why It Doesn't Matter)

The immediate pushback to this architecture is always: "SQLite doesn't scale."

I am intentionally honest about the scale ceiling: SQLite serializes writes. Auténtico is not designed for active-active multi-region deployments or massive enterprise horizontal scaling.

However, let's look at the math:

Concurrency	Error rate	Login p95	Token p95	Assessment
20 VUs	0%	86ms	54ms	Comfortable — imperceptible to users
100 VUs	0%	611ms	647ms	Supported — fully functional
500 VUs	0%	3.36s	3.89s	Degraded — users feel the wait

Performance tests with k6 show the system degrades gracefully via SQLite's busy timeout—queueing requests and adding latency rather than throwing errors.

For most teams running internal tools, small-to-mid-sized apps, or self-hosted environments, trading infinite horizontal scaling for zero operational overhead is absolutely the right choice.

Conclusion

Operational simplicity does not mean protocol simplicity.

Auténtico strictly enforces:

OIDC Discovery — publishes /.well-known/openid-configuration so relying parties auto-configure without hardcoding endpoints
JWK Set — exposes public signing keys at /.well-known/jwks.json for independent token verification
RS256 JWT Signing — asymmetric signing; the private key never leaves the IdP
Auth2/OIDC protocol: ImplementsOIDC protocol
Admin UI: For admins to manage clients, users and session
Account UI: For users to manage they profile
Swagger OpenAPI docs: Publishes api specs docs

If you are a small team, an indie developer, or just someone who wants to deploy an Identity Provider without taking on a second job as a sysadmin, sometimes the best architecture is the one you barely have to think about.

The Lightweight JavaScript Framework Renaissance of 2026

Eugene Yakhnenko — Tue, 24 Mar 2026 01:43:52 +0000

Best JavaScript Frameworks in 2026: For AI and Humans

The JavaScript framework landscape in 2026 looks different from what it did three years ago. Not because React disappeared or Vue lost relevance, but because something shifted in how code gets written. AI coding assistants now author a significant portion of frontend code. That changes the evaluation criteria in ways the existing framework rankings haven't caught up with yet.

This article covers both the established giants and the growing category of lightweight libraries that are having a quiet renaissance. The goal is to help you pick the right tool given who, or what, will be writing most of your code.

The New Evaluation Criteria

The classic framework checklist covered performance, ecosystem, learning curve, and job market. Those still matter. But in 2026, two new questions belong on that list:

How much does this framework cost an AI to get right?

Every framework has footguns. The question is whether those footguns require deep framework-specific knowledge to avoid, or whether they're the kind of mistakes any developer (human or AI) would catch on a first read. Frameworks with fewer implicit rules produce more reliable AI-generated code.

Can you run it without a build pipeline?

For quick prototypes, internal tools, and AI-generated demos, the ability to drop a script tag and go is genuinely valuable. Not every project needs a bundler, and forcing one adds friction that compounds when an AI agent is setting up the environment.

The Heavy Framework Tax

React, Vue, Angular, and Svelte dominate the ecosystem. They dominate for real reasons: massive communities, mature tooling, rich ecosystems, and years of production hardening. None of what follows is an argument to abandon them.

But they carry weight.

React requires understanding hooks ordering rules, useEffect dependency arrays, stale closure behavior, and the distinction between controlled and uncontrolled components. These are not obvious from the surface syntax. AI agents generating React code make predictable, repeatable mistakes in all of these areas. The community has documented them extensively, which means LLMs have seen the patterns, but also means the footguns are well-established and hard to train away.

Vue 3 is more approachable. The Composition API is clean, and <script setup> reduces boilerplate significantly. The reactivity model is intuitive. But the template compiler is a black box, the distinction between ref and reactive trips up new users (human and AI alike), and the ecosystem split between Options API and Composition API adds cognitive overhead.

Angular is the most structured of the group. Structure helps, but Angular's DI system, decorators, zone.js, and now the signals migration mean there is a lot of framework-specific knowledge required before you can write idiomatic code. It remains the right choice for large enterprise teams where that structure is the point.

Svelte compiles away at build time, which is elegant. But the compiler is the framework. You cannot use Svelte without a build step, the template syntax is non-standard HTML, and the reactivity model (especially the $: syntax in Svelte 4 and the runes in Svelte 5) requires knowing Svelte specifically. An AI agent that hasn't seen enough Svelte in training will produce subtly wrong reactive code.

None of this is fatal. Millions of applications run on these frameworks and will continue to. But there is a real cost, and it is higher when an AI is holding the pen.

The Light Library Renaissance

A different category of tools has been growing steadily: small, focused libraries that add reactivity and component structure on top of the browser's native model rather than replacing it. They tend to share a few traits:

No build step required for core functionality
Templates that stay close to HTML, or use standard JS tagged literals
Signal-based or proxy-based reactivity with simple rules
Minimal framework-specific concepts to learn

In 2026, this category is no longer a niche. It is a legitimate choice for a wide range of projects, and in many cases the better choice for AI-generated code.

Arrow.js

Arrow.js (@arrow-js/core) is one of the most technically interesting entries in this space. It was built by Standard Agents and has an architecture that reads like a deliberate response to framework complexity.

The core model is simple: reactive state is a plain object wrapped in reactive(), and templates are JavaScript tagged literals using the html tag.

import { reactive, html } from '@arrow-js/core'

const state = reactive({ count: 0 })

html`
  <div>
    <p>Count: ${() => state.count}</p>
    <button @click="${() => state.count++}">+</button>
  </div>
`(document.getElementById('app'))

A few things stand out here. Reactive expressions in templates are just arrow functions. Static values render once; functions are tracked and re-run when dependencies change. That distinction is explicit in the syntax, not hidden behind a compiler.

The reactivity model is proxy-based rather than signal-based. This means you mutate properties directly (state.count++), and array mutations like .push() trigger updates without requiring a reassignment. For developers coming from plain JavaScript, this feels natural.

Components are defined with component(), a factory function that runs once per slot and returns a template. Local state, side effects, and cleanup all live inside the factory.

const Counter = component((props) => {
  const local = reactive({ clicks: 0 })
  return html`<button @click="${() => local.clicks++}">
    ${() => local.clicks}
  </button>`
})

Arrow's package ecosystem is notably complete. Beyond the core, it ships SSR with @arrow-js/ssr, client-side hydration with @arrow-js/hydrate, hydration boundary recovery, and a QuickJS/WASM sandbox (@arrow-js/sandbox) for safely running user-authored Arrow code in the browser. That last one is unusual and speaks to an AI-native use case: letting agents generate and execute code without granting them access to the host page.

The tradeoff is that templates live in JavaScript. The html tagged literal approach means your markup is a JS string, not a file an HTML tool understands natively. That is a different philosophy from the HTML-first camp, and whether it is a feature or a limitation depends on the project.

Kasper.js

Kasper.js takes the opposite position on the templates-vs-JavaScript question. Templates are valid HTML. Directives are standard HTML attributes prefixed with @. Any developer who knows HTML can read a Kasper template without knowing Kasper.

<template>
  <div>
    <p>Count: {{count.value}}</p>
    <button @on:click="increment()">+</button>
    <div @if="count.value > 10">You clicked a lot.</div>
  </div>
</template>

<script>
import { Component, signal } from 'kasper-js'

export class Counter extends Component {
  count = signal(0)
  increment() { this.count.value++ }
}
</script>

The reactivity model uses signals with explicit .value reads and writes. This is more verbose than Arrow's proxy approach, but it makes reactive reads visible in both code and templates. An AI agent reading a Kasper template knows exactly which values are reactive and when they update.

Components are classes. This is a deliberate choice for AI compatibility: classes have well-defined ownership, explicit methods, and a lifecycle that maps directly to familiar OOP patterns. There are no hook rules, no dependency arrays, no rules about where you can call what. onMount, onChanges, onRender, onDestroy: the lifecycle is what it says it is.

Cleanup is handled through a single AbortController that every component owns. this.watch(), this.effect(), this.computed(), and all @on: event listeners are released automatically when the component is destroyed. No return () => cleanup(). No forgetting to unsubscribe.

The expression evaluator is worth noting: it is a custom recursive-descent parser, not eval and not new Function. This means Kasper templates work under strict Content Security Policies, which matters for enterprise and regulated environments. The parser is more capable than it might sound: it covers the full practical range of JavaScript expressions, including arrow functions, optional chaining, nullish coalescing, object and array literals with spread, typeof, instanceof, postfix and prefix operators, and a pipeline operator (|>). The only meaningful gaps compared to full JavaScript are statement-level constructs like async/await, for loops, and switch, none of which belong in a template expression anyway.

The no-build-step story is genuine. One CDN import (16KB gzipped) and you have signals, a router, slots, and lazy loading. The Vite plugin adds single-file .kasper components on top, but it is optional.

<script type="module">
  import { App, Component, signal } from 'https://cdn.jsdelivr.net/npm/kasper-js/dist/kasper.min.js'

  class Counter extends Component {
    count = signal(0)
  }
  Counter.template = `<button @on:click="count.value++">{{count.value}}</button>`

  App({ root: document.body, entry: 'counter', registry: { counter: { component: Counter } } })
</script>

Kasper also ships an llms.txt at kasperjs.top/llms.txt, a machine-readable reference file specifically for AI agents. It covers the full API surface in a compact format designed for context windows, which reflects where the framework sees the ecosystem going.

Others Worth Knowing

Alpine.js is the simplest entry point in the light library category. Directives live directly on HTML elements as attributes (x-data, x-show, x-on:click). There is no component system, no build step, and very little to learn. It is excellent for adding interactivity to server-rendered pages. It is not the right tool for building SPAs.

Lit comes from Google and is built on Web Components. Templates use tagged literals like Arrow, and reactivity is property-based. Lit components are real custom elements, which means they work in any framework or no framework. The tradeoff is that Web Component conventions (shadow DOM, attribute reflection, property vs attribute distinctions) add complexity that pure library approaches avoid.

Solid.js is a compiler-based framework like Svelte, but its output is fine-grained reactive updates with no virtual DOM. Performance is exceptional. The JSX surface looks like React, which helps with adoption, but the mental model is fundamentally different: components run once, and reactivity is tracked through signal reads. Solid is worth learning if performance is a primary concern and you are comfortable with a build step.

Petite-Vue is a distribution of Vue designed for progressive enhancement. It is small, requires no build step, and works well when you need Vue's template syntax on an existing server-rendered page. It is not a full SPA framework.

Comparison at a Glance

Framework	Build Required	Reactivity	Template Style	AI-Friendly
React	Yes	Hooks	JSX	Moderate
Vue 3	Optional	Proxy/Ref	HTML + compiler	Moderate
Angular	Yes	Signals/Zone	HTML + compiler	Low
Svelte 5	Yes	Runes	HTML + compiler	Moderate
Arrow.js	No	Proxy	JS tagged literals	High
Kasper.js	No	Signals	Valid HTML	High
Alpine.js	No	Proxy	Inline HTML	High
Lit	No	Properties	JS tagged literals	High
Solid	Yes	Signals	JSX	Moderate

How to Choose

Use React, Vue, or Angular when you are joining an existing team, building a product that needs a large ecosystem, or hiring for a team that already knows the framework. The community, tooling, and hiring pool are real advantages that lightweight alternatives cannot match yet.

Use Svelte or Solid when bundle size and runtime performance are primary constraints and you are comfortable with a compiler in the pipeline.

Use Arrow.js when you want the smallest possible runtime, prefer JavaScript-centric templates, need SSR with hydration out of the box, or are building tooling where the sandbox package is relevant.

Use Kasper.js when HTML-first templates matter (for readability, CSP compliance, or AI-generated code), when you want class-based components with automatic cleanup, or when a no-build-step option has real value for your workflow.

Use Alpine.js when you have server-rendered HTML and want to add interactivity without touching the build pipeline.

Use Lit when Web Components interoperability is a requirement.

Conclusion

The right framework in 2026 is still context-dependent. The established four are not going anywhere, and for many teams they remain the correct answer.

But the light library category has matured. Arrow.js and Kasper.js in particular are not toys or experiments: they are complete, well-tested solutions with clear architectural philosophies. They are simpler by design, not by omission. And in an era where AI agents write a growing share of frontend code, simpler-by-design has compounding returns.

The best framework is the one your team, and your tools, can use correctly. In 2026, that calculation includes AI as a member of the team.

Building a JavaScript Framework (and Failing Twice at Reactivity)

Eugene Yakhnenko — Mon, 23 Mar 2026 01:46:01 +0000

About five years ago, I didn't set out to build a framework I'd use in production.

I just wanted to understand them.

I had already written parsers and interpreters before, so I knew the mechanics: tokenization, ASTs, execution. But frameworks felt different. They weren't just about parsing code; they were about state, updates, and keeping the UI in sync. Reactivity was the part I didn't understand. So I decided to build one from scratch.

I started with the pieces I knew: a scanner, a parser, a JavaScript interpreter, and an HTML template parser. After a while, I had a working system: a small component model and a template engine that could render real views. It looked like a framework.

But it was missing the one thing that actually makes a framework feel alive: reactivity.

<template>
  <h1>{{count.value}}</h1>
  <div class="actions">
    <button @on:click="count -= 1">-</button>
    <button @on:click="count += 1">+</button>
  </div>
</template>

<script>
  import { signal, Component } from "kasper-js";

  export class Counter extends Component {
    count = signal(0);
  }
</script>

<style>
  h1 {
    font-size: 2rem;
  }
</style>

The Part That Failed Twice

I tried implementing reactivity early on. It didn't work. I've tried with using Proxy, I tried just using a render() function, it was not clicking. There where other parts of the framework I struggled with as well, but reactivity left an imprint.

Later, I tried again. This time I got something running; but it was fragile. It only worked for a single component. Child updates broke. State invalidation was inconsistent. It looked like reactivity, but you couldn't trust it.

So I dropped it again.

Coming Back to It

The project sat dormant for a long time, almost abandoned. Other projects took over. Life moved on. After a few years away, I returned to it. This time, I approached it differently. Instead of trying to "add reactivity," I focused on correctness first, testing everything, and simplifying assumptions.

With the help of AI agents, I rebuilt the system around signals.

That changed everything.

Once signals were in place, the architecture became much simpler: no virtual DOM, no diffing, direct updates to real DOM nodes, fine-grained reactivity. But the real breakthrough wasn't just the model. It was the process.

600 Tests Later

I started adding tests. Then more tests. Then hundreds more. With AI assistance, I reached 600+ test cases. At that point, something unexpected happened: the AI couldn't generate any new meaningful tests. Everything obvious and most non-obvious cases were already covered.

The test were meaningful. It felt complete.

But it wasn't. Of course it wasn't. Just because 600+ tests pass it doesn't mean your system has no bugs.

The Real Test Wasn't Tests

The codebase looked solid. The tests passed. But there was still a problem: no one had actually used the framework to build real apps.

So I tried something different. Instead of writing apps manually, I asked AI agents to build them.

And they failed immediately.

Not because the framework was broken; but because the AI didn't know how to use it. This was a surprising moment. The system worked, but it wasn't understandable.

The Missing Piece: Documentation for AI

That's when I introduced something modern: llms.txt.

A dedicated, structured specification designed for AI agents. Not marketing docs. Not tutorials. Just syntax, rules, constraints, and examples. Think of it as a "principal engineer version" of the API.

Then I started a loop: give the AI the spec, ask it to build an app, observe where it fails, update the spec, repeat.

After a few iterations, something remarkable happened. The AI started generating full apps on the first try: todo apps, CRUD interfaces, Kanban boards, tree views, infinite scroll, even Game of Life. All working.

Article about my experience with llms.txt

A Surprising Insight About AI

At one point, the AI became very confident about a "necessary" architectural change. It proposed a redesign that would require around 100 lines of changes. We tried it. It failed repeatedly.

After stepping back and analyzing the problem, the real fix was 5 lines of code.

That moment stuck with me. AI can be incredibly helpful but it can also confidently overcomplicate problems.

When Tests Stop Helping

With 600+ tests, the system looked stable. But once the AI started generating real applications, new edge cases appeared: subtle rendering issues, lifecycle timing problems, data edge cases that no unit test would have caught in isolation.

So I kept going. Built more apps (shopping cart, dashboards, editors, product listing, interractive tables with pagination), fed failures back into the system, and added more tests. Real usage found things that testing alone never would.

What Actually Made the Framework Stable

Looking back, it wasn't one thing. It was the combination of:

A simple reactive model (signals)
Relentless testing (600+ cases)
Real-world usage (apps, not just tests)
AI as both a developer and a user

The Unexpected Win

One design choice I made years ago turned out to be critical: the template syntax was valid HTML. Originally, this was just for better syntax highlighting. But later, it made the framework significantly more AI-friendly. No custom grammar. No ambiguity. Just HTML with extensions.

What I'd Do Differently

If I started today, I would design the reactive model first, write tests earlier (a lot earlier), treat AI as a first-class user from day one, and create a machine-readable spec alongside human docs.

Where It Ended Up

After years of on-and-off work, multiple failures, and hundreds of tests, the framework is stable. Not because it's perfect, but because it survived repeated redesigns, real usage, and constant pressure from both humans and machines.

Building the framework wasn't the hardest part. Making it correct, usable, and understandable for both humans and AI was the real challenge.

Try it out!

Learn more about kasper.js at:

We Had to Write Docs for AI: llms.txt Changed Everything

Eugene Yakhnenko — Mon, 23 Mar 2026 01:37:14 +0000

Most developers write documentation for humans.

While building my JavaScript framework, I ran into a problem I didn't expect: the framework worked but AI couldn't use it. Not "wasn't perfect." Not "made small mistakes." It completely failed to build even basic apps correctly unless it had the source code of the framework available.

The Moment Things Broke

After years of work, I finally had a stable system: a custom scanner, parser, interpreter, a template engine with components, a signal-based reactivity system, and around 600 tests covering edge cases. I thought I was done.

So I tried something simple: "Build a todo app using this framework."

What I got back looked confident, but was completely wrong. Wrong syntax. Wrong mental model. Invented features that didn't exist.

This wasn't a bug in the framework. It was a documentation failure.

README Is Not Enough Anymore

Traditional documentation is designed for humans: narrative explanations, gradual onboarding, examples mixed with storytelling.

AI doesn't work like that. It doesn't "read" docs. It pattern-matches and guesses. So when the documentation is incomplete, ambiguous, or too prose-heavy, AI fills in the gaps. Confidently. Incorrectly.

The Solution: llms.txt

The solution was simple in hindsight: treat AI like a strict compiler, not a reader.

I created a new file: llms.txt. Not marketing docs. Not tutorials. Just raw, explicit specification.

The rules were strict:

No prose. No storytelling, no explanations. Only syntax, rules, and constraints.

No ambiguity. There's a big difference between:

You can use @if for conditional rendering.

and:

@if="condition"
- condition must be a valid JS expression
- evaluates to truthy/falsy
- false removes node from DOM

Complete surface area. All directives, template expressions, components, lifecycle hooks, signal behavior, everything explicitly defined. Nothing implied.

Minimal but real examples:

<ul>
  <li @each="item in items" @key="item.id">{{ item.name }}</li>
</ul>

Reading the Docs + Source Code

Even with llms.txt, the AI couldn't just guess everything. It needed to read a lot of source code, inspect function signatures, understand how signals propagate, see how component lifecycle worked. Only then could it map the spec to the actual implementation and generate working apps.

Building apps wasn't magic. It was AI + spec + code comprehension.

The Iteration Loop

I didn't just hand over the file and hope. The loop looked like this:

Give AI the current llms.txt and source access
Ask it to build a real app (todo, kanban, etc.)
Observe failures
Fix the spec
Repeat

A few things became clear along the way.

Missing features aren't always obvious. At one point, AI kept trying to use @keydown.enter. I had never documented it but the framework already supported it. The fix was to update the spec, not the code.

Ambiguity is worse than missing features. Undocumented features lead to confident guesses. Vaguely documented features lead to confident wrong guesses. Explicit rules always win.

AI exposes your own blind spots. It suggested massive architectural rewrites; redesign scope tracking, refactor core systems. All seemed convincing. The result: 100 lines of changes, none of which worked. The real fix? Five lines of code. AI can be very persuasive about the wrong solution.

When It Finally Clicked

After a few iterations of refining llms.txt and reading source code, AI could reliably generate todo apps, Kanban boards, tree views, infinite scroll, and Game of Life (first try), fully working, following spec.

Real apps also exposed edge cases that 600 unit tests never would: shopping carts, form wizards, markdown editors, live dashboards. The tests covered everything within a known model. Real usage kept expanding the model.

Two Types of Documentation

There are now two distinct audiences for docs:

Human docs: explain concepts, tell the story, teach mental models.

AI docs (llms.txt): define rules, eliminate ambiguity, maximize correctness.

Both are necessary. They serve completely different purposes and shouldn't be conflated.

The Unexpected Payoff

One design decision made early on turned out to help here too: the template syntax was valid HTML. This meant free syntax highlighting, editor support, and it turns out, AI-friendly defaults. The more your syntax looks like existing patterns, the less AI has to guess.

Final Thought

The hardest part wasn't building the framework. It wasn't reactivity or performance.

It was making the system understandable to something that doesn't actually understand.

We're entering a world where humans write ideas and AI writes implementations. In that world, specification becomes the product. Not just a supplement to the code, the thing that makes the code usable at all.

Learn more about kasper.js at:

Adding Attribute-Based Access Control to a Real-Time Collaborative App with OpenTDF

Eugene Yakhnenko — Fri, 20 Mar 2026 07:52:05 +0000

I built Skedoodle, an open-source real-time collaborative sketching app. Think a lightweight Figma for doodling: multiple users connect over WebSocket, draw on a shared infinite canvas, and see each other's cursors move in real time. It's built with React, TypeScript, Two.js for vector graphics, and Zustand for state management, with an Express backend handling persistence and real-time sync.

Building the interactive parts was the fun challenge. Throttled rendering at 60fps, path simplification algorithms to keep stroke data lean, touch support, pan and zoom on an infinite canvas, undo/redo that works across multiple collaborators. Skedoodle is a proper interactive app, not a toy demo.

But it had a glaring gap: no authorization. Authentication? Sure, users logged in via OIDC. But once you were in, you could access any sketch if you knew the ID. Think YouTube: every video is technically accessible if you have the link, even "unlisted" ones. Skedoodle had the same problem. There was no way to control who could see or edit what.

I needed to fix this. And rather than hand-roll role checks and a collaborators table, I wanted to use a proper policy engine — one that could handle the simple case today and scale to more complex scenarios without rewriting everything.

How This Project Started

This whole project started because I was working with an AI agent to generate an llms.txt for OpenTDF; a structured documentation file designed to give AI agents enough context to work with a platform. Once we had it, the obvious next step was to test it: take a real project with no authorization at all, point an agent at the llms.txt, and see if it could build a correct ABAC integration from scratch.

Skedoodle was the perfect candidate. A real collaborative app, with authentication but zero authorization. The experiment: could an AI agent, armed only with OpenTDF's llms.txt and a description of the access model I wanted, deliver a working integration?

Why OpenTDF

OpenTDF is an open-source platform maintained by Virtru that provides attribute-based access control (ABAC) alongside end-to-end encryption via the Trusted Data Format specification.

What drew me in was how lightweight the authorization integration is. OpenTDF is known for its encryption capabilities, but the ABAC engine stands entirely on its own. You don't need to encrypt anything to use it. You define policies, and the platform makes access decisions. That's exactly what I needed: a centralized policy engine that could answer "does this user have access to this sketch?" based on attributes rather than hardcoded role checks.

The ABAC model is straightforward:

You define namespaces and attributes (e.g., https://skedoodle.com/attr/sketch-access)
Each attribute has values and a rule (AnyOf, AllOf, or Hierarchy)
Subject mappings connect identity provider claims to attribute entitlements
When someone requests access, the platform evaluates their entitlements against the resource's required attributes and returns permit or deny

No SDKs to embed, no agents to deploy. It's a JSON API you call. Your app manages the data, OpenTDF manages the policy.

The Access Model

What I wanted was straightforward:

Owner creates a sketch and always has full access
Owner can invite other users by username
Owner can remove any collaborator
Collaborators can draw on the sketch and can leave voluntarily
Collaborators cannot remove other collaborators or the owner
No public access — every sketch requires an explicit ABAC grant. Read-only public sharing could be layered on later as a separate attribute.

Simple enough for users to understand, but it needs proper enforcement at every layer: REST API, WebSocket connections, and the real-time command stream.

Building It with an AI Agent

I used Claude Code as my coding agent. The agent fetched OpenTDF's llms.txt at runtime, which gave it the architectural overview, API reference, Connect RPC URL patterns, protobuf enum values, and curl examples it needed to understand the platform.

The agent:

Read the docs and correctly chose ABAC authorization over full TDF encryption, understanding that per-command encryption would be impractical for real-time collaboration
Designed an attribute scheme (one attribute value per sketch, AnyOf rule) that maps cleanly to the sharing model
Built the entire integration: REST API, WebSocket authorization, OpenTDF service with subject mapping lifecycle, and client UI

The llms.txt gave the agent enough context to use the right API patterns without guessing — the correct RPC URL format, the exact enum values for condition operators and boolean types, the entity identifier structure for GetDecisions. I described the access model I wanted, and it delivered a working integration.

The ongoing iteration — refining the architecture, debugging access issues, removing redundant layers — was also done collaboratively with the agent, with llms.txt as the shared reference for how OpenTDF's APIs work. When we hit an issue where ABAC returned PERMIT but the app still denied access, the agent was able to trace the problem because it understood the full authorization flow from the docs.

How the Integration Works

ABAC as the Single Source of Truth

There's no collaborators table in the database. OpenTDF is the sole authority for access control. The database stores sketches, commands, and users. Who has access to what is entirely managed through OpenTDF subject mappings.

This is a deliberate design choice. Instead of maintaining a local access control table and keeping it in sync with a policy engine, the application delegates all authorization to OpenTDF. The only local concept of "role" is ownership: the Sketch table has an ownerId field. Everything else — who can access which sketch, whether a given user is permitted — comes from ABAC.

Policy Structure

On server startup, the service registers Skedoodle's policy structure with OpenTDF:

Namespace: https://skedoodle.com
Attribute: sketch-access (rule: AnyOf)

Each sketch gets its own attribute value. Subject mappings are actively managed as part of the application lifecycle:

Sketch created → register an attribute value, create a subject mapping for the owner
Collaborator invited → create a subject mapping linking the user's username to the sketch's attribute value
Collaborator removed → delete the subject mapping
Access check → call GetDecisions to verify the user has a valid entitlement

The Sharing Workflow

Three endpoints handle collaboration:

POST   /api/sketches/:id/collaborators           Owner invites by username
DELETE /api/sketches/:id/collaborators/:username  Owner removes, or user leaves
GET    /api/sketches/:id/collaborators            List who has access

When an owner invites a collaborator, the app creates a subject mapping in OpenTDF:

const result = await rpc(
  "policy.subjectmapping.SubjectMappingService",
  "CreateSubjectMapping",
  {
    attributeValueId: valueId,
    actions: [{ name: "read" }],
    newSubjectConditionSet: {
      subjectSets: [
        {
          conditionGroups: [
            {
              booleanOperator: "CONDITION_BOOLEAN_TYPE_ENUM_OR",
              conditions: [
                {
                  subjectExternalSelectorValue: ".username",
                  operator: "SUBJECT_MAPPING_OPERATOR_ENUM_IN",
                  subjectExternalValues: [username],
                },
              ],
            },
          ],
        },
      ],
    },
  }
);

This tells the platform: when a user's Keycloak .username matches, grant them the sketch's attribute value entitlement.

Listing collaborators queries ListSubjectMappings and filters for mappings that match the sketch's attribute value. Removing a collaborator deletes the mapping. There's no local state to keep in sync.

Access Checks

Every protected operation — loading a sketch, fetching commands, joining a WebSocket room, saving commands — calls GetDecisions:

const result = await rpc("authorization.AuthorizationService", "GetDecisions", {
  decisionRequests: [
    {
      actions: [{ name: "read" }],
      entityChains: [
        {
          id: "user",
          entities: [{ userName: username }],
        },
      ],
      resourceAttributes: [
        {
          attributeValueFqns: [
            `https://skedoodle.com/attr/sketch-access/value/${sketchId}`,
          ],
        },
      ],
    },
  ],
});
const allowed = result.decisionResponses?.[0]?.decision === "DECISION_PERMIT";

If the platform denies access or is unreachable, the request is rejected. This is a deliberate choice — ABAC is the single source of truth, so there's no stale local copy to fall back to. In a production deployment where availability is critical, you'd want to run OpenTDF with redundancy, or introduce a short-lived decision cache as a buffer. For Skedoodle, fail-closed is the right tradeoff: denying access temporarily is better than granting it incorrectly.

WebSocket Enforcement

Real-time collaboration adds a wrinkle. You can't call a policy service on every brush stroke. The approach:

Authorize on join: call GetDecisions when a user connects
Enforce at the room level: owners and collaborators can draw, the role is set once at join time
Kick on revocation: when access is removed via the API, immediately disconnect the user

// When an owner removes a collaborator
const mappingId = await opentdfService.findSubjectMappingId(targetUsername, sketchId);
if (mappingId) {
  await opentdfService.deleteSubjectMapping(mappingId);
}

const room = rooms.get(sketchId);
if (room) {
  room.kickClientByUsername(targetUsername);
}

The client handles revocation gracefully with a dialog explaining what happened and options to go back.

Listing Sketches from ABAC

To show a user their sketches, the app queries both the database and OpenTDF in parallel:

const [ownedSketches, abacSketchIds] = await Promise.all([
  prisma.sketch.findMany({ where: { ownerId: req.userId } }),
  opentdfService.listSketchIdsForUser(req.username),
]);

Owned sketches come from the database. Shared sketches come from OpenTDF by iterating subject mappings and extracting sketch IDs from attribute value FQNs. The two lists are merged, deduped, and returned with roles.

What This Shows About ABAC

This integration replaced what would typically be a collaborators join table, a set of role-checking queries, and manual sync logic — with a handful of API calls to a policy engine.

Where ABAC gets interesting is what happens next. Today Skedoodle's access model is simple: per-sketch, per-user grants. But the same infrastructure supports:

Mapping team membership to sketch access (subject mappings based on group claims instead of individual usernames)
Classification-based access (new attributes with AllOf or Hierarchy rules)
Cross-organization sharing (attribute values scoped to external identity providers)

These would be policy changes — new attributes, new subject mappings — not application code changes. The checkAccess() call stays the same.

The Timeline

The entire integration took one afternoon:

Phase	Time
Switch identity provider to Keycloak	15 min
Create Keycloak client + test users	10 min
Collaborator API + OpenTDF subject mapping lifecycle	15 min
WebSocket authorization + kick-on-revoke	15 min
Client UI (share dialog, access denied, role badges)	20 min
OpenTDF ABAC service integration	15 min
Debugging and polish	20 min

The OpenTDF integration itself was the smallest piece. Most of the work was building the sharing UX and enforcing access at the WebSocket layer. OpenTDF slotted in cleanly because it's designed to be an authorization service you call, not a framework you restructure your app around.

Key Takeaways

ABAC can be your single source of truth for access control. Instead of maintaining a collaborators table and keeping it in sync with a policy engine, Skedoodle delegates all authorization to OpenTDF. The application code doesn't contain access control logic beyond "ask OpenTDF and respect the answer."

The integration surface is small. Six API operations cover the entire authorization model, callable from any language with plain fetch.

Real-time apps need smart enforcement points. You can't call a policy service on every WebSocket message. Authorize on connect, enforce roles at the room level, and handle revocation proactively by kicking disconnected users.

llms.txt makes AI-assisted integration practical. The agent built a working ABAC integration from documentation alone. Structured, machine-readable docs lower the barrier to adoption — not just for AI agents, but for any developer exploring a new platform.

ABAC scales where RBAC doesn't. Roles are fine until you need to express "users in department X with clearance level Y can access resources tagged with classification Z." That sentence maps directly to ABAC attributes. Trying to model it with roles leads to an explosion of role combinations.

Try It

The OpenTDF integration lives in a dedicated fork: skedoodle-opentdf. It includes everything you need to run the full stack locally.

If you're building an app that needs access control beyond basic ownership — especially if you want centralized policy management or the flexibility to evolve your authorization model over time — ABAC with OpenTDF is worth a look.

Kneel Before Zod!

Eugene Yakhnenko — Fri, 16 Jan 2026 02:56:00 +0000

TypeScript has changed the game for JavaScript developers by adding static type checking, but it doesn’t automatically handle data validation. Especially when dealing with external sources like APIs or user inputs.
Lets break down the challenges of data validation in TypeScript, explores possible solutions, and takes a closer look at Zod, a powerful validation library.

Why Data Validation Matters in TypeScript

Data validation is all about making sure the data you receive is in the right format and contains the right information. This is especially important when handling external data, like API responses, user input or data from local storage. When you define types in TypeScript, they help during development, but they don’t actually enforce anything at runtime. So even if you expect an API to return a certain structure, TypeScript won’t stop it from giving you something completely different.
You've probably experienced this issue tons of times with errors like:

VM228:1 Uncaught TypeError: Cannot read properties of undefined (reading 'something')

Compile-Time vs. Runtime-Time Gap

One of the biggest challenges in TypeScript data validation is the difference between what TypeScript checks at compile time and what actually happens at runtime. For example, when you fetch data from an API, TypeScript assumes it matches your type definitions, but in reality, there’s no guarantee.
Same issue when reading from localStorage. Even when JSON.parse() succeeds, there's no guarantee that the data has the shape you're expecting.
This gap means that without extra validation, your app could end up working with incorrect or unexpected data.

interface User {
  id: number;
  email: string;
}

const fetchUser = async (id: number): Promise<User> => {
  const response = await fetch(`/api/users/${id}`);
  const data = await response.json();
  return data; // But nothing ensures data actually matches User interface
}

const retrieveUser = async (): User | null => {
  try {
    const data = localStorage.get('user');
    const user = JSON.parse(data);
    return user; // But nothing ensures data actually matches User interface
  catch {
    return null;
  }
}

API interfaces are contracts, and usually this is not an issue, specially if you are also the maintainer of the API.

Solutions for TypeScript Data Validation

Type Guards and Assertion Functions

TypeScript's built-in type guards provide a simple validation mechanism:
Type Guards Docs

function isUser(data: unknown): data is User {
  return (
    data !== null &&
    typeof data === "object" &&
    "id" in data &&
    typeof data.id === "number" &&
    "username" in data &&
    typeof data.username === "string" &&
    "email" in data &&
    typeof data.email === "string"
  );
}

// Usage
const processUser = (data: unknown) => {
  if (isUser(data)) {
    // TypeScript knows data is User here
    console.log(data.username);
  } else {
    throw new Error("Invalid user data");
  }
};

This approach works but becomes unwieldy for complex objects, requiring manual implementation of validation logic.

Zod as a Solution for TypeScript Validation

Zod is a TypeScript-first schema validation library with static type inference. It allows defining schemas that validate data at runtime while automatically inferring TypeScript types.
Zod Docs

import { z } from "zod";

const UserSchema = z.object({
  id: z.string(),
  email: z.string().email(),
});

// Extract the inferred type
type User = z.infer<typeof UserSchema>;
// { id: string; email: string }

The retrieve from local storage function would look like:

const retrieveUser = async (): User | null => {
  try {
    const data = localStorage.get('user');
    const user = JSON.parse(data);
    const validatedUser = UserSchema.parse(user);
    return validatedUser; // User matches the type
  catch {
    return null;
  }
}

Pros of Zod

TypeScript-First Design

Zod was built specifically for TypeScript, resulting in excellent type inference and integration with TypeScript's type system. This enables catching type errors during development rather than at runtime.

Schema-to-Type Inference

The z.infer<typeof schema> pattern allows extracting TypeScript types directly from validation schemas, ensuring perfect alignment between validation and types.

Comprehensive Schema Options

Zod supports a wide range of validation options, from simple primitives to complex structures including objects, arrays, tuples, unions, and even functions.

Handy Zod utility: validateSchemaOrThrow

Here is a handy utility for validating schemas. It will attempt to validate the schema.
When it succeeds it returns the validated data. It will re-throw the combined zod errors when data is invalid.

import { z, ZodRawShape } from "zod";

export function validateSchemaOrThrow<T extends ZodRawShape>(
  schema: z.ZodObject<T>,
  data: any
): ReturnType<z.ZodObject<T>["parse"]> {
  const parsed = schema.safeParse(data);

  if (!parsed.success) {
    const error = parsed.error.issues.map(issue => issue.message).join(", ");
    throw new Error(error);
  }

  return parsed.data;
}

This is how it would end up being used in a framework route for example:

export async function POST(request: NextRequest) {
  try {
    const req = await request.json();
    const credentials = validateSchemaOrThrow(LoginSchema, req);
    const authUser = await loginUserOrThrow(credentials);

    return NextResponse.json({ data: authUser }, { status: 200 });
  } catch (err: any) {
    const error = err?.message || "Unexpected login error";
    return NextResponse.json({ error, data: null }, { status: 409 });
  }
}

More info at

https://zod.dev/

Handling Tech Debt while Shipping Features

Eugene Yakhnenko — Fri, 16 Jan 2026 02:54:13 +0000

Picture this: You're halfway through building that exciting new feature everyone's been asking for. You're in the zone. The code is flowing. And then... you discover a bug. Not in your new code—in the old system your feature depends on. What do you do? Fix it now? File a ticket and move on? Pretend you didn't see it? Is it actually a bug or is it a bug in your understanding of the requirements?

If you've been there (and honestly, who hasn't?), you know this moment of choice happens constantly during development. The reality of building software isn't a clean, linear path from requirements to deployment. It's more like exploring a house where opening one door reveals three more doors you didn't know existed, and sometimes those doors are stuck.

Let's talk about how to handle this reality without burning out, missing deadlines, or letting your codebase turn into a maintenance nightmare.

Why This Matters More Than You Think

Here's a sobering fact: when you switch from working on your feature to investigating that bug, it takes your brain about a good amount of time to fully get back into the zone afterward. Not the five minutes you hoped. For a team getting interrupted multiple times a day, that's anywhere from 10-20 hours of lost productivity every week.

And it's not just about time. Studies show that interrupted tasks take twice as long to complete and contain twice as many errors. It's a vicious cycle: poor code quality from interrupted work creates new bugs, which create more interruptions, which create more poor code.

But here's some good news: teams that handle these interruptions well don't eliminate them (that's impossible). They build systems to manage them efficiently.

First Things First: Not Everything is Urgent

The fastest way to chaos is treating every discovered issue like a five-alarm fire. Most things aren't. You need a simple way to decide what actually needs your attention right now.

Here's a framework that works:

P0/Critical: System crashes, data loss, security breaches. Drop everything.

P1/High: Significant features broken but workarounds exist. Handle this sprint or immediately after.

P2/Medium: Degraded experience but not blocking. Can wait until next sprint if needed.

P3/Low: Cosmetic issues, minor UX friction. Backlog material.

The key word here is "actually." Is this actually critical, or does it just feel urgent because you discovered it today?

A helpful trick: use RICE scoring for the gray areas. Score each issue on Reach (how many users), Impact (how badly affected), Confidence (how sure you are), and Effort (how hard to fix). Then calculate: (Reach × Impact × Confidence) / Effort. Higher scores win. This removes emotion from the decision.

Plan for the Unexpected (Because It Will Happen)

Here's where some teams go wrong: they plan sprints as if nothing unexpected will happen. Every hour is allocated to planned work. When interruptions inevitably arrive, the sprint explodes.

Successful teams build buffer into every sprint:

10-15% Corporate overhead: Meetings, emails, ceremonies
60-75% Planned work: Your actual features
10-15% Unplanned work: The buffer for surprises

"But that means we'll deliver less!"

I hear you saying. Actually, no. You'll deliver more consistently because you're planning realistically. Some sprints, you'll have fewer interruptions and pull ahead. Others, you'll use the full buffer. Over time, it averages out—but without the constant feeling of failure.

How much buffer do you need? It depends on the team, product, and environment: Track your actual interrupt load for a few sprints and adjust accordingly.

The Superman Strategy: Protecting Focus Time

For teams dealing with production systems or customer support, here's a game-changer: the Superman rotation.

Instead of spreading interrupts across everyone (death by a thousand distractions), one person handles all interrupts for a set period—a week, a sprint, whatever makes sense. Everyone else gets uninterrupted focus time.

Yes, one person's productivity takes a hit. But the rest of the team's productivity increases, and the net result is usually positive. Plus, the Superman builds deep knowledge of system issues and common problems.

Keys to making this work:

Rotate fairly: Nobody should be permanently on interrupt duty
Provide backup: Have a secondary person for escalation
Be realistic: Junior developers might need help; that's okay
Give them side work: They can tackle documentation, tools, or admin tasks between interrupts

Turn Fires Into Fireproofing

The difference between reactive and proactive teams isn't that proactive teams have fewer problems. It's that they prevent the same problem from happening twice.

After any major issue, follow this pipeline:

Fix the immediate problem (the symptom)
Conduct a quick Root Cause Analysis within 24-48 hours: Why did this happen? Was it missing tests? Unclear requirements? Architecture gap?
Create a prevention artifact: A runbook, automated test, monitoring rule, or architectural change
Track and prioritize improvements: Work the highest-impact, lowest-effort ones into your tech debt time

Example: Payment processing bug blocks the team. Don't just fix it, ask why your tests didn't catch it. Should there be integration tests? Add them. Document the scenario. Set up monitoring. Now it won't happen again.

Protect Your Brain: Reduce Context Switching

Even well-managed interrupts cause context switching. Here's how to minimize the damage:

Reserve focus time blocks: Many teams reserve specific hours as interrupt-free. No meetings, no Slack questions (unless production is literally on fire). Make it a team norm.

Set response time expectations:

Interrupt now (call, DM): Production down, security breach
Same-day response: Code reviews, sprint blockers (within 8 hours)
Next-day response: General questions, non-urgent bugs (within 24 hours)
Async only: Status updates, docs (no immediate response needed)

Limit work-in-progress: One or two active items per developer, maximum. Finish before starting new work. It feels slower but actually speeds things up.

The Missing Requirements Problem

Sometimes the "issue" isn't a bug—it's that you start building and realize the requirements were incomplete. This happens constantly. If you are not breaking things, you are not solving hard problems, and incomplete requirements are part of that.

Three-tier response:

Critical path clarification: If it blocks current work, pause and clarify immediately with your product owner. This should be a 30-minute conversation, not a three-day delay.
Scope decision: Is this part of the current feature? If yes, add it. If no, capture it for later.
Document for next time: Update your requirements template so this gap doesn't recur.

Don't fall into the false choice between "delay everything for perfect requirements" and "build something incomplete." Address blocking gaps now; defer the rest.

Measure What Matters

How do you know if your interrupt management is working? Track:

Cycle time: How long from issue discovery to fix? Faster is better.
Deployment frequency: Are you shipping consistently or sporadically?
Bug escape rate: What percentage of bugs reach production?
Developer satisfaction: Survey your team on focus time and stress levels.

If any of these are trending wrong, your process needs adjustment.

Common Traps to Avoid

Priority inflation: If 50% of your issues are "P0 critical," your definitions are broken. Typically, 5-10% should be P0.

Treating interrupts as planning failure: They're not. They're inevitable in live software. The question is how you handle them.

Permanent interrupt duty: Rotate fairly or you'll burn people out.

Skipping root cause analysis: Fixing the 20th payment bug without understanding why they keep happening means you're firefighting forever. Take the time to prevent recurrence.

Process creep: Don't add so much overhead that the meetings about interrupts are worse than the interrupts themselves.

Start Simple

You don't need to implement everything at once. Here is a simple four-week plan to get started:

Week 1:

Define your priority levels and share with the team
Reserve 10-20% of your sprint for unplanned work
Start a weekly 15-minute triage meeting

Week 2-3:

Try Superman rotation if your team handles interrupts
Protect time blocks as focus time
Set max 2 active items per developer

Week 4+:

Do root cause analysis on major issues
Track your metrics
Adjust based on what you learn

The Bottom Line

Building software means dealing with unexpected issues. The question is not if unexpected issues will happen (they will), but rather when.

Teams that excel at this aren't more talented or better equipped. They just accept reality and build around it: clear prioritization, reserved capacity, focused triage, root cause prevention, and protected focus time.

Your codebase will never be perfect. There will always be tech debt, bugs, and surprises. But with the right system, you can ship features, maintain quality, and keep your team healthy.

The best time to start was yesterday. The second-best time is right now.

Custom HTTP Interceptors in HLS.js for Video Streaming

Eugene Yakhnenko — Fri, 16 Jan 2026 02:52:32 +0000

Custom HTTP Interceptors in HLS.js for Video Streaming

When building video streaming applications, you might need to intercept HTTP requests for authentication, custom decryption, or analytics during video playback. HLS.js makes this surprisingly straightforward with custom loaders. Let's explore what HLS.js is and how to implement HTTP interceptors for video fragment loading.

What is HLS.js?

HLS.js is a JavaScript library that enables HTTP Live Streaming (HLS) playback in browsers that don't natively support it. While Safari handles HLS natively, browsers like Chrome, Firefox, and Edge need HLS.js to parse .m3u8 playlists and stream video fragments seamlessly.

The library handles:

Parsing HLS manifests (.m3u8 files)
Downloading and buffering video segments
Adaptive bitrate switching based on network conditions
Video playback coordination

Installation

Install HLS.js via npm or pnpm:

npm install hls.js
# or
pnpm add hls.js

Basic Usage

Here's a minimal React implementation:

import { useEffect, useRef } from "react";
import Hls from "hls.js";

export function HlsPlayer() {
  const videoRef = useRef<HTMLVideoElement>(null);
  const playlistUrl = "https://example.com/video.m3u8";

  useEffect(() => {
    const video = videoRef.current;
    if (!video) return;

    if (Hls.isSupported()) {
      const hls = new Hls();
      hls.loadSource(playlistUrl);
      hls.attachMedia(video);
    }
  }, []);

  return <video ref={videoRef} controls />;
}

Implementing a Custom HTTP Interceptor

The real power comes when you need to intercept fragment requests.
Create a custom loader by extending Hls.DefaultConfig.loader:

class CustomLoader extends Hls.DefaultConfig.loader {
  load(context: any, config: any, callbacks: any) {
    if (context.frag) {
      // Handle video fragments with custom logic
      fetchVideoFragment(callbacks, context);
    } else {
      // Use default loader for playlists
      super.load(context, config, callbacks);
    }
  }
};

The context.frag check distinguishes between playlist requests (handled by the default loader) and video fragment requests (where we apply custom logic).

Custom Fragment Fetcher

Here's how to fetch fragments with custom handling:

async function fetchVideoFragment(callbacks: any, context: any) {
  const start = performance.now();

  try {
    const response = await fetch(context.url, {
      // Add custom headers here if needed
      headers: {
        'Authorization': '{{ Bearer token }}',
        'X-Custom-Header': '{{Custom value}}'
      }
    });

    const buffer = await response.arrayBuffer();
    const end = performance.now();

    callbacks.onSuccess(
      { data: buffer, url: context.url },
      {
        loading: { start, end },
        loaded: buffer.byteLength,
        retry: 0,
      },
      context
    );
  } catch (error) {
    callbacks.onError(
      {
        code: 0,
        text: (error as Error).message,
        type: "networkError",
      },
      { loading: { start, end: performance.now() }, retry: 0 },
      context
    );
  }
}

This approach lets you:

Add authentication tokens to fragment requests
Decrypt encrypted video segments
Track loading performance metrics
Implement custom error handling
Apply transformations to video data before playback

Putting It Together

Wire up the custom loader when initializing HLS:

export function HlsPlayerHttpInterceptor() {
  const videoRef = useRef<HTMLVideoElement>(null);
  const playlistUrl = "https://example.com/video.m3u8";

  useEffect(() => {
    const video = videoRef.current;
    if (!video) return;

    if (Hls.isSupported()) {
      const hls = new Hls({
        // uses the custom loader
        loader: CustomLoader,
      });

      hls.loadSource(playlistUrl);
      hls.attachMedia(video);
    }
  }, []);

  return <video ref={videoRef} controls />;
}

Use Cases

Custom HTTP interceptors are particularly useful for:

Protected content: Adding authentication headers to video fragment requests
DRM workflows: Decrypting video segments before playback
Analytics: Tracking fragment load times and network performance
Caching strategies: Implementing custom caching logic
A/B testing: Routing requests to different CDN endpoints

DEV Community: Eugene Yakhnenko

Making OAuth Testable: Rethinking OIDC Clients in JavaScript

The real pain point

Why OIDC clients are hard to test

A different approach: treat OIDC as a protocol

Architecture shift: separate protocol from runtime

Testing OAuth without mocks

Deterministic end-to-end tests

Running tests across frameworks

What this catches that mocks don't

Tradeoffs

Takeaway

What 200 Concurrent Users Taught Me About SQLite Performance

Profiling on the Wrong Machine

Designing Verifico

It Worked

It Didn't Work

Finding the Real Bottleneck

The Boring Win: WAL Mode

The Real Scaling Win: Read/Write Pool Split

What Shipped in 2.0

What I Learned

Why your drawing app uses 2% CPU when you're not using it

Methodology in one paragraph

Why tldraw ticks every frame

Excalidraw's React reconciler

Figma is a different kind of cost

Skedoodle's 0.09%: event-driven rendering

The tie that's the real story

The tax

When not to do this

Try it yourself

I Found 5 Security Bugs in My OAuth2 Provider on My First Try (With an MCP Security Tool)

The foundation: RFC annotations and conformance testing

Traditional scanning: OWASP ZAP

Enter go-appsec/toolbox

Setup

What I found: 5 vulnerabilities on my first try

The standout: unauthenticated token introspection (HIGH)

The other four

What passed (23 tests)

What the author found: 10 more issues, deeper logic bugs

MFA enforcement bypass (#172)

Password grant authenticating deactivated users (#174)

Admin API audience validation bypass (#183)

The other ones:

The takeaway

Why I Built an Identity Provider in Go and SQLite

The Itch: Finding the Right Lightweight IdP

The Antidote: Zero-Ceremony Architecture

Flexibility Over Dogma: Solving the Passkey Trap

The "Deliberately Un-clever" Architecture & The AI Accelerator

The Scale Ceiling (And Why It Doesn't Matter)

Conclusion

Links

The Lightweight JavaScript Framework Renaissance of 2026

Best JavaScript Frameworks in 2026: For AI and Humans

The New Evaluation Criteria

The Heavy Framework Tax

The Light Library Renaissance

Arrow.js

Kasper.js

Others Worth Knowing

Comparison at a Glance

How to Choose

Conclusion

Building a JavaScript Framework (and Failing Twice at Reactivity)

The Part That Failed Twice

Coming Back to It

600 Tests Later

The Real Test Wasn't Tests

The Missing Piece: Documentation for AI

A Surprising Insight About AI

When Tests Stop Helping

What Actually Made the Framework Stable

The Unexpected Win

What I'd Do Differently

Where It Ended Up

Try it out!

We Had to Write Docs for AI: llms.txt Changed Everything