DEV Community

Cover image for How Building a Conlang Editor Taught Me What Tagged Unions Can Actually Be In Typescript
Ja
Ja

Posted on

How Building a Conlang Editor Taught Me What Tagged Unions Can Actually Be In Typescript

I'm building a text editor for constructed languages. The kind of thing where you define phonological rules, morphology tables, and interlinear glossing, and the editor interprets your grammar in real time to produce translations and grammatical scores. It is absurdly niche. I love it.

The problem is, a project like that needs an incremental parser, and TypeScript makes incremental parsing in a type-safe way genuinely painful. You end up with these sprawling discriminated unions full of ceremony, verbose type guards that the compiler can only half-verify, and switch statements that nobody trusts. I kept reaching for ADTs and functional patterns to tame it, but every library that offered them either wanted me to adopt a whole runtime or didn't compose beyond the one thing it was good at.

So I kept building aljabr. And it kept growing.

Current Aljabr logo as of v0.3.0

If you read my earlier posts, you might remember aljabr as a pattern matching library. Define tagged unions, get exhaustive match(), move on. That was v0.1. By v0.2, it had reactive signals built on the same union substrate. But v0.3 is where it crossed a line I didn't plan for.

I needed to ingest grammar definitions from external files. Untyped JSON, user-edited config, raw text payloads. I looked at Zod and ArkType for validation, and both are excellent at what they do, but wiring their output back into aljabr's union model felt like duct-taping two type systems together. The errors came back as their types. The validated data had to be manually reconstructed into my variants. It worked, but it was ugly in a way that mattered.

So I built aljabr/schema. A decode/encode pipeline where schemas are themselves aljabr variants (you can match() over a schema), errors surface as a Validation union with accumulated field paths, and Schema.variant() decodes directly into your tagged union constructors. No glue layer. No adapter boilerplate.

Then I needed resource management for WebSocket connections in the editor's live preview. Then retry policies for flaky network fetches. Then structured error classification so I could tell the difference between "your grammar file has a syntax error" and "the WebSocket died unexpectedly." Each piece grew out of an actual problem in the parser project, and each one composes through the same union + match model that started all of this.

The result is four independent entry points. Use one, use all of them, ignore what you don't need:

aljabr           union factories, match(), when() arms, select(), is.* patterns
aljabr/prelude   Result, Option, Validation, Signal, Derived, Ref, Scope, Resource
aljabr/schema    decode/encode pipeline, Schema.variant(), DecodeError union
aljabr/signals   signal(), memo(), effect(), scope(), query()
Enter fullscreen mode Exit fullscreen mode

Zero dependencies. No runtime. No fibers. No DI container.

What came out of it

Instead of listing everything, let me show you the parts I keep reaching for.

The pattern vocabulary (and what you can do with it)

The TC39 pattern matching proposal has been moving forward, and ts-pattern has done beautiful work showing what structural selection can look like in TypeScript today. I wanted that same expressiveness, but flowing through nominal tagged unions instead of structural dispatch.

The foundation is the is.* namespace. These are type wildcards and combinators that you use inside when() arms to describe what a field should look like:

when({ age: is.number },                          handler)  // age is a number
when({ name: is.not.nullish },                     handler)  // name is present
when({ key: is.union("Tab", "Enter") },            handler)  // key is one of these
when({ status: is.not("error") },                  handler)  // anything except "error"
when({ data: is.variant(Result) },                 handler)  // data is any Result variant
when({ val: is.union(is.variant(Result), is.string) }, handler)  // Result or a string
Enter fullscreen mode Exit fullscreen mode

They compose the way you'd expect. is.not(is.union("a", "b")) works. is.not.array is a plain value you can pass directly. The is.not.* mirrors exist for every wildcard so you can write BDD-style patterns without wrapping everything in is.not(is.thing).

Where it gets interesting is when you combine these with select(). select("name") binds a matched field to a named slot in the handler's second argument, and the optional inner pattern constraint actually narrows the extracted type at compile time:

import { union, match, when, select, is, __, Union } from "aljabr"

const Expr = union({
  Literal: (value: number) => ({ value }),
  BinOp:   (op: string, left: number, right: number) => ({ op, left, right }),
  Ident:   (name: string, scope: string | null) => ({ name, scope }),
})
type Expr = Union<typeof Expr>

function describe(expr: Expr): string {
  return match(expr, {
    Literal: ({ value }) => `literal ${value}`,
    BinOp: [
      when(
        { op: select("op"), left: select("l"), right: select("r") },
        (_, { op, l, r }) => `${l} ${op} ${r}`,
        // op: string, l: number, r: number — each typed from the variant
      ),
    ],
    Ident: [
      when(
        { name: select("n"), scope: select("s", is.not(is.nullish)) },
        (_, { n, s }) => `${s}::${n}`,
        // s is string, not string | null — the inner pattern narrowed it
      ),
      when(__, ({ name }) => name),
    ],
  })
}
Enter fullscreen mode Exit fullscreen mode

No casts anywhere. If you've used ts-pattern's P.select(), the shape will feel familiar, but here it flows through tag-first dispatch, so your variants stay nominal and serialize cleanly. The wildcards give you the vocabulary, select() gives you the extraction, and they compose without getting in each other's way.

Crossing the data boundary

Here's where the conlang editor story comes back. My grammar definition files look something like this on the wire:

{
  "type": "Rule",
  "name": "vowel-harmony",
  "pattern": { "type": "Sequence", "elements": ["V", "C", "V"] },
  "transform": { "type": "Assimilate", "target": 2, "source": 0 }
}
Enter fullscreen mode Exit fullscreen mode

Decoding that into typed aljabr variants used to mean writing a manual decoder, handling every discriminant, and hoping I didn't forget a case. Now:

import { union, match, Union } from "aljabr"
import { Schema, decode } from "aljabr/schema"

const GrammarNode = union({
  Rule:       (p: { name: string; pattern: unknown; transform: unknown }) => ({ ...p }),
  Sequence:   (p: { elements: string[] }) => ({ ...p }),
  Assimilate: (p: { target: number; source: number }) => ({ ...p }),
})

const GrammarSchema = Schema.variant(GrammarNode, {
  Rule: Schema.object({
    name:      Schema.string(),
    pattern:   Schema.object({ elements: Schema.array(Schema.string()) }),
    transform: Schema.object({ target: Schema.number(), source: Schema.number() }),
  }),
  Sequence:   Schema.object({ elements: Schema.array(Schema.string()) }),
  Assimilate: Schema.object({ target: Schema.number(), source: Schema.number() }),
})

const result = decode(GrammarSchema, rawPayload)

// result is Validation<GrammarNode, DecodeError>
match(result, {
  Valid:       ({ value }) => applyRule(value),
  Invalid:    ({ errors }) => errors.forEach(e =>
    console.error(`${e.path.join(".")}: ${getTag(e)}`)
  ),
  Unvalidated: () => {},
})
Enter fullscreen mode Exit fullscreen mode

decode returns a Validation, not a Result, because external data can fail in multiple places simultaneously. Missing field here, type mismatch there. All errors accumulate with their full path. The DecodeError variants (TypeMismatch, MissingField, InvalidLiteral, UnrecognizedVariant, Custom) are themselves a tagged union, so you match over them the same way you match over everything else.

I could have wrapped Zod. I could have written an adapter for ArkType. Both are great libraries and if you're already using them, aljabr ships defineDecoder and defineCodec helpers for exactly that. But for my use case, having the schema engine speak the same algebraic language as the rest of the stack meant one less translation layer between what I think and what the code says.

When things go wrong (and they will)

The editor's live preview opens a WebSocket, subscribes to file watchers, and polls a language server. These all need teardown guarantees. If a reactive dependency changes, the old connection has to close before the new one opens. If the component unmounts, everything needs to clean up. If the WebSocket dies mid-connection, I need to know whether that was a domain error, an unexpected crash, or an intentional cancellation.

import { Signal, Resource, watchEffect, match, Fault } from "aljabr/prelude"

const roomId = Signal.create("lobby")

const WsResource = Resource(
  () => connectWebSocket(`/rooms/${roomId.get()!}`),
  (ws) => ws.close(),
)

const handle = watchEffect(
  async (signal, scope) => {
    const ws = await scope.acquire(WsResource)
    return receiveNextMessage(ws, signal)
  },
  (effect) => match(effect, {
    Done:   ({ value }) => renderMessage(value),
    Failed: ({ fault }) => match(fault, {
      Fail:        ({ error }) => showDomainError(error),
      Defect:      ({ thrown }) => reportCrash(thrown),
      Interrupted: () => {}, // intentional cancellation, nothing to do
    }),
    Stale: () => {},
  }),
  {
    eager: true,
    schedule: Schedule.Exponential({ base: 1000, max: 30000 }),
    timeout: 10000,
  },
)
Enter fullscreen mode Exit fullscreen mode

Scope holds finalizers that run in LIFO order on disposal. Resource pairs acquire with release. watchEffect gives each run a fresh scope that auto-disposes when the effect re-runs or stops. Fault classifies failures into three variants: Fail (domain error, retryable), Defect (unexpected runtime panic), and Interrupted (abort signal fired). Schedule is a tagged union of retry policies.

Every piece here is a matchable union. The error handling, the resource lifecycle, the retry policy. It's the same match() you use for everything else.

And yes, await using scope = Scope() works if you're on a runtime that supports TC39 Explicit Resource Management. Aljabr implements Symbol.asyncDispose.

Not everything, on purpose

I want to be clear about what this is and isn't, because the feature surface might suggest I'm trying to compete with things I'm not.

This is not Effect. Effect is a full runtime with fibers, structured concurrency, a service layer, and a mature ecosystem. If you want that level of infrastructure and you're willing to invest in the framework, Effect is excellent and I genuinely mean that. Aljabr has no runtime. There are no fibers. There is no dependency injection container. It's a toolkit, not a platform.

This is not ts-pattern either. ts-pattern is the best structural pattern matcher in the TypeScript ecosystem and it works over arbitrary objects you didn't define. Aljabr's dispatch is tag-first and nominal. If you need to pattern match over third-party types or data shapes you don't control the definition of, ts-pattern is probably the better tool. Aljabr is for the case where you're defining the unions yourself and you want the pattern matching, the schemas, the reactive state, and the error handling to all compose through that same union model.

Think of it as the part of the algebraic programming story that doesn't require buying into a framework. Define your domain as tagged unions. Match over them exhaustively. Validate external data into them. React to them. Manage their lifetimes. One model, all the way through.

The road from here

The docs are the most complete they've ever been. There's a getting started guide, full API references for every module, and advanced pattern guides covering signal protocols, reactive UI composition, resource lifetimes, and parser construction.

On the roadmap: React bindings via useSyncExternalStore (so aljabr's reactive primitives drive React renders without lifting into React state), reactive array iterators with per-index fine-grained tracking, and a React Hook Form resolver for the schema module. All of that is tracked in the repo if you want to follow along.

npm install aljabr
Enter fullscreen mode Exit fullscreen mode

That conlang editor still isn't done. But the toolkit underneath it is ready for you to build your own weird thing with.

Source and docs on GitHub

Top comments (0)