Polymorphic Typing: Fewer Tokens, Stronger Guarantees
Who this is for. If you've wondered whether it's possible to have strict typing without verbose annotations like Java or TypeScript — the answer is yes. This article explains how, and why it's critical for LLMs.
33.6% of all LLM-generated code failures are type errors. Can we eliminate them without making the LLM generate type annotations?
Part of Token Economics of Code series.
The Problem: Types Cost Tokens
// TypeScript: ~50% of tokens are type annotations
function add(a: number, b: number): number {
return a + b;
}
Each annotation means more tokens, more context consumed (quadratic attention cost), and more opportunities for mistakes.
The ideal: 100% type safety with zero type annotations.
Hindley-Milner: A 1960s Solution to a 2020s Problem
The Hindley-Milner algorithm infers types automatically, requiring zero annotations. Used in Haskell, OCaml, F#, Elm.
-- Synoema: zero annotations, 100% type safety
add x y = x + y
-- Compiler infers: Int -> Int -> Int
id x = x
-- Compiler infers: forall a. a -> a
map f [] = []
map f (x:xs) = f x : map f xs
-- Compiler infers: forall a b. (a -> b) -> List a -> List b
How It Works
Algorithm W works in three steps:
Step 1: Constraint generation. For each expression, create type variables and record constraints.
Step 2: Unification. Solve the constraint system — replace type variables with concrete types. Conflicts = type errors.
Step 3: Generalization. Remaining free variables become polymorphic: id : a -> a becomes id : forall a. a -> a.
Key property: HM always finds the most general type.
Let-Polymorphism
id x = x
main =
a = id 42 -- id used as Int -> Int
b = id true -- id used as Bool -> Bool
a -- No error! id is polymorphic.
Interaction with Constrained Decoding
At each generation step, the compiler determines valid types for the next expression:
-- LLM generates: map ??? [1 2 3]
-- Compiler knows: ??? : Int -> t (function from Int)
-- Valid: \x -> x + 1, \x -> x * 2
-- Invalid: \x -> x ++ "hello" (String != Int)
Grammar + type constraints narrow valid continuations by orders of magnitude.
Comparison
| Approach | Type guarantees | Tokens on types | Runtime errors |
|---|---|---|---|
| Python (duck typing) | None | 0 | Many |
| TypeScript | Yes | ~30-50% of code | Few |
| Java | Yes | ~40-60% of code | Few |
| Synoema (HM) | Yes | 0 | None |
Try It
git clone https://github.com/Delimitter/synoema
cd synoema/lang
cargo run -p synoema-repl -- eval "id x = x; id 42"
cargo run -p synoema-repl -- eval "map f [] = []; map f (x:xs) = f x : map f xs; map (\x -> x * 2) [1 2 3]"
TypeScript vs Synoema — same guarantees, 44% fewer tokens:
// TypeScript: 25 tokens
function map<A, B>(f: (a: A) => B, xs: A[]): B[] { ... }
-- Synoema: 14 tokens
map f [] = []
map f (x:xs) = f x : map f xs
Impact on LLM Code Generation
With HM, LLMs generate only semantics — the compiler handles types. This is why Synoema achieves 74.8% fewer type errors than syntax-only constrained decoding.
What's Next
Next: we measured every token across 16 algorithms in 5 languages.
Part of Token Economics of Code series by @andbubnov. HM type inference: 1,908 lines of Rust, 61 tests.
Top comments (0)