Conor Maguire

Posted on Mar 25

Building an Embeddable Family Tree Viewer in TypeScript

#genealogy #gedcom #wikitree #familytree

The Problem

I'm building irishroots.ie, a genealogy platform focused on Irish heritage. At some point I needed to embed a family tree visualisation on the site — something users could explore, navigate, and ideally edit inline.

I assumed this would be a two-hour job. Grab a library, wire it up, done.

Six hours of research later, I was starting a new repo.

What's Already Out There (And Why It Wasn't Enough)

The GEDCOM ecosystem

GEDCOM (.ged) is the universal exchange format for genealogy data — invented by the Mormon Church in 1984, standardised in 1996, and still the format every serious genealogy tool exports to. If you want to interoperate with Ancestry, FamilySearch, MacFamilyTree, or any desktop genealogy software, you need to speak GEDCOM.

The JavaScript ecosystem has a few parsers, but the visualisation options are thin:

d3.js family tree examples — these are great demos, but they're demos. They assume a simple hierarchical structure (one parent, one child), which immediately breaks down when you have remarriages, half-siblings, or ancestors who married their distant cousins (which, in Irish rural genealogy, happens constantly).
Gramps Web — full Django app, not embeddable.
Treant.js — old, unmaintained, not GEDCOM-aware.
FamilyChart — the closest thing I found to what I needed, but it has dependencies, an unusual data model, and no GEDCOM support.

None of them handle the GEDCOM FAM record model properly. In GEDCOM, families are first-class objects — a FAM record links two spouses and their children. It's not a simple tree; it's a graph of couples and descendants. Any layout engine that treats it as a hierarchy will produce wrong output the moment someone has two marriages.

The WikiTree API

WikiTree is a collaborative genealogy platform with a public JSON API. It's particularly strong on Irish and British genealogy — there are millions of well-sourced profiles. I wanted to be able to load a WikiTree profile directly into the viewer.

WikiTree's API is... documented, but eccentric. It returns person records with Father, Mother, and Spouses fields, but reconstructing a renderable family graph from this requires multi-round fetching (fetch the root, then fetch each parent, then fetch each parent's parents...), and the API has rate limits and authentication requirements that make naive depth-first fetching painful.

None of the existing family tree libraries had WikiTree support.

The Decision: Build It

Given that I needed:

GEDCOM parsing (including the hairy stuff: CONC/CONT continuation lines, BOM handling, CRLF normalisation)
A proper graph layout that handles multiple marriages and half-siblings
WikiTree JSON API support
Zero runtime dependencies (I'm embedding this in a Wagtail CMS; the last thing I need is to manage a transitive dependency chain)
A clean embedding API: new FamilyTreeViewer('#container', { gedcom: text })

...building it myself was the right call. Here's how it came together.

Architecture

The library is structured as a clean pipeline:

GEDCOM text / WikiTree API
        ↓
    [Parser / Loader]
        ↓
    GedcomNode[] (raw AST)
        ↓
    buildTree() → Tree (typed model)
        ↓
    computeLayout() → LayoutResult (x/y coordinates)
        ↓
    SVGRenderer → <svg> in DOM

Each stage is independently testable. There are 61 Vitest tests covering the parser, serializer, layout engine, and edit operations.

The GEDCOM Parser

GEDCOM is line-oriented. Every line starts with a level number, an optional cross-reference ID, a tag, and an optional value:

0 @I1@ INDI
1 NAME John /Murphy/
1 SEX M
1 BIRT
2 DATE 15 MAR 1887
2 PLAC County Mayo
1 FAMS @F1@
0 @F1@ FAM
1 HUSB @I1@
1 WIFE @I2@
1 CHIL @I3@

The parser is two-pass: tokenize into RawLine[], then build a node tree by tracking a level stack. The tricky parts:

BOM and line ending normalisation. Files come from everywhere — Windows genealogy software loves CRLF, some tools emit a UTF-8 BOM. Strip these first before anything else.

const normalized = text
  .replace(/^\uFEFF/, '')    // BOM
  .replace(/\r\n/g, '\n')    // CRLF → LF
  .replace(/\r/g, '\n');     // bare CR → LF

CONC and CONT lines. Long values (like notes) can be split across lines using CONC (concatenate, no space) and CONT (continue, with newline). Per GEDCOM 5.5.1, CONC appends directly — no space injected — so it's the writer's responsibility to split at a safe point. Most parsers get this wrong and add a space.

ANSEL encoding. Old files may declare CHAR ANSEL in the header — a character encoding used by genealogy software in the 90s. We don't decode it (that would require a lookup table and is a rabbit hole), but we warn loudly and proceed as UTF-8.

The Data Model

The parser produces a raw AST. buildTree() walks it and builds a typed Tree object with Individual and Family records:

interface Individual {
  id: string;
  givenName: string;
  surname: string;
  sex: 'M' | 'F' | 'U';
  events: EventRecord[];       // BIRT, DEAT, MARR, etc.
  familyAsChild?: string;      // @F1@ — the family this person was born into
  familiesAsSpouse: string[];  // [@F1@, @F2@] — families this person formed
  notes: string[];
}

interface Family {
  id: string;
  husbandId?: string;
  wifeId?: string;
  childIds: string[];
}

This mirrors the GEDCOM model directly. The FAM record is preserved as a first-class object — not flattened into edges — because the layout engine needs it.

The Layout Engine

This is where it gets interesting.

A family tree is not a tree. It's a DAG (directed acyclic graph) where nodes are individuals and edges are parent-child relationships. But it's also not a clean DAG — in endogamous populations (like historical Irish townlands where everyone is related to everyone), you can have individuals who appear as both ancestor and descendant via different lineages.

The layout assigns each individual to a generation — an integer row — and then positions them horizontally within that row.

Generation assignment is two-phase:

BFS from root ancestors (individuals who have no familyAsChild in the tree), propagating generation = parent.generation + 1 to each child. Spouses are not propagated through — you don't want a spouse "dragging" their in-laws into the wrong generation just because they married into a different lineage.
Spouse alignment pass. After BFS, if a husband is generation 3 and his wife (from a different lineage) computed as generation 2, raise the lower one to match. This keeps couples on the same row without corrupting the parent-child lineage.

// Phase 2: align spouses
for (const fam of tree.getAllFamilies()) {
  if (!fam.husbandId || !fam.wifeId) continue;
  const hGen = gen.get(fam.husbandId) ?? 0;
  const wGen = gen.get(fam.wifeId) ?? 0;
  const aligned = Math.max(hGen, wGen);
  gen.set(fam.husbandId, aligned);
  gen.set(fam.wifeId, aligned);
}

Ghost nodes are synthesized for families where only one spouse is known. The ghost gets a dashed border and no name — it signals "partner unknown" without breaking the layout symmetry.

Collapsible subtrees. The viewer tracks a Set<string> of expanded family IDs. Collapsing a family cascades: walk the family's children, find any families they formed as spouses, and collapse those too. The getVisibleSet() function re-derives which individuals are visible on each render, which makes the expand/collapse state trivial to manage.

The Renderer

The renderer produces SVG entirely through the DOM API — no innerHTML, no string templates, no third-party SVG library. Every element is document.createElementNS('http://www.w3.org/2000/svg', ...).

Layering. The SVG has four <g> layers, painted bottom-up: edges → couple connectors → node cards → expand/collapse buttons. This avoids z-index battles and keeps edges behind cards.

Orthogonal elbow paths. Parent-child connections drop vertically from a family midpoint, then branch horizontally to each child. This produces the classic family tree look without any curve-fitting:

  [John Murphy] ═══ [Mary O'Brien]
          │
    ┌─────┴─────┐
    │           │
[Patrick]   [Brigid]

CSS injection. Styles are injected once as a <style id="ftv-styles-v1"> tag. Idempotent — if you mount two viewers on the same page, the second mount detects the tag is already there and skips. All class names use the ftv- prefix to avoid collisions.

The WikiTree Loader

The WikiTree loader is dynamically imported (await import('./loaders/wikitree.js')) so it's tree-shaken out for users who only need GEDCOM.

It fetches profiles depth-first via WikiTree's JSON API, building up a Tree object directly (bypassing GEDCOM entirely — there's no need for a round-trip through text format). Dates come back as "YYYY-MM-DD" strings with "0000-00-00" for unknowns:

export function parseWikiDate(raw: string | undefined): ParsedDate {
  if (!raw || raw === '0000-00-00') return {};
  // handles partial dates: "1887-00-00" → year only
  const [y, m, d] = raw.split('-').map(Number);
  const year = y || undefined;
  const month = m || undefined;
  const day = d || undefined;
  return { year, date: formatDate(year, month, day) };
}

The loader normalises WikiTree's Spouses map (an object keyed by numeric ID, which is unusual) into Family records, then hands the completed Tree to the same layout engine that handles GEDCOM files.

The Edit Engine

The EditEngine handles add/update/remove operations with cascading consistency:

Removing an individual removes them from all FAM records that reference them
If removing them leaves a family with no members, the family is removed too
All mutations emit a tree:change event on the EventBus, which triggers a re-render and calls the onSave callback

The EventBus is a simple typed pub/sub:

type Events = {
  'node:click': { id: string };
  'node:hover': { id: string };
  'tree:change': Record<string, never>;
};

TypeScript ensures you can't emit or subscribe to a non-existent event type, and the payload shape is enforced at the call site.

The Embedding API

The public interface is designed to be minimal:

import { FamilyTreeViewer } from 'family-tree-viewer';

// From a GEDCOM file
const viewer = new FamilyTreeViewer('#my-container', {
  gedcom: gedcomText,
  theme: 'light',
  readonly: false,
  onSave: (updatedGedcom) => {
    localStorage.setItem('family-tree', updatedGedcom);
  },
});

// From WikiTree
await viewer.loadWikiTree('Murphy-1234', { depth: 3 });

// Programmatic control
viewer.selectPerson('@I3@');
viewer.fitToScreen();
viewer.destroy();

The build produces both ESM (family-tree-viewer.js) and UMD (family-tree-viewer.umd.cjs) via Vite's library mode, so it works with bundlers and plain <script> tags.

What I Learned

GEDCOM is nastier than it looks. The spec is 200 pages. Real files from real software have non-compliant quirks (line length violations, missing headers, non-standard tags). A parser that handles the spec is a starting point; a parser that handles real files is a different project.

Graph layout is genuinely hard. The two-phase BFS + spouse alignment approach works well for most trees but can produce unintuitive results for highly endogamous families. Layout algorithms for genealogical graphs are an active area of research.

Zero dependencies is a real constraint, not just a boast. Every time I wanted to reach for a utility — date parsing, SVG path helpers, a test utility — I had to write it or do without. This makes the bundle small (the ESM build is ~35 KB) and removes an entire category of supply chain risk for an embedded library.

Dynamic import is great for optional features. The WikiTree loader is only fetched if you call loadWikiTree(). This keeps the core bundle small for users who only need GEDCOM.

What's Next

Export to image (PNG/SVG download)
Printing layout (multi-page PDF-friendly layout)
Better handling of highly endogamous graphs (Sugiyama-style layer assignment)
GEDCOM 7.0 support (the 2021 revision is finally gaining traction)

The library is open source under AGPL v3. If you're building something in the genealogy space — or just need an embeddable family tree widget that actually understands GEDCOM — it might be a useful starting point.

Check out the demo!

Built as part of the Irish genealogy project at irishroots.ie. The source is on GitHub.

DEV Community