Eresh Gorantla

Posted on Mar 20

I Built a Schema Migration Tool for Cassandra Because Nothing Else Worked

#devops #schema #database #python

Flyway and Liquibase treat Cassandra like PostgreSQL. They miss async DDL, distributed locking, and partial failures. So I built cqltrack — open-source CLI + Python library with schema agreement, LWT locking, and a CQL linter.

And then open-sourced it.

If you manage Apache Cassandra in production, you've felt this pain. You need to add a column, create an index, or restructure a table. So you SSH into a node, open cqlsh, and paste your CQL. Then you do it again on staging. Then someone on your team does the same change slightly differently on another cluster.

Sound familiar?

I looked at existing tools. Flyway has a Cassandra plugin. Liquibase can talk CQL. But they all treat Cassandra like PostgreSQL with a different query language. They miss the hard parts:

DDL is asynchronous. When you CREATE TABLE, the coordinator returns success before all nodes know the table exists. Your next statement — CREATE INDEX on that table — lands on a different node that hasn't seen it yet. It fails. Flyway doesn't handle this. Neither does Liquibase.

There are no DDL transactions. If statement 3 of 5 fails, statements 1 and 2 already executed. You can't roll back. Your database is in a half-migrated state.

Multiple processes can migrate simultaneously. Two CI runners deploying to the same cluster. Both see pending migrations. Both try to apply them. Chaos.

So I built cql-track.

What It Does

cqltrack is a schema migration tool that actually understands Cassandra's distributed nature. Install it with pip:

pip install cql-track

14 commands. One CLI. Everything you need:

cqltrack init         Create keyspace + tracking tables
cqltrack migrate      Apply pending migrations
cqltrack rollback     Undo migrations
cqltrack status       Applied vs pending at a glance
cqltrack history      Full audit trail with timing
cqltrack lint         Static analysis for dangerous CQL
cqltrack diff         Compare schemas across environments
cqltrack snapshot     Export live schema as CQL
cqltrack baseline     Adopt on existing databases
cqltrack pending      CI gate — exit code 1 if unapplied
cqltrack validate     Detect modified migration files
cqltrack repair       Accept intentional file changes
cqltrack new          Scaffold a migration file
cqltrack profiles     List environment profiles

But the real value is in what happens under the hood.

Architecture

Two entry points — CLI for terminal/CI, Python API for embedding in applications. Same engine underneath.

Schema Agreement — The Problem Nobody Talks About

Cassandra propagates schema changes through gossip. After a DDL statement, nodes converge asynchronously. The common "fix" is sleep(5). That's not a fix.

cqltrack polls system.local and system.peers after every DDL statement, waiting until all nodes report the same schema_version UUID. Only then does it proceed.

Configurable timeout (default: 30s). DML statements (INSERT, UPDATE) skip this entirely — they don't need agreement.

Distributed Locking with Cassandra's Own LWT

No external dependencies. No ZooKeeper. No Redis. cqltrack uses Cassandra's Lightweight Transactions to implement a distributed lock.

When two CI runners or Kubernetes pods try to migrate simultaneously:

Key design decisions:

TTL safety net — if a worker crashes without releasing, the lock auto-expires in 10 minutes
Ownership check — DELETE IF owner = me prevents one process from releasing another's lock
SERIAL consistency — full linearizability regardless of your configured consistency level
30 retries, 2s intervals — patient waiting before giving up

Partial Failure Tracking

Statement 3 fails. Statements 1 and 2 already ran. Cassandra has no rollback.

When a migration fails:

It's recorded in the history table with status = failed
cqltrack status shows it as pending — ready for retry
cqltrack history shows it as FAILED — full audit trail
Fix the CQL file, run migrate again — clean retry

The linter helps prevent retryability issues. It warns you if you write CREATE TABLE without IF NOT EXISTS — because on retry, that statement would fail even though the table already exists from the first attempt.

Not Just a CLI — A Python Library

Starting from v1.1.0, cqltrack works as a programmatic API. Embed migrations directly in your application startup:

FastAPI:

from contextlib import asynccontextmanager
from fastapi import FastAPI
from cqltrack import CqlTrack

@asynccontextmanager
async def lifespan(app: FastAPI):
    with CqlTrack("cqltrack.yml") as tracker:
        tracker.init()
        tracker.migrate()
        yield

app = FastAPI(lifespan=lifespan)

Django:

from django.apps import AppConfig
from cqltrack import CqlTrack

class MyAppConfig(AppConfig):
    def ready(self):
        with CqlTrack("cqltrack.yml") as tracker:
            tracker.init()
            tracker.migrate()

Plain Python — no YAML needed:

from cqltrack import CqlTrack

with CqlTrack(contact_points=["127.0.0.1"], keyspace="my_app") as t:
    t.init()
    t.migrate()

The distributed lock ensures only one Gunicorn worker or Kubernetes pod runs migrations. Others wait, then find nothing pending.

Built-in Static Analysis

8 rules that catch real problems:

Rule	What it catches
`no-rollback`	Missing `@down` section
`empty-rollback`	Empty rollback — false sense of safety
`drop-no-if-exists`	`DROP TABLE` without `IF EXISTS` — not idempotent
`create-no-if-not-exists`	`CREATE TABLE` without `IF NOT EXISTS`
`column-drop`	`ALTER TABLE DROP` in UP section — permanent data loss
`truncate`	`TRUNCATE` — wipes all data
`pk-alter`	PRIMARY KEY change — impossible in Cassandra
`type-change`	Column type change — very limited support

Context-aware: ALTER TABLE DROP in @down (rollback) is expected and not flagged. No running Cassandra needed.

Schema Diff Across Environments

$ cqltrack diff --source dev --target prod

Schema diff: myapp_dev <-> myapp_prod

  table       audit_log                     only in dev
  column      users.phone                   only in dev        text
  index       idx_users_email               only in dev

3 difference(s) found.

Compares tables, columns (with types), partition keys, clustering keys, indexes, and UDTs. Works across clusters via profiles or between keyspaces on the same cluster.

Multi-Environment Profiles

One YAML file. All environments:

profiles:
  dev:
    keyspace:
      name: myapp_dev
  prod:
    cassandra:
      contact_points: [prod-1, prod-2, prod-3]
      consistency: LOCAL_QUORUM
    keyspace:
      name: myapp_prod

cqltrack --profile prod migrate
cqltrack --profile dev diff --target-profile prod

Config resolution (last wins):

The Full Migration Lifecycle

How It Compares

	cassandra-migrate	cassandra-migration	cqltrack
Schema agreement	No	No	Yes
Distributed lock (LWT)	No	Basic	Yes + TTL
Partial failure tracking	No	No	Yes
CQL linter	No	No	8 rules
Schema diff	No	No	Yes
Multi-env profiles	No	No	Yes
JSON output for CI	No	No	Yes
Astra DB	No	No	Yes
Baseline adoption	No	No	Yes
Programmatic API	No	No	Yes

Compatibility

Verified with:

Apache Cassandra 3.x, 4.x, 5.x
DataStax Enterprise (DSE)
DataStax Astra DB (cloud)

AWS Keyspaces is not supported yet — it doesn't support Lightweight Transactions the way open-source Cassandra does. It's on the roadmap.

Get Started

pip install cql-track

GitHub: github.com/ereshzealous/cql-track
PyPI: pypi.org/project/cql-track
Docs: USAGE.md
Examples: FastAPI, Django, plain Python

MIT licensed. Contributions welcome.

If you've struggled with Cassandra schema management, give it a try. Star the repo if it saves you time.

If you get a chance to try it out, I'd appreciate any feedback. Feel free to open an issue at github.com/ereshzealous/cql-track/issues if you run into anything.

DEV Community