DEV Community

Eresh Gorantla
Eresh Gorantla

Posted on

I Built a Schema Migration Tool for Cassandra Because Nothing Else Worked

Flyway and Liquibase treat Cassandra like PostgreSQL. They miss async DDL, distributed locking, and partial failures. So I built cqltrack — open-source CLI + Python library with schema agreement, LWT locking, and a CQL linter.

And then open-sourced it.


If you manage Apache Cassandra in production, you've felt this pain. You need to add a column, create an index, or restructure a table. So you SSH into a node, open cqlsh, and paste your CQL. Then you do it again on staging. Then someone on your team does the same change slightly differently on another cluster.

Sound familiar?

I looked at existing tools. Flyway has a Cassandra plugin. Liquibase can talk CQL. But they all treat Cassandra like PostgreSQL with a different query language. They miss the hard parts:

DDL is asynchronous. When you CREATE TABLE, the coordinator returns success before all nodes know the table exists. Your next statement — CREATE INDEX on that table — lands on a different node that hasn't seen it yet. It fails. Flyway doesn't handle this. Neither does Liquibase.

There are no DDL transactions. If statement 3 of 5 fails, statements 1 and 2 already executed. You can't roll back. Your database is in a half-migrated state.

Multiple processes can migrate simultaneously. Two CI runners deploying to the same cluster. Both see pending migrations. Both try to apply them. Chaos.

So I built cql-track.

What It Does

cqltrack is a schema migration tool that actually understands Cassandra's distributed nature. Install it with pip:

pip install cql-track
Enter fullscreen mode Exit fullscreen mode

14 commands. One CLI. Everything you need:

cqltrack init         Create keyspace + tracking tables
cqltrack migrate      Apply pending migrations
cqltrack rollback     Undo migrations
cqltrack status       Applied vs pending at a glance
cqltrack history      Full audit trail with timing
cqltrack lint         Static analysis for dangerous CQL
cqltrack diff         Compare schemas across environments
cqltrack snapshot     Export live schema as CQL
cqltrack baseline     Adopt on existing databases
cqltrack pending      CI gate — exit code 1 if unapplied
cqltrack validate     Detect modified migration files
cqltrack repair       Accept intentional file changes
cqltrack new          Scaffold a migration file
cqltrack profiles     List environment profiles
Enter fullscreen mode Exit fullscreen mode

But the real value is in what happens under the hood.


Architecture

Two entry points — CLI for terminal/CI, Python API for embedding in applications. Same engine underneath.

High Level Architecture


Schema Agreement — The Problem Nobody Talks About

Cassandra propagates schema changes through gossip. After a DDL statement, nodes converge asynchronously. The common "fix" is sleep(5). That's not a fix.

cqltrack polls system.local and system.peers after every DDL statement, waiting until all nodes report the same schema_version UUID. Only then does it proceed.

Schema Agreement

Configurable timeout (default: 30s). DML statements (INSERT, UPDATE) skip this entirely — they don't need agreement.


Distributed Locking with Cassandra's Own LWT

No external dependencies. No ZooKeeper. No Redis. cqltrack uses Cassandra's Lightweight Transactions to implement a distributed lock.

Distributed Locking

When two CI runners or Kubernetes pods try to migrate simultaneously:

Locking Mechanism

Key design decisions:

  • TTL safety net — if a worker crashes without releasing, the lock auto-expires in 10 minutes
  • Ownership checkDELETE IF owner = me prevents one process from releasing another's lock
  • SERIAL consistency — full linearizability regardless of your configured consistency level
  • 30 retries, 2s intervals — patient waiting before giving up

Partial Failure Tracking

Statement 3 fails. Statements 1 and 2 already ran. Cassandra has no rollback.

Partial Failures

When a migration fails:

  1. It's recorded in the history table with status = failed
  2. cqltrack status shows it as pending — ready for retry
  3. cqltrack history shows it as FAILED — full audit trail
  4. Fix the CQL file, run migrate again — clean retry

The linter helps prevent retryability issues. It warns you if you write CREATE TABLE without IF NOT EXISTS — because on retry, that statement would fail even though the table already exists from the first attempt.


Not Just a CLI — A Python Library

Starting from v1.1.0, cqltrack works as a programmatic API. Embed migrations directly in your application startup:

FastAPI:

from contextlib import asynccontextmanager
from fastapi import FastAPI
from cqltrack import CqlTrack

@asynccontextmanager
async def lifespan(app: FastAPI):
    with CqlTrack("cqltrack.yml") as tracker:
        tracker.init()
        tracker.migrate()
        yield

app = FastAPI(lifespan=lifespan)
Enter fullscreen mode Exit fullscreen mode

Django:

from django.apps import AppConfig
from cqltrack import CqlTrack

class MyAppConfig(AppConfig):
    def ready(self):
        with CqlTrack("cqltrack.yml") as tracker:
            tracker.init()
            tracker.migrate()
Enter fullscreen mode Exit fullscreen mode

Plain Python — no YAML needed:

from cqltrack import CqlTrack

with CqlTrack(contact_points=["127.0.0.1"], keyspace="my_app") as t:
    t.init()
    t.migrate()
Enter fullscreen mode Exit fullscreen mode

The distributed lock ensures only one Gunicorn worker or Kubernetes pod runs migrations. Others wait, then find nothing pending.

Kubernetes Deployment

Built-in Static Analysis

8 rules that catch real problems:

Rule What it catches
no-rollback Missing @down section
empty-rollback Empty rollback — false sense of safety
drop-no-if-exists DROP TABLE without IF EXISTS — not idempotent
create-no-if-not-exists CREATE TABLE without IF NOT EXISTS
column-drop ALTER TABLE DROP in UP section — permanent data loss
truncate TRUNCATE — wipes all data
pk-alter PRIMARY KEY change — impossible in Cassandra
type-change Column type change — very limited support

Context-aware: ALTER TABLE DROP in @down (rollback) is expected and not flagged. No running Cassandra needed.


Schema Diff Across Environments

$ cqltrack diff --source dev --target prod

Schema diff: myapp_dev <-> myapp_prod

  table       audit_log                     only in dev
  column      users.phone                   only in dev        text
  index       idx_users_email               only in dev

3 difference(s) found.
Enter fullscreen mode Exit fullscreen mode

CQL Track Diff Tracking

Compares tables, columns (with types), partition keys, clustering keys, indexes, and UDTs. Works across clusters via profiles or between keyspaces on the same cluster.


Multi-Environment Profiles

One YAML file. All environments:

profiles:
  dev:
    keyspace:
      name: myapp_dev
  prod:
    cassandra:
      contact_points: [prod-1, prod-2, prod-3]
      consistency: LOCAL_QUORUM
    keyspace:
      name: myapp_prod
Enter fullscreen mode Exit fullscreen mode
cqltrack --profile prod migrate
cqltrack --profile dev diff --target-profile prod
Enter fullscreen mode Exit fullscreen mode

Config resolution (last wins):

Config Resolution


The Full Migration Lifecycle

Migration Life Cycle

How It Compares

cassandra-migrate cassandra-migration cqltrack
Schema agreement No No Yes
Distributed lock (LWT) No Basic Yes + TTL
Partial failure tracking No No Yes
CQL linter No No 8 rules
Schema diff No No Yes
Multi-env profiles No No Yes
JSON output for CI No No Yes
Astra DB No No Yes
Baseline adoption No No Yes
Programmatic API No No Yes

Compatibility

Verified with:

  • Apache Cassandra 3.x, 4.x, 5.x
  • DataStax Enterprise (DSE)
  • DataStax Astra DB (cloud)

AWS Keyspaces is not supported yet — it doesn't support Lightweight Transactions the way open-source Cassandra does. It's on the roadmap.


Get Started

pip install cql-track
Enter fullscreen mode Exit fullscreen mode

GitHub: github.com/ereshzealous/cql-track
PyPI: pypi.org/project/cql-track
Docs: USAGE.md
Examples: FastAPI, Django, plain Python

MIT licensed. Contributions welcome.

If you've struggled with Cassandra schema management, give it a try. Star the repo if it saves you time.

If you get a chance to try it out, I'd appreciate any feedback. Feel free to open an issue at github.com/ereshzealous/cql-track/issues if you run into anything.


Top comments (0)