DB data migration is a pain

#database #java #opensource #sql

JDBMig: A Lightweight CLI Tool for Cross-Database Data Migration

When a client tasked me with migrating data from MySQL to PostgreSQL (and later Oracle), I conducted an extensive search for existing open-source solutions. While there are excellent commercial tools available, I found a gap in the open-source market for something that was both simple and highly functional without excessive overhead.

At the time, I was working extensively with Java and JDBC. Given that JDBC drivers exist for virtually every data source, I decided to architect my own utility: JDBMig.

Core Objectives

The design philosophy was driven by four technical requirements:

Simplicity: A streamlined CLI interface for terminal-based operations.
OS Agnostic: Full portability across different operating systems.
DB Independence: Leveraging JDBC to ensure compatibility with any SQL-compliant database.
Data Mapping: The ability to map data structures with support for minor transformations or "adjustments" during the process.

Architecture & Workflow

The utility implements an ETL-like (Extract, Transform, Load) logic, decoupled into two primary phases to ensure data integrity and flexibility:

Export Phase: The tool connects to the source database and serializes the dataset into a structured JSON format.
Import Phase: The utility parses the JSON payload and executes the ingestion logic into the target schema.

Schema Management & Hooks

Automating the full DDL (Data Definition Language) generation for different dialects (e.g., converting MySQL AUTO_INCREMENT to PostgreSQL SERIAL or Sequences) is notoriously complex. Given the project's constraints, I opted for a pragmatic approach using lifecycle hooks:

beforeImport hook: Executes a custom SQL script to initialize the schema (tables, types, etc.) on the target DB prior to data ingestion.
afterImport hook: Executes a final SQL script to handle post-migration tasks, such as generating indices, constraints, and updating sequences.

This approach provides the developer with full control over the schema mapping without the tool needing to handle complex SQL dialect translations.

Usage

java -jar JDBMig.jar --import --config config/config.json --dataDir /path/to/exported/data

--export: Export data from the source database to JSON files.
--import: Import data from JSON files into the target database.
--config: Path to your configuration file.
--dataDir: Directory to store/read exported JSON files.
--useConn: Specifies which connection to use if multiple are defined.

Config file example:

{
  "dataDir": "data/",
  "fieldToLowerCase": false,
  "prettyPrint": false,
  "useConn": "MYSQL_connection",
  "tables": ["table1", "table2"],
  "MYSQL_connection": {
    "type": "mysql",
    "initString": null,
    "jdbcUrl": "jdbc:mysql://localhost:3306/DATABASE_NAME",
    "user": "user",
    "password": "password"
  },
  "drivers": [
    {"name": "mysql", "className": "com.mysql.jdbc.Driver", "jarFile": "lib/mysql-5.1.18.jar"},
    {"name": "postgres", "className": "org.postgresql.Driver", "jarFile": "lib/pgsql-42.2.15.jar"}
  ]
}