DEV Community

Cover image for How I Automated Database Seeding in CI/CD with One API Call
DDLTODATA
DDLTODATA

Posted on

How I Automated Database Seeding in CI/CD with One API Call

Automate PostgreSQL test data generation in your CI/CD pipeline. No more flaky tests from inconsistent data.


Manual test data generation doesn't scale. When you're running hundreds of CI builds per day, you need data generation that's fast, consistent, and fully automated.

DDL to Data provides a REST API that generates realistic PostgreSQL test data in milliseconds — fast enough for every pipeline run.

Quick Start

1. Get Your API Key

Sign up at ddltodata.com/tiers — the free tier includes 50,000 rows/month.

2. Save Your Schema

curl -X POST https://api.ddltodata.com/schemas \
  -H "Content-Type: application/json" \
  -H "X-API-Key: YOUR_API_KEY" \
  -d '{
    "name": "users",
    "ddl": "CREATE TABLE users (id SERIAL PRIMARY KEY, first_name VARCHAR(50), last_name VARCHAR(50), email VARCHAR(100) UNIQUE, created_at TIMESTAMP DEFAULT NOW());"
  }'
Enter fullscreen mode Exit fullscreen mode

3. Generate Data

curl https://api.ddltodata.com/generate/users?rows=100 \
  -H "X-API-Key: YOUR_API_KEY"
Enter fullscreen mode Exit fullscreen mode

Response:

[
  {
    "id": 1,
    "first_name": "Sarah",
    "last_name": "Chen",
    "email": "sarah.chen@techstartup.io",
    "created_at": "2024-03-15T09:23:41"
  },
  ...
]
Enter fullscreen mode Exit fullscreen mode

That's it. Three lines to generate 100 rows of realistic test data.

API Reference

Base URL

https://api.ddltodata.com
Enter fullscreen mode Exit fullscreen mode

Authentication

All requests require an API key in the header:

X-API-Key: YOUR_API_KEY
Enter fullscreen mode Exit fullscreen mode

Endpoints

Save Schema

POST /schemas
Enter fullscreen mode Exit fullscreen mode
{
  "name": "products",
  "ddl": "CREATE TABLE products (id SERIAL PRIMARY KEY, name VARCHAR(100), price DECIMAL(10,2));"
}
Enter fullscreen mode Exit fullscreen mode

Generate Data

GET /generate/{schema_name}?rows={count}&format={format}
Enter fullscreen mode Exit fullscreen mode

Parameters:

  • rows — Number of rows (default: 10, max: 10,000)
  • format — Output format: json, csv, or sql

Generate Related Data

For multi-table schemas with foreign keys:

POST /generate/related?format={format}
Enter fullscreen mode Exit fullscreen mode
{
  "tables": {
    "users": 10,
    "orders": 50,
    "order_items": 200
  }
}
Enter fullscreen mode Exit fullscreen mode

Returns data for all tables with referential integrity maintained.

List Schemas

GET /schemas
Enter fullscreen mode Exit fullscreen mode

Delete Schema

DELETE /schemas/{schema_name}
Enter fullscreen mode Exit fullscreen mode

CI/CD Integration

GitHub Actions

name: Test with Generated Data

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest

    services:
      postgres:
        image: postgres:15
        env:
          POSTGRES_PASSWORD: postgres
          POSTGRES_DB: test_db
        ports:
          - 5432:5432
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5

    steps:
      - uses: actions/checkout@v4

      - name: Create tables
        run: psql -h localhost -U postgres -d test_db -f schema.sql
        env:
          PGPASSWORD: postgres

      - name: Seed test data
        run: |
          curl -s "https://api.ddltodata.com/generate/users?rows=100&format=sql" \
            -H "X-API-Key: ${{ secrets.DDL_API_KEY }}" | \
            psql -h localhost -U postgres -d test_db
        env:
          PGPASSWORD: postgres

      - name: Run tests
        run: npm test
Enter fullscreen mode Exit fullscreen mode

CircleCI

version: 2.1

jobs:
  test:
    docker:
      - image: cimg/node:18.0
      - image: cimg/postgres:15.0
        environment:
          POSTGRES_USER: postgres
          POSTGRES_DB: test_db
          POSTGRES_PASSWORD: postgres

    steps:
      - checkout

      - run:
          name: Wait for Postgres
          command: dockerize -wait tcp://localhost:5432 -timeout 1m

      - run:
          name: Create tables
          command: psql -h localhost -U postgres -d test_db -f schema.sql

      - run:
          name: Seed test data
          command: |
            curl -s "https://api.ddltodata.com/generate/users?rows=100&format=sql" \
              -H "X-API-Key: ${DDL_API_KEY}" | \
              psql -h localhost -U postgres -d test_db

      - run:
          name: Run tests
          command: npm test
Enter fullscreen mode Exit fullscreen mode

GitLab CI

test:
  image: node:18
  services:
    - postgres:15
  variables:
    POSTGRES_DB: test_db
    POSTGRES_USER: postgres
    POSTGRES_PASSWORD: postgres
  script:
    - apt-get update && apt-get install -y postgresql-client curl
    - psql -h postgres -U postgres -d test_db -f schema.sql
    - |
      curl -s "https://api.ddltodata.com/generate/users?rows=100&format=sql" \
        -H "X-API-Key: ${DDL_API_KEY}" | \
        psql -h postgres -U postgres -d test_db
    - npm test
Enter fullscreen mode Exit fullscreen mode

Jenkins

pipeline {
    agent any

    environment {
        DDL_API_KEY = credentials('ddl-api-key')
    }

    stages {
        stage('Seed Database') {
            steps {
                sh '''
                    curl -s "https://api.ddltodata.com/generate/users?rows=100&format=sql" \
                      -H "X-API-Key: ${DDL_API_KEY}" | \
                      psql -h localhost -U postgres -d test_db
                '''
            }
        }

        stage('Test') {
            steps {
                sh 'npm test'
            }
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

Language Examples

Python

import requests

API_KEY = "your_api_key"
BASE_URL = "https://api.ddltodata.com"

headers = {"X-API-Key": API_KEY}

# Generate data
response = requests.get(
    f"{BASE_URL}/generate/users",
    params={"rows": 100, "format": "json"},
    headers=headers
)

users = response.json()
print(f"Generated {len(users)} users")
Enter fullscreen mode Exit fullscreen mode

JavaScript/Node.js

const API_KEY = process.env.DDL_API_KEY;

async function generateTestData(schemaName, rows = 100) {
  const response = await fetch(
    `https://api.ddltodata.com/generate/${schemaName}?rows=${rows}`,
    {
      headers: { 'X-API-Key': API_KEY }
    }
  );
  return response.json();
}

const users = await generateTestData('users', 100);
console.log(`Generated ${users.length} users`);
Enter fullscreen mode Exit fullscreen mode

Go

package main

import (
    "encoding/json"
    "fmt"
    "net/http"
    "os"
)

func generateTestData(schemaName string, rows int) ([]map[string]interface{}, error) {
    url := fmt.Sprintf("https://api.ddltodata.com/generate/%s?rows=%d", schemaName, rows)

    req, _ := http.NewRequest("GET", url, nil)
    req.Header.Set("X-API-Key", os.Getenv("DDL_API_KEY"))

    resp, err := http.DefaultClient.Do(req)
    if err != nil {
        return nil, err
    }
    defer resp.Body.Close()

    var data []map[string]interface{}
    json.NewDecoder(resp.Body).Decode(&data)
    return data, nil
}
Enter fullscreen mode Exit fullscreen mode

Ruby

require 'net/http'
require 'json'

API_KEY = ENV['DDL_API_KEY']

def generate_test_data(schema_name, rows: 100)
  uri = URI("https://api.ddltodata.com/generate/#{schema_name}?rows=#{rows}")

  req = Net::HTTP::Get.new(uri)
  req['X-API-Key'] = API_KEY

  res = Net::HTTP.start(uri.hostname, uri.port, use_ssl: true) do |http|
    http.request(req)
  end

  JSON.parse(res.body)
end

users = generate_test_data('users', rows: 100)
puts "Generated #{users.length} users"
Enter fullscreen mode Exit fullscreen mode

Best Practices

Store API Key Securely

Never commit API keys to version control. Use environment variables or secrets management:

  • GitHub Actions: Repository secrets
  • CircleCI: Project environment variables
  • GitLab: CI/CD variables (masked)
  • Jenkins: Credentials plugin

Cache Schemas

Save schemas once, generate many times. Don't re-upload schemas on every pipeline run.

Use Appropriate Row Counts

  • Unit tests: 10-50 rows
  • Integration tests: 100-500 rows
  • Load tests: 1,000-10,000 rows

Handle Rate Limits

The API returns 429 Too Many Requests if you exceed limits. Implement exponential backoff:

import time
import requests

def generate_with_retry(schema_name, rows, max_retries=3):
    for attempt in range(max_retries):
        response = requests.get(
            f"https://api.ddltodata.com/generate/{schema_name}?rows={rows}",
            headers={"X-API-Key": API_KEY}
        )

        if response.status_code == 429:
            time.sleep(2 ** attempt)
            continue

        return response.json()

    raise Exception("Max retries exceeded")
Enter fullscreen mode Exit fullscreen mode

Performance

  • Response time: Under 100ms for most requests
  • Throughput: Generate 10,000 rows in a single request
  • Uptime: 99.9% SLA on paid plans

Fast enough that data generation won't slow down your CI/CD pipeline.

Pricing

  • Free — 50,000 rows/month, $0
  • Developer — 250,000 rows/month, $9/mo
  • Team — 1,000,000 rows/month, $29/mo

View all plans →

Get Started

  1. Get your API key: ddltodata.com/tiers
  2. Save your schema: POST /schemas
  3. Generate data: GET /generate/{schema}?rows=100
  4. Integrate with CI/CD: Use examples above

Questions? Reach out at support@ddltodata.com or check the User Guide.

Top comments (0)