Dexter

Posted on Nov 23

Why We Chose Go to Rewrite Our DB-to-Elasticsearch Sync Tool

The Challenge: Building a Better CDC Tool

In the modern data landscape, real-time synchronization from databases to search engines has become a critical requirement. Whether you're building e-commerce search, analytics dashboards, or log aggregation systems, you need reliable, fast, and maintainable CDC (Change Data Capture) solutions.

When we started ElasticRelay, we looked at existing solutions like Logstash, Debezium + Kafka Connect, and Apache Flink. While powerful, they often came with significant overhead:

Complex deployment: Multi-service architectures requiring Kafka clusters, Zookeeper coordination, and JVM tuning
Resource intensive: High memory footprint and CPU usage, especially for smaller workloads
Configuration complexity: YAML/JSON configurations that quickly become unwieldy
Operational burden: Multiple moving parts, each with their own failure modes

We decided to build something different: a lightweight, reliable, and developer-friendly CDC tool that just works™.

Why Go? The Technical Decision

After evaluating several languages including Java, Python, and Rust, we chose Go for ElasticRelay's core data plane. Here's why:

1. Goroutines: Built-in Concurrency Without the Complexity

CDC workloads are inherently concurrent. You're reading from multiple database tables, transforming data in parallel, and writing to multiple Elasticsearch indices simultaneously. Go's goroutine model made this natural:

// ElasticRelay's parallel snapshot processing
func (m *ParallelSnapshotManager) Start(ctx context.Context, tables []string) error {
    // Create worker pool
    m.workers = make([]*SnapshotWorker, m.config.WorkerPoolSize)
    for i := 0; i < m.config.WorkerPoolSize; i++ {
        worker := NewSnapshotWorker(i, m)
        m.workers[i] = worker
        go worker.Run(m.ctx)  // Each worker runs in its own goroutine
    }

    // Process table chunks concurrently
    for _, tableName := range tables {
        go m.processTable(tableName) // Parallel table processing
    }

    return nil
}

What would require thread pools, executors, and complex synchronization in Java becomes elegant and readable in Go. Our parallel snapshot processing can handle millions of records across dozens of tables with just a few hundred lines of code.

2. Channels: Elegant Data Pipeline Architecture

CDC systems are essentially data pipelines. Go's channels provided the perfect abstraction for building our processing stages:

type ParallelSnapshotManager struct {
    tableQueue chan *TableTask    // Tables waiting to be processed
    chunkQueue chan *ChunkTask    // Data chunks ready for processing
    resultChan chan *ProcessResult // Completed chunks
}

// Data flows naturally through the pipeline
func (w *SnapshotWorker) Run(ctx context.Context) {
    for {
        select {
        case chunk := <-w.manager.chunkQueue:
            result := w.processChunk(chunk)
            w.manager.resultChan <- result
        case <-ctx.Done():
            return
        }
    }
}

This channel-based architecture makes our system naturally backpressure-aware and resource-bounded. If Elasticsearch is slow, the channels fill up and upstream processors automatically slow down.

3. Single Binary Deployment: DevOps Simplicity

One of Go's killer features for infrastructure tools is single binary deployment:

# Build once, run anywhere
go build -o elasticrelay ./cmd/elasticrelay

# Docker deployment is trivial
FROM scratch
COPY elasticrelay /elasticrelay  
ENTRYPOINT ["/elasticrelay"]

Compare this to a typical Kafka Connect + Debezium setup:

JVM with specific version requirements
Kafka cluster (3+ nodes for production)
Zookeeper ensemble (3+ nodes)
Connect worker nodes
Plugin management and classpath configuration

ElasticRelay runs as a single process with minimal resource requirements. Our users report production deployments running stably on 2-core, 4GB RAM instances handling millions of daily events.

4. Memory Efficiency: Streaming Without the Bloat

JVM-based tools often struggle with memory efficiency due to garbage collection overhead and object allocation patterns. Go's efficient memory model and garbage collector allowed us to build truly streaming processors:

// Stream processing with controlled memory usage
func (w *SnapshotWorker) processChunkStream(chunk *ChunkTask) error {
    // Process in configurable batches to control memory
    batchSize := w.config.BatchSize // Typically 1000-10000 records

    for {
        batch, err := w.fetchBatch(chunk, batchSize)
        if err != nil || len(batch) == 0 {
            break
        }

        // Transform and send immediately - no accumulation
        if err := w.processBatch(batch); err != nil {
            return err
        }

        batch = nil // Help GC
    }

    return nil
}

This approach keeps memory usage constant regardless of table size. We've successfully synchronized tables with 100+ million records while maintaining memory usage under 4GB.

5. Rich Ecosystem: Standing on Giants' Shoulders

Go's ecosystem provided excellent libraries for our specific use case:

go-mysql: Battle-tested MySQL binlog parsing
elastic/go-elasticsearch: Official Elasticsearch client with bulk operations
gRPC-Go: High-performance service communication
Testify: Comprehensive testing framework

// MySQL binlog parsing with go-mysql
syncer := replication.NewBinlogSyncer(replication.BinlogSyncerConfig{
    ServerID: cfg.ServerID,
    Flavor:   "mysql",
    Host:     cfg.DBHost,
    Port:     uint16(cfg.DBPort),
    User:     cfg.DBUser,
    Password: cfg.DBPassword,
})

// Elasticsearch bulk operations
res, err := es.Bulk(
    es.Bulk.WithIndex(indexName),
    es.Bulk.WithBody(bulkBody),
    es.Bulk.WithRefresh("wait_for"),
)

The integration was seamless, and the libraries' Go-idiomatic APIs made our code clean and maintainable.

Real-World Performance: The Numbers Don't Lie

The Go rewrite delivered significant performance improvements:

Metric	Legacy Solution	ElasticRelay (Go)	Improvement
Initial Sync Time	27 hours (100M records)	2-4 hours	85%+ faster
Memory Usage	8-16GB (unbounded)	2-4GB (controlled)	75% reduction
Binary Size	200MB+ (with dependencies)	15MB (static binary)	90% smaller
Cold Start Time	2-3 minutes	5-10 seconds	95%+ faster
Resource Requirements	8 cores, 16GB RAM	2 cores, 4GB RAM	75% reduction

Architecture Highlights: Go-Powered Design Patterns

Graceful Degradation with Interface-Based Design

Go's interfaces enabled us to build a system that gracefully handles failures:

type SinkServiceServer interface {
    BulkWrite(stream pb.SinkService_BulkWriteServer) error
    DescribeIndex(context.Context, *pb.DescribeIndexRequest) (*pb.DescribeIndexResponse, error)
}

// Real implementation
type ElasticsearchSink struct { /* ... */ }

// Fallback implementation for DLQ
type DummySinkServer struct {}

func (d *DummySinkServer) BulkWrite(stream pb.SinkService_BulkWriteServer) error {
    // Immediately fail to trigger DLQ processing
    return fmt.Errorf("sink unavailable - triggering DLQ")
}

When Elasticsearch is unavailable, ElasticRelay automatically routes events to a Dead Letter Queue (DLQ) and continues processing. This resilience-by-default approach prevents data loss during outages.

Context-Driven Cancellation

Go's context package provided elegant cancellation and timeout handling:

func (m *ParallelSnapshotManager) processWithTimeout(
    ctx context.Context, 
    table string,
) error {
    // Create timeout context for this specific table
    tableCtx, cancel := context.WithTimeout(ctx, 30*time.Minute)
    defer cancel()

    select {
    case result := <-m.processTable(tableCtx, table):
        return result
    case <-tableCtx.Done():
        return fmt.Errorf("table %s processing timeout", table)
    case <-ctx.Done():
        return ctx.Err() // Global cancellation
    }
}

This pattern ensures that no operation can hang indefinitely, and cancellations propagate cleanly through the entire system.

The Developer Experience Factor

Beyond performance, Go significantly improved our development experience:

1. Fast Build Times

# Complete rebuild in seconds, not minutes
time make build
real    0m3.245s
user    0m5.234s
sys     0m1.456s

2. Excellent Tooling

go fmt        # Consistent formatting
go vet        # Static analysis
go test -race # Race condition detection
go mod tidy   # Dependency management

3. Cross-Platform Builds

# Build for multiple platforms from one machine
make build-all
# Produces: linux/amd64, darwin/amd64, darwin/arm64, windows/amd64

Challenges and Trade-offs

Go wasn't perfect for every aspect of our system:

1. Error Handling Verbosity

Go's explicit error handling can be verbose:

// Typical Go error handling pattern
config, err := config.LoadMultiConfig(configFile)
if err != nil {
    return fmt.Errorf("failed to load config: %w", err)
}

orchServer, err := orchestrator.NewMultiOrchestrator(grpcAddr)
if err != nil {
    return fmt.Errorf("failed to create orchestrator: %w", err)
}

While verbose, this explicitness helped us build more robust error handling and better observability.

2. Generics Adoption

Before Go 1.18, the lack of generics led to some code duplication. Post-1.18, we've been gradually adopting generics for type-safe collections and algorithms.

3. Dynamic Configuration

Go's strong typing sometimes clashes with the need for dynamic configuration. We solved this with interface-based plugin systems:

type TransformRule interface {
    Apply(record map[string]interface{}) (map[string]interface{}, error)
    Validate() error
}

// Different rule implementations
type FieldRenameRule struct { /* ... */ }
type DataTypeConversionRule struct { /* ... */ }
type CustomScriptRule struct { /* ... */ }

Lessons Learned: Go Best Practices for Infrastructure Tools

1. Start with Interfaces

Define your interfaces first, implementations second. This enables testing, mocking, and graceful degradation patterns.

2. Embrace Channels for Pipeline Architecture

Channels naturally model data flow and provide backpressure handling for free.

3. Use Context Everywhere

Context enables clean cancellation, timeouts, and tracing throughout your system.

4. Design for Single Binary Deployment

Minimize external dependencies and embrace Go's static linking capabilities.

5. Profile Early and Often

Go's built-in profiling tools (go tool pprof) make performance optimization straightforward.

The Road Ahead: Go's Role in ElasticRelay's Future

As ElasticRelay evolves toward supporting PostgreSQL, MongoDB, and advanced data governance features, Go continues to be the right choice:

Performance: Our parallel processing architecture scales linearly with core count
Reliability: Explicit error handling and testing culture reduce production issues
Maintainability: Go's simplicity keeps our codebase approachable for new team members
Ecosystem: Rich libraries for databases, message queues, and cloud services

Conclusion: Go for the Win

Choosing Go for ElasticRelay's rewrite was one of our best technical decisions. The combination of:

Built-in concurrency (goroutines + channels)
Memory efficiency (streaming processing + efficient GC)
Deployment simplicity (single binary)
Developer productivity (fast builds + excellent tooling)
Rich ecosystem (mature libraries for our use case)

...enabled us to build a CDC tool that's 5x faster, 4x smaller, and 10x easier to deploy than traditional solutions.

If you're building infrastructure tools and considering Go, we highly recommend it. The language's design philosophy of simplicity, clarity, and pragmatism aligns perfectly with the needs of reliable, high-performance systems.

Want to try ElasticRelay? Check out our GitHub repository or read our Getting Started guide.

Questions? Join our community discussions or reach out on Twitter.

The ElasticRelay team is passionate about building better data infrastructure tools. Follow our journey as we make real-time data synchronization simple, reliable, and accessible to every developer.

Related Articles:

Tags: #golang #cdc #elasticsearch #dataengineering #opensource #mysql #performance

DEV Community

Why We Chose Go to Rewrite Our DB-to-Elasticsearch Sync Tool

Why We Chose Go to Rewrite Our DB-to-Elasticsearch Sync Tool

The Challenge: Building a Better CDC Tool

Why Go? The Technical Decision

1. Goroutines: Built-in Concurrency Without the Complexity

2. Channels: Elegant Data Pipeline Architecture

3. Single Binary Deployment: DevOps Simplicity

4. Memory Efficiency: Streaming Without the Bloat

5. Rich Ecosystem: Standing on Giants' Shoulders

Real-World Performance: The Numbers Don't Lie

Architecture Highlights: Go-Powered Design Patterns

Graceful Degradation with Interface-Based Design

Context-Driven Cancellation

The Developer Experience Factor

1. Fast Build Times

2. Excellent Tooling

3. Cross-Platform Builds

Challenges and Trade-offs

1. Error Handling Verbosity

2. Generics Adoption

3. Dynamic Configuration

Lessons Learned: Go Best Practices for Infrastructure Tools

1. Start with Interfaces

2. Embrace Channels for Pipeline Architecture

3. Use Context Everywhere

4. Design for Single Binary Deployment

5. Profile Early and Often

The Road Ahead: Go's Role in ElasticRelay's Future

Conclusion: Go for the Win

Top comments (0)