DEV Community

Chen Debra
Chen Debra

Posted on

Running Apache DolphinScheduler Without ZooKeeper: Two Proven Registry Alternatives

When designing a distributed scheduling platform, the choice of a service registry has a direct impact on both system architecture and operational complexity. Although Apache DolphinScheduler uses ZooKeeper as its default registry center, it also provides multiple alternatives that allow users to choose the solution that best fits their existing infrastructure and operational capabilities.

Instead of forcing every deployment to rely on ZooKeeper, DolphinScheduler offers greater flexibility by supporting both JDBC Registry and Etcd Registry, giving organizations more options for different deployment scenarios.

Why Do Some Teams Prefer Not to Use ZooKeeper?

ZooKeeper has long been the de facto coordination service for distributed systems. It is mature, reliable, and battle-tested. However, many teams still hesitate to introduce it into their environments for several practical reasons.

Higher Operational Complexity

Running ZooKeeper requires deploying and maintaining an independent cluster. In production environments, at least three nodes are typically recommended to achieve high availability. For smaller teams or organizations with limited infrastructure resources, maintaining an additional distributed system increases both operational overhead and maintenance costs.

An Additional Technology Stack

Many organizations already operate around relational databases such as MySQL or PostgreSQL. Introducing ZooKeeper means adopting and maintaining another technology stack, which requires additional expertise and increases the learning curve for operations teams.

Extra Infrastructure Costs

ZooKeeper consumes dedicated computing resources, including CPU, memory, storage, and networking. For organizations aiming to simplify infrastructure or reduce resource consumption, these additional requirements may become an unnecessary burden.

Registry Alternatives in Apache DolphinScheduler

To address different deployment requirements, Apache DolphinScheduler currently provides two production-ready alternatives to ZooKeeper:

  • JDBC Registry
  • Etcd Registry

Each solution offers the same core capabilities required by the scheduler while targeting different infrastructure preferences.

Option 1: JDBC Registry

One of the most innovative features in Apache DolphinScheduler is the JDBC Registry, which eliminates the need for an additional registry service by leveraging an existing relational database such as MySQL or PostgreSQL.

1

How It Works

Instead of relying on an external coordination service, the JDBC Registry reproduces the essential capabilities of ZooKeeper using relational database tables.

Event Notification

The JdbcRegistryDataChangeListenerAdapter converts database changes—including record creation, updates, and deletions—into DolphinScheduler Event notifications that trigger registered SubscribeListener callbacks.

Internally, the registry detects changes through polling or trigger-based mechanisms, effectively simulating ZooKeeper's Watcher functionality.

Distributed Locking

The JDBC Registry provides two methods for acquiring distributed locks:

  • acquireLock(String key)
  • acquireLock(String key, long timeout)

These methods support both blocking and timeout-based lock acquisition.

Locks are managed through database records to ensure mutual exclusion across distributed nodes. Registry entries are categorized into two types:

  • EPHEMERAL — Temporary entries that are automatically cleaned up through heartbeat detection when a client disconnects or fails.
  • PERSISTENT — Permanent entries that remain available until explicitly removed.

This mechanism enables reliable distributed coordination without requiring ZooKeeper.

Deployment Steps

Step 1: Initialize the Registry Tables

Execute the appropriate initialization script according to your database:

  • MySQL: src/main/resources/mysql_registry_init.sql
  • PostgreSQL: src/main/resources/postgresql_registry_init.sql

Step 2: Update the Configuration

Add the following configuration to:

  • master-server/conf/application.yaml
  • worker-server/conf/application.yaml
  • api-server/conf/application.yaml
registry:
  type: jdbc
  heartbeat-refresh-interval: 3s
  session-timeout: 60s
  hikari-config:
    jdbc-url: jdbc:mysql://127.0.0.1:3306/dolphinscheduler
    username: root
    password: root
    maximum-pool-size: 5
    connection-timeout: 9000
    idle-timeout: 600000
Enter fullscreen mode Exit fullscreen mode

Step 3: Add the Database Driver

If you are using MySQL, copy the mysql-connector-java.jar driver into the DolphinScheduler classpath. The MySQL JDBC driver is intentionally not bundled with the official distribution package, so it must be provided separately.

Results

With the JDBC registry, the MasterServer and WorkerServer of Apache DolphinScheduler store metadata in a relational database. Database transactions ensure data consistency, while the heartbeat mechanism enables service discovery and failure detection.

This approach is particularly suitable for environments that already have a mature database operations team, allowing organizations to take full advantage of their existing database infrastructure.

Option 2: Etcd Registry

Etcd is a distributed key-value store designed for the cloud-native era and is especially well suited for Kubernetes and other cloud-native environments. The Etcd registry implementation in Apache DolphinScheduler is built on the Jetcd client library and provides functionality similar to ZooKeeper.

2

How It Works

Event Listening

The EtcdRegistry class uses Etcd's Watch API to monitor changes (create, update, and delete operations) on a specified key or key prefix. It converts the underlying Etcd watch events into DolphinScheduler Event objects and triggers the SubscribeListener callback to provide real-time notifications.

Distributed Locking

EtcdKeepAliveLeaseManager grants leases with a specified TTL and continuously keeps them alive using Etcd's keep-alive mechanism. If a client disconnects, the lease expires automatically, releasing the lock without requiring manual intervention.

Connection Health Monitoring

EtcdConnectionStateListener monitors the connection state between Apache DolphinScheduler and the Etcd cluster. When the connection is lost or restored, it automatically re-establishes distributed locks or re-registers services.

Configuration

1. Update the Configuration Files

Add the following configuration to:

  • master-server/conf/application.yaml
  • worker-server/conf/application.yaml
  • api-server/conf/application.yaml
registry:
  type: etcd
  endpoints: "http://etcd0:2379, http://etcd1:2379, http://etcd2:2379"
  namespace: dolphinscheduler
  connection-timeout: 9s
  retry-delay: 60ms
  retry-max-delay: 300ms
  retry-max-duration: 1500ms
  # Optional SSL configuration
  cert-file: "deploy/kubernetes/dolphinscheduler/etcd-certs/ca.crt"
  key-cert-chain-file: "deploy/kubernetes/dolphinscheduler/etcd-certs/client.crt"
  key-file: "deploy/kubernetes/dolphinscheduler/etcd-certs/client.pem"
  # Optional authentication configuration
  user: ""
  password: ""
  authority: ""
Enter fullscreen mode Exit fullscreen mode

2. SSL Configuration Notes

If SSL is enabled on the Etcd server, make sure your JDK version is newer than Java 8u252 (released in April 2020). JDK 11 is also fully supported. The Docker images currently use JDK 8u362, which works correctly.

This requirement exists because native ALPN support was introduced starting with Java 8u252.

Results

With the Etcd registry, Apache DolphinScheduler can fully leverage Etcd's strong consistency and high availability to deliver low latency, excellent scalability, and simplified deployment in cloud-native environments.

This option is especially suitable for teams that already use Kubernetes and the Etcd technology stack.

Why This Design?

The design philosophy behind providing multiple registry implementations in Apache DolphinScheduler is straightforward: reduce dependencies on external components and allow users to choose the registry that best fits their environment.

This philosophy has been consistently reflected throughout the evolution of Apache DolphinScheduler. The project previously implemented a Redis-based queue, but the Redis implementation was eventually removed to reduce external dependencies.

Today, multiple registry options are provided not to replace ZooKeeper, but to give users greater flexibility when selecting the deployment architecture that best meets their requirements.

For Kubernetes deployments, this flexibility is reflected in the values.yaml configuration of the Helm Chart:

zookeeper:
  enabled: true  # Enabled by default

registryEtcd:
  enabled: false  # Enable manually if needed

registryJdbc:
  enabled: false  # Enable manually if needed
Enter fullscreen mode Exit fullscreen mode

Conclusion

Apache DolphinScheduler provides two production-ready alternatives to ZooKeeper: JDBC Registry and Etcd Registry.

Both implementations support the same core capabilities required by a distributed scheduler, including service registration, service discovery, event notification, heartbeat management, and distributed locking.

The JDBC Registry is particularly attractive for teams with mature relational database infrastructure, while the Etcd Registry offers a seamless experience for organizations embracing Kubernetes and cloud-native technologies.

Instead of forcing every deployment to depend on a single coordination service, DolphinScheduler allows users to make infrastructure decisions based on their own operational requirements.

This flexibility reflects one of the project's core principles: software should adapt to its users—not the other way around.

As distributed systems continue to evolve, reducing unnecessary dependencies while providing deployment choices becomes increasingly valuable. By supporting multiple registry implementations, Apache DolphinScheduler enables organizations to build reliable scheduling platforms using the technologies they already know and trust.

Notes

  • ZooKeeper remains the default registry implementation in Apache DolphinScheduler and has not been deprecated.
  • By default, the JDBC Registry uses the same database as DolphinScheduler metadata, although a separate database can also be configured.
  • The Etcd Registry supports both SSL/TLS encryption and user authentication, allowing deployments to meet different security requirements.
  • In pseudo-cluster deployments, users can choose among ZooKeeper, MySQL (JDBC Registry), and Etcd as the registry implementation.
  • All registry configurations support environment variables, making them easy to integrate into containerized and cloud-native deployment workflows.

Top comments (0)