Preface
As an open-source distributed task scheduling system, Apache DolphinScheduler often needs to scale up or down in actual production environments based on business demands. This article provides a detailed walkthrough of the scaling process—both expansion and reduction—of the DolphinScheduler cluster, helping operations teams safely and efficiently adjust cluster size.
Cluster Expansion Operations
1. Pre-expansion Preparation
Before performing the expansion, make sure of the following:
- Node type to be added: Master or Worker
- Number of nodes to be added
- Whether the physical machine where the new node is located has the required services installed
Important Tip: A single physical machine should not run multiple Master or Worker service processes simultaneously.
2. Basic Environment Setup
2.1 Required Software Installation
All new nodes must install:
- JDK 1.8+: JAVA_HOME environment variable must be configured
- Basic tools: such as
wget
,tar
, etc.
Optional for Worker nodes:
- Hadoop/Hive/Spark clients (if corresponding task types are to be executed)
2.2 Obtain Installation Package
- Confirm the version of the existing cluster and download the same version of the installation package
- Determine a unified installation directory (e.g.,
/opt/dolphinscheduler
) - Download and extract the installation package to the target directory
- Add the database driver package (e.g.,
mysql-connector-java
)
mkdir -p /opt
tar -zxvf apache-dolphinscheduler-<version>-bin.tar.gz -C /opt
mv /opt/apache-dolphinscheduler-<version>-bin /opt/dolphinscheduler
3. System User Configuration
Create the deployment user and configure sudo privileges on all new nodes:
useradd dolphinscheduler
echo "dolphinscheduler123" | passwd --stdin dolphinscheduler
echo 'dolphinscheduler ALL=(ALL) NOPASSWD: ALL' >> /etc/sudoers
sed -i 's/Defaults requirett/#Defaults requirett/g' /etc/sudoers
4. Configuration File Adjustments
4.1 Copy Configuration Files
Copy the conf
directory from an existing node to the new node and double-check:
-
datasource.properties
: Database connection info -
zookeeper.properties
: ZooKeeper connection info -
common.properties
: Resource storage configuration -
dolphinscheduler_env.sh
: Environment variables
4.2 Configure Environment Variables
Edit conf/env/dolphinscheduler_env.sh
, sample configuration:
export HADOOP_HOME=/opt/soft/hadoop
export JAVA_HOME=/opt/soft/java
export PATH=$JAVA_HOME/bin:$PATH
Create a symbolic link to Java:
sudo ln -s $JAVA_HOME/bin/java /usr/bin/java
4.3 Update Cluster Configuration
Edit bin/env/install_env.sh
on all nodes:
# Add Master node example
ips="ds1,ds2,ds3"
masters="master1,master2,ds1,ds2"
# Add Worker node example
workers="worker1:default,worker2:default,ds3:default"
5. Permission Setup & Cluster Restart
Set directory permissions:
sudo chown -R dolphinscheduler:dolphinscheduler /opt/dolphinscheduler
Restart the cluster:
# Stop all services
bin/stop-all.sh
# Start all services
bin/start-all.sh
6. Expansion Verification
- Use
jps
command to check service processes - Check log files on each node
- Confirm the status of new nodes via the Web UI Monitoring Center
Cluster Shrinking Operations
1. Pre-shrink Preparation
Clearly identify the node types and quantities to be removed, ensuring the operation will not affect existing task execution.
2. Shrinking Steps
2.1 Stop Services on Target Nodes
Run the following on the nodes to be removed:
# Stop Master service
bin/dolphinscheduler-daemon.sh stop master-server
# Stop Worker service
bin/dolphinscheduler-daemon.sh stop worker-server
Use jps
to confirm services have been stopped.
2.2 Update Cluster Configuration
Edit bin/env/install_env.sh
on all nodes and remove the corresponding node configurations:
# Master shrink example
masters="master1,master2" # Removed ds1, ds2
# Worker shrink example
workers="worker1:default,worker2:default" # Removed ds3
3. Post-shrink Check
- Confirm the remaining nodes are running properly
- Check whether task scheduling is affected
- Monitor system resource usage
Notes
- Version Consistency: Ensure all nodes use the same version of DolphinScheduler
- Config Synchronization: All nodes must have identical configuration files
- Service Dependencies: Worker nodes must install necessary clients for specific task types
- Resource Permissions: Ensure the deployment user has sufficient permissions for the resource storage system
- Operation Order: Always stop services before modifying configurations to avoid inconsistencies
By following these steps, you can safely scale your DolphinScheduler cluster up or down to flexibly meet changing business needs. It is recommended to perform these operations during off-peak business hours and to back up your system beforehand.
Top comments (0)