DEV Community

Chen Debra
Chen Debra

Posted on

6 Common SQL Operation Tips in DolphinScheduler

Summary: Part 4 of the Apache DolphinScheduler Backend SQL Experience Series
Keywords: Big Data, Data Quality, Workflow Scheduling

Overview

After researching DolphinScheduler and applying it in an actual project for a period of time, I’ve gathered some practical backend SQL tips to share.

I'm using MySQL as the backend database for DolphinScheduler in this example.

Let’s use tasks with names containing “ods_xf_act” as an example.

1. Modify Task Group

UPDATE t_ds_task_definition a
JOIN t_ds_task_definition_log b ON a.`code` = b.`code` AND a.version = b.version
SET a.task_group_id = 19, b.task_group_id = 19
WHERE a.name LIKE '%ods_xf_act%'
Enter fullscreen mode Exit fullscreen mode

2. Batch Modify Task Execution Type

UPDATE t_ds_process_definition a
JOIN t_ds_process_definition_log b ON a.code = b.code AND a.version = b.version
SET a.execution_type = 1, b.execution_type = 1
WHERE a.name LIKE '%ods_xf_act%';
Enter fullscreen mode Exit fullscreen mode

3. Check Timer Configurations

Use this query to decide how to configure your timers.

SELECT crontab, COUNT(*) FROM t_ds_schedules
GROUP BY crontab
ORDER BY COUNT(*) DESC
Enter fullscreen mode Exit fullscreen mode

4. Batch Modify Timers

Modifying timers one by one through the frontend is tedious, so I chose to update them in the backend in bulk.

1. Identify Timers That Need Updating

SELECT t1.id, t1.process_definition_code, crontab, t2.name 
FROM t_ds_schedules t1
JOIN t_ds_process_definition t2 
ON t1.process_definition_code = t2.`code`
WHERE name LIKE '%ods_xf_act%'
AND crontab LIKE '%0 0 5 *%'
Enter fullscreen mode Exit fullscreen mode

2. Update to the Desired Crontab Timer

UPDATE t_ds_schedules t1
JOIN t_ds_process_definition t2 
ON t1.process_definition_code = t2.`code`
SET t1.crontab = '0 0 11 * * ? *'
WHERE name LIKE '%ods_xf_act%'
AND crontab LIKE '%0 0 5 *%'
Enter fullscreen mode Exit fullscreen mode

3. Update the Trigger Table for the New Crontab

Since the timer has been changed from 5 to 11, the WHERE clause now targets crontab values of 11.

UPDATE qrtz_cron_triggers t1
SET t1.CRON_EXPRESSION = '0 0 11 * * ? *'
WHERE t1.TRIGGER_NAME IN (
  SELECT CONCAT("job_", t1.id) 
  FROM t_ds_schedules t1
  JOIN t_ds_process_definition t2 
  ON t1.process_definition_code = t2.`code`
  WHERE name LIKE '%ods_xf_act%'
  AND crontab LIKE '%0 0 11 *%'
)
Enter fullscreen mode Exit fullscreen mode

4. Update the Start Time for the New Timer Trigger

Since there's a time zone difference when updating NEXT_FIRE_TIME, subtract 8 hours.

UPDATE qrtz_triggers t1
SET t1.NEXT_FIRE_TIME = ROUND(UNIX_TIMESTAMP(DATE_SUB("2024-07-23 11:00:00", INTERVAL 8 HOUR)) * 1000)
WHERE t1.TRIGGER_NAME IN (
  SELECT CONCAT("job_", t1.id) 
  FROM t_ds_schedules t1
  JOIN t_ds_process_definition t2 
  ON t1.process_definition_code = t2.`code`
  WHERE name LIKE '%ods_xf_act%'
  AND crontab LIKE '%0 0 11 *%'
)
Enter fullscreen mode Exit fullscreen mode

5. Notification Policy Set to “Do Not Send,” But Still Receiving Alerts

Issue: Originally set to “send on failure,” then changed to “do not send,” but alerts were still triggered.

Root Cause: The associated alert group was not cleared when switching to “do not send.” This is a bug.

Temporary Solution:

SELECT t1.*
FROM t_ds_schedules t1
JOIN t_ds_process_definition t2 
ON t1.process_definition_code = t2.`code`
WHERE name LIKE '%ods_xf_act%'
Enter fullscreen mode Exit fullscreen mode

Update all records with warning_type = 0 and set the corresponding warning_group_id to 0.

6. Task Group Queue Shows No Tasks on UI but Resources Are Fully Occupied

View Task Group List

SELECT * FROM t_ds_task_group
ORDER BY create_time DESC
Enter fullscreen mode Exit fullscreen mode

If a task group is full but no tasks are visible on the UI, you can manually adjust the values.

View Task Group Queue and Mark Unfinished Tasks as Completed (Status = 2)

-- t_ds_task_group_queue.`status` tinyint(4) DEFAULT '-1' COMMENT '-1: waiting  1: running  2: finished'

SELECT * FROM t_ds_task_group_queue
WHERE 1=1
AND status <> 2  -- finished
ORDER BY create_time DESC
Enter fullscreen mode Exit fullscreen mode

View Task Instances and Mark Unfinished Tasks as Completed (State = 7)

-- t_ds_task_instance.`state` tinyint(4) DEFAULT NULL COMMENT 
-- 'Status: 0 commit succeeded, 1 running, 2 prepare to pause, 3 pause, 4 prepare to stop, 
-- 5 stop, 6 fail, 7 succeed, 8 need fault tolerance, 9 kill, 10 wait for thread, 11 wait for dependency to complete'

-- id refers to task_id from t_ds_task_group_queue

SELECT * FROM t_ds_task_instance
WHERE state <> 7  -- success
AND id IN (
  SELECT id FROM t_ds_task_group_queue
  WHERE 1=1
  AND status <> 2  -- finished
  ORDER BY create_time DESC
)
ORDER BY submit_time DESC
LIMIT 100
Enter fullscreen mode Exit fullscreen mode

Top comments (0)