DEV Community

Apache SeaTunnel
Apache SeaTunnel

Posted on

Field-Tested & Proven Free Real-Time MySQL Oracle Sync with SeaTunnel, Step-by-Step Guide

When it comes to tools for syncing MySQL to Oracle, besides the traditional OGG, SeaTunnel (ST) should definitely be on the list—it’s simply too convenient to use ~

Our production system has been running steadily for 4 months now, with 3 source databases, perfectly supported! Highly recommend everyone to give it a try~

1. How to use SeaTunnel? Let’s check the help info

./bin/seatunnel.sh -h
Usage: seatunnel.sh [options]
  Options:
    --async                         Run the job asynchronously, when the job 
                                    is submitted, the client will exit
                                    (default: false)
    -can, --cancel-job              Cancel job by JobId
    --check                         Whether check config (default: false)
    -cj, --close-job                Close client the task will also be closed 
                                    (default: true)
    -cn, --cluster                  The name of cluster
    -c, --config                    Config file
    --decrypt                       Decrypt config file, When both --decrypt 
                                    and --encrypt are specified, only 
                                    --encrypt will take effect (default: false) 
    -m, --master, -e, --deploy-mode SeaTunnel job submit master, support 
                                    [local, cluster] (default: cluster)
    --encrypt                       Encrypt config file, when both --decrypt 
                                    and --encrypt are specified, only 
                                    --encrypt will take effect (default: false) 
    --get_running_job_metrics       Gets metrics for running jobs (default: false) 
    -h, --help                      Show the usage message
    -j, --job-id                    Get job status by JobId
    -l, --list                      list job status (default: false)
    --metrics                       Get job metrics by JobId
    -n, --name                      SeaTunnel job name (default: SeaTunnel)
    -r, --restore                   restore with savepoint by jobId
    -s, --savepoint                 savepoint job by jobId
    --set-job-id                    Set custom job id for job
    -i, --variable                  Variable substitution, such as -i 
                                    city=beijing, or -i date=20190318. We use 
',' as separator, when inside "", ',' are 
                                    treated as normal characters instead of 
                                    delimiters. For example, -i 
                                    city="beijing,shanghai". If you want to 
                                    use dynamic parameters, you can use the 
                                    following format: -i date=$(date +%Y%m%d). (default: [])
Enter fullscreen mode Exit fullscreen mode

Reference explanation:

  1. Parameters related to job submission
Parameter Abbreviation Description
--config -c Required parameter, specifies the configuration file path
--name -n Sets the job name (default is "SeaTunnel")
--master -m
or -e
Specifies the deployment mode: local or cluster (default value)
--async None Runs the job in asynchronous mode, and the client exits immediately after submission (default is false)
--variable -i Variable substitution, format: -i key=value, multiple variables are separated by commas
  1. Parameters related to job management
Parameter Abbreviation Description
--job-id -j Get job status by JobId
--list -l List status of all jobs
--cancel-job -can Cancel job by JobId
--close-job -cj Close task when closing client (default true)
--savepoint -s Create savepoint for job with specified JobId
--restore -r Restore job from savepoint
--metrics None Get monitoring information of specified job
--get_running_job_metrics None Get monitoring information of running jobs

2. How to install? You can refer to the previous article

【Data Sync】First Experience with SeaTunnel: 5000 Words to Get You Started with Oracle-CDC

3. Daily Ops: Job Management

3.1 View jobs: running means it’s in progress, of course you’ll also see other statuses

./bin/seatunnel.sh -l
Enter fullscreen mode Exit fullscreen mode

3.2 Pause job

./bin/seatunnel.sh -s 967714059992432641
Enter fullscreen mode Exit fullscreen mode

3.3 Restart a paused job

./bin/seatunnel.sh -r 967714059992432641 -c $SEATUNNEL_HOME/config/mysql_virdb_config 
Enter fullscreen mode Exit fullscreen mode

Since no background run was added during restore, it stayed in the foreground. After exiting with ctrl+c, the status became:

PS: When restoring, be sure to add the relevant parameters, job name, whether to run in the background, etc.

./bin/seatunnel.sh -r 967714059992432641 -c $SEATUNNEL_HOME/config/mysql_virdb_config --async -n job_mysql_virdb
Enter fullscreen mode Exit fullscreen mode

Check the job status again—this is the result we want!

3.4 Cancel job

This command cancels the specified job. Once canceled, the job will stop, and its status will become CANCELED.

Batch canceling is supported, allowing you to cancel multiple jobs at once.

All checkpoint info for canceled jobs will be deleted and cannot be restored with seatunnel.sh -r.

./bin/seatunnel.sh -can 967714059992432641
Enter fullscreen mode Exit fullscreen mode

PS: The status after canceling looks very similar to directly exiting with ctrl+c. Restarting may result in partial data loss.

4. SeaTunnel Logging Configuration

Config file: $SEATUNNEL_HOME/config/log4j2.properties

4.1 Configure separate log file for each job (takes effect after restarting a job)

Current config:

rootLogger.appenderRef.file.ref = fileAppender
appender.file.layout.pattern = [%X{ST-JID}] %d{yyyy-MM-dd HH:mm:ss,SSS} %-5p [%-30.30c{1.}] [%t] - %m%n
Enter fullscreen mode Exit fullscreen mode

Change to:

rootLogger.appenderRef.file.ref = routingAppender
appender.file.layout.pattern = %d{yyyy-MM-dd HH:mm:ss,SSS} %-5p [%-30.30c{1.}] [%t] - %m%n
Enter fullscreen mode Exit fullscreen mode

4.2 SeaTunnel supports scheduled deletion of old log files to avoid disk space shortage

In $SEATUNNEL_HOME/config/seatunnel.yaml, add the following config:
Default values below (time is in minutes, e.g., 1440 = 1440 minutes):

seatunnel:
engine:
history-job-expire-minutes:1440
telemetry:
logs:
scheduled-deletion-enable:true
Enter fullscreen mode Exit fullscreen mode

Explanation:

  • history-job-expire-minutes: Sets retention time (in minutes) for historical jobs and logs. After this time, expired job info and logs are automatically deleted.
  • scheduled-deletion-enable: Enables scheduled cleanup (default true). Logs are deleted once they reach the expiration time. If disabled, logs will remain forever and must be managed manually—otherwise, disk usage may be impacted. It’s recommended to configure this reasonably as needed.

5. Web UI for Job Monitoring

PS: Finally, it’s here—a Web UI for job monitoring! Awesome! Let’s see how to configure and access it.

5.1 Configure Web UI

Config file: $SEATUNNEL_HOME/config/seatunnel.yaml, default config:

seatunnel:
engine:
http:
enable-http:true
port:8080
Enter fullscreen mode Exit fullscreen mode

5.2 Access Web UI

Open browser, go to http://ip:8080

5.3 Data sync validation

Compare using latest create/update time and row counts.

6. Common Errors

6.1 Must configure database and table names, otherwise error occurs

... (log error omitted, showing database option required)
Enter fullscreen mode Exit fullscreen mode

6.2 Error 2: ID issue

... (log error omitted, showing can't find field [ID])
Enter fullscreen mode Exit fullscreen mode

6.3 If already exists, error will be prompted

Error Msg = ORA-00955: name is already used by an existing object
Enter fullscreen mode Exit fullscreen mode

Web UI and server queries may show inconsistency

jobid: 967961857958608897 was paused before; after restart, Web UI showed 2 records.


6.4 Could be due to data delay—after a few days, it disappeared

7. References

Top comments (0)