Mohamed Roshan

Posted on May 17

Data transfer made easy - Rclone

#gcp #cloud #cli #linux

Let me get straight to it, I used to transfer data from Files.com or any other platform where files were dropped to our cloud buckets using scripts. It was all okay when the file sizes were within a few MBs, but things got painful once they grew into GBs. It started taking a lot more time.

To speed things up, I tried running the transfer on a VM, it did get faster, but not faster faster, especially when the size crossed 400+ GB.

That’s when I started looking for a better way to connect my GCP/AWS buckets directly with these storage platforms, something that could make the transfer process faster and more reliable. And that’s where rclone came into the picture.

Rclone

I have set it up on my vm as a job that runs the backups/transfer with ease

sudo apt update 

curl https://rclone.org/install.sh | sudo bash

the usual installation process, once done with is lets set up the config, this is the place where we mention the details of the storage place where we transferring the data from and to

rclone config

gonna throw options to set remote

from here it will take you to a bunch of storage platform options supported by rclone that can be used to mount

choose the one that's preferred, I used files.com and gave it a name which will be used to refer later on, did the auth using api here.

PS : You might not find the api option right away so wait for the edit advanced config option

now we are done with one remote, moving on to the next, follow the similar steps as the first one. rclone config -> new remote -> pick the one you want and provide the auth method. I have gone for GCS bucket here, mentioned the project number and performed auth using the service account json key

Also if you're concerned and specific about object acl and classes, you can pick the appropriate one from the options

once you're done with it, you can check if the mounting has been successful by using the ls command along with the remote name

rclone ls filescom:

And to copy the files the usual syntax is

rclone copy <source> <destination> [flags]

we got bunch of flags to show the progress --progress, mention the parallel transfer with number --transfers [number], to perform a simulation use --dry-run, to exclude or include any files we can use --exclude or --include

rclone copy filescom:/hawk gcs:vault-archive/-P --transfers=8 --checkers=10 --buffer-size=64M --fast-list --retries=5 --low-level-retries=10 --timeout=5m --contimeout=30s --retries-sleep=10s  --log-file=/home/mohamed-roshan-k/rclone_transfer.log --log-level=INFO

-p = progress bar
--checker = checking if the file already exists in the destination
--buffer-size = mentions the size per file that's transferred to the buffer
--retries = number of times it should retry the transfer if it fails
--low-level-retries = similar like --retries but for network and file level error
--timeout = aborts the task if its stuck more than the mentioned time
--contimeout = connection timeout
--retries-sleep = interval between each retry
--log-file = path to the logs

Some screenshots on the time taken for transfer.
Do note, the process can be made faster if we increase

Transfer = --transfers
Checkers = --checkers
Buffer size = --buffer-size

If your VM has the specs to handle the increased load (CPU, RAM, and network), you’ll see a noticeable improvement in performance (pretty obvious but yea)

DEV Community

Data transfer made easy - Rclone

Rclone

Top comments (0)