loading...
Cover image for Parallel incremental FTP deploy in CI pipeline

Parallel incremental FTP deploy in CI pipeline

arxeiss profile image Pavel Kutáč ・2 min read

Automatization of deployment is must-have nowadays. There are several tools that can upload files to FTP. But none of them can upload only changes and do it in parallel. I combined two tools to achieve this, with examples for Gitlab.

🇨🇿 V češtině si lze článek přečíst na kutac.cz

TL;DR Everything described in the article is in the public repository with ready-to-go Docker image and examples for the Gitlab CI pipeline.


FTP Deployment + LFTP = Parallel incremental upload

Earlier I have forked and modified FTP Deployment tool. Which can compute hashes of all files, compares it with .htdeployment file saved on the server, and creates/uploads/deletes only changed files. However, it cannot work in parallel.

But there is another great and popular tool called LFTP which can do parallel operations. It also supports mirror command, but that cannot be used in many cases. More about that below.

In my project, I used the modified fork of FTP Deployment to track changes and prepare lists of directories and files to upload or delete. The second step consists of Bash scripts consuming lists and executing LFTP commands.

ℹ️ Thanks to the combination of both tools my deploy pipeline works much faster. With many changed files in the vendor folder, I can see time decreasing from 30 minutes to 5 minutes.

See the README and example files to see detailed explanation and ready-to-go config files. And also prepared Docker image which is based on the newest PHP and contains a modified version of FTP Deployment, LFTP, and utilities to replace environment variables and a few more.

Alt Text

Why lftp mirror cannot be always used?

The mirror command of LFTP can synchronize 2 folders, so someone could say that FTP Deployment is superfluous. And in some cases, it could really be true. However, LFTP checks only filesize and file's modification time. If there is a difference between a local file and a file on FTP, the file is synced. But this has many flaws:

  1. Some FTP servers ignore modification time sent with file and just set current time. Due to this behavior, the file should be synced always.
  2. If some files are created in the pipeline, their modification time can be different between runs. So they could be always considered as changed. For example vendor folder.
  3. Both issues above can be solved with --ignore-time flag. Then the only filesize is checked when files are compared. But if only 1 character is changed in the file, the file will not be synced, which is not sufficient for production use.

Let me know what do you think about my solution!

Discussion

pic
Editor guide
Collapse
moopet profile image
Ben Sinclair

Nice!

A (long) while ago I made burdock which is in my own words, "a cheap python knock-off of dandelion" because I wanted features but didn't know ruby.

I never really used it past the one company where I only had FTP access to their servers, but it's designed to be git-aware and offered a couple of features you might consider in your (much newer and better) app:

  • dry runs
  • fake first deployment (only upload metadata file and assume everything's already in place)
  • multiple profiles
Collapse
arxeiss profile image
Pavel Kutáč Author

Hi, thank you for the comment!

Actually, most of the features you mentioned are already supported. Because I just combined existing tools, FTP Deployment and LFTP, it's more like the setting of those tools.

FTP Deployment and LFTP, both support dry runs. Also, fake first deployment is possible with FTP Deployment. But as I mostly want to use it in CI, profiles can be handled via environment variables, which is also safer way than hardcoding credentials into the pipeline config file.