loading...
Cover image for What is Proper Continuous Integration?

What is Proper Continuous Integration?

Marko Anastasov on October 04, 2018

Continuous integration (CI) is confusing. As with all ideas, everybody does their own version of it in practice. CI is a solution to the problems ...
pic
Editor guide
Collapse
ben profile image
Ben Halpern

It takes us just under 10 minutes to push to dev.to. It really would be lovely to improve this, or at least work to never make it worse. I know we're perfectly in the realm of average, but this is something I feel could be much faster if it were tuned and gardened properly.

Collapse
derekjhopper profile image
Derek Hopper

We're about the same currently. We use Semaphore, but don't use their parallel feature. Every push gets built. We don't normally push every single commit.

We use Firefox for integration tests. I feel we could improve performance by switching to headless chrome, but Firefox has been pretty reliable for us over the years.

Collapse
david_j_eddy profile image
David J Eddy

I got tired of the complexity in using headless chrome so I built a docker container for it. Then, 6 month later I found Cypress.io, life changer.

Collapse
markoa profile image
Marko Anastasov Author

Ben, where is most of the time spent?

It sounds like it's less than 10mins for both CI + deployment, which is great!

Collapse
joehobot profile image
Joe Hobot

What makes you think that's great?

Thread Thread
markoa profile image
Marko Anastasov Author

CI + deployment < 10mins for a medium+ web app with probably hundreds of thousands of users sounds good to me in general. Most importantly developers can stay in the zone and keep moving fast. Is there room for optimization — probably, as Ben hinted, but it's impossible to say anything useful without looking at the code, configuration & infrastructure being used.

Collapse
austinhardaway profile image
Austin Hardaway

Is there some goalpost that gave you a 10 minute gate? I feel like CI shouldn't imply speed as much as continuity, i.e. one automated process commit->test->deploy (not neccessarily one technical process>

Collapse
simonhaisz profile image
simonhaisz

CI is all about the quick feedback loop. The 10 minute target is a common one and for a very good reason - responding to failures.

One of the principles is that master should always be green. When something does go wrong it needs to be fixed fast. The sooner you learn about a failure the sooner you can react. If you react quickly enough you don't have as much as a context switch to fix the issue. Less time means that there will be less commits from others to a broken branch further muddying the waters.

Some people have automated tools that will automatically revert changes after a CI failure. There more old school practice (that we have where I work) is that after you push your don't leave your desk until you get the green alert, so any issues in master can be dealt with as soon as possible. If it took more than 10 minutes not many people would follow the practice 😁

Now don't get me wrong, a process that takes 30 minutes or an hour is better than no process at all. But there are real advantages to tightening that loop. Depending upon your solution and your environment it may not be worth the investment to do so. Having only have a handful of commiters is a lot different from dozens or hundreds.

Collapse
austinhardaway profile image
Austin Hardaway

Thanks that makes a lot of sense! And clearly in this arena, as in most, faster is always better. Faster feedback==faster fixes. However I do think that a separation of cocerns here is warranted. A slower CI pipeline is still a CI pipeline, and I worry about including a speed goal as an implied meaning to a term that describes the functionality of a thing. It probably does stem from not having experience on a team that large though...

Thread Thread
markoa profile image
Marko Anastasov Author

Our biggest Ruby on Rails app has 80k lines of test code, unit + integration => entire CI build takes 7mins. With automatic parallelization, of course. Otherwise it'd take an hour and a half.

Thread Thread
derekjhopper profile image
Derek Hopper

Thanks for sharing the time without parallelization. I think a lot of teams and developers out there can benefit from hearing that.

Collapse
stenpittet profile image
Sten

I second this. I started to write a comment but you said it all :)

Collapse
jcoelho profile image
José Coelho

Great article Marko.
How do you reduce the time of a build?
In my company we have a big project, with no tests, and usually the build takes around 20 to 40 minutes.

Any tips or tricks to reduce such builds? ( building with Maven)

Collapse
bizzy237 profile image
Yury
mvn install -o -T 1C -pl modulename -am

Using modules for stuff that can be built independently and then running the build in parallel should help a bit. Using incremental builds is also faster because you don't need to rebuild everything if you only changed one file. And if you have all your dependencies in your local repository, you don't need to download the internet

Collapse
darkain profile image
Vincent Milum Jr

Simplicity is key. If tests are taking that long, usually that is indication that the underlaying application framework or the testing framework itself is fundamentally flawed. I know this isn't always the case, but focusing on simplicity really does lead to incredible performance. In my base libraries that I develop, the largest has over 700 tests now. The entire testing framework completes in under 1 second still even on modest hardware. A major issue that I see with most testing frameworks is the level of repetition within them, usually in the form of each test creating a virtualized environment of some kind, run a test that completes in a few milliseconds, then tear down the environment and repeat the process for the next test. This setup and take down I've seen account for literally 99.9%+ of the execution time in testing. By doing setup and takedown only once in my testing script, it has enabled this massive level of performance, plus has the added benefit of catching bugs that were missed previously. It turned out that some methods in the library left the library core's state in an inconsistent state. On a single test basis, this was missed, but subsequent calls to the library assumed the consistent of the library's state and would fail if and only if ran without the core being taken down and reinitialized. To help catch these, unit tests are now run in random order upon every execution, and there are multiple executions in a row put git push (possible because it can now complete so quickly)

Collapse
markoa profile image
Marko Anastasov Author

This setup and take down I've seen account for literally 99.9%+ of the execution time in testing.

I can confirm this based on my experience too. For example, let's say you're working on a large Ruby on Rails web app, tens of thousands of LoC. The default approach when writing any new piece of test code is to include a test helper which basically adds the complete monolith as a dependency to your test. This isn't always necessary, some code can stand alone.

On a more granular level, a new test case that you're writing may not need all the data that other tests in the file need to load, etc. It's all about thinking a little more carefully about what we're doing.

Collapse
apoca profile image
Miguel Vieira

You are talking about whole kind of tests, like integration and acceptance or only unit tests!?

Collapse
rapasoft profile image
Pavol Rajzak

That's the thing. In big projects, you should have multiple levels of tests. The unit tests should be the fastest ones and should be executed after each push. The integration tests are the ones that should be executed before merging feature branch into master branch (if it triggers deployment). Any other long running tests (like separate test automation that's clicking around the application and takes tens of minutes to finish) should be run nightly.

Collapse
apoca profile image
Miguel Vieira

Exactly Pavol! When I read this article I get scared, my integration tests take at least 3 hours... And yes, I am doing like you describe.

Thread Thread
markoa profile image
Marko Anastasov Author

3+ hours means the entire team cannot deploy more than twice during work hours. If your engineering and business leadership is OK with that then what you can do...

Is that the best possible outcome? I think not, see my comment above on how fast we can go with a 1.5h (sequential) build.

Thread Thread
apoca profile image
Miguel Vieira

As you said whole unit test must run in less than 10 minutes (average)... and our unit tests aren't different. But, our integration tests take more than 3 hours (depends of course of characteristics of these machine).

When you run seeds, and migrations in each test with a hundreds of endpoints I think that time is acceptable, I don't think you are thinking correctly in this particularly kind of tests (integration and acceptance), this must take several hours (it depends how big is your project and your team), and we only run this when we think we have all stuff done to deploy for production. But, in daily work sprints we run unit test in less than 10 minutes of course...

There are many telltale signs that easily differentiate a unit test from an integration test: Encapsulation, Complexity and Test failure. So, I think we're talking about different things...

whether you have a different approach... I am all ears :)

Thread Thread
markoa profile image
Marko Anastasov Author

What I refer to as our 1h30min sequential build being cut down to 7mins does include all Cucumber scenarios too (integration & acceptance tests). They all run seeding and db migration prior to launching Firefox for UI testing. So it really is possible. :) We use a Semaphore feature called Boosters, which is Ruby only atm.

Collapse
qm3ster profile image
Mihail Malo

A bit offtopic, but what's the fastest free CI for open-source?
Still Semaphore at the moment?

Collapse
markoa profile image
Marko Anastasov Author

Still Semaphore ⚡️ 😉

Collapse
qm3ster profile image
Mihail Malo

Ok, gonna set some up this week, don't fail me Marko Anastasov :v

Has anything been written about Semaphore + pnpm?
Or maybe Semaphore + Rush + pnpm 😱
The doc on caching mentions only npm and yarn atm

Sorry for being so bothersome 😎

Thread Thread
markoa profile image
Marko Anastasov Author

No worries, ask away!

No official docs on pnpm yet, but this will work:

curl -L https://unpkg.com/@pnpm/self-installer | node
pnpm install

Btw I totally recommend that you try Semaphore 2.0. In that case follow 2.0 Node.js docs to set up caching of node modules.

Thread Thread
qm3ster profile image
Mihail Malo

I got some caching to work, and it was actually faster than clean install despite pnpm's speed and Semaphore's fast npm downloads.

But I haven't managed to make completely offline repeat installs work.
I'll try to remember to ping you once that's done, so you can add it to the Semaphore docs as well as pnpm docs.

I've been trying to get @microsoft/rush working as well, can you tell me what git environment is set up, and exactly what does the checkout command do? (It appears it's a sparser checkout than rush check needs)

Thread Thread
markoa profile image
Marko Anastasov Author

checkout does a shallow git clone by default, but you can override that, see docs. The script is open source.

Not sure what you mean by "what git environment"?

I'd be happy to include tips on using Rush in Semaphore docs, feel free to share a recipe. :)

Thread Thread
qm3ster profile image
Mihail Malo

I'm still in the process of understanding what rush actually wants, but I'll share what I learn.

Collapse
kyslik profile image
Martin Kiesel

Wait an hour gitlab.com/gitlab-org/gitlab-ee/pi....

(I know some people pointed this already in the comments; but what kind of tests are we talking about here!?)

Collapse
markoa profile image