DEV Community

Cover image for The State of Minification in PHP – How 1 Project Grew into 6 (Part 2)
Hexydec
Hexydec

Posted on

The State of Minification in PHP – How 1 Project Grew into 6 (Part 2)

In part 1 of this post, I presented to you my journey of writing a set of compilers to minify HTML/CSS/JS using PHP, along with the software I wrote to see how they stacked up against the competition.

This is the results.

The Contenders

I used google, packagist, and GitHub to find some projects to go up against, I picked projects that were most used. Where the original project had been abandoned, I picked forks that maintained them, although some of those forks themselves hadn’t been updated in a while.

I also picked a script on GitHub Gist that seemed to have some traction, it combined an HTML, CSS, and Javascript minifier.

So here they are (Note stats may not be completely up to date):

HTML Minifiers

HTML Minifier Type Commits Stars Dependents Maintained
hexydec/htmldoc Compiler 212 11 1 Yes
mrclay/minify RegExp 764 2900 3731 Yes
taufik-nurrohman RegExp 48 199 Unnknown Yes
voku/html-min Compiler 203 106 151 No
pfaciana/tiny-html-minifier Linear Consumer 46 123 0 No
deruli/html-minifier Compiler 39 2 0 No

CSS Minifiers

CSS Minifier Type Commits Stars Dependents Maintained
hexydec/cssdoc Compiler 169 1 1 Yes
mrclay/minify RegExp 764 2900 3731 Yes
taufik-nurrohman RegExp 48 199 Unknown No
matthiasmullie/minify RegExp 439 1700 4544 Yes
wikimedia/minify RegExp 220 0 12 Yes
tubalmartin/cssmin RegExp 123 223 5586 No
natxet/cssmin Compiler 38 79 1308 No
websharks/css-minifier RegExp 10 17 159 No

Javascript Minifiers

JS Minifier Type Commits Stars Dependents Maintained
hexydec/jslite Compiler 64 2 1 Yes
mrclay/jsmin Linear Consumer 764 2900 3731 Yes
taufik-nurrohman RegExp 48 199 Unnknown No
matthiasmullie/minify RegExp 439 1700 4544 Yes
wikimedia/minify RegExp 220 0 12 Yes
tedivm/jshrink Linear Consumer 110 25 0 No

Software

The tests were run on a VPS running Ubuntu and PHP 8, with 2 Intel Xeon CPU's and 2GB RAM.

Metrics

To understand the results, there are 3 metrics here that we need to look at:

  • Compression Ratio – How much compression was achieved by each package (Gzipped and non-gzipped)
  • Speed – How fast does each package perform the compression
  • Reliability – Is the output valid

The first two were fairly straight forward to calculate, but the third needed some extra programming to plugin some packages or external services to validate the code, and there are some caveats:

  • Not all the input is valid, so where the number of errors in the input equals the number of errors in the output, the result is considered valid
  • The HTML and CSS minifiers both use the W3 validator service, on certain errors it doesn’t present the errors in that tree, if the minifier fixes the error, it may show more errors than the input if the tree in question has more errors underneath it, causing a false positive

Because of the false positives on the output, the software allows you to see the validation errors, and to view the output.

Scoring System

For each metric, a score will be calculated like this:

(100/(Max-Min)*(Value-Min)

This means that the slowest / lowest compression / least reliable will score 0 and the fastest / highest compression / most reliable will score 100, and other values will be graded between the highest and lowest.

To get the totals, the scores will be added up, but since there are two metrics for the compression (gzipped and non-gzipped), the reliability and speed metrics will be multiplied by 2, allowing the average score of the two compression metrics to be calculated in.

A Note

Before looking at the results I want to note that all the developers including me have worked very hard on their creations, and many of them are well tested and are used by millions of websites everyday.

Currently my software is only used by a handful of websites.

The Results

HTML Minification Test

HTML Minification Test Results

The scores for the HTML test are as follows:

Reliability Speed Compression Gzip Total
voku/html-min 93 23 60 45 56%
mrclay/minify 80 30 41 32 49%
taufik-nurrohman 0 100 0 0 33%
deruli/html-minifier 100 0 44 36 47%
hexydec/htmldoc 94 44 100 100 79%

taufik-nurrohman’s minifier was the fastest followed by mrclay/minify, both beat the next fastest which was mine, by about 8x. Unsurprisingly these were both RegExp based ones. All the compilers were slower, the slowest of which was deruli/html-minifier, which is a PHP port of the Blink Tokeniser. It was 4x slower than the fastest compiler (mine), but this is not surprising since the original code was written in C++, which is much faster than PHP.

My software gave the highest compression at 11.00%, 39% better than the next highest voku/html-min at 6.71%. Just to point out here that my HTML minifier also minified inline CSS and Javascript, and I don't think any of the others did, which is perhaps why the compression was so much higher than the others. Without compressing the inline CSS and JS, the compression ratio for my software would have been 8.2%.

All the compilers performed better than their RegExp counterparts on the compression metric.

The worst performer was taufik-nurrohman at 0.20%, note that only the results considered valid were taken into account, and because of the amount of errors in the results, the compression was always going to be lower, although strangely most of the results that were valid seemed to produce bigger output than the input.

All the minifiers tested showed more errors in the output than the input on at least one test, although after a brief analysis of the input and output, it looks like most of them are false positives. Certainly in my software, the errors captured by the validator were all present in the input code, but for some reason were not reported in the input.

There was once exception to this, and this was the Gist script by taufik-nurrohman, it showed output errors in over 90% of the test websites.

Even though there were false positives, I took the results as the came out for scoring.

Overall I am pretty happy with the performance of my software, discounting the errors which I think are all false positives, it was the fastest compiler and also produced the best compression, along with the highest overall score.

The script from taufik-nurrohman did not perform well, whilst it was the fastest, not only did it have the most errors, it also had by far the lowest compression.

The most popular project mrclay/minify performed well, a few more errors were reported than on my software, but again I think these are all false positives.

CSS Minification Test

CSS Minification Test Results

The scores for the CSS test are as follows:

Reliability Speed Compression Gzip Total
matthiasmullie/minify 100 0 46 43 48%
tubalmartin/cssmin 100 79 68 73 83%
mrclay/minify 100 63 36 45 68%
natxet/cssmin 100 89 0 0 63%
cerdic/css-tidy 100 73 29 60 73%
websharks/css-minifier 100 100 12 67 80%
wikimedia/minify 100 100 37 55 82%
taufik-nurrohman 100 99 28 7 72%
hexydec/cssdoc 100 72 100 100 91%

All the contenders performed well here, there were no minifiers that produced more validation errors that there were in the input.

The lack of errors is here is likely an output of the fact that CSS is easier to parse than other languages due to its fixed and predictable format.

The fastest minifier here was wikimedia/minify, along with websharks/css-minifier it blitzed the rest of the field, minifying all 12 test files in 0.047 seconds. The slowest was matthiasmullie/minify which completed the tests in 8.4 seconds. Again the RegExp based minifiers were all faster than the compilers with the exception of the above, the fastest compiler was natxet/cssmin at 0.94 seconds.

All the contenders were very close in their compression ratio, again this is likely due to the predictable format of CSS. The lowest compression was natxet/cssmin at 21.17%, 16.70% gzipped. The highest compression ratio was achieved by my software at 24.63% and 20.14% gzipped.

One anomaly here was matthiasmullie/minify, it mostly performed well in compression and would have completed the tests in good time, but it for some reason had trouble processing a couple of the test files, one has significantly lower compression than the others, and another took 6 seconds out of its 8 seconds overall to minify it.

Here my software was not the fastest compiler, but it did achieve the highest compression. I ran the results here on Ubuntu, but on my laptop which is Windows, it was by far the fastest compiler, I guess the OS and platform implementation of PHP makes a difference in the performance of the code that is running.

Javascript Minification Test

Javascript Minification Test Results

The scores for the Javascript test are as follows:

Reliability Speed Compression Gzip Total
matthiasmullie/minify 100 0 0 0 33%
mrclay/jsmin 100 33 96 96 77%
tedivm/jshrink 100 76 96 96 91%
wikimedia/minify 100 86 97 94 94%
taufik-nurrohman 0 100 7 0 34%
hexydec/jslite 100 27 100 100 76%

For the Javascript tests all the contenders produced valid results except for taufik-nurrohman which had 2/10 invalid results. It was also the fastest, completing the tests in 0.18 second, wikimedia/minify was the next fastest at 1.75 seconds.

With regards to speed, the RegExp based minifiers were out front again (except one). But there are more linear consumers and my compiler in the mix, showing how much more difficult it is to parse Javascript than HTML and CSS. Indeed when I started my project it was RegExp based, but I rewrote it as a compiler because I found that I couldn’t achieve what I wanted to do using regular expressions alone.

After the RegExp based minifiers, the Linear Consumers came next in the speed ranking, with my compiler and then matthiasmullie/minify bringing up the rear. My software had the highest compression ratio in both non-gzip and gzip with 33.97% and 26.67% respectively.

The slowest and lowest compression was matthiasmullie/minify, again there was an anomaly where it was unable to compress one of the tests files to anywhere close to the ratio achieved by all the others. This had a big impact on its overall compression ratio which ended up being about half of the others (17.55%), it would have been on par without this problem. It also struggled in the speed department, clocking in a total of 11.4 seconds, interestingly on Windows it was one of the fastest, whilst the most popular mrclay/jsmin was the slowest.

Happy here to be leading the pack across the board on compression ratio, my goal was to produce software that was fast enough to use on the fly with inline javascript, so I didn’t want to get too crazy with the optimisations, I only picked the ones that were safe to perform and didn’t need too much context to know whether it was safe.

Hats off to the devs who write the javascript based minfiers where their AST tracks all of the variables and scopes, and thus are able to perform much more heavy-handed optimisations in a safe manor, something that all the software packages here opted not to do, probably for both processing speed and to reduce the complexity of their software.

Conclusion

It is interesting how the structure of the input language determined the spread of speed and ratio, with the more predictably structured CSS showing the contenders much closer on all metrics. Also the difference in speed/ratio/reliability between the RegExp based minifiers and the compilers, there was definitely a clear gap in speed with the RegExp software being faster, whilst also being slightly less reliable and lower compression.

The results here are just an indication of the performance and quality of the software packages presented here. With different inputs, running on other OS's the results will be different. Also the scoring system levels each metric evenly against the others, whereas real world requirements may be different.

On quality, I think this is harder to measure than the benchmark presented here, which didn't take into account metrics like code coverage for tests, or whether the project is maintained.

What do you think? Please tell me your thoughts in the comments.

How My Software Performed

To the performance of my own software I am definitely happy, whilst none were the fastest overall which I expected, they were also not the slowest, and proved to be reliable in all tests (bar the false positives which all the competitors seemed to suffer from in the HTML test).

Proud also that my software achieved the highest compression across the board, helped by the software I wrote to compare it to the competition.

In the scoring my HTML and CSS minifiers came out in 1st place, with the Javascript minifier coming in 4th. Speed was the issue here, and being the only compiler in the list it wasn't a suprise that it didn't score that well on the speed metric, although it wasn't the slowest and also was only 1% behind the most popular package. Perhaps I will rework it into a linear consumer at some point.

For my own process, writing and testing this software has been extremely challenging, but has enabled me to learn new concepts, implementation techniques, and overall improve my process. Before starting these projects, I hadn’t:

  • Done any serious testing with PHPUnit
  • Published anything on packagist
  • Used composer or published anything that required it
  • Written a Wordpress plugin
  • Done much with promises in Javascript
  • Produced a code coverage report
  • Used GitHub’s build system
  • Published any articles on dev.to

To all developers who are starting out or who haven’t published their own project yet, I would highly recommend it. You will be pushed out of your comfort zone, and be required to learn new techniques, processes, platforms and systems to be able to get your software to a point where it is stable, tested, documented, and basically good enough for other developers to pull into their production projects.

If you want to run these tests yourself, or want to try out my software, all my projects are available on GitHub and other platforms with the following links:

Top comments (0)