DEV Community

Cover image for PDF Generation, Bloat and Optimization
James Moberg
James Moberg

Posted on

PDF Generation, Bloat and Optimization

The State of myCFML PDF Generation

My current comparison is with CFDocument on the deprecated CF2016 Developer Edition. I'm planning on performing tests with CF2021 & CF2023 soon and will perform this on my personal developer workstation using CommandBox and a testing framework that I'm in the process of developing. I have not personally downloaded or used CF2025 yet1. I believe that it may be using the same iText library as CF10. (Can anyone confirm this?) I also haven't compared with Lucee 5.3+ Flying Saucer implementation.

This will be a part of a series as I intend on performing a lot more tests as well as sharing tips that we've learned on how to improve overall PDF quality, performance and file size.

Technologies

  • CFDocument Built-in Adobe ColdFusion function that creates PDF output from a text block containing CFML and HTML. (NOTE: Some ACF overhead may occur due to adding the "Adobe ColdFusion Developer/Trial Edition - Not for Production Use" watermark.)

  • CFHTMLtoPDF Built-in Adobe ColdFusion function that creates PDFs from HTML using a WebKit based rendering engine. Some ACF overhead may occur due to adding the "Adobe ColdFusion Developer/Trial Edition - Not for Production Use" watermark. I tried testing this on a CF2016 Standard production server, but got the error message "No Service manager is available.". (We don't use CFHTMLTOPDF, so it may be disabled at the CFAdmin level.)

  • WKHTMLTOPDF (LGPLv3; portable) tends to be faster and generate smaller PDFs. It can also run concurrently and generate PDFs in the background without using a ColdFusion thread or impacting the Java heap memory. Download

  • Ghostscript (GNU GPL Affero license; portable) is an interpreter for the PostScript(R) language and PDF files. It runs on various embedded operating systems and platforms including Windows, macOS, the wide variety of Unix and Unix-like platforms, and VMS systems. Download

Test an empty page with only a "Hello World" H1 header

There's a huge differences in the generation file size and the Ghostscript optimization doesn't make much of a difference.

Server Engine Generation FileSize Ghostscript Duration Ghostscript Filesize Percent Smaller
CF2016 CFDocument 486 ms 40,837 10 ms 40,068 1.89%
CF2016 CFHTMLTOPDF 1286 ms 77,179 9 ms 98,953 -28.20%
CF2016 WKHTMLTOPDF 223 ms 7,539 9 ms 5,167 31.47%
<!DOCTYPE html>
<html lang="en"><head>
<meta charset="utf-8">
</head>
<body>
<h1>Hello World</h1>
</body>
</html>
Enter fullscreen mode Exit fullscreen mode

Tests with a locally-downloaded 4.7mb NASA JPG image rendered at 400px wide.

The difference regarding file size is a much wider gap here for CFDocument. Using Ghostscript to optimize results in a much smaller CFDocument.

Server Engine Generation FileSize Ghostscript Duration Ghostscript Filesize Percent Smaller
CF2016 CFDocument 8,046 ms 24,311,787 50 ms 74,478 99.693%
CF2016 CFHTMLTOPDF 2,256 ms 120,258 1017 ms 136,211 -11.71%
CF2016 WKHTMLTOPDF 1,638 ms 492,115 45 ms 21,012 95.73%

I also ran this test script on a production server for CFDocument to eliminate the watermark and the filesize was consistent 24mb, but the Ghostscript result was 47,028 (36.86% smaller if no watermark exists) and still twice as large as WKHTMLTOPDF.

<!DOCTYPE html>
<html lang="en"><head>
<meta charset="utf-8">
</head>
<body>
<h1>Hello World</h1>
    <img src="/tempdirectory/images_yyyymmddHHnnss.jpg" width="400" height="248" border="1">
</body>
</html>
Enter fullscreen mode Exit fullscreen mode

1 I haven't personally interacted with CF2025 beyond accessing it online via CFFiddle. I haven't had any time to read the terms and do not wish to be legally bound by any rules that I may not be fully understand. Back to text

Source Code

https://gist.github.com/JamoCA/b957c34cddea38f4bd2d777b41e348ac

Top comments (1)

Collapse
 
paolo_olocco_5c55abb63004 profile image
Paolo Olocco

Hi James, how were you able to test CFHTMLTOPDF with CommandBox? I tried to configure Adobe PDFServlet but It wouldn't work