We will keep this tutorial short and sweet. Here you can find easy steps on how you can convert any HTML source page to PDF document, including all resource files, such as Images (PNG, JPG, SVG), styles and scripts.
We will use Flying Saucer library that is open source. This library uses a modified and older version of Itext.
The main steps here are:
- Creating CSS for print (we will not cover these steps, this is different for every platform and webpage)
- Creating Custom Itext Renderer (we will override standard one, a simple change to enable having PdfWriter in an earlier stage)
- Custom User-Agent (we will override standard one, we will give one more option to enable showing of SVG images on our generated PDF)
- Main Class
Let's start with the Main Class. In this class here we will make a call to the page to get the source code. For instance, we will use where am i web page, because it is simple and clean. You can get the source code in multiple ways, we recommend the following:
- Creating and executing HTTP Client (do not require any library)
- Using Chrome Driver and Selenium (slower, but is good if you need content after JS execution), good for dynamic content
After that, we need to call HtmlCleaner().celean(html) method to clean the HTML.
Now we have a cleaner and prepared HTML that we need to convert into PDF documents. First of all, you need to create OutputStream with the destination of your PDF. You can use the dynamic creation of PDF, to make sure that your code will not throw any exception.
After that, you need to create custom PdfWriter. For that, you need to create Rectangle, A3 format in our case and Document (Itext). Using the rectangle and the document get the instance of the writer. OutputStream should be the PDF document.
When you are done with that, you need to initialize new ItextRenderer, this will be modified one, on this renderer you should pass two parameters which are PdfWriter and the Document, you will initialize this in the constructor.
Next, you may or may not add ITextFontResolver, but you should add CustomUserAgent which will take care of showing our SVG images (this library does not support that by default).
And for last, use renderer and call createPdf function.
Custom Itext User-Agent
This part is very important when are you generating PDF since this library does not support showing of SVG images, we are forced to do that by ourselves.
First of all, you need to override getImageResource method, and inside to call our new function that will take care of showing the SVG image if an extension is .svg, otherwise call the method from the parent class.
After that, you need your function where you will render the SVG image. We will use PdfWriter that we created in Main Class, if we are using different ones, our PDF will not look correct.
Custom Itext Renderer
And the last step is our Custom Itext Renderer, we need this because we want to pass same PdfWriter to the createPdf function and to our Custom User-Agent. What you need to do it to initialize the _writer in the constructor and to set the writer on the ITextOutputDevice.