DEV Community

Ardas Group Inc.
Ardas Group Inc.

Posted on

How to Preview Document or File in a Browser for SaaS

File preview seems to be a common thing, so many developers might think that asking Google 'How to preview Word document in browser' would be enough to find a perfect solution for their project, because of how such a popular thing like file preview can be so hard to achieve? In this article, we will analyze how to create a web application to solve such a problem for different types of documents that can be used in SaaS projects.

Why do you need to view doc file in browser?

One of the main reasons is the functionality will reduce the time needed to search and check documents. Often, users are reluctant to download documents because it clogs their devices with data. Also, not everyone has the necessary software to view documents of various formats.

The developed component will be used to preview the loaded documents and documents generated from the template.

Why view doc file in browser for Fintech SaaS Solutions

Often, in order to comply with reporting in accounting and ERP systems, there is a need to attach files. These are situations when it is not possible to completely get rid of the paper and switch to digital. In such cases, it is necessary to store scans and, accordingly, the doc preview solves a long search for the needed document.

Fintech project where users upload invoices and financial reports.

A receipt from the project where the amounts and other data were recognized.

Any systems aimed at transforming paper documents into digital information - as a rule, require scanning and recognition. For example, personal finance accounting systems.

Why view pdf file in a web browser for CRM, HR and The Rest of Human Resource Systems
Information about a client or an employee in such systems is the main product of storage and it is often required to attach a scan of a passport or a cooperation agreement, where a file preview can speed up the process of finding the necessary information.

HR, staffing, as well as B2B systems, where employees upload scans of their documents or contracts of the serving companies
Why display word document in HTML for Legaltech Systems
At the moment, this is one of the most lagging systems in terms of technical development. They are quite conservative in their work process, so they still face a lot of papers. Most often, users come across documents in the system in two cases:

for easy storage, accounting and printing at the right time;
for document recognition and translation into a digital version;

Legaltech systems in which you need to store and manage any documents, including separate templates, and separately filled ones.

As an example - a contract in a word document from one of our past fintech SaaS projects

Why preview word document in browser for the Healthcare Industry

The Hospital Management System most often encounters a large number of documents, images (X-ray, ultrasound, patient's photo, etc.). In addition to different types of documents, they also more often than usual have unusual formats that cannot be processed using classical solutions.

Х-rays saved as an image on the server, which can be previewed in the browser in the healthcare SaaS project.

Is it secure to open doc in browser for preview?

We would like to draw your attention to the fact that in the off-the-shelf solutions have a big security issue since you have to send files to third-party services, so our team solves this problem on its own in order to save data on a personal server.

How do we ensure the safety of your data:

File transfer occurs via HTTPS / SSL;

Safe storage with Amazon S3, Google Cloud or Microsoft Azure;

For secure storage solutions, our team prefers to work with Amazon S3 due to features such as:

Low latency and high throughput;

Storing objects with 99.999999999% reliability across multiple AZs;

Resilient to events affecting the entire Availability Zone;

Estimated 99.99% availability throughout the year;

Availability guaranteed by Amazon S3 Service Level Agreement;

SSL support for data transfer and data encryption at rest;

S3 lifecycle management to automatically migrate objects to other S3 storage classes.

Off-the-shelf solutions: pros and cons

There are free online doc previews. Unfortunately, there are only two and none of them is perfect, but they greatly expand the possibilities. So here we have our money savers: Google Docs and Office Web Apps.

Preview files with Google Docs Viewer

This is not an official solution, this means that Google nowhere gives you documentation on how to properly use this, but developers somehow found it out anyway, despite that Google Docs Viewer isn’t supported anymore it still works!

Pros:

Many supported file types, probably you’ll find every file type you would like to preview: images, videos, text, code, Microsoft Office file types, pdfs, Adobe file types, svgs, font file types, archive file types and more;

25MB file limit;

Works on every popular desktop and mobile browser which is very important if you want to make a preview on mobile devices.

Cons:

Along with the lack of support from Google it likes to throw random errors which will result in no preview at all, what’s more… there’s no way of checking if it failed or not, your inline embedder won’t give you any information about it (no browser event or anything);

As you might know, Microsoft file types like .ppt, .doc, .xls, etc. are not Google file types so… It has some problems with displaying it, but don’t worry It’s not like they’ll not show up at all, just for example in .doc’s files, some images might jump into the next line/page instead of showing in a row.

Preview files with Office Web Apps
Microsoft also gives its solution to preview files on your website, surely it’s the best option for Office types files because it’s the best at parsing them into HTML.

Pros:

Faster loading than Google Docs;

Always successfully displays the result - no random errors;

Most accurate .docs and .ppts parser.

​Cons:

Supports only Microsoft Office file types: .ppt(x), .doc(x), xls(x);

10MB limit for docs/ppts, 5MB for xls;

Low (or none) support for mobiles, throws errors, doesn't display anything and it's not responsive below ~700px width.

Perhaps, you think that it is more complicated and expensive to create a custom solution of document and file preview. We can assure you that it is not that scary as it sounds plus it is much more secure especially if you deal with personal info.

Our Experience in custom document previewer for browser
For the project we encountered, it was required to implement a component for previewing documents of the following formats: jpeg, png, tiff, pdf, xls, xlsx, doc, docx.

Wherein, the component must have the following functionality:

Page through the document, scroll;

Enlarge / Reduce Document Page;

Download document;

Print document.

Since the documents are confidential, they should not be processed on third-party resources.

World document preview

The diagram below shows two processes. The first is loading documents with their conversion. The second is opening documents for preview. They can happen one after another, depending on the interface, but also loading and converting can happen only once, and the preview can be performed repeatedly.

In the first case, the user uploads the document to the server, which is saved in the file server as an original. Then the webserver sends the document to Gotenberg, which converts it into pdf. As a result, we have two documents - original and converted. If the original is not needed, it can be deleted.

When the document has already been saved in pdf, the webserver writes all the necessary information to the database to bind the path to this document, depending on the task.

As for the preview, in this case, the user clicks on the preview button, thereby sending a request to the webserver. In turn, the latter sends a request to the database, which already has all the information, including the path to the document. If the server gives a positive answer, then the document can be downloaded and sent to the browser, where it will be displayed using the Javascript libraries.

Download and preview user documents - sequence diagram
Next, we had to develop a second process, where there is a document template and it must be filled in automatically. In this case, the user sends the ID to the webserver and the required document. After receiving the template, the webserver starts replacing the keys with user data that we get from the database.

The final version goes to Gothenberg, where it is converted to PDF. After receiving the file, the web server sends it to the file server to save. As in the previous case, the webserver records the path to the document in the database. Thus, at the next request from the user, a ready document will be found for further use.

Preview generated documents - sequence diagram
Now let’s review the development process in detail.

Client-side implementation (frontend, Javascript)
At first, we faced an issue that the browser does not support all formats. That means, the browser can only show pictures and PDFs, but the rest cannot (tables, Word, tiff). For any exotic, the system can be modified by adding a converter from it to PDF.

For implementation, we adhered to the following principle of work:

To display a PDF document, the @mikecousins/react-pdf component is used;

If you need to display a picture in PNG or JPEG format, then the jsPDF library is used, which creates a PDF file;

If you need to display a picture in TIFF format, then the tiff library is used, which converts the image and transfers the data to jsPDF;

The print-js library is used to print PDF files.

In this case, we used libraries such as:

jsPDF - used to create pdf files;
@mikecousins/react-pdf - react component, used to display pdf files;

TIFF - converts a tiff image to canvas;

Print-js - print PDF files;

File-saver - for downloading files.

Server-side implementation
Many libraries did not fit - they had problems with encoding in different languages, formatting was not fully supported (colors disappeared, text indents were violated, italic font, etc.).

The solution was the open-source project Gotenberg, which works like a charm, it is based on the LibreOffice engine, so it does not have such problems. Gotenberg is a Docker-powered stateless API for converting HTML, Markdown, and Office documents to PDF.

Preview Excel document with graphs

Our team had two tasks: just display the uploaded documents, and secondly, display templates with an attached database, with which the template is filled in automatically.

Since the generation of documents based on a template occurs in Docx format, which is essentially an archive with XML files, we use the Docx4j library - opens documents from open office and MS office and then you can work with them in Java, you can modify the document and then save it to the server.

Contents of docx or xlsx files
The docx, xlsx file format is a zip archive containing XML text, graphics, and other data.

Summing up, for the server-side implementation, we used technologies such as React, Java, Docker, Gotenberg.

Final Thoughts on how to preview a document in web browser
When looking at the task in detail, the implementation of document previews in a browser does not look like a complicated process if the security conditions are met and the right technologies are selected.

However, if you do not use ready-made solutions to open a word doc in the browser, then it is better to use the help of an experienced team that has already solved a similar problem. Ardas is ready to assist you in a display word document in the browser as well as file previews for your project. Please contact us for details.

Originally taken from https://ardas-it.com/how-to-preview-document-or-file-in-a-browser-for-saas

Top comments (0)