Discussion on: Converting Word to PDF Using A Python-Based Lambda

View post

Few things that I change in 2021 to make this work for python 3.8 runtime in Lambda:

According to brotlipy API documentation, change decompressor.process to decompressor.decompress
Build/copy brotlipy dependency from Linux environment, as targeted Lambda runtime is AmazonLinux

Create fonts/fonts.conf in your dependency package with following content (assuming libreoffice is extracted under /tmp/instdir dir):

<?xml version="1.0"?>
<!DOCTYPE fontconfig SYSTEM "fonts.dtd">
<fontconfig>
<dir>/tmp/instdir/share/fonts/truetype</dir>
<cachedir>/tmp/fonts-cache/</cachedir>
<config></config>
</fontconfig>

Environment variables:
FONTCONFIG_FILE= /var/task/fonts/fonts.conf
HOME=/tmp
Update return statement, from '{}/program/soffice'... to '{}/program/soffice.bin'...

To make use of libreoffice, I've used subprocess in python and please note that you have to call the command twice to make it work (reason:
still unknown).

soffice_path = load_libre_office()
word_file_path = "/tmp/file.docx"
conv_cmd = f"{soffice_path} --headless --norestore --invisible --nodefault --nofirststartwizard --nolockcheck --nologo --convert-to pdf:writer_pdf_Export --outdir /tmp {word_file_path}"
response = subprocess.run(conv_cmd.split(), stdout=subprocess.PIPE, stderr=subprocess.PIPE)
if response.returncode != 0:
    response = subprocess.run(conv_cmd.split(), stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    if response.returncode != 0:
        print("cannot convert this document to pdf")

Just to bring to your kind notice: I didn't wrap pdf:writer_pdf_Export in quotes like ... --convert-to "pdf:writer_pdf_Export"... because it won't work. Many bloggers wrote this command wrong, resulting in failure of conversion.

Enjoy serverless libreoffice with python, Cheers!