Quite often, I have to submit the same documents in either
.jpg/jpeg format. I used online conversion sites to convert between the two formats. Recently a concern had been lingering in my mind that why am I uploading my personal documents on clouds and shouldn't I just create a converter so that the process stays local on my device. Hence, I decided to make a program to do the conversions.
I'd like to include this as a documented step :) The reason is short: I do web development in my day-to-day life, so I decided that coding this small program in Python will be refreshing.
You can either implement conversions on your own or you can look for packages that are well-maintained and have a good download count. I chose following packages for conversions:
I used PAGE to build static GUI. It is a very simple and easy-to-use drag-and-drop GUI generator. I made two GUIs, one for the converter and the other for showing success message. You can also handle all in one.
Note: tkinter applications look different on different OS. I use Windows. If you want to make application same across all platforms, you can add conditional layouts.
After generating scripts from PAGE, I made the GUI dynamic on the basis of
- If the
conversion_mode == 'jpg', then set text accordingly and show entry field for inputting name of output PDF file.
- Similarly, if the
conversion_mode == 'pdf', then make relevant textual changes. No need to add entry field as the output JPGs will have the PDF's name followed by index.
I used tkinter's filedialog module to upload files. I set
initialdir to the directory where script is placed.
curr_dir = os.getcwd()
Upload PDF: Since single PDF file will be uploaded, hence
filedialog.askopenfilename(initialdir=curr_dir, title="Select PDF", filetypes=[("PDF File", "*.pdf")])
Upload JPGs: Multiple JPG files can be uploaded to create a PDF file, hence
filedialog.askopenfilenames(initialdir=curr_dir, title="Select JPG(s)", filetypes=[("JPG File", "*.jpeg *.jpg")
To convert JPG/JPEG files to a single PDF, simply provide
file_paths, which is a list of image file paths, to img2pdf's
import img2pdf with open(name+".pdf", "wb+") as file: file.write(img2pdf.convert(*file_paths))
name is the output PDF name provided by the user.
pdf2image, firstly download
poppler on your system. (Instructions). Since I'm a Windows user, I downloaded poppler and placed the
bin/ folder in the same directory as the script.
poppler_path=r"./bin" if (sys.platform == "win32" or sys.platform == "cygwin") else None
Image files created are stored in
output_folder. To avoid out-of-memory error, I set
paths_only=True so that
convert_from_path(...) doesn't need to return PIL Image objects.
from pdf2image import convert_from_path from pdf2image.exceptions import ( PDFInfoNotInstalledError, PDFPageCountError, PDFSyntaxError ) convert_from_path( file_path, output_folder=output_folder_path, output_file=output_folder_name, fmt="jpg", paths_only=True, poppler_path=r"./bin" if (sys.platform == "win32" or sys.platform == "cygwin") else None )
You can run the function in chunks (see) if the PDF file has hundreds of pages or you want to work on Image object and are concerned about running into OOM error.
After conversion, I shift the user to different interface where success message is displayed.
Get the full implementation from my GitHub.
So this is how I made a PDF ⇌ JPG converter that fulfills my requirements and now I don't have to worry about security/privacy.
Share your thoughts. Feedback's always welcome :)