PDFs are great for distribution because we can read them on any PDF reader device. Many readers, however, prefer to listen to their content than read it. Fortunately, you can use Python to turn any PDF into an audio file that we can play on several popular devices.
The first step in creating a PDF reader with Python is downloading and installing the correct libraries. In this tutorial, we’ll be using PyPDF2 and pyttsx3. PyPDF2 is a library to read and manipulate PDF files, while pyttsx3 will convert them into audio files.
Relate: How to Download YouTube Videos With Python?
These libraries are not installed by default, so we need to install them using pip. We can do this by opening a command prompt or terminal and entering the following command:
pip install PyPDF2 pyttsx3
1. Read PDF file using Python
Reading files in Python is straightforward. We need to open the file using the open()
function.
Yet, to read the contents of a PDF, we need to install the PyPDF2 library. We can do this by creating a PdfFileReader object and passing the file path of the PDF file to it.
from PyPDF2 import PdfFileReader
# create a PdfFileReader object
reader = PdfFileReader("/path/to/file.pdf")
2. Get the page to read
A PDF file may contain multiple pages. So we need to specify which page we want to read.
The PyPDF2 module offers two ways to work with pages. We can find the number of pages in a PDF using numPages
property. The pages
property returns a list of all the pages in the PDF.
number_of_pages = reader.numPages
page = reader.pages[0]
You could use it to loop through the pages if you're converting a large PDF into an audiobook.
for page in reader.pages:
# do the rest
#or
for i in range(reader.numPages):
page = reader.pages[i]
#do the rest
3. Convert PDF to text using Python
Once we decide which page we will read, we need to extract the text content from that page. In PyPDF2, we can use the extractText
property.
text = page.extractText()
4. Configure the pyttsx3 engine
Now that we have the text content of our PDF, we need to convert it into audio. For this, we'll use the pyttsx3 library.
Pyttsx3 is a text-to-speech conversion library in Python. It works offline and is compatible with multiple platforms, including Windows, Linux, and MacOSX.
The first step is to create an instance of the pyttsx3's Engine class.
import pyttsx3
engine = pyttsx3.init()
Once the engine is ready, we can set the voice (male, female), volume, and speaking rate. But these are all optional.
Set the voice—male/female
We need to get what's available in the engine to set the voice.
voices = engine.getProperty('voices')
Now, we can set the voice property on the engine. Use 0 to set the male voice and 1 for the female voice.
#Male
engine.setProperty('voice', voices[0].id)
#Female
engine.setProperty('voice', voices[1].id)
Set reading speed
We can change the reading speed by setting the rate property to your desired words per minute.
engine.setProperty('rate', 150)
Instead of setting a specific rate, if you only want to speed (or slow) the reading, you can refer to the current speed and adjust it.
# refer to the current value
rate = engine.getProperty('rate')
engine.setProperty('rate', rate+50)
Change the volume
In pyttsx3, you can set the volume between 0 and 1. 0 to mute, and 1 to set the volume to its maximum.
engine.setProperty('volume',.75)
5. Read or Save the audio
We've done all the background work. Now let's dive into action. In pyttsx3, you have to call the say
method and then the runAndWait
method to do the actual speech.
engine.say(text)
engine.runAndWait()
Finally, we can save the audio in MP3 format using the save_to_file()
method of pyttsx3's Engine class.
engine.save_to_file(text, 'test.mp3')
Putting it all together
We've walked through the code line by line to understand it better. Now, the fuller version to convert PDFs and save them to a file would look like this.
import pyttsx3
from PyPDF2 import PdfFileReader
# create a PdfFileReader object
reader = PdfFileReader("/path/to/file.pdf")
# extract text from page 1 (index 0)
page = reader.pages[0]
text = page.extractText()
# Create a pyttsx3 engine
engine = pyttsx3.init()
# set the voice
voices = engine.getProperty('voices')
engine.setProperty('voice', voices[1].id) #Female
# set the speed/rate of speech
engine.setProperty('rate', 150)
# set the volume
engine.setProperty('volume',.75)
engine.save_to_file(text, 'test.mp3')
Final thoughts
PDFs are everywhere. But readers’ preferences also change dramatically.
More people want to listen than reading. Because it’s possible to do something else while listening. We don’t get this flexibility when reading. Also, listening is more convenient for people with vision impairment.
But we don’t find audio versions for every PDF resource we see on the internet. But Python programmers would easily convert them into audio files with a few lines of coding.
In this article, we’ve discussed how to convert PDFs into audio files. We’ve also looked for ways to modify the speech with different volumes, voices, and speeds.
Did you like what you read? Consider subscribing to my email newsletter because I post more like this frequently.
Thanks for reading, friend! Say Hi to me on LinkedIn, Twitter, and Medium.
Top comments (1)
Very useful!! Is there a reason you use pypdf2 over other PDF libraries like pymupdf or borb?