loading...

Convert any .pdf file 📚 into an audio 🔈 book with Python

Mustafa Anas on January 07, 2020

(edit: I am glad you all liked this project! It got to be the top Python article of the week!) A while ago I was messing around with google's Text... [Read Full]
markdown guide
 

I am really intrigued by this article. I tried everything to install pdftotext lib on my mac but was unsuccessful. I keep getting this error --> " error: command 'gcc' failed with exit status 1"
I installed OS dependencies , Poppler using brew but didn't work. Can you anyone help me?

 

make sure you have these two installed:
python-dev
libevent-dev

 

Yup i installed them . NO matter what i do, i keep getting this error --> "ERROR: Command errored out with exit status 1"
and i installed gcc too!

I just started getting the same thing on my system (Ubuntu). After a lot of Google/StackExchange, this worked (copy from my annotations):

For whatever reason, in order to install the following two, I had to install some stuff on my Ubuntu Mate ** system-wide ** to get rid of compile errors:

sudo apt-get install python3-setuptools python3-dev libpython3-dev
sudo apt-get update
sudo apt-get install build-essential libpoppler-cpp-dev pkg-config python-dev

I'm using PyCharmCE. After the above, I could use this in the PyCharm terminal:

pip3 install pdftotext
pip3 install gtts

After I did all of that, successful! Program works like a charm (hehe).

Cheers!

A pleasure to finally be able to give back a little!

I have a Mac, brother. Can't use app-get. what should i do now?

Are you using the default Python 2.7?? You may need to use Python 3.x

I got this working on the Mac using Python 3.7.4 using virtual env and brew. Works fine.

I am using docker with my Macbook without any issue. And it is a great alternative to start working on any environment, stack, etc.

 

They mention what all has to be installed for various O.S's in here pypi.org/project/pdftotext/

 

Have you tried to install the OS dependencies as specified in the docs? github.com/jalan/pdftotext#macos

 

I would suggest adding two lines to save the MP3 file to the same location and name as the PDF file.

from os.path import splitext

outname = splitext(filelocation)[0] + '.mp3'

then use:

final_file.save(outname)

 

Oh, fantastic! I was looking to add this by myself but I don't know python coding. Thanks for bringing it up!

 
 

Really cool and quick project! One thing I would suggest is to use python's join() method instead of looping over the list of strings. I think that's the more "pythonic" way and should also perform a little better.

 

Thanks for the tip!
I sure will start using that

 

My favorite part is (if I am not mistaken) that this would work for any language PDF as long as google text to speech supports the language.

 

hahaha omg how could I not think about doing the research.
You're true.
check this out
cloud.google.com/text-to-speech/

 

I am on fedora and had to install the following dependencies to get this working before I could pip install pdftotext

Sequence would be

sudo dnf install gcc-c++ pkgconfig poppler-cpp-devel python-devel redhat-rpm-config
pip install pdftotext gtts
 

This is a life-saving procedure you shared. I tried it and works like charm. Thank you so very much.

I have a question though...
I know this is a simplistic approach to just explain the basics( and its awesome). Please, is it possible to change the reader's voice and reading speed?

 

I am glad you liked it!
The intention of all my writings is to be as simple as possible so all-levels readers can understand.
If you wish to know more about customizing this API, please check this page:
gtts.readthedocs.io/en/latest/

 

An observation here ( I'm sure this has to do with the gtts engine though ):

The reader would rather spell some words than pronounce the actual words and its a bit strange. I did a conversion where the word "first" was spelt rather than pronounced. Initially, I thought such occurs when words are not properly written and the text recognition engine is affected. "Five" was pronounced fai-vee-e,and other spellings like that.

Overall though, it is manageable and one can make good sense out of the readings. Now I can "read" my e-books faster with this ingenious solution.

Thanks again, @mustapha

 

Thanks a lot for the article, I tried a lot finding such thing but now am able to read(listen) to all my untouched PDFs.

 

That was my intention.
Glad you liked it :)

 

I tried this on Win10, but was unable to install pdftotext package in Python 3.8.
Hence, I did this using another way :

github.com/suryabranwal/TIL/blob/m...

 

I have a problem running [vagrant@centos8 ~]$ sudo pip3 install pdftotext on CentoOS8:
error: command 'gcc' failed with exit status 1
Command "/usr/bin/python3.6 -u -c "import setuptools, tokenize;file='/tmp/pip-build-7_3v7vuh/pdftotext/setup.py';f=getattr(tokenize, 'open', open)(file);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, file, 'exec'))" install --record /tmp/pip-ac0irxfy-record/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-build-7_3v7vuh/pdftotext/

I'm running Python 3.6.8, do I have to use Python 3.8 explicitly?

 

Really cool !
However , when I tried to convert a decent sized pdf file (3.0 MB) , I got the following error :

"gtts.tts.gTTSError: 500 (Internal Server Error) from TTS API. Probable
cause: Uptream API error. Try again later."

Is Gtts blocking me from using their API ? How shall I resolve this ?

 
 
 

Do you have any demo audio files? I'm really interested to hear it. :)

 

Run this code and hear the result

from gtts import gTTS
final_file = gTTS(text='Demo String', lang='en')  # store file in variable
final_file.save("Generated Speech.mp3")  # save file to computer
 

Good stuff, Mustafa! I created a github project for this in case anyone wants to see and get an idea how this is set up on an Ubuntu 18.04 workstation.

github.com/hseritt/pdf2voice

 

Thank you for sharing the repo Harlin!

 
 
 

Awesome, awesome, awesome! I'm guessing they're ok to listen to?

 
 
 
 

Nice one Mustafa!

I'm curious what would happen if the PDF has images or mathematical equations?

code of conduct - report abuse