DEV Community

Cover image for Dyslexic DEV? No problem!
Viorel PETCU
Viorel PETCU

Posted on

Dyslexic DEV? No problem!

Given

that I am dyslexic and also suffer from aphantasia, which is a fascinating quirk of the brain, one can imagine I have sort of a love / hate relationship with reading.

On the one hand, I must keep up with the news and trends of the software engineering industry but the serious and useful information I require, is in written form and that always drains my reading battery because of the above average amount of concentration I have to put forward.

When

daylight saving Weekend rolled around, I decided to do something useful with the "additional" hour, so I set out to improve this situation for people like me.

I decided to build something that allows us to take the text from any article in any language (English, German, Romanian in my case) and convert it to an mp3 file so that we can "listen to the article" and not drain our weak reading battery.

Then

I remembered one of the most important UNIX principles:

"DO ONE THING AND DO IT WELL"

so I challenged myself to:

  • use WELL DONE existing THINGS
  • write only a shell script
  • have less than 31 lines (today is halloween πŸŽƒ )
  • uses only cli tools,pipes, shell commands
  • time box of 1h.

πŸ€ͺ

I did it! But this was only possible because so many wonderful people have developed so many great projects and shared them with the rest of us. There exists a multitude of software out there and any UNIX based OS allows us to interconnect it seamlessly, simply amazing 🀩

Result

The script:

#!/bin/sh
# this script does some text editing for the:
#   $1 - input file
# and stores it into the:
#   out-$1 - output file
# which it then later utilises to CURL to http://localhost:5002/api/tts running:
#   docker run -it -p 5002:5002 synesthesiam/mozillatts:en
# you can replace the TAG at the end with any language supported by TTS
# it will:
#  - produce a .wav file for each sentance in the outputfile
#  - join the wav files into a single one
#  - turn it into a .mp3 file named audio-$1.mp3

cat "$1" | awk -F'\.' '{ for (i=1; i<NF; i++) print $i ".\n" }' | tr "'" "Β΄" |tr "\"" "Β΄" > out-"$1"

input="out-$1"
while IFS= read -r line
do
  curl --location --request POST 'http://localhost:5002/api/tts' --header 'Content-Type: text/plain' --data-raw "$line" -o "audio_$(date +%s).wav"
done < "$input"

ls audio*wav |awk '{print "file " $0}' > wav-list.txt

ffmpeg -f concat -safe 0 -i wav-list.txt -vn -ar 44100 -ac 2 -b:a 128k audio-$1.mp3

rm out-$1
rm *wav
rm wav-list.txt

open "audio-$1.mp3"
Enter fullscreen mode Exit fullscreen mode

The list of ingredients:

  1. Mozilla/TTS
  2. FFMPEG cli this thing went to Mars πŸ”΄
  3. UNIX based OS
  4. docker cli
  5. the above script

The "recipe":

 # in your terminal run: 
 docker run -it -p 5002:5002 synesthesiam/mozillatts:en
 # replace the 'en' tag with 'de', 'fr', 'ro' etc.
 # select and copy the text you want to listen to 
 # paste it into a file: `article.txt`
 # save the script as `audiofy.sh` next to the text file
 # in the terminal run:
 sh audiofy.sh article.txt
Enter fullscreen mode Exit fullscreen mode

Demo

I used a snippet from the README file of the fantastic Mozilla/TTS for my demo i.e:

Mozilla TTL README snippet

Have a listen to the output on soundcloud. (use the open in new tab function so you can see the text and marvel at the natural sounding synthetic voice)

Conclusion

Even tough I set only 1h for this project, I do love the result and will return to it for improvements. Implementing this was way too much fun.

Also in the time I wrote this post I listend to most of the news articles that popped up on my Google News feed, because I had converted them as a test for my script.

Synergy: achieved!
Productivity: increased!
Reading battery: protected!

✌️

Top comments (2)

Collapse
 
baenencalin profile image
Calin Baenen • Edited

So... You invented a screen reader, that instead of reading, compiles the text into an audio file..?

Collapse
 
realvorl profile image
Viorel PETCU • Edited

Not a bad summarisation, yes it turns text into .MP3 but it's all local on your machine (no privacy concerns). And with TTS you could train and use your own voice. Also it produces files longer than 34 sec which currently is a limitation of TTS.

PS: I did not invent any of it, I just did "some plumbing" by pipe-ing together some really cool tools!