Python audio processing at lightspeed ⚡ Part 4: simpleaudio, spectrum animations

#python #audio #video

So this is going to be a miscellaneous coverage of audio libraries that do things other than playing sound, and I have one other library at the moment that can play sound without Portaudio, its name is simpleaudio and while it is very, very basic, its simplicity is useful if you just want to play sounds without processing things, but it only appears to play WAV files.

So first I'll quickly cover simpleaudio since there's not much to it and then I'll look at spectrum animation making.

Simpleaudio

Installing this library is just a matter of running pip install simpleaudio, it doesn't have dependencies that I know of, and the following code will play a WAV file asynchronously:



import simpleaudio as sa
with open(path_to_file, 'rb') as f:
    b = f.read()
    play_obj = sa.play_buffer(b, 2, 2, 44100)

As you can see you need to read the file in binary form and feed into simpleaudio, although it also accepts numpy arrays. play_buffer takes, in order, the audio data, the numebr of channels to output to, the bytes per sample (as we are directly playing raw sound data) and the sample rate.

Playback can be stopped with play_obj.stop(), play_obj.is_playing() checks if the sound is still playing and play_obj.wait_done() will wait until the sound has finished playing.

For aesthetic reasons you might prefer an audio object created with the above parameters instead of directly playing it. This can be accomplished by using WaveObject like sp:



# Creates wave object from audio data
wave_obj = sa.WaveObject(audio_data, 2, 2, 44100)
# Plays wave object
play_obj = wave_obj.play()
# Creates wave object from a file
wave_obj = sa.WaveObject.from_wave_file(path_to_file)
# Creates a wave object from wave.open() handle
wave_read = wave.open(path_to_file, 'rb')
wave_obj = sa.WaveObject.from_wave_read(wave_read)

Simpleaudio can play 8-, 16- and 24-bit integer bit depths and 32-bit floating point depths. The sample rates that can be used are 8, 11.025, 16, 22.05, 32, 44.1, 48, 88.2, 96, and 192 kHz but the ones that work depend on your system and/or sound card.

Audio_display

This python package consists of one program, fft2png, which generates a spectrum of a sound, kind of like spectrum visual effects you see in media players. It doesn't have any callable python functions that I know of.

The program is called like fft2png -i path-to-wav-file -o some-filename-sequence.png. The output filename should have a mask in order to generate a sequence of pictures, one per frame. In other words, output-{:06}.png will make pictures starting with output-000000.png, output-0000001.png, etc. until it has created as many frames as you want. There is an -r option that specifies desired frame rate, sampling the sound at more time positions per second to make the spectrums.

Make sure you run this in an empty directory because it will generate a lot of images.

When that is finished you can combine the images into a video using ffmpeg. You need to adjust the arguments to ffmpeg yourself, for the example above it would be: ffmpeg -framerate 30 -i output-%06d.png output-file.mp4, and the %06d is a shorthand way to refer to 000000, 000001 etc. all at the same time.

That command will make an MP4 video that looks something like this, depending on the sound you analyzed in the first place. In this example I visualized a short sound effect:

Using -R renders the spectrum differently. What you just saw is the default renderer, -R 0. Here are demos of using -R n with n set to 1, 2 and 3:

In fft2png the --color switch takes a hex number to use as the bar color. The default is FFFFFF which makes completely white bars. On a side note, if you have trouble picking a color, you don't have a color picker handy and you don't mind installing the PyQt5 module, you can run this code snippet to create a color picker to use:



# As you can see this code was ripped from pythonspot.com - credit to them
import sys
from PyQt5.QtWidgets import QApplication, QWidget, QPushButton, QColorDialog
from PyQt5.QtGui import QIcon
from PyQt5.QtCore import pyqtSlot
from PyQt5.QtGui import QColor

class App(QWidget):

    def __init__(self):
        super().__init__()
        self.title = 'PyQt5 color dialog - pythonspot.com'
        self.left = 10
        self.top = 10
        self.width = 320
        self.height = 200
        self.initUI()

    def initUI(self):
        self.setWindowTitle(self.title)
        self.setGeometry(self.left, self.top, self.width, self.height)

        button = QPushButton('Open color dialog', self)
        button.setToolTip('Opens color dialog')
        button.move(10,10)
        button.clicked.connect(self.on_click)

        self.show()

    @pyqtSlot()
    def on_click(self):
        self.openColorDialog()

    def openColorDialog(self):
        color = QColorDialog.getColor()

        if color.isValid():
            print(color.name())

if __name__ == '__main__':
    app = QApplication(sys.argv)
    ex = App()
    sys.exit(app.exec_())

In the following spectrum I used --color 0C2F49 to make the bars a shade of blue. Don't put 0x or # at the beginning of the hex number or it won't work.

The result:

--blending controls the the influence of previous frames on the next frame generated. The closer this is to 0 the more jittery the spectrum becomes as more of the spectrum only uses samples from the current frame. A blending closer to 1 makes it use more samples from previous frames than the current one. The default blending is 0.7. This spectrum has a blending of 0.1:

Other fft2png options include --bar-width to change the width of the bars, --bar-spacing controls their spacing, --bar-count controls number of bars and --image-height controls the height of the generated images.

Appendix: Converting MP4 to GIF without online tools

This section does not use any Python. It should be accessible to those who don't know the language. After making the MP4 videos as shown above, the next challenge was to convert them to GIFs to display them in this article. I didn't want to upload the videos in their raw form because that would take up precious bandwidth.

ffmpeg is able to convert MP4 into the GIF format. But you should not convert directly to GIF because it will cause quality loss in the GIF because not all the colors in the MP4 video were accounted for. Instead, first you make a "pallet" as shown here:



ffmpeg -i input.mp4 -filter_complex "[0:v] palettegen" palette.png

Without going into detail about the parameters used here this creates a small image containing all the colors in the MP4 video. It is then sampled when you convert the MP4 into a GIF with:



ffmpeg -i input.mp4 -i palette.png -filter_complex "[0:v][1:v] paletteuse" output.gif

So, don't attempt to input an MP4 by itself to create a GIF, use an MP4 and its palette.

And we're done (finally!)

To me it feels there's not much more modules left to cover without feeling redundant, so I'm wrapping up the series here with this Part 4. However I understand that other modules do have their place even if it feels like I've already seen this functionality before (I'm specifically referring to the multitude of audio editors in python) so I will cover those in due time, just without the "Part x" moniker.

If you see anything incorrect here please let me know so I can fix it.

Image by Gordon Johnson from Pixabay