DEV Community

Discussion on: Github: A High-Performance Library for Audio Analysis

Collapse
 
audioflux profile image
van

In the field of deep learning for audio, the mel spectrogram is the most commonly used audio feature. The performance of mel spectrogram features can be benchmarked and compared using audio feature extraction libraries such as the following:

  • audioFlux: developed in C with a Python wrapper, it has different bridging processes for different platforms, and supports OpenBLAS, MKL, etc.
  • TorchAudio: developed in PyTorch, which is optimized for CPUs and uses MKL as its backend. This evaluation does not include the GPU version of PyTorch.
  • librosa: developed purely in Python, mainly based on NumPy and SciPy, with NumPy using OpenBLAS as its backend.
  • Essentia: developed in C++ with a Python wrapper, it uses Eigen and FFTW as its backend.