DEV Community: van

An AI cover model for your own voice.

van — Wed, 14 Aug 2024 06:20:17 +0000

Just one minute of voice, clone your own voice timbre, currently completely free to experience and use.

https://lamucal.com/ai-cover

Lamucal: An AI-powered music generation tool for creating tabs, chords, lyrics, and melodies.

van — Wed, 27 Mar 2024 04:29:54 +0000

https://lamucal.ai

Use Cases

Music Transcription
Lyric Synchronization
Musical Exploration
Chord Matching

Feature Highlights

AI-Enhanced Music Generation: Swiftly transforms music from sources into playable chords.
Lyric Synchronization: Ensures lyrics align perfectly with music.
Interactive Learning Feature: Ideal for musicians to play along with instruments.
Broad Song Selection: Access over 40 million songs for musical learning and enjoyment.

AI-Generated Tabs & Chords for Any Song

van — Wed, 27 Mar 2024 04:22:23 +0000

Lamucal is an artificial intelligence-enhanced tool designed for generating tabs, chords, lyrics, and melodies of any song. It can swiftly convert music or songs from an array of sources such as YouTube, Deezer, SoundCloud, and MP3 into playable chords.

Users can also edit, transpose and separate tracks with ease. Apart from generating music details, Lamucal offers precise lyric synchronization, ensuring that the lyrics align perfectly with the chords and rhythm patterns.

It also possesses an interactive learning feature which could be valuable for musicians wanting to play along with guitar, ukulele, or piano. Users can explore a vast selection of over 40 million songs, varying from classic songs to new songs of the month, providing a broad scope for musical exploration and learning.

Furthermore, the tool includes an accurate chord matching feature to fine-tune your music playing experience. Lamucal is not confined to a desktop version but has a mobile application available for both iOS and Android platforms.

Integration with social media platforms like Facebook, TikTok, and YouTube extends its reach, permitting users to share their musical exploits.

Real-time audio source separation, generate lyrics, chords, beat and tabs.

van — Wed, 27 Mar 2024 04:13:45 +0000

github:https://github.com/DoMusic/Hybrid-Net

An AI-powered multimodal project focused on music, generate chords, beats, lyrics, melody, and tabs for any song.

The online experience, https://lamucal.ai

Online guitar tuner

van — Mon, 18 Dec 2023 13:05:17 +0000

Open in online

FastTune is an AI tuner for guitar, ukulele, bass, banjo, mandolin, violin and etc.

It utilizes the transformer-based tuneNN network model for abstract timbre modeling, supporting tuning for 12+ instrument types.

The online experience based on web audio and wasm, See the site here

Github: A High-Performance Library for Audio Analysis

van — Thu, 27 Apr 2023 12:39:23 +0000

Audioflux is a deep learning tool library for audio and music analysis, feature extraction. It supports dozens of time-frequency analysis transformation methods and hundreds of corresponding time-domain and frequency-domain feature combinations. It can be provided to deep learning networks for training, and is used to study various tasks in the audio field such as Classification, Separation, Music Information Retrieval(MIR) and ASR etc.

Project: https://github.com/libAudioFlux/audioFlux
Benchmark: https://github.com/libAudioFlux/audioFlux/issues/22

Who else can match such speed? A performance leap prompted by a single issue

van — Tue, 25 Apr 2023 05:46:18 +0000

Recently, I open-sourced a small project about audio feature extraction and analysis. As someone in the AI audio field, I felt that I lacked a deep understanding of audio features during my research, so I created this project as a way to learn and practice.

Although it was a small project for learning and practice, I was confident in it because most of the core algorithms were implemented in C and wrapped in Python. I thought it should be faster than libraries implemented purely in Python. I also did a simple performance comparison with other related Python libraries, and the results were indeed faster. However, I didn't expect to hit a snag later on!!!

Two weeks ago, I received an issue from a user saying 'Speed is slow, am I missing something?' When I took a closer look, I was shocked to find that my library was the slowest. I quickly ran it on my own computer, and it was even worse than the results given by the user. This was a big blow!!! The relevant issue can be found here: https://github.com/libAudioFlux/audioFlux/issues/18#issuecomment-1498371872

After careful analysis, I found that the sample size I used for testing was too small, and when the sample size was large, the performance was slow, mainly because of matrix multiplication. After subsequent optimization, it was much faster than other libraries, but there was still a performance gap compared to PyTorch's official torchaudio library.

I accepted my fate since torchaudio is simply superior. After a week of hard work, I tried various technical optimizations such as OpenBLAS, Eigen, MKL, FFTW, SIMD, and parallel computing. I tested the performance with different sample sizes, CPUs, and system platforms. The results are shown in the following figure:

The graphs show the benchmark results on Linux/AMD and macOS/Intel, respectively.

Here is the detailed benchmark report: https://github.com/libAudioFlux/audioFlux/tree/master/benchmark

Overall,

on Linux/AMD processors, audioflux is slightly faster than torchaudio, but on Linux/Intel, it is slightly slower.
On macOS, for large sample sizes, audioflux is faster than torchaudio, and Intel is significantly faster than M1; for small sample sizes, torchaudio is faster than audioflux.

After various arduous optimizations, the performance of audioflux is much faster than the previous version and other related libraries. I have done everything I can to optimize its performance, but it still cannot beat torchaudio. I hope everyone can give me a thumbs up and follow me, and I look forward to outperforming torchaudio in the future!!!

If you are interested, please give us a star.
Project address: https://github.com/libAudioFlux/audioFlux

Using the Python library audioFlux to learn audio analysis

van — Thu, 30 Mar 2023 10:37:01 +0000

AudioFlux is a Python library that provides deep learning tools for audio and music analysis and feature extraction. It supports various time-frequency analysis transformation methods, which are techniques for analyzing audio signals in both the time and frequency domains. Some examples of these transformation methods include the short-time Fourier transform (STFT), the constant-Q transform (CQT), and the wavelet transform.

In addition to the time-frequency analysis transformations, AudioFlux also supports hundreds of corresponding time-domain and frequency-domain feature combinations. These features can be used to represent various characteristics of the audio signal, such as its spectral content, its temporal dynamics, and its rhythmic patterns. These features can be extracted from the audio signal and used as input to deep learning networks for classification, separation, music information retrieval (MIR) tasks, and automatic speech recognition (ASR).

For example, in music classification, AudioFlux could extract a set of features from a piece of music, such as its spectral centroid, mel-frequency cepstral coefficients (MFCCs), and its zero-crossing rate. These features could then be used as input to a deep learning network trained to classify the music into different genres, such as rock, jazz, or hip-hop. AudioFlux provides a comprehensive set of tools for analyzing and processing audio signals. This is an essential asset for professionals and scholars studying and applying methods to analyze audio and music.

https://github.com/libAudioFlux

In the field of audio, the relationship between various transforms.

van — Tue, 28 Feb 2023 15:30:12 +0000

Mel spectrogram and MFCC

van — Tue, 28 Feb 2023 14:27:27 +0000

Algorithm flow

Read 220Hz audio data

import audioflux as af

audio_path = af.utils.sample_path('220')
audio_arr, sr = af.read(audio_path)

Extract spectrogram of dB

low_fre = 0
spec_arr, fre_band_arr = af.mel_spectrogram(audio_arr, samplate=sr, low_fre=low_fre)
spec_dB_arr = af.utils.power_to_db(spec_arr)

Show mel spectrogram plot

import matplotlib.pyplot as plt
from audioflux.display import fill_spec
import numpy as np

# calculate x/y-coords
audio_len = audio_arr.shape[0]
x_coords = np.linspace(0, audio_len/sr, spec_arr.shape[1] + 1)
y_coords = np.insert(fre_band_arr, 0, low_fre)

fig, ax = plt.subplots()
img = fill_spec(spec_dB_arr, axes=ax,
                x_coords=x_coords,
                y_coords=y_coords,
                x_axis='time', y_axis='log',
                title='Mel Spectrogram')
fig.colorbar(img, ax=ax, format="%+2.0f dB")

Extract mfcc data

cc_arr, _ = af.mfcc(audio_arr, samplate=sr)

Show mfcc plot

# calculate x-coords
audio_len = audio_arr.shape[0]
x_coords = np.linspace(0, audio_len/sr, cc_arr.shape[1] + 1)

fig, ax = plt.subplots()
img = fill_spec(cc_arr, axes=ax,
                x_coords=x_coords, x_axis='time',
                title='MFCC')
fig.colorbar(img, ax=ax)

audioFlux: A library for audio and music analysis, feature extraction.

van — Tue, 28 Feb 2023 12:20:38 +0000

Category Submission:

C/Python

Screenshots

Description

A library for audio and music analysis and feature extraction, which supports dozens of time-frequency analysis and transformation methods, as well as hundreds of corresponding time-domain and frequency-domain feature combinations, can be provided to the deep learning network for training, and can be used to study the classification, separation, music information retrieval (MIR), ASR and other tasks in the audio field.

Link to Source Code

https://github.com/libAudioFlux/audioFlux

Permissive License

MIT

How I built it

Systematic and multi-dimensional feature extraction and combination can be flexibly used for various task research and analysis.
The performance is efficient, the core is mostly implemented in C, and FFT hardware acceleration based on different platforms is convenient for large-scale data feature extraction.
It is applicable to the mobile end and supports real-time calculation of audio stream at the mobile end.

Additional Resources/Info

Document

Code Demo

 pip install audioflux

import numpy as np
import audioflux as af

import matplotlib.pyplot as plt
from audioflux.display import fill_spec

# Get a 220Hz's audio file path
sample_path = af.utils.sample_path('220')

# Read audio data and sample rate
audio_arr, sr = af.read(sample_path)

# Extract mel spectrogram
spec_arr, mel_fre_band_arr = af.mel_spectrogram(audio_arr, num=128, radix2_exp=12, samplate=sr)
spec_arr = np.abs(spec_arr)

# Extract mfcc
mfcc_arr, _ = af.mfcc(audio_arr, cc_num=13, mel_num=128, radix2_exp=12, samplate=sr)

# Display
audio_len = audio_arr.shape[0]
# calculate x/y-coords
x_coords = np.linspace(0, audio_len / sr, spec_arr.shape[1] + 1)
y_coords = np.insert(mel_fre_band_arr, 0, 0)
fig, ax = plt.subplots()
img = fill_spec(spec_arr, axes=ax,
                x_coords=x_coords, y_coords=y_coords,
                x_axis='time', y_axis='log',
                title='Mel Spectrogram')
fig.colorbar(img, ax=ax)

fig, ax = plt.subplots()
img = fill_spec(mfcc_arr, axes=ax,
                x_coords=x_coords, x_axis='time',
                title='MFCC')
fig.colorbar(img, ax=ax)

plt.show()