ระบบการแยกแยะชนิดของเพลง โดยใช้ Machine Learning

WachirawitPhreuk — Sun, 14 Apr 2024 17:20:41 +0000

เพลงในโลกของเรานั้นมีหลากหลายแนวด้วยกัน ไม่ว่าจะเป็นป๊อปที่หลายคนคุ้นหู หรือแนวเพลงร็อคที่เร้าใจและทำให้คุณได้โยกหัว เพลงทุกแนวนั้นเป็นศิลปะ ที่บอกถึงเรื่องราวของตัวเอง และความรู้สึกผ่านตัวโน้ตและจังหวะ

ในบทความนี้ เราต้องการสร้างโมเดลที่จำแนกประเภทของเพลงต่างๆ ได้โดย ซึ่งเราต้องการรู้ว่าตัวโมเดลจะสามารถจำแนกประเภทตัวอย่างเพลงที่ถูกป้อนเข้าให้ได้ถูกต้องและแม่นยำหรือเปล่า

โดยทั้งหมดนี้เราจะใช้ Machine Learning กับโค้ดภาษา Python และทำใน Google Colab

ก่อนอื่น ดาวน์โหลดไฟล์นี้ หรือนำเข้า Google Drive ไฟล์

ขั้นตอนการทำงาน

ขั้นตอนที่ 1: นำเข้า Libraries และ Dataset
Libraries ที่นำเข้านั้นประกอบไปด้วย Pandas สำหรับนำเข้า dataset, Matplotlib และ Seaborn สำหรับการแสดงข้อมูลออกเป็นภาพ, Numpy สำหรับการ scaling และ correlation และ Librosa สำหรับการแสดงข้อมูลเสียงให้เป็นภาพ

import numpy as np 
import pandas as pd 
import matplotlib.pyplot as plt 
import seaborn as sns 
import librosa.display

จากนั้นจึงนำเข้าไฟล์ ‘file.csv’ ที่ได้ดาวน์โหลดไว้

music_data = pd.read_csv('file.csv') 
music_data.head(5)

Output

ขั้นตอนที่ 2: วิเคราะห์ข้อมูลที่นำเข้ามา

ขั้นตอนนี้เป็นการเช็คว่าข้อมูลที่เราได้มานั้น มีอะไรบ้าง มีลักษณะเป็นอย่างไร

เช่นหาว่าแต่ละ label มีกี่อัน

music_data['label'].value_counts()

Output

จาก Output พบว่าเราได้มีการใส่เพลงในแต่ละประเภททั้งหมด 100 อัน

หรือวิเคราะห์เพลงในไฟล์ว่ามีรูปแบบอย่างไร ในตัวอย่างนี้เราจะให้วิเคราะห์หมวดหมู่ Disco โดยใช้ไฟล์เสียง disco.00001.wav

path = 'drive/MyDrive/genres_original/disco/disco.00001.wav'
plt.figure(figsize=(14, 5)) 
x, sr = librosa.load(path) 
librosa.display.waveshow(x, sr=sr) 

print("Disco")

Output

วิเคราะห์เพลงหมวดหมู่ Jazz โดยใช้ไฟล์เสียง jazz.00001.wav

path = 'drive/MyDrive/genres_original/jazz/jazz.00001.wav'
plt.figure(figsize=(14, 5)) 
x, sr = librosa.load(path) 
librosa.display.waveshow(x, sr=sr) 

print("Jazz")

Output

ขั้นตอนที่ 3: ประมวลผลข้อมูล

เพื่อให้การทำงานของตัวโมเดลแยกแนวเพลงนั้นทำงานได้ เราจำเป็นต้องแปลงข้อมูล labels ให้เป็น integer โดยใช้ LabelEncoder()

from sklearn import preprocessing 
label_encoder = preprocessing.LabelEncoder() 
music_data['label'] = label_encoder.fit_transform(music_data['label'])

เพื่อความถูกต้อง และเพื่อให้ตัวโมเดลทำงานได้เร็ว ไม่ช้าเกินไปเพราะขนาดของข้อมูล เราจึง drop คอลัมน์ filename ออก และปรับขนาดข้อมูล

X = music_data.drop(['label','filename'],axis=1) 
y = music_data['label']
cols = X.columns 
minmax = preprocessing.MinMaxScaler() 
np_scaled = minmax.fit_transform(X) 

# new data frame with the new scaled data. 
X = pd.DataFrame(np_scaled, columns = cols)

ขั้นตอนที่ 4: ฝึกโมเดล

เริ่มด้วยการแยกตัวโมเดลด้วย train_test_split

from sklearn.model_selection import train_test_split 

X_train, X_test, y_train, y_test = train_test_split(X, y, 
                                                    test_size=0.3, 
                                                    random_state=111) 
X_train.shape, X_test.shape, y_train.shape, y_test.shape

เมื่อทำการแยกตัวโมเดลเรียบร้อย เราต้องการที่จะทดสอบตัวจำแนกประเภทด้วย 5 โมเดลนี้ประกอบไปด้วย KNeighborsClassifier, Decision Tree Classifier, Random Forest, Logistics Regression, Cat Boost, Gradient Boost

from sklearn.metrics import accuracy_score 
from sklearn.neighbors import KNeighborsClassifier 
from sklearn.tree import DecisionTreeClassifier 
from sklearn.ensemble import RandomForestClassifier 
from sklearn.linear_model import LogisticRegression 
import catboost as cb 
from xgboost import XGBClassifier 

rf = RandomForestClassifier(n_estimators=1000, max_depth=10, random_state=0) 
cbc = cb.CatBoostClassifier(verbose=0, eval_metric='Accuracy', loss_function='MultiClass') 
xgb = XGBClassifier(n_estimators=1000, learning_rate=0.05) 

for clf in (rf, cbc, xgb): 
    clf.fit(X_train, y_train) 
    preds = clf.predict(X_test) 
    print(clf.__class__.__name__,accuracy_score(y_test, preds))

ขั้นตอนที่ 5: Neural Network

ในขั้นตอนนี้จะเป็นการประเมินตัว Dataset ด้วย Neural Network, Neural Network คือโมเดลที่วิเคราะห์แพทเทิร์นของข้อมูลและการเลือกทางเลือกใน Machine learning ซึ่งเหมาะมากในการประเมินความถูกต้องของการจำแนก

import tensorflow.keras as keras 
from tensorflow.keras import Sequential 
from tensorflow.keras.layers import *

model = Sequential() 

model.add(Flatten(input_shape=(58,))) 
model.add(Dense(256, activation='relu')) 
model.add(BatchNormalization()) 
model.add(Dense(128, activation='relu')) 
model.add(Dropout(0.3)) 
model.add(Dense(10, activation='softmax')) 
model.summary()

Output

จากนั้นจึงนำรวบรวมโมเดล โดยใส่ epochs หรือจำนวนข้อมูลที่ใช้ในการเทรน ซึ่งในบทความนี้ใช้ 100 เราจึงใส่ไป 100

# compile the model 
adam = keras.optimizers.Adam(lr=1e-4) 
model.compile(optimizer=adam, 
            loss="sparse_categorical_crossentropy", 
            metrics=["accuracy"]) 

hist = model.fit(X_train, y_train, 
                validation_data = (X_test,y_test), 
                epochs = 100, 
                batch_size = 32)

ขั้นตอนที่ 6: ประเมินผล

ขั้นตอนนี้จะเป็นการประเมินว่าโมเดลนี้มีความถูกต้องเท่าไร โดยเริ่มจากการรันโค้ดนี้

test_error, test_accuracy = model.evaluate(X_test, y_test, verbose=1) 
print(f"Test accuracy: {test_accuracy}")

Output ของโมเดลที่ใช้ epochs ทั้งหมด 100

แล้วนำมาพล็อตกราฟ

fig, axs = plt.subplots(2,figsize=(10,10)) 

# accuracy 
axs[0].plot(hist.history["accuracy"], label="train") 
axs[0].plot(hist.history["val_accuracy"], label="test")  
axs[0].set_ylabel("Accuracy") 
axs[0].legend() 
axs[0].set_title("Accuracy") 

# Error 
axs[1].plot(hist.history["loss"], label="train") 
axs[1].plot(hist.history["val_loss"], label="test")  
axs[1].set_ylabel("Error") 
axs[1].legend() 
axs[1].set_title("Error") 

plt.show()

Output

จากการทดสอบนี้ ทำให้เราสงสัยว่า ถ้าหากเรากำหนดให้จำนวนข้อมูลที่ใช้ลดลงจาก 100 เหลือเพียง 10 เท่านั้น ตัวข้อมูลนั้นจะมีความแม่นยำขนาดไหน

Output เมื่อ epochs เท่ากับ 10

ซึ่งจากกราฟที่ได้มานั้น ไม่แม่นยำเท่าการใช้ข้อมูลที่มากเท่า 100

สรุป

โมเดลการจำแนกเพลงให้อยู่ในแต่ละประเภทนี้ จะมีประสิทธิภาพมากยิ่งขึ้นเมื่อมีข้อมูลเพลงที่มาก และใช้ Ensemble Learning และ Neural nets ในการสร้างโมเดล

จากตัวอย่างนี้ ทำให้เราพอเห็นไกด์ไลน์ในการทำโมเดลแยกประเภทได้บ้าง โดยเมื่อยิ่งมีจำนวนเพลงมากเท่าไร ความแม่นยำของข้อมูลก็จะมากขึ้นเท่านั้น ซึ่งสามารถใช้ได้กับการเทรนโมเดลตัวอื่นๆ ที่ไม่ใช่การจำแนกประเภทได้อีกด้วย

References:
https://www.geeksforgeeks.org/music-genre-classifier-using-machine-learning/
https://drive.google.com/drive/folders/189jIlfQ-E94eHfKVksbAUSXKX68Ju1GI

DEV Community: WachirawitPhreuk

ระบบการแยกแยะชนิดของเพลง โดยใช้ Machine Learning

ขั้นตอนการทำงาน

สรุป