DEV Community

kojix2
kojix2

Posted on • Edited on

8 4

Easy machine learning with Ruby using Rumale

What is Rumale

A powerful library for machine learning written in pure Ruby!
Rumale is created by @yoshoku.
https://github.com/yoshoku/rumale

Rumale (Ruby machine learning) is a machine learning library in Ruby. Rumale provides machine learning algorithms with interfaces similar to Scikit-Learn in Python. Rumale supports Linear / Kernel Support Vector Machine, Logistic Regression, Linear Regression, Ridge, Lasso, Factorization Machine, Naive Bayes, Decision Tree, AdaBoost, Gradient Tree Boosting, Random Forest, Extra-Trees, K-nearest neighbor classifier, K-Means, Gaussian Mixture Model, DBSCAN, Power Iteration Clustering, Mutidimensional Scaling, t-SNE, Principal Component Analysis, and Non-negative Matrix Factorization.

Install

gem install rumale
Enter fullscreen mode Exit fullscreen mode

Prepare a dataset

require 'rumale'
require 'daru'
require 'rdatasets'

# load datasets
iris = RDatasets.load(:datasets, :iris)
# Daru::DataFrame

# labels # Numo::Int32#shape=[150]
iris_labels = iris['Species'].to_a
encoder = Rumale::Preprocessing::LabelEncoder.new
labels = encoder.fit_transform(iris_labels) 

# samples Numo::DFloat#shape=[150,4]
# (Daru -> NArray )
samples = Numo::DFloat[*iris[0..3].to_matrix.to_a]
Enter fullscreen mode Exit fullscreen mode

Classification models

# Support vector machine
model = Rumale::LinearModel::SVC.new(
  reg_param: 0.0001,
  fit_bias: true,
  max_iter: 3000,
  random_seed: 1
)
Enter fullscreen mode Exit fullscreen mode

Various classifiers

model = Rumale::Tree::DecisionTreeClassifier.new(random_seed: 1)
model = Rumale::Ensemble::RandomForestClassifier.new(random_seed: 1)
model = Rumale::NearestNeighbors::KNeighborsClassifier.new(n_neighbors: 5)
model = Rumale::NaiveBayes::GaussianNB.new
# etc...
Enter fullscreen mode Exit fullscreen mode

Cross validation

# KFold
kf = Rumale::ModelSelection::StratifiedKFold.new(
  n_splits: 5,
  random_seed: 1
)

cv = Rumale::ModelSelection::CrossValidation.new(
  estimator: model,
  splitter: kf
)
report = cv.perform(samples, labels)
Enter fullscreen mode Exit fullscreen mode

Result

scores = report[:test_score]
puts scores.sum / scores.size
# 0.9466666666666667
Enter fullscreen mode Exit fullscreen mode

Learning and Predicting

# Learning
model.fit(samples, labels)

# Predicting
# accept 2D NArray  (Numo::DFloat#shape=[150,4])
p model.predict(samples).to_a
Enter fullscreen mode Exit fullscreen mode

Save and load models

# Save a model
File.binwrite("model.dat", Marshal.dump(model))

# Load a model
model = Marshal.load(File.binread("model.dat"))
Enter fullscreen mode Exit fullscreen mode

Enjoy!

Sentry image

See why 4M developers consider Sentry, “not bad.”

Fixing code doesn’t have to be the worst part of your day. Learn how Sentry can help.

Learn more

Top comments (0)

Billboard image

Create up to 10 Postgres Databases on Neon's free plan.

If you're starting a new project, Neon has got your databases covered. No credit cards. No trials. No getting in your way.

Try Neon for Free →

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay