Dipti Moryani

Posted on Nov 18

Understanding SVM Intuitively

#webdev #programming #ai #web3

Support Vector Machines (SVM) are a powerful classification and regression technique that separates data using hyperplanes. The method is particularly useful for data with unknown or irregular distributions, where classical assumptions like linearity may not hold.
If we have labeled data, SVM attempts to find the best separating hyperplane between classes. Consider a simple two-class dataset:
Red and Blue points represent different classes.
Many lines or planes can separate the two classes.
The goal is to find the optimal line or hyperplane, which maximizes the distance (margin) from the nearest data points of each class.
Margin (m) is the distance between the nearest points of each class and the hyperplane:
m=2∣∣a∣∣for line y=ax+bm = \frac{2}{||a||} \quad \text{for line } y = ax + bm=∣∣a∣∣2for line y=ax+b
Maximizing the margin ensures the classifier is robust to noise and unseen test data.

Creating a Sample Dataset in R
We can simulate a simple dataset with two features:

Sample data

x = 1:20
y = c(3,4,5,4,8,10,10,11,14,20,23,24,32,34,35,37,42,48,53,60)

Create a data frame

train = data.frame(x, y)

Plot the data

plot(train, pch = 16)

At first glance, this dataset seems linearly separable, suggesting a linear regression could also be effective.

Linear Regression vs SVM
Linear Regression:

Fit linear model

model <- lm(y ~ x, train)

Plot regression line

abline(model)

SVM:
library(e1071)

Fit SVM model

model_svm <- svm(y ~ x, train)

Predict values

pred <- predict(model_svm, train)

Plot predictions

points(train$x, pred, col = "blue", pch = 4)

Comparing Model Performance (RMSE)

Linear regression RMSE

lm_error <- sqrt(mean(model$residuals^2)) # ~3.83

SVM RMSE

svm_error <- sqrt(mean((train$y - pred)^2)) # ~2.70

Even with a simple example, SVM outperforms linear regression, demonstrating its robustness.

Tuning SVM for Better Accuracy
SVM performance can be improved by tuning:
epsilon: insensitive tube in regression
cost: penalty for misclassification
Grid search with cross-validation is straightforward in R:
svm_tune <- tune(svm, y ~ x, data = train,
ranges = list(epsilon = seq(0, 1, 0.01),
cost = 2^(2:9)))

Best model

best_mod <- svm_tune$best.model
best_mod_pred <- predict(best_mod, train)

Calculate RMSE

best_mod_RMSE <- sqrt(mean((train$y - best_mod_pred)^2)) # ~1.29

Plot tuned model predictions

plot(train, pch = 16)
points(train$x, best_mod_pred, col = "blue", pch = 4)

The grid search allows SVM to find optimal hyperparameters, reducing RMSE significantly — in this case from ~2.7 to ~1.29.

Visualizing SVM Tuning
plot(svm_tune)

Darker regions = better accuracy
Use this plot to narrow the search range for further fine-tuning
Avoid overfitting by not using excessively fine steps

Key Takeaways
SVM is robust: It maximizes margins, making it resistant to noisy or biased data.
Linear vs Non-linear:
Linear SVM for simple datasets
Kernel SVM (RBF, Gaussian) for complex, non-linear data
Tuning matters: Cost and epsilon significantly impact performance
Comparison to Regression:
Linear regression works well for truly linear patterns
SVM excels when data is noisy or non-linear

Why Use SVM?
Works with non-linear and unknown data distributions
Provides interpretable hyperplanes
Performs well even with high-dimensional data
Flexible through kernel methods
Caution: SVM can overfit if tuning isn’t handled carefully, especially on small datasets.

Full R Code
x = 1:20
y = c(3,4,5,4,8,10,10,11,14,20,23,24,32,34,35,37,42,48,53,60)
train = data.frame(x, y)
plot(train, pch=16)

Linear regression

model <- lm(y ~ x, train)
abline(model)

SVM

library(e1071)
model_svm <- svm(y ~ x, train)
pred <- predict(model_svm, train)
points(train$x, pred, col="blue", pch=4)

RMSE

lm_error <- sqrt(mean(model$residuals^2))
svm_error <- sqrt(mean((train$y - pred)^2))

Grid search for tuning

svm_tune <- tune(svm, y ~ x, data=train,
ranges=list(epsilon=seq(0,1,0.01),
cost=2^(2:9)))
best_mod <- svm_tune$best.model
best_mod_pred <- predict(best_mod, train)
best_mod_RMSE <- sqrt(mean((train$y - best_mod_pred)^2))

Plot results

plot(svm_tune)
plot(train, pch=16)
points(train$x, best_mod_pred, col="blue", pch=4)

This version is structured, beginner-friendly, and ready for professional use, highlighting both conceptual understanding and hands-on R implementation.

At Perceptive Analytics, we help organizations unlock actionable insights from their data. Recognized among leading AI Consulting Companies, we guide businesses in adopting AI solutions that enhance forecasting, automate processes, and improve decision-making. Organizations looking to hire Power BI consultants rely on us to build scalable dashboards, automate reporting, and provide real-time insights that drive smarter business outcomes.

DEV Community