DEV Community

Rishav Saha
Rishav Saha

Posted on • Originally published at Medium on

Support Vector Machines: Geometrical Interpretation

SVM’s are one of the most popular machine learning techniques that can be used for both classification and regression tasks. SVM’s started becoming popular in the 1990’s .In short SVM can be thought of as a supervised machine learning model which uses a hyperplane to differentiate the two classes and the objective of it is to maximize the distance between the +ve and the -ve points(for classification problems). So let’s understand the geometrical interpretation of SVM in details

Geometrical Interpretation:

1.Introduction

Let we have some points and just to make it simple let the data is linearly separable

As we can see this data is linearly separable we can have multiple hyperplanes that can separate them but here we are just considering these two hyperplanes for simplicity.

Now let’s change the data a little bit for easier understanding-

Now the question that arises is out of the two hyperplanes π1 and π2 which hyperplane should we prefer?

If we choose π1 then there are many points which are close to the hyperplane (marked by a circle) . As these points are very much close to the hyperplane , if the hyperplane changes slightly then these points could get misclassified which is something we must try to avoid.

Another fact that we need to understand the points which are closer to a hyperplane have a lesser probability of belonging to a class which are far from the hyperplane(not in the opposite direction)

In the above example the +ve point closer to π1 has a probability of 0.55 of belonging to a class while the point further away has a probability of 0.9 of belonging to the class.

Thus the objective of SVM is to find a hyperplane which separates the +ve and the -ve point as far as possible. So we will choose the hyperplane π2 over π1. FYI a hyperplane that tries to separate the +ve points from the -ve points as far as possible is called a margin maximizing hyperplane. So π2 here is a margin maximizing hyperplane.

2.Margin Maximizing Hyperplane

If we keep on drawing hyperplanes parallel to the hyperplane π then after a time we will get a plane which will intersect with the first +ve point and we call it π+ . Similarly we will get a point which will intersect with the first -ve point and we will call it π-. Both π+ and π- are parallel to each other. The points passing through π+ and π- are called support vectors(important for understanding the mathematical formulation)

Let the distance between π+ and π- be d and is also called the margin.

We want to maximize the value of d because a greater value of d means that the +ve and the -ve points are far away from each other and wider the gap , the better for us. So SVM basically tries to maximize the margin i.e dist(π+,π-).

if margin increases => misclassification decreases => generalization accuracy increases(accuracy on future unseen data)


Top comments (0)