Did you hear about Machine Learning? Classification? Regression?

What's the difference between these tasks?

It all comes down to the data and the type of problem you want to solve. All Machine Learning algorithms use data.

Computers process numbers all the time, so any information you have needs to be represented as numbers. In the most basic form, each record is shown as an (x,y) pair in a 2 dimensional plane.

### Classification vs regression

Then the task of classification is, given a new data point, does it belong to class blue or class red? Its output is a discrete value.

For regression, it tries to predict a continuous value. That's why there is only one color.

Is this hard to implement? Not really. The module sklearn comes with the algorithms out of the box. For classification or regression there are many examples available.

The program below creates the plots

```
#!/usr/bin/python3
import matplotlib.pyplot as plt
import numpy as np
from sklearn.datasets import make_blobs, make_regression
from sklearn.svm import LinearSVC, LinearSVR
title_size = 14
axis_label_size = 12
params = {'legend.fontsize': 7,
'figure.figsize': (7, 3),
'axes.labelsize': 8,
'axes.titlesize': 9,
'xtick.labelsize': 10,
'ytick.labelsize': 10}
plt.rcParams.update(params)
def make_classification_example(axis, random_state):
X, y = make_blobs(n_samples=100, n_features=2, centers=2, cluster_std=2.7, random_state=random_state)
axis.scatter(X[y == 0, 0], X[y == 0, 1], color="red", s=10, label="Disease")
axis.scatter(X[y == 1, 0], X[y == 1, 1], color="blue", s=10, label="Healthy")
clf = LinearSVC().fit(X, y)
# get the separating hyperplane
w = clf.coef_[0]
a = -w[0] / w[1]
xx = np.linspace(-5, 7)
yy = a * xx - (clf.intercept_[0]) / w[1]
# plot the line, the points, and the nearest vectors to the plane
axis.plot(xx, yy, 'k-', color="black", label="Model")
ax1.tick_params(labelbottom='off', labelleft='off')
ax1.set_xlabel("Gene 1")
ax1.set_ylabel("Gene 2")
ax1.legend()
def make_regression_example(axis, random_state):
X, y = make_regression(n_samples=100, n_features=1, noise=30.0, random_state=random_state)
axis.scatter(X[:, 0], y, color="blue", s=10, label="Patients")
clf = LinearSVR().fit(X, y)
axis.plot(X[:, 0], clf.predict(X), color="black", label="Model")
ax2.tick_params(labelbottom='off', labelleft='off')
ax2.set_xlabel("Gene 1")
ax2.set_ylabel("Survived (years)")
ax2.legend()
random_state = np.random.RandomState(42)
f, (ax1, ax2) = plt.subplots(ncols=2)
ax1.set_title("Classification")
make_classification_example(ax1, random_state)
ax2.set_title("Regression")
make_regression_example(ax2, random_state)
plt.savefig("classification.vs.regression.png", bbox_inches="tight")
```

Machine Learning resources:

## Discussion (1)

Regression aims to predict a continuous output value. For example, say that you are trying to predict the revenue of a certain brand as a function of many input parameters. A regression model would literally be a function which can output potentially any revenue number based on certain inputs. It could even output revenue numbers which never appeared anywhere in your training set.

Classification aims to predict which class (a discrete integer or categorical label) the input corresponds to. e.g. let us say that you had divided the sales into Low and High sales, and you were trying to build a model which could predict Low or High sales (binary/two-class classication). The inputs might even be the same as before, but the output would be different. In the case of classification, your model would output either "Low" or "High," and in theory every input would generate only one of these two responses.

The above description is is true for any data science method; but, there is a gray area: There are algorithms which predict probability, which is a continuous value, between 0 and 1. By the above definition, you can consider them regression algorithms (think of logistic regression). At the same time, this probability refers to classes, so they can be used for classification (just set a threshold for probability: everything with probability < 0.5 goes into one class, and with > 0.5 into the other). How you "classify" these algorithms is a philosophical question, of little practical importance.

net-informations.com/ds/iq/default...