Support vector machines

CONTENTS
* MACHINE LEARNING
ALGORITHM
* CLASSIFICATION USING SVM
* KERNELS IN SVM
* ISSUES
* SVM REGRESSION
* IMPLEMENTATION

In general, a machine learning algorithm is used to predict the output for future based upon the
previously collected data.
Support vector machines is a supervised machine learning algorithm applied for both regression and
classification problems. SVMs are based upon statistical methods. The robust and accurate nature of
SVM made it the most efficient algorithm in the area of machine learning.
MACHINE LEARNING ALGORITHM

CLASSIFICATION USING SVM
• In this algorithm of classification, we plot each data item as a point in n-dimensional space (where n is number of
features you have) with the value of each feature being the value of a particular coordinate. Then, we perform
classification by finding the hyper-plane that differentiate the two classes.
• A hyper-plane is a line that separates data belonging to various classes or properties.
HYPER PLANE

CLASSIFICATION OF LINEAR DATA
Consider a linearly separable data which can be represented in a graphical format with two dimensions only.
This linear data can be easily classified by using a hyper-plane
Plot the data on graph
• There a two hyper planes which divide the data
• The efficiency of classification depends upon the best hyperplane chosen

CHOOSING THE BEST HYPER-PLANE
• According to the SVM algorithm we find the points closest to the line from both the classes.These points are called
support vectors.
• Now, we compute the distance between the line and the support vectors. This distance is called the margin.
• Our goal is to maximize the margin.
• The hyperplane for which the margin is maximum is the optimal hyperplane.
• Thus SVM tries to make a decision boundary in such a way that the separation between the two classes is as wide
as possible.

CLASSIFICATION OF NON-LINEAR DATA
Classifying the data which is linearly separable is easy by constructing a hyper plane.
But seperating a non linear data with numerous dimensions is not possible by drawing a straight line
The solution is to map the data into another space that can be separated linearly
Non linearly separable data
Add one more dimension z-axis
Now the data is clearly linearly separable. Let the purple line separating the data in higher dimension be z=k, where k is
a constant. Since, z=x²+y² we get x² + y² = k; which is an equation of a circle. So, we can project this linear separator in
higher dimension back in original dimensions using this transformation.

KERNELS IN SVM
Non-linear data can be classified by adding an extra dimension but finding the correct transformation is not easy everytime.
For this purpose we use kernels.
Kernels are used because constructing feature space which is
highly dimensional is costly.

DIFFERENT TYPES OF KERNEL FUNCTIONS
A kernel function is defined as a function that corresponds to a dot product of two feature vectors in some expanded
feature space:
Now we only need to compute k(xi,xj) and we don’t need to perform computations in high dimensional space explicitly.
This is what is called the Kernel Trick.
Some commonly used kernel functions are
Ploynomial kernel
Guassian kernel
Sigmoid kernel

Linear Kernel:
The Linear kernel is the simplest kernel function. It is given by the inner product plus an optional constant c.
Polynomial Kernel:
The Polynomial kernel is a non-stationary kernel. Polynomial kernels are well suited for problems where all the
training data is normalized.

IMPORTANT KERNEL ISSUES
For most of the kernel function we don’t know the corresponding mapping function so we don’t know to which
dimension we rose the data. So even though rising to higher dimension increases the likelihood that they will be
separable we can’t guarantee that . We will see a compromising solution for this problem.
Secondly,a strong kernel ,which lifts the data to infinite dimension, sometimes may lead us the
severe problem of Overfitting:
Symptoms of overfitting:
1-Low margin -> poor classification performance.
2-Large number of support vectors->Slows down the computation.
The biggest limitation of SVM lies in the choice of the kernel (the best choice of
kernel for a given problem is still a research problem). A second limitation is
speed and size (mostly in training - for large training sets, it typically selects a
small number of support vectors, thereby minimizing the computational
requirements during testing). The optimal design for multiclass SVM classifiers
is also a research area.

OVERFITTING PROBLEM
A well known problem with machine learning methods is overtraining.
In statistics, overfitting is "the production of an analysis that corresponds too closely or exactly to a particular set of
data, and may therefore fail to fit additional data or predict future observations reliably".
The green line represents an overfitted model and the black line represents a regularized model. While the green line best
follows the training data, it is too dependent on that data and it is likely to have a higher error rate on new unseen data,
compared to the black line.

Linearly separable
But low margin!
All these problems leads us to the compromising solution: SOFT MARGIN
The allowance of softness in margins (i.e. a low cost setting) allows for errors to be made while fitting the model
(support vectors) to the training/discovery data set.
Conversely, hard margins will result in fitting of a model that allows zero errors.
Sometimes it can be helpful to allow for errors in the training set, because it may produce a more generalizable model when
applied to new datasets.

SUPPORT VECTOR REGRESSION (SVR)
Coming to regression which tries to fit the data into a model in order to predict a quantity for future.
The Support Vector Regression (SVR) process is much similar to classification with minute differences.
The output for regression will be a real number which is predicted by substituting its values to the equation that is
constructed based upon the relationship between the attributes of already available data.

REAL TIME APPLICATIONS
The aim of using SVM is to correctly classify unseen data. SVMs have a number of applications in several fields.
Some common applications of SVM are-
• Face detection
• Text and hypertext categorization
• Classification of images
• Bioinformatics
• Protein fold and remote homology detection
• Handwriting recognition
• Generalized predictive control(GPC)

THE END
MANASWINI MYSORE
15BD1A0531

Support vector machines

More Related Content

What's hot

Similar to Support vector machines

Recently uploaded

In this document

Support vector machines