SVM, A Machine Learning
Algorithm
Friendly Introduction to Machine Learning
Learn from experience Learn from data Follow Instructions
What is Machine Learning?
Machine Learning is the science of getting computers to act without being
explicitly programmed.
Computer
Data
Program
Output
Computer
Data
Output
Program
Traditional Programming:
Machine Learning:
Why is it trending now?
 Availability of many different kinds of data.
 Greater computational power.
 Exponential decrease in the price of powerful computing
resources.
 Train computers to do things that are very difficult to
program.
Machine Learning Methods
ML Methods
Supervised
Learning
Unsupervised
Learning
ClassificationRegression Clustering Association
Supervised Learning : Regression
Bedrooms Sq. feet Neighbourhood Rent
2 2000 K.R.Puram 25000
3 2000 Banashankari 20000
1 850 K.R.Puram 15000
3 2500 K.R.Puram 35000
4 3000 Whitefield 40000
2 1500 K.R.Puram ?
Supervised Learning : Classification
Sepal Length Sepal Width Petal Length Petal Width Class
5.1 3.5 1.4 0.2 Iris - Setosa
4.9 3.0 1.4 0.2 Iris - Setosa
7.0 3.2 4.7 1.4 Iris - Versicolor
6.4 3.2 4.5 1.5 Iris - Versicolor
6.4 3.2 5.3 2.3 Iris - Verginica
7.9 3.8 6.4 2.0 Iris - Verginica
Unsupervised Learning
Bedrooms Sq. feet Neighbourhood
2 2000 K.R.Puram
3 2000 Banashankari
1 850 K.R.Puram
3 2500 K.R.Puram
4 3000 Whitefield
Data
Training Data
Machine Learning
Algorithm
ClassifierTest Data Prediction
Support Vector Machine
 Vladimir Vapnik laid most of the groundwork for SVM
while working on his PhD thesis in the Soviet Union in
1960s.
 A supervised Machine Learning algorithm
 Used for both classification and regression
 It’s a binary classifier
Support Vector Machine
.
SVM is a classifier method that performs classification tasks by constructing
hyperplanes.
Identifying the right hyperplane?
Identifying the right Hyperplane?
Margin
Maximum perpendicular distance between the nearest data point and hyperplane -
Margin
How to compute MARGIN?
 Consider w perpendicular to median.
 Consider u, which we would like to classify.
 Decision rule:
w • u ≥ c ⇒ u is +, or
w • u + b ≥ 0 ⇒ u is +,
where c= -b.
 w • x++ b ≥ 1 (+ve sample)
w • x-+ b ≤ −1 (-ve sample)
 Introduce yi = 1 for + data and yi = -1 for -
data.
yi(x • w + b) ≥ 1, or
yi(x • w + b) − 1 ≥ 0,
 Margin =
How to compute MARGIN?
Maximizing the MARGIN
 Margin = 2||w||-1
 maximizing 2||w||-1 = maximizing ||w||-1
= minimizing ||w||
= minimizing
 Using Lagrange multipliers
 Using Lagrange multipliers
 w • x++ b ≥ 1 (+ve sample)
w • x-+ b ≤ −1 (-ve sample)
 Decision Rule:
Types of Data: Linearly Separable
Non - Linearly Separable Data
What if the data is not linearly separable?
Idea : Separable in higher dimension
X
Y
Z=X2 +Y2
X
Z
Y
Where does SVM get its name?
 Separating plane is usually determined by only a handful
of data points.
 The points that help determine the hyperplane are called
Support Vectors.
 The hyperplane itself is a classifying machine.
WEKA: A Machine
Learning tool
 Waikato Environment for Knowledge
Analysis
 Open source software tool
 Developed at The University of
Waikato
 Collection of visualization tools and
algorithms
IRIS Dataset
Attributes:
1. Sepal Length
2. Sepal Width
3. Petal Length
4. Petal Width
IRIS Flowers
Implementation of SVM in WEKA
References:
 https://www.youtube.com/watch?v=IpGxLWOIZy4
 https://weka.waikato.ac.nz/dataminingwithweka
 https://www.analyticsvidhya.com
 https://medium.com/@ageitgey
 https://www.youtube.com/watch?v=_PwhiWxHK8o&t=99s
 http://www.svm-tutorial.com
 https://www.youtube.com/watch?v=ZDfVal_4HMA
 http://www.eric-kim.net/eric-kim-net/posts/1/kernel_trick.html
 http://www4.stat.ncsu.edu/~post/todd/SVMslides.pdf

Support Vector Machine and Implementation using Weka

  • 1.
    SVM, A MachineLearning Algorithm
  • 2.
    Friendly Introduction toMachine Learning Learn from experience Learn from data Follow Instructions
  • 3.
    What is MachineLearning? Machine Learning is the science of getting computers to act without being explicitly programmed. Computer Data Program Output Computer Data Output Program Traditional Programming: Machine Learning:
  • 4.
    Why is ittrending now?  Availability of many different kinds of data.  Greater computational power.  Exponential decrease in the price of powerful computing resources.  Train computers to do things that are very difficult to program.
  • 5.
    Machine Learning Methods MLMethods Supervised Learning Unsupervised Learning ClassificationRegression Clustering Association
  • 6.
    Supervised Learning :Regression Bedrooms Sq. feet Neighbourhood Rent 2 2000 K.R.Puram 25000 3 2000 Banashankari 20000 1 850 K.R.Puram 15000 3 2500 K.R.Puram 35000 4 3000 Whitefield 40000 2 1500 K.R.Puram ?
  • 7.
    Supervised Learning :Classification Sepal Length Sepal Width Petal Length Petal Width Class 5.1 3.5 1.4 0.2 Iris - Setosa 4.9 3.0 1.4 0.2 Iris - Setosa 7.0 3.2 4.7 1.4 Iris - Versicolor 6.4 3.2 4.5 1.5 Iris - Versicolor 6.4 3.2 5.3 2.3 Iris - Verginica 7.9 3.8 6.4 2.0 Iris - Verginica
  • 8.
    Unsupervised Learning Bedrooms Sq.feet Neighbourhood 2 2000 K.R.Puram 3 2000 Banashankari 1 850 K.R.Puram 3 2500 K.R.Puram 4 3000 Whitefield
  • 9.
  • 10.
  • 11.
    Support Vector Machine Vladimir Vapnik laid most of the groundwork for SVM while working on his PhD thesis in the Soviet Union in 1960s.  A supervised Machine Learning algorithm  Used for both classification and regression  It’s a binary classifier
  • 12.
    Support Vector Machine . SVMis a classifier method that performs classification tasks by constructing hyperplanes.
  • 13.
  • 14.
    Identifying the rightHyperplane? Margin Maximum perpendicular distance between the nearest data point and hyperplane - Margin
  • 15.
    How to computeMARGIN?  Consider w perpendicular to median.  Consider u, which we would like to classify.  Decision rule: w • u ≥ c ⇒ u is +, or w • u + b ≥ 0 ⇒ u is +, where c= -b.  w • x++ b ≥ 1 (+ve sample) w • x-+ b ≤ −1 (-ve sample)
  • 16.
     Introduce yi= 1 for + data and yi = -1 for - data. yi(x • w + b) ≥ 1, or yi(x • w + b) − 1 ≥ 0,  Margin = How to compute MARGIN?
  • 17.
    Maximizing the MARGIN Margin = 2||w||-1  maximizing 2||w||-1 = maximizing ||w||-1 = minimizing ||w|| = minimizing  Using Lagrange multipliers
  • 18.
     Using Lagrangemultipliers
  • 19.
     w •x++ b ≥ 1 (+ve sample) w • x-+ b ≤ −1 (-ve sample)  Decision Rule:
  • 21.
    Types of Data:Linearly Separable
  • 22.
    Non - LinearlySeparable Data
  • 24.
    What if thedata is not linearly separable? Idea : Separable in higher dimension X Y Z=X2 +Y2 X Z Y
  • 25.
    Where does SVMget its name?  Separating plane is usually determined by only a handful of data points.  The points that help determine the hyperplane are called Support Vectors.  The hyperplane itself is a classifying machine.
  • 26.
    WEKA: A Machine Learningtool  Waikato Environment for Knowledge Analysis  Open source software tool  Developed at The University of Waikato  Collection of visualization tools and algorithms
  • 31.
    IRIS Dataset Attributes: 1. SepalLength 2. Sepal Width 3. Petal Length 4. Petal Width
  • 32.
  • 34.
  • 36.
    References:  https://www.youtube.com/watch?v=IpGxLWOIZy4  https://weka.waikato.ac.nz/dataminingwithweka https://www.analyticsvidhya.com  https://medium.com/@ageitgey  https://www.youtube.com/watch?v=_PwhiWxHK8o&t=99s  http://www.svm-tutorial.com  https://www.youtube.com/watch?v=ZDfVal_4HMA  http://www.eric-kim.net/eric-kim-net/posts/1/kernel_trick.html  http://www4.stat.ncsu.edu/~post/todd/SVMslides.pdf