Machine Learning
By - Er. Suraj Awal
Learning
- Process of acquiring new knowledge or modifying existing knowledge to
adapt to new situations
- Involves 3 factors : change, generalization and improvement
- Learner changes, which is determined and represented in efficient way
- Performance improves on all similar tasks
- Address possibility of performance degradation and prevent from it.
Types of Learning
Rote Learning:
- System stores all the information computed before
- Stored information are retrieved when needed
- Minimal if time to retrieve is less than computation time
- Eg : Checker playing program of Samuel (Stores and retrieves, when needed,
the board positions it has encountered in previous games.)
Types of Learning
Learning By Analogy:
- Acquiring new knowledge about an entity
by transferring it from a known similar
entity.
- Eg: Consider two problem domain which
are analogous. Having the knowledge of
Kirchoff’s law, we can transfer the same
knowledge to hydraulic problem and derive
: Qc = Qa + Qb
Types of Learning
Explanation Based Learning:
- Learn from a single example x by explaining why x is an example of the
target concept.
- Explanation is then generalized.
Types of Learning
Learning By Example (Inductive Learning):
- Learning concepts by drawing inductive inference from a set of facts
- Defines a class for each domain with features as a facts
- A decision tree is made
- Eg : Iterative Dichotomizer 3 (ID3)
Iterative Dichotomizer 3 (ID3) Algorithm
- Algorithm to generate a decision tree
- Decision nodes and leaf nodes connected by arcs
- Top - bottom approach
- Entropy or information gain is used to select most useful attribute for
classification
H = - summation (pi * log2pi)
Iterative Dichotomizer 3 (ID3) Algorithm
Algorithm:
1. Create root node
2. If all examples are positive, create positive leaf node and stop
3. If all examples are negative, create negative leaf node and stop
4. Otherwise,
a) Calculate entropy to select root node and branch node
b) Partition examples into subset
c) Repeat until all examples are classified
Iterative Dichotomizer 3 (ID3) Algorithm
The data of weather for 14 days is given as :
Days Outlook Temp Humidity Wind Play Tennis
1 S H H W 0
2 S H H S 0
3 O H H W 1
4 R M H W 1
5 R C N W 1
6 R C N S 0
7 O C N S 1
Iterative Dichotomizer 3 (ID3) Algorithm
Days Outlook Temp Humidity Wind Play Tennis
8 S M H W 0
9 S C N W 1
10 R M N W 1
11 S M N S 1
12 O M H S 1
13 O H N W 1
14 R M H S 0
Iterative Dichotomizer 3 (ID3) Algorithm
Observing the data to find root node by analyzing which attribute creates the
most homogeneous branches.
Attribute Value Negative Positive Entropy
Outlook S 3 2 9.71
O 0 4
R 2 3
Temp H 2 2 12.75
M 2 4
C 1 3
Attribute Value Negative Positive Entropy
Humidity H 4 3 11.04
N 1 6
Wind W 2 6 12.49
S 3 3
Iterative Dichotomizer 3 (ID3) Algorithm
The root node is the outlook case as it has less entropy.
The root node branches to 3 values, Sunny (S), Overcast (O) and Rainy ( R).
In case of overcast, all the examples are positive. So, it will generate a new node
as ‘Yes’, which is a leaf node.
For Sunny and Rainy, we again calculate the entropy for each case to find the
next attribute to be the next node in the tree.
Iterative Dichotomizer 3 (ID3) Algorithm
For Sunny Case For Rainy Case
Attribute Value Neg Pos Entropy Neg Pos Entropy
Temp
H 2 0 2.0 0 0 4.75
M 1 1 1 2
C 0 1 1 1
Humidity
H 3 0 0 1 1 4.75
N 0 2 1 2
Wind
W 2 1 4.75 0 3 0
S 1 1 2 0
Iterative Dichotomizer 3 (ID3) Algorithm
The next node for Sunny case is Humidity and for rainy case is Wind.
The overall decision tree formed is shown below:
Genetic Algorithm
- Based on natural selection and genetic inheritance
- Generates a set of random solutions to a problem and make them compete
- Only the fittest solution survive
- Each solution represent chromosome.
- A set of such solutions represent a population
Genetic Algorithm
Genetic Operators:
1. Selection ( It replicates the most successful solution found in a population
at a rate proportional to their relative quality)
2. Crossover (It decomposes two distinct solutions and randomly mixes to
form novel solutions)
3. Mutation (It randomly produces a candidate solution)
Genetic Algorithm
Algorithm:
1. Produce an initial population of individuals
2. Evaluate the fitness of all individuals
3. Select fitter individual for reproduction
4. Recombine between individuals
5. Mutate individuals
6. Evaluate fitness of modified individuals
7. Generate a new population
8. Is solution found?
a) Yes : terminate
b) No : Repeat from step 3
Genetic Algorithm (Flowchart)
Genetic Algorithm (Example)
Derivation of given table using Genetic Algorithm:
Inputs Output
A B Z
0 0 1
0 1 0
1 0 1
1 1 0
Genetic Algorithm (Example)
Introduce some weights w1 and w2 to the input.
So, weighted sum of inputs can be calculated as:
Y = w1 * A + w2 * B
Let Ze be estimated value of Z such that:
Ze = 0 (Y < 0) and Ze = 1 (Otherwise)
Assume weights can have discrete values -1, 0 and 1
The goal is to find values of w1 and w2 that makes Ze = Z for all entries of A and
B.
Genetic Algorithm (Example)
Let no of correct entries in every generation be the fitness function and assume
population of size 4.
Initialize randomly the first generation of population as:
[ {-1, 0} , {0, 1} , {0, 0}, {1, -1} ]
The fitness function for the population be:
F = { 2, 2, 2, 3}
Genetic Algorithm (Example)
On reproduction, the chromosome with lowest fitness is removed and the one
with highest fitness is added. We get:
[ {1, -1}, {1, -1}, {0, 1} , {0, 0} ] with f = {3, 3, 2, 2}
Crossover : (1 and 3 at site 1) (2 and 4 at site 1)
[ {1, 1}, {0, -1}, {1, 0}, {0, -1} ]
The fitness becomes: f = {2, 4, 2, 4}
Hence, correct solution is the chromosome with fitness 4. So, {0, -1}
Fuzzy Learning
- Knowledge representation technique which is used if the notions can not be
defined precisely and depend upon their contexts
- Truth value may range between completely true or completely false
- Crisp variable represent precise quantities
- Fuzzy set A of universe X is defined by function μA : X → [0, 1]
where, μA = 1 (if x is totally in A)
= 0 (if x is not in A)
= (0, 1) (if x is partially in A)
Fuzzy Inference
- Also known as Mamdani inference
- Has 4 stages:
a) Fuzzification of Input Variable
b) Rule Evaluation
c) Aggregation of Rule Output
d) Defuzzification
Fuzzy Inference
Fuzzification:
- Input variables are mapped based on their memberships to the respective
fuzzy regions they belong to
Rule Evaluation:
- Fuzzified inputs are taken and applied fuzzy rule
- Result in the new knowledge of membership function
Fuzzy Inference
Aggregation of Rule Output:
- All membership functions of all rule consequents are taken and combined
into a single fuzzy set for each output variable
Defuzzification:
- Aggregated output fuzzy set is converted to crisp number
- This is the final output of the system
Fuzzy Inference (Example - Fuzzy Room
Cooler)
Assumption:
- The rate of flow of water needs to be controlled based on speed of fan and
temperature, so as to maintain temperature of room.
- Fuzzy terms be:
Temp as cold, moderate and hot
Fan speed as Slack, medium and fast
Flow rate as Negative, Medium and Positive
Fuzzy Inference (Example - Fuzzy Room
Cooler)
Definition of Linguistic Variables:
Fuzzy Inference (Example - Fuzzy Room
Cooler)
Definition of Linguistic Variables:
Fuzzy Inference (Example - Fuzzy Room
Cooler)
Definition of Linguistic Variables:
Fuzzy Inference (Example - Fuzzy Room
Cooler)
Fuzzy Rules:
Fan Speed
Slack Medium Fast
Temperature
Cold - - N
Moderate - M P
Hot M P P
Fuzzy Inference (Example - Fuzzy Room
Cooler)
Fuzzy Inference for temp 34 and fan speed 35 rpm:
1. Fuzzification
Temp : (moderate, hot) with membership (0.15, 0.31)
Fan speed : (medium, fast) with membership (0.33, 0.33)
2. Rule Evaluation and get minimum membership value
- (moderate, medium) → M = 0.15
- (moderate, fast) → P = o.15
- (hot, medium) → P = 0.31
- (hot, fast) → P = 0.31
Fuzzy Inference (Example - Fuzzy Room
Cooler)
Fuzzy Inference for temp 34 and fan speed 35 rpm:
3. Aggregation of rule output
- The value in water flow rate profile gives an area in the graph.
- By finding center of gravity of the area, a point in the graph is obtained.
4. Defuzzification
- The crisp number corresponding to the obtained point gives the desired flow
rate.
Boltzmann Machine
- A stochastic (unpredictable) recurrent neural network
- Stochastic extension of Hopfield network
- Network run repeatedly by choosing a unit and setting its state
- After a long run at certain temp, the probability will depend only on state’s
energy. This is the case of convergence.
- The structure is shown below:
Boltzmann Machine (Algorithm)
Positive Phase:
- Clamp data vector on visible units
- Let hidden units reach thermal equilibrium
- Sample si*sj for all pairs of units
- Repeat for all data vectors in training set
Negative Phase:
- Do not clamp any units
- Let whole network reach thermal equilibrium
- Sample si*sj for all pairs of units
- Repeat many times to get good estimate
Boltzmann Machine (Algorithm)
Weight Update:
- Update each weight by amount proportional to difference in <si*sj> in two
phase
The End
Would you recommend this slide?
Like, Comment and Share

Machine Learning Algorithms

  • 1.
    Machine Learning By -Er. Suraj Awal
  • 2.
    Learning - Process ofacquiring new knowledge or modifying existing knowledge to adapt to new situations - Involves 3 factors : change, generalization and improvement - Learner changes, which is determined and represented in efficient way - Performance improves on all similar tasks - Address possibility of performance degradation and prevent from it.
  • 3.
    Types of Learning RoteLearning: - System stores all the information computed before - Stored information are retrieved when needed - Minimal if time to retrieve is less than computation time - Eg : Checker playing program of Samuel (Stores and retrieves, when needed, the board positions it has encountered in previous games.)
  • 4.
    Types of Learning LearningBy Analogy: - Acquiring new knowledge about an entity by transferring it from a known similar entity. - Eg: Consider two problem domain which are analogous. Having the knowledge of Kirchoff’s law, we can transfer the same knowledge to hydraulic problem and derive : Qc = Qa + Qb
  • 5.
    Types of Learning ExplanationBased Learning: - Learn from a single example x by explaining why x is an example of the target concept. - Explanation is then generalized.
  • 6.
    Types of Learning LearningBy Example (Inductive Learning): - Learning concepts by drawing inductive inference from a set of facts - Defines a class for each domain with features as a facts - A decision tree is made - Eg : Iterative Dichotomizer 3 (ID3)
  • 7.
    Iterative Dichotomizer 3(ID3) Algorithm - Algorithm to generate a decision tree - Decision nodes and leaf nodes connected by arcs - Top - bottom approach - Entropy or information gain is used to select most useful attribute for classification H = - summation (pi * log2pi)
  • 8.
    Iterative Dichotomizer 3(ID3) Algorithm Algorithm: 1. Create root node 2. If all examples are positive, create positive leaf node and stop 3. If all examples are negative, create negative leaf node and stop 4. Otherwise, a) Calculate entropy to select root node and branch node b) Partition examples into subset c) Repeat until all examples are classified
  • 9.
    Iterative Dichotomizer 3(ID3) Algorithm The data of weather for 14 days is given as : Days Outlook Temp Humidity Wind Play Tennis 1 S H H W 0 2 S H H S 0 3 O H H W 1 4 R M H W 1 5 R C N W 1 6 R C N S 0 7 O C N S 1
  • 10.
    Iterative Dichotomizer 3(ID3) Algorithm Days Outlook Temp Humidity Wind Play Tennis 8 S M H W 0 9 S C N W 1 10 R M N W 1 11 S M N S 1 12 O M H S 1 13 O H N W 1 14 R M H S 0
  • 11.
    Iterative Dichotomizer 3(ID3) Algorithm Observing the data to find root node by analyzing which attribute creates the most homogeneous branches. Attribute Value Negative Positive Entropy Outlook S 3 2 9.71 O 0 4 R 2 3 Temp H 2 2 12.75 M 2 4 C 1 3 Attribute Value Negative Positive Entropy Humidity H 4 3 11.04 N 1 6 Wind W 2 6 12.49 S 3 3
  • 12.
    Iterative Dichotomizer 3(ID3) Algorithm The root node is the outlook case as it has less entropy. The root node branches to 3 values, Sunny (S), Overcast (O) and Rainy ( R). In case of overcast, all the examples are positive. So, it will generate a new node as ‘Yes’, which is a leaf node. For Sunny and Rainy, we again calculate the entropy for each case to find the next attribute to be the next node in the tree.
  • 13.
    Iterative Dichotomizer 3(ID3) Algorithm For Sunny Case For Rainy Case Attribute Value Neg Pos Entropy Neg Pos Entropy Temp H 2 0 2.0 0 0 4.75 M 1 1 1 2 C 0 1 1 1 Humidity H 3 0 0 1 1 4.75 N 0 2 1 2 Wind W 2 1 4.75 0 3 0 S 1 1 2 0
  • 14.
    Iterative Dichotomizer 3(ID3) Algorithm The next node for Sunny case is Humidity and for rainy case is Wind. The overall decision tree formed is shown below:
  • 15.
    Genetic Algorithm - Basedon natural selection and genetic inheritance - Generates a set of random solutions to a problem and make them compete - Only the fittest solution survive - Each solution represent chromosome. - A set of such solutions represent a population
  • 16.
    Genetic Algorithm Genetic Operators: 1.Selection ( It replicates the most successful solution found in a population at a rate proportional to their relative quality) 2. Crossover (It decomposes two distinct solutions and randomly mixes to form novel solutions) 3. Mutation (It randomly produces a candidate solution)
  • 17.
    Genetic Algorithm Algorithm: 1. Producean initial population of individuals 2. Evaluate the fitness of all individuals 3. Select fitter individual for reproduction 4. Recombine between individuals 5. Mutate individuals 6. Evaluate fitness of modified individuals 7. Generate a new population 8. Is solution found? a) Yes : terminate b) No : Repeat from step 3
  • 18.
  • 19.
    Genetic Algorithm (Example) Derivationof given table using Genetic Algorithm: Inputs Output A B Z 0 0 1 0 1 0 1 0 1 1 1 0
  • 20.
    Genetic Algorithm (Example) Introducesome weights w1 and w2 to the input. So, weighted sum of inputs can be calculated as: Y = w1 * A + w2 * B Let Ze be estimated value of Z such that: Ze = 0 (Y < 0) and Ze = 1 (Otherwise) Assume weights can have discrete values -1, 0 and 1 The goal is to find values of w1 and w2 that makes Ze = Z for all entries of A and B.
  • 21.
    Genetic Algorithm (Example) Letno of correct entries in every generation be the fitness function and assume population of size 4. Initialize randomly the first generation of population as: [ {-1, 0} , {0, 1} , {0, 0}, {1, -1} ] The fitness function for the population be: F = { 2, 2, 2, 3}
  • 22.
    Genetic Algorithm (Example) Onreproduction, the chromosome with lowest fitness is removed and the one with highest fitness is added. We get: [ {1, -1}, {1, -1}, {0, 1} , {0, 0} ] with f = {3, 3, 2, 2} Crossover : (1 and 3 at site 1) (2 and 4 at site 1) [ {1, 1}, {0, -1}, {1, 0}, {0, -1} ] The fitness becomes: f = {2, 4, 2, 4} Hence, correct solution is the chromosome with fitness 4. So, {0, -1}
  • 23.
    Fuzzy Learning - Knowledgerepresentation technique which is used if the notions can not be defined precisely and depend upon their contexts - Truth value may range between completely true or completely false - Crisp variable represent precise quantities - Fuzzy set A of universe X is defined by function μA : X → [0, 1] where, μA = 1 (if x is totally in A) = 0 (if x is not in A) = (0, 1) (if x is partially in A)
  • 24.
    Fuzzy Inference - Alsoknown as Mamdani inference - Has 4 stages: a) Fuzzification of Input Variable b) Rule Evaluation c) Aggregation of Rule Output d) Defuzzification
  • 25.
    Fuzzy Inference Fuzzification: - Inputvariables are mapped based on their memberships to the respective fuzzy regions they belong to Rule Evaluation: - Fuzzified inputs are taken and applied fuzzy rule - Result in the new knowledge of membership function
  • 26.
    Fuzzy Inference Aggregation ofRule Output: - All membership functions of all rule consequents are taken and combined into a single fuzzy set for each output variable Defuzzification: - Aggregated output fuzzy set is converted to crisp number - This is the final output of the system
  • 27.
    Fuzzy Inference (Example- Fuzzy Room Cooler) Assumption: - The rate of flow of water needs to be controlled based on speed of fan and temperature, so as to maintain temperature of room. - Fuzzy terms be: Temp as cold, moderate and hot Fan speed as Slack, medium and fast Flow rate as Negative, Medium and Positive
  • 28.
    Fuzzy Inference (Example- Fuzzy Room Cooler) Definition of Linguistic Variables:
  • 29.
    Fuzzy Inference (Example- Fuzzy Room Cooler) Definition of Linguistic Variables:
  • 30.
    Fuzzy Inference (Example- Fuzzy Room Cooler) Definition of Linguistic Variables:
  • 31.
    Fuzzy Inference (Example- Fuzzy Room Cooler) Fuzzy Rules: Fan Speed Slack Medium Fast Temperature Cold - - N Moderate - M P Hot M P P
  • 32.
    Fuzzy Inference (Example- Fuzzy Room Cooler) Fuzzy Inference for temp 34 and fan speed 35 rpm: 1. Fuzzification Temp : (moderate, hot) with membership (0.15, 0.31) Fan speed : (medium, fast) with membership (0.33, 0.33) 2. Rule Evaluation and get minimum membership value - (moderate, medium) → M = 0.15 - (moderate, fast) → P = o.15 - (hot, medium) → P = 0.31 - (hot, fast) → P = 0.31
  • 33.
    Fuzzy Inference (Example- Fuzzy Room Cooler) Fuzzy Inference for temp 34 and fan speed 35 rpm: 3. Aggregation of rule output - The value in water flow rate profile gives an area in the graph. - By finding center of gravity of the area, a point in the graph is obtained. 4. Defuzzification - The crisp number corresponding to the obtained point gives the desired flow rate.
  • 34.
    Boltzmann Machine - Astochastic (unpredictable) recurrent neural network - Stochastic extension of Hopfield network - Network run repeatedly by choosing a unit and setting its state - After a long run at certain temp, the probability will depend only on state’s energy. This is the case of convergence. - The structure is shown below:
  • 35.
    Boltzmann Machine (Algorithm) PositivePhase: - Clamp data vector on visible units - Let hidden units reach thermal equilibrium - Sample si*sj for all pairs of units - Repeat for all data vectors in training set Negative Phase: - Do not clamp any units - Let whole network reach thermal equilibrium - Sample si*sj for all pairs of units - Repeat many times to get good estimate
  • 36.
    Boltzmann Machine (Algorithm) WeightUpdate: - Update each weight by amount proportional to difference in <si*sj> in two phase
  • 37.
    The End Would yourecommend this slide? Like, Comment and Share