Principal Component Analysis (PCA) and LDA PPT Slides

Principal Component Analysis
(PCA)
 Principal component analysis (PCA) is a method of
dimensionality reduction , feature extraction that
transforms the data from “d-dimensional space” into a
new co-ordinate system of dimension p , where p <= d.
 PCA was invented in 1901 by Karl Pearson as
an analogue of the principal axis theorem in
mechanics.
 It was later independently developed and
named by Harold hotelling in 1930.

Goals
The main goal of a PCA analysis is to
identify patterns in data.
It is basically used to reduce the
dimensions of data set .
 PCA aims to detect the correlation
between variables.

Transformation
 In order to approximate the space spanned by the
original data points.
X=[x1,x2x3,……….,xd]
we can chose p based on what percentage of the
variance of the original data we would lie to maintain .
 The first principal component has the maximum
variance , thus it accounts for the most significant
variance in data.
 The Second principal component has the second
highest variance and so on until principal component
has minimum variance

PCA Approach
Standardize the data.
Perform Singular Vector Decomposition to get
the Eigenvectors and Eigenvalues.
Sort eigenvalues in descending order and choose
the k- eigenvectors
Construct the projection matrix from the
selected k- eigenvectors.
Transform the original dataset via projection
matrix to obtain a k-dimensional feature
subspace.

Different types of PCA Scatter plots for better understanding

Linear Discriminant Analysis
(LDA)

Introduction
Linear Discriminant Analysis (LDA) is used to solve
dimensionality reduction for data with higher attributes
 Pre-processing step for pattern-classification and machine
learning applications.
 Used for feature extraction.
 Linear transformation that maximize the separation between
multiple classes.
 The original dichotomous discriminant analysis was developed
by Sir Ronald Fisher in 1936.

Feature Subspace :
To reduce the dimensions of a d-dimensional data set
by projecting it onto a (k)-dimensional subspace
(where k < d)
Feature space data is well represented:-
Compute eigen vectors from dataset
Collect them in scatter matrix
Generate k-dimensional data from d-dimensional
dataset.

Scatter Matrix:
Within class scatter matrix
In between class scatter matrix
Maximize the between class measure &
minimize the within class measure.

LDA steps:
1. Compute the d-dimensional mean vectors.
2. Compute the scatter matrices
3. Compute the eigenvectors and corresponding
eigenvalues for the scatter matrices.
4. Sort the eigenvalues and choose those with the largest
eigenvalues to form a d×k dimensional matrix
5. Transform the samples onto the new subspace.

Different types of LDA Scatter plots for better understanding

References:
[1]https://en.wikipedia.org/wiki/Principal_component_analysis#
[2]http://sebastianraschka.com/Articles/2015_pca_in_3_steps.ht
ml#a-summary-of-the-pca-approach
[3]http://cs.fit.edu/~dmitra/ArtInt/ProjectPapers/PcaTutorial.pdf
[4] Sebastian Raschka, Linear Discriminant Analysis Bit by Bit,
http://sebastianraschka.com/Articles/414_python_lda.html , 414.
[5] Zhihua Qiao, Lan Zhou and Jianhua Z. Huang, Effective
Linear Discriminant Analysis for High Dimensional, Low Sample
Size Data
[6] Tic Tac Toe Dataset -
https://archive.ics.uci.edu/ml/datasets/Tic-Tac-Toe+Endgame

Principal Component Analysis (PCA) and LDA PPT Slides

Principal Component Analysis (PCA) and LDA PPT Slides

More Related Content

What's hot

Similar to Principal Component Analysis (PCA) and LDA PPT Slides

Recently uploaded

In this document

Principal Component Analysis (PCA) and LDA PPT Slides