Data Analytics For Beginners | Introduction To Data Analytics | Data Analytics Using R | Edureka
The document outlines a data analytics master program offered by Edureka, covering topics such as data cleaning, statistics, data visualization, and machine learning. It details the roles and responsibilities of a data analyst, required skills, and provides insights into salary expectations. Additionally, it emphasizes the significance of data analytics in generating insights and improving business processes.
Data Analytics For Beginners | Introduction To Data Analytics | Data Analytics Using R | Edureka
2.
Data Analytics MasterProgram www.edureka.co/masters-program/data-analyst-certification
Topics For Today’s Session
Introduction To Data Analytics
Data Cleaning and Manipulation
Statistics
Data Visualization
Machine Learning
Roles, Responsibilities & Salary
Hands-On
Data Analytics MasterProgram www.edureka.co/masters-program/data-analyst-certification
Why Data Analytics?
Gather Hidden Insights01
Generate Reports 02
Perform Market Analysis03
Improve Business Requirement 04
5.
Data Analytics MasterProgram www.edureka.co/masters-program/data-analyst-certification
What is Data Analytics?
Data Analytics refers to the techniques to analyse data to enhanced productivity and business gain.
Business
Administration
Exploratory Data
Analysis
Growth in Business
6.
Data Analytics MasterProgram www.edureka.co/masters-program/data-analyst-certification
Who is a Data Analyst?
Collect Data Analyse Data Create Reports
7.
Data Analytics MasterProgram www.edureka.co/masters-program/data-analyst-certification
Data Analyst Skills
Statistics Data Cleaning
EDA Data Visualization
Machine Learning
Data Analytics MasterProgram www.edureka.co/masters-program/data-analyst-certification
Statistics
Statistics is a branch of mathematics dealing with data collection and organization, analysis, interpretation and presentation.
Analyse Data
Build a Model Infer Result
Data Analytics MasterProgram www.edureka.co/masters-program/data-analyst-certification
Categories in Statistics – Descriptive Statistics
Descriptive
Descriptive statistics uses the data to provide descriptions of the population, either through numerical calculations or graphs or
tables.
Characteristics of Data
Descriptive Statistics
12.
Data Analytics MasterProgram www.edureka.co/masters-program/data-analyst-certification
Categories in Statistics – Descriptive Statistics
Descriptive
There are mainly two measures you need to understand in Descriptive Statistics.
Measures of Centre01
Measures of Spread 02
13.
Data Analytics MasterProgram www.edureka.co/masters-program/data-analyst-certification
Descriptive Statistics – Measures of Centre
Descriptive
There are 3 terms, you have to understand in Measures of Centre.
Mean
Measure of average of all the values in a sample is
called Mean.
110 + 110 + 93 + 96 + 90 + 110 + 110 + 110
8
= 103.625
14.
Data Analytics MasterProgram www.edureka.co/masters-program/data-analyst-certification
Descriptive Statistics – Measures of Centre
Descriptive
There are 3 terms, you have to understand in Measures of Centre.
Measure of average of all the values in a sample is
called Mean.
110 + 110 + 93 + 96 + 90 + 110 + 110 + 110
8
= 103.625
Measure of the central value of the sample set is
called Median.
21,21,21.3,22.8,23,23,23,23
22.8+23
2
= 22.9
Measure of the central value of the sample set is
called Median.
21,21,21.3,22.8,23,23,23,23
22.8+23
2
= 22.9
Median
15.
Data Analytics MasterProgram www.edureka.co/masters-program/data-analyst-certification
Descriptive Statistics – Measures of Centre
Descriptive
There are 3 terms, you have to understand in Measures of Centre.
Measure of average of all the values in a sample is
called Mean.
110 + 110 + 93 + 96 + 90 + 110 + 110 + 110
8
= 103.625
Measure of the central value of the sample set is
called Median.
21,21,21.3,22.8,23,23,23,23
22.8+23
2
= 22.9
Measure of the central value of the sample set is
called Median.
21,21,21.3,22.8,23,23,23,23
22.8+23
2
= 22.9
Mode
The value most recurrent in the sample set is
known as Mode.
21,21,22,23,24,25,25,25,26 Mode - 25
16.
Data Analytics MasterProgram www.edureka.co/masters-program/data-analyst-certification
Descriptive Statistics – Measures of Spread
Descriptive
Range
Range is the given
measure of how spread
apart are the values in a
dataset.
Range = Max(𝑥𝑖) - Min(𝑥𝑖)
Inter Quartile Range
Inter Quartile
Range(IQR) is the
measure of variability,
based on dividing a
dataset into quartiles.
1 2 3 4 5 6 7 8
Q1 Q2 Q3
Variance
Variance describes how
much a random variable
differs from its expected
value.
It entails computing
squares of deviations.
Standard Deviation
Standard Deviation is
the measure of the
dispersion of a set of
data from its mean.
𝑖=1
𝑁
=(𝑥𝑖−𝜇)²
1
𝑁
17.
Data Analytics MasterProgram www.edureka.co/masters-program/data-analyst-certification
Categories in Statistics – Inferential Statistics
Descriptive
Inferential
Inferential Statistics generalizes a large dataset and applies probability to draw a conclusion. It allows us to infer data parameters
based on a statistical model using a sample data.
Statistical Model
Inferential Statistics
Start
Process Step
Decision
Answer
18.
Data Analytics MasterProgram www.edureka.co/masters-program/data-analyst-certification
Inferential Statistics – Hypothesis Testing
Descriptive
Inferential
Statisticians use hypothesis testing to formally check whether the hypothesis is accepted or rejected.
State the Hypotheses – This stage
involves stating the null and alternative
hypotheses.
Formulate an Analysis Plan – This stage involves the
construction of an analysis plan.
Analyse Sample Data – This stage involves the calculation and
interpretation of the test statistic as described in the analysis plan.
Interpret Results – This stage involves the application of the decision rule described in
the analysis plan.
Hypothesis testing is conducted in the following manner:
19.
Data Analytics MasterProgram www.edureka.co/masters-program/data-analyst-certification
Descriptive vs Inferential Statistics
Descriptive Statistics Inferential Statistics
Concerned with Properties of
Population
Makes inferences from the sample
Presents data in a meaningful manner
Compares and predicts the future
outcomes
Outcomes are shown in form of
charts, tables and graphs
Outcomes are in the form of
probability scores
Describes the known data
Tries to make conclusions beyond the
data available
Measures of central tendency and
spread of data
Hypothesis Testing and Analysis of
variance.
Data Analytics MasterProgram www.edureka.co/masters-program/data-analyst-certification
Data Cleaning and Manipulation
Data Cleaning
The process of detecting and correcting corrupt or
inaccurate records from a database is said to be Data
Cleaning.
Data Manipulation
The process of changing data to make it more
organized and easy to read is known as Data
Manipulation.
Data Analytics MasterProgram www.edureka.co/masters-program/data-analyst-certification
Data Visualization
Data Visualization is the representation of data inform of charts, diagram etc.
Bar Graph Scatter Plot Pie Chart
Box Plot Line Graph
Data Analytics MasterProgram www.edureka.co/masters-program/data-analyst-certification
Machine Learning
Machine Learning is a concept which allows the machine to learn from examples and experience, and that too without being
explicitly programmed.
26.
Data Analytics MasterProgram www.edureka.co/masters-program/data-analyst-certification
Data Analyst: Roles and Responsibilities
Determining Organizational Goals Mine Data Data Cleaning
Analyzing Data Pinpointing Trends and Patterns Creating Reports with Visualizations
27.
Data Analytics MasterProgram www.edureka.co/masters-program/data-analyst-certification
Salary of Data Analyst
Average Salary (US)
Average Salary (IND)
$83,878
₹404,660
Data Analytics MasterProgram www.edureka.co/masters-program/data-analyst-certification
Need of R
R is open-source and freely available.
R is cross-platform compatible.
R is a powerful scripting language.
R is highly flexible and evolved.
Data Analytics MasterProgram www.edureka.co/masters-program/data-analyst-certification
Hands-On
To perform data analysis on the below data set and gather some insights.
32.
Data Analytics MasterProgram www.edureka.co/masters-program/data-analyst-certification
Data Analytics @edureka
Program
Starts
2nd
Week
7th Week 15th Week
11th
Week
01
02
03
04
Statistics Essentials
Probability Bayesian Interference
Regression Making Statistics
Data Analytics with R
Data Manipulation Exploratory Analysis Regression
Data Visualization Data Mining Sentiment Analysis
SAS Training
Advanced Statistical Techniques SAS Macros
PROC SQL SAS ODS Advanced SAS Procedures
Tableau Training
LOD Expressions Tableau Desktop Tableau Public
Data Visualization Integration with R
Graduated as Data Analyst
Self-Paced
Instructor - Led
33.
Data Analytics MasterProgram www.edureka.co/masters-program/data-analyst-certification
Data Analytics @edureka
QlikView
Certification
Training
Advanced
MS Excel
2010
R
Programming
Certification
Training
Analytics for
Retail Banks
Decision Tree
Modelling
Using R
Certification
Training
Machine
Learning
with Mahout
Certification
Training
Advanced
Predictive
Modelling in
R
Certification
Training