Asst. Prof. Dr. Lahieb M. Jawad
lahieb1978@gmail.com
Big Data Analysis Concepts
Lecture Two
2/22/2024 Lecture Two 1
Big Data Analysis Concepts
Contents
Big Data Lifecycle
5
Example of BDA Lifecycle
6
Big Data Analysis Applications
7
Big Data Analysis V’s BD Analytic 1
What is Big Data Analysis? 2
Benefit of Using Big Data Analytics
3
Types of Big Data Analytics 4
2/22/2024 Lecture Two 2
Data Analysis
Data Analytics
Vs.
Data Analysis is the process of examining data to find facts,
relationships, patterns, insights and/or trends.
The overall goal of data analysis is to support better
decision making.
2/22/2024 Lecture One 17
Big Data Analysis Concepts
Data Analysis Example
• The analysis of ice cream sales data in order to determine how
the number of ice cream cones sold is related to the daily
temperature.
• The results of such an analysis would support decisions
related to how much ice cream a store should order in relation
to weather forecast information.
• Carrying out data analysis helps establish patterns and
relationships among the data being analyzed.
2/22/2024 Lecture Two 19
Big Data Analysis Concepts
Data Analytics is a broader term that encompasses data
analysis. Data analytics is a discipline that includes the
management of the complete data lifecycle, which
encompasses collecting, cleansing, organizing, storing,
analyzing and governing data.
◇ The term includes the development of analysis methods,
scientific techniques and automated tools.
◇ Data analytics enable data-driven decision-making with scientific
backing so that decisions can be based on factual data and not
simply on past experience or intuition alone.
2/22/2024 Lecture Two 20
Big Data Analysis Concepts
What is Big Data
Analysis?
Big Data Analysis Concepts
Big Data Analysis: is a process to extract meaningful in sight
from big data such as hidden pattern, unknown correlations,
market, treads and customer performances. It involves analyzing
structured and unstructured data.
2/22/2024 Lecture Two
Resources of Big Data
Big Data Analysis Concepts
Benefit of Using Big Data Analytics
Big Data Analysis Concepts
2/22/2024 Lecture Two
Big Data Analysis Concepts
2/22/2024 Lecture Two
Benefit of Using Big Data Analytics
Big Data Analysis Concepts
2/22/2024 Lecture Two
Benefit of Using Big Data Analytics
Big Data Analysis Concepts
2/22/2024 Lecture Two
Benefit of Using Big Data Analytics
Big Data Analysis Concepts
2/22/2024 Lecture Two
Benefit of Using Big Data Analytics
Big Data Analysis Concepts
2/22/2024 Lecture Two
Benefit of Using Big Data Analytics
Big Data Analysis Concepts
2/22/2024 Lecture Two
Benefit of Using Big Data Analytics
Types of Big
Data Analytics
There are four general categories of analytics that are
distinguished by the results they produce:
◇ Descriptive analytics
◇ Diagnostic analytics
◇ Predictive analytics
◇ Prescriptive analytics
Introduction To Big Data
The different analytics types varying
data, storage and processing requirements
to facilitate the delivery of multiple types of
analytic results.
2/22/2024 Lecture One 21
Big Data Analysis Concepts
2/22/2024 Lecture Two
Types of Big Data Analytics
Big Data Analysis Concepts
2/22/2024 Lecture Two
Big Data Analysis Concepts
2/22/2024 Lecture Two
Descriptive analytics are carried out to answer questions about
events that have already occurred. This form of analytics
contextualizes data to generate information. It is estimated that
80% of generated analytics results are descriptive in nature.
Sample questions can include:
◇ What was the sales volume over the past 12 months?
◇ What is the number of support calls received as categorized by
severity and geographic location?
◇ What is the monthly commission earned by each sales agent?
Introduction To Big Data
The reports are generally
static in nature and display
historical data that is
presented in the form of data
grids or charts.
2/22/2024 Lecture One 22
Big Data Analysis Concepts
2/22/2024 Lecture Two
Big Data Analysis Concepts
2/22/2024 Lecture Two
Some questions include:
◇ Why were Q2 sales less than Q1 sales?
◇ Why have there been more support calls originating from the Eastern region than
from the Western region?
◇ Why was there an increase in patient re-admission rates over the past three months?
Introduction To Big Data
Diagnostic analytics aim to determine the cause of a phenomenon that occurred in the past
using questions that focus on the reason behind the event. The goal of this type of
analytics is to determine what information is related to the phenomenon in order to enable
answering questions that seek to determine why something has occurred.
It provide more value than descriptive analytics but require a more advanced skillset.
Diagnostic analytics usually require collecting data from multiple sources and storing it in
a structure that lends itself to performing drill-down and roll-up analysis.
2/22/2024 Lecture One 23
Big Data Analysis Concepts
2/22/2024 Lecture Two
Big Data Analysis Concepts
2/22/2024 Lecture Two
Predictive analytics are carried out in an attempt to determine the outcome of an event
that might occur in the future. With predictive analytics, information is enhanced with
meaning to generate knowledge that conveys how that information is Related.
It used to generate future predictions based upon past events. It is important to
understand that the models used for predictive analytics have implicit dependencies on
the conditions under which the past events occurred.
◇ If these underlying conditions change, then the models that make predictions need
to be updated. It used of large datasets comprised of internal and external data and
various data analysis techniques.
Some questions include:
◇ What are the chances that a customer will default on a loan if they have missed
monthly payment?
◇ What will be the patient survival rate if Drug B is administered instead of Drug A?
◇ If a customer has purchased Products A and B, what are the chances that they will
also purchase Product C?
Introduction To Big Data
It try to predict the outcomes of events, and predictions
are made based on patterns, trends and exceptions
found in historical and current data. This can lead to the
identification of both risks and opportunities.
2/22/2024 Lecture One 24
Big Data Analysis Concepts
2/22/2024 Lecture Two
Prescriptive analytics build upon the results of predictive analytics by
prescribing actions that should be taken. The focus is not only on which
prescribed option is best to follow, but why.
It provide results that can be reasoned about because they embed elements of
situational understanding. Thus, this kind of analytics can be used to gain an
advantage or mitigate a risk.
Some questions include:
◇ Among three drugs, which one provides the best results?
◇ When is the best time to trade a particular stock?
Introduction To Big Data
It provide more value than any other type
of analytics and correspondingly require
the most advanced skillset
2/22/2024 Lecture One 25
Big Data Analysis
Lifecycle
2/22/2024 Lecture One 15
Big Data Analytic Lifecycle Phases
2/22/2024 Lecture Two
◇ you’ll define your data’s purpose and how to achieve it
by the time you reach the end.
◇ It focus on enterprise requirements related to data
◇ Defining the data’s purpose and how to achieve it by the
end
◇ Identifying critical objectives a business is trying to
discover by mapping out the data
Big Data Analysis Concepts
2/22/2024 Lecture Two
◇ It consists of everything that has anything to do with data
◇ It attention to information requirements
◇ It involve collecting, processing, and cleansing the accumulated data
◇ Used to make sure that the data you need is actually available to you for
processing
◇ To collect valuable information and proceed; Data is collected using the below
methods:
 Data Acquisition: Accumulating information from external sources.
 Data Entry: Formulating recent data points using digital systems or manual
data entry techniques within the enterprise.
 Signal Reception: Capturing information from digital devices, such as control
systems and the Internet of Things.
Big Data Analysis Concepts
2/22/2024 Lecture Two
◇ The main goal is to choose an analytical technique, or a short list
of candidate techniques, based on the end goal of the project.
◇ To build a model that utilizes the data to achieve the goal.
◇ To determine the methods, techniques, and workflow to build the
model in the subsequent phase.
◇ The model’s building initiates with identifying the relation
between data points to select the key variables and eventually
find a suitable model.
◇ To identifies relations between data points to select the key
variables, and eventually devises a suitable model
Big Data Analysis Concepts
2/22/2024 Lecture Two
◇ Developing data sets for testing, training, and production
purposes.
◇ They rely on tools and several techniques like decision trees,
regression techniques ,logistic regression
◇ It perform a trial run of the model to observe if the model
corresponds to the datasets, and neural networks for building
and executing the model
◇ Tools for the Model Building Phase
1. Commercial Tools: SAS, SPSS, Matlab, Alpine, STATISTICA,
Mathematica, analytics tools.
2. Free or Open Source tools: R and PL/R, PostgreSQL, Octave,
WEKA, Python, numpy, scipy, pandas, and SQL in-database.
Big Data Analysis Concepts
2/22/2024 Lecture Two
◇ To compare the outcomes of the modeling to the criteria
established for success and failure.
◇ To determine if it succeeded or failed in its objectives
Big Data Analysis Concepts
2/22/2024 Lecture Two
◇ To provide a detailed report with key findings, coding, briefings,
technical papers/ documents to the stakeholders.
◇ To measure the analysis’s effectiveness, the data is moved to a
live environment from the sandbox and monitored to observe if
the results match the expected business goal.
◇ If the findings are as per the objective, the reports and the
results are finalized.
Big Data Analysis Concepts
2/22/2024 Lecture Two
Big Data Analysis Concepts
2/22/2024 Lecture Two
Example of Big Data Analytic Lifecycle
Business Understanding and Data Understanding
Data preparation
Big Data Analysis Concepts
Example of Big Data Analytic Lifecycle
2/22/2024 Lecture Two
Big Data Analysis Concepts
2/22/2024 Lecture Two
Example of Big Data Analytic Lifecycle
Data Partitioning
Modeling
Big Data Analysis Concepts
2/22/2024 Lecture Two
Model Evaluation
Big Data Analysis Concepts
2/22/2024 Lecture Two
Big Data Analysis Concepts
Final Result
2/22/2024 Lecture Two
Big Data Analysis
Applications
Big Data Analysis Concepts
2/22/2024 Lecture One 46
Big Data Analytics Applications
Big Data Analysis Concepts
2/22/2024 Lecture Two
Big Data Analysis Concepts
2/22/2024 Lecture Two
Big Data Analysis Concepts
2/22/2024 Lecture Two
Big Data Analysis Concepts
2/22/2024 Lecture Two
Big Data Analysis Concepts
2/22/2024 Lecture Two
Big Data Analysis Concepts
2/22/2024 Lecture Two
Thank You
2/22/2024 Lecture Two

big data analysis concepts by dr. lahe

  • 1.
    Asst. Prof. Dr.Lahieb M. Jawad lahieb1978@gmail.com Big Data Analysis Concepts Lecture Two 2/22/2024 Lecture Two 1
  • 2.
    Big Data AnalysisConcepts Contents Big Data Lifecycle 5 Example of BDA Lifecycle 6 Big Data Analysis Applications 7 Big Data Analysis V’s BD Analytic 1 What is Big Data Analysis? 2 Benefit of Using Big Data Analytics 3 Types of Big Data Analytics 4 2/22/2024 Lecture Two 2
  • 3.
  • 4.
    Data Analysis isthe process of examining data to find facts, relationships, patterns, insights and/or trends. The overall goal of data analysis is to support better decision making. 2/22/2024 Lecture One 17 Big Data Analysis Concepts
  • 5.
    Data Analysis Example •The analysis of ice cream sales data in order to determine how the number of ice cream cones sold is related to the daily temperature. • The results of such an analysis would support decisions related to how much ice cream a store should order in relation to weather forecast information. • Carrying out data analysis helps establish patterns and relationships among the data being analyzed. 2/22/2024 Lecture Two 19 Big Data Analysis Concepts
  • 6.
    Data Analytics isa broader term that encompasses data analysis. Data analytics is a discipline that includes the management of the complete data lifecycle, which encompasses collecting, cleansing, organizing, storing, analyzing and governing data. ◇ The term includes the development of analysis methods, scientific techniques and automated tools. ◇ Data analytics enable data-driven decision-making with scientific backing so that decisions can be based on factual data and not simply on past experience or intuition alone. 2/22/2024 Lecture Two 20 Big Data Analysis Concepts
  • 7.
    What is BigData Analysis?
  • 8.
    Big Data AnalysisConcepts Big Data Analysis: is a process to extract meaningful in sight from big data such as hidden pattern, unknown correlations, market, treads and customer performances. It involves analyzing structured and unstructured data. 2/22/2024 Lecture Two
  • 9.
    Resources of BigData Big Data Analysis Concepts
  • 10.
    Benefit of UsingBig Data Analytics Big Data Analysis Concepts 2/22/2024 Lecture Two
  • 11.
    Big Data AnalysisConcepts 2/22/2024 Lecture Two Benefit of Using Big Data Analytics
  • 12.
    Big Data AnalysisConcepts 2/22/2024 Lecture Two Benefit of Using Big Data Analytics
  • 13.
    Big Data AnalysisConcepts 2/22/2024 Lecture Two Benefit of Using Big Data Analytics
  • 14.
    Big Data AnalysisConcepts 2/22/2024 Lecture Two Benefit of Using Big Data Analytics
  • 15.
    Big Data AnalysisConcepts 2/22/2024 Lecture Two Benefit of Using Big Data Analytics
  • 16.
    Big Data AnalysisConcepts 2/22/2024 Lecture Two Benefit of Using Big Data Analytics
  • 17.
  • 18.
    There are fourgeneral categories of analytics that are distinguished by the results they produce: ◇ Descriptive analytics ◇ Diagnostic analytics ◇ Predictive analytics ◇ Prescriptive analytics Introduction To Big Data The different analytics types varying data, storage and processing requirements to facilitate the delivery of multiple types of analytic results. 2/22/2024 Lecture One 21
  • 19.
    Big Data AnalysisConcepts 2/22/2024 Lecture Two Types of Big Data Analytics
  • 20.
    Big Data AnalysisConcepts 2/22/2024 Lecture Two
  • 21.
    Big Data AnalysisConcepts 2/22/2024 Lecture Two
  • 22.
    Descriptive analytics arecarried out to answer questions about events that have already occurred. This form of analytics contextualizes data to generate information. It is estimated that 80% of generated analytics results are descriptive in nature. Sample questions can include: ◇ What was the sales volume over the past 12 months? ◇ What is the number of support calls received as categorized by severity and geographic location? ◇ What is the monthly commission earned by each sales agent? Introduction To Big Data The reports are generally static in nature and display historical data that is presented in the form of data grids or charts. 2/22/2024 Lecture One 22
  • 23.
    Big Data AnalysisConcepts 2/22/2024 Lecture Two
  • 24.
    Big Data AnalysisConcepts 2/22/2024 Lecture Two
  • 25.
    Some questions include: ◇Why were Q2 sales less than Q1 sales? ◇ Why have there been more support calls originating from the Eastern region than from the Western region? ◇ Why was there an increase in patient re-admission rates over the past three months? Introduction To Big Data Diagnostic analytics aim to determine the cause of a phenomenon that occurred in the past using questions that focus on the reason behind the event. The goal of this type of analytics is to determine what information is related to the phenomenon in order to enable answering questions that seek to determine why something has occurred. It provide more value than descriptive analytics but require a more advanced skillset. Diagnostic analytics usually require collecting data from multiple sources and storing it in a structure that lends itself to performing drill-down and roll-up analysis. 2/22/2024 Lecture One 23
  • 26.
    Big Data AnalysisConcepts 2/22/2024 Lecture Two
  • 27.
    Big Data AnalysisConcepts 2/22/2024 Lecture Two
  • 28.
    Predictive analytics arecarried out in an attempt to determine the outcome of an event that might occur in the future. With predictive analytics, information is enhanced with meaning to generate knowledge that conveys how that information is Related. It used to generate future predictions based upon past events. It is important to understand that the models used for predictive analytics have implicit dependencies on the conditions under which the past events occurred. ◇ If these underlying conditions change, then the models that make predictions need to be updated. It used of large datasets comprised of internal and external data and various data analysis techniques. Some questions include: ◇ What are the chances that a customer will default on a loan if they have missed monthly payment? ◇ What will be the patient survival rate if Drug B is administered instead of Drug A? ◇ If a customer has purchased Products A and B, what are the chances that they will also purchase Product C? Introduction To Big Data It try to predict the outcomes of events, and predictions are made based on patterns, trends and exceptions found in historical and current data. This can lead to the identification of both risks and opportunities. 2/22/2024 Lecture One 24
  • 29.
    Big Data AnalysisConcepts 2/22/2024 Lecture Two
  • 30.
    Prescriptive analytics buildupon the results of predictive analytics by prescribing actions that should be taken. The focus is not only on which prescribed option is best to follow, but why. It provide results that can be reasoned about because they embed elements of situational understanding. Thus, this kind of analytics can be used to gain an advantage or mitigate a risk. Some questions include: ◇ Among three drugs, which one provides the best results? ◇ When is the best time to trade a particular stock? Introduction To Big Data It provide more value than any other type of analytics and correspondingly require the most advanced skillset 2/22/2024 Lecture One 25
  • 31.
  • 32.
    2/22/2024 Lecture One15 Big Data Analytic Lifecycle Phases 2/22/2024 Lecture Two
  • 33.
    ◇ you’ll defineyour data’s purpose and how to achieve it by the time you reach the end. ◇ It focus on enterprise requirements related to data ◇ Defining the data’s purpose and how to achieve it by the end ◇ Identifying critical objectives a business is trying to discover by mapping out the data Big Data Analysis Concepts 2/22/2024 Lecture Two
  • 34.
    ◇ It consistsof everything that has anything to do with data ◇ It attention to information requirements ◇ It involve collecting, processing, and cleansing the accumulated data ◇ Used to make sure that the data you need is actually available to you for processing ◇ To collect valuable information and proceed; Data is collected using the below methods:  Data Acquisition: Accumulating information from external sources.  Data Entry: Formulating recent data points using digital systems or manual data entry techniques within the enterprise.  Signal Reception: Capturing information from digital devices, such as control systems and the Internet of Things. Big Data Analysis Concepts 2/22/2024 Lecture Two
  • 35.
    ◇ The maingoal is to choose an analytical technique, or a short list of candidate techniques, based on the end goal of the project. ◇ To build a model that utilizes the data to achieve the goal. ◇ To determine the methods, techniques, and workflow to build the model in the subsequent phase. ◇ The model’s building initiates with identifying the relation between data points to select the key variables and eventually find a suitable model. ◇ To identifies relations between data points to select the key variables, and eventually devises a suitable model Big Data Analysis Concepts 2/22/2024 Lecture Two
  • 36.
    ◇ Developing datasets for testing, training, and production purposes. ◇ They rely on tools and several techniques like decision trees, regression techniques ,logistic regression ◇ It perform a trial run of the model to observe if the model corresponds to the datasets, and neural networks for building and executing the model ◇ Tools for the Model Building Phase 1. Commercial Tools: SAS, SPSS, Matlab, Alpine, STATISTICA, Mathematica, analytics tools. 2. Free or Open Source tools: R and PL/R, PostgreSQL, Octave, WEKA, Python, numpy, scipy, pandas, and SQL in-database. Big Data Analysis Concepts 2/22/2024 Lecture Two
  • 37.
    ◇ To comparethe outcomes of the modeling to the criteria established for success and failure. ◇ To determine if it succeeded or failed in its objectives Big Data Analysis Concepts 2/22/2024 Lecture Two
  • 38.
    ◇ To providea detailed report with key findings, coding, briefings, technical papers/ documents to the stakeholders. ◇ To measure the analysis’s effectiveness, the data is moved to a live environment from the sandbox and monitored to observe if the results match the expected business goal. ◇ If the findings are as per the objective, the reports and the results are finalized. Big Data Analysis Concepts 2/22/2024 Lecture Two
  • 39.
    Big Data AnalysisConcepts 2/22/2024 Lecture Two Example of Big Data Analytic Lifecycle Business Understanding and Data Understanding
  • 40.
    Data preparation Big DataAnalysis Concepts Example of Big Data Analytic Lifecycle 2/22/2024 Lecture Two
  • 41.
    Big Data AnalysisConcepts 2/22/2024 Lecture Two Example of Big Data Analytic Lifecycle Data Partitioning
  • 42.
    Modeling Big Data AnalysisConcepts 2/22/2024 Lecture Two
  • 43.
    Model Evaluation Big DataAnalysis Concepts 2/22/2024 Lecture Two
  • 44.
    Big Data AnalysisConcepts Final Result 2/22/2024 Lecture Two
  • 45.
  • 46.
    Big Data AnalysisConcepts 2/22/2024 Lecture One 46 Big Data Analytics Applications
  • 47.
    Big Data AnalysisConcepts 2/22/2024 Lecture Two
  • 48.
    Big Data AnalysisConcepts 2/22/2024 Lecture Two
  • 49.
    Big Data AnalysisConcepts 2/22/2024 Lecture Two
  • 50.
    Big Data AnalysisConcepts 2/22/2024 Lecture Two
  • 51.
    Big Data AnalysisConcepts 2/22/2024 Lecture Two
  • 52.
    Big Data AnalysisConcepts 2/22/2024 Lecture Two
  • 53.