Copyright © SAS Institute Inc. All rights reserved.
1
CSC1202
Fundamentals
of Data Science
Lecture 1: Introduction to Data Science
These slides are adapted with permission from SAS Introduction to Data Science Course Materials
Copyright © SAS Institute Inc. All rights reserved.
2
Big Data Analytics
Data Science Definition
Required Skills of Data Scientists
Lecture 1: Introduction to Data Science
Copyright © SAS Institute Inc. All rights reserved.
3
Lecture 1: Introduction to Data Science
Big Data Analytics
Data Science Definition
Required Skills of Data Scientists
Data Deluge
Copyright © SAS Institute Inc. All rights reserved.
5
Consequences of the Data Deluge
• Every problem generates data eventually.
Proactively defining a data collection protocol results in more useful
information. This leads to more useful analytics.
• Every company needs analytics eventually.
Proactively analytical companies compete more effectively.
• Everyone needs analytics eventually.
Proactively analytical people are more marketable and more successful in
their work.
Copyright © SAS Institute Inc. All rights reserved.
6
"Big data is what happened when the
cost of storing information became less than
the cost of making the decision
to throw it away.”
- George Dyson
Science Historian and TED Speaker
Big Data Explained
Copyright © SAS Institute Inc. All rights reserved.
7
Big Data: What Is It?
The SAS definition of big data:
The point at which the volume, velocity, and variety of data exceed an
organization’s storage or computation capacity for accurate and timely
decision making
Here are some factors associated
with big data:
• data volume
• data velocity
• data variety
• data variability
• data complexity
Copyright © SAS Institute Inc. All rights reserved.
8
Data Volume
Data volumes are increasing due to use of the following:
• social media (Facebook, Twitter, Instagram)
• machines talking to machines
• improvements in the manufacturing process (quality control)
• automated tracking devices
• streaming data feeds
Copyright © SAS Institute Inc. All rights reserved.
9
Data Velocity
• business processes that are more automated
• mergers and acquisitions
• more use of social media
• more use of self-service applications
• integration of business applications
Copyright © SAS Institute Inc. All rights reserved.
10
Data Variety
• structured data
• unstructured data
• business applications
• unstructured text documents
(articles, blogs, and so on)
• emails
• digital images
• video and audio clips
• streaming data
• stock ticker data
• RFID tag data
• sensor data
Copyright © SAS Institute Inc. All rights reserved.
11
Data Variability
• The flow of data changes over time (seasonality, peak response, social
media trends, and so on).
• Data values change over time. How much history do you keep?
• Data values are different across data sources.
• Data is stored in different formats.
• Data standards change across time.
What was “valid” five years ago
might not be “valid” today.
Copyright © SAS Institute Inc. All rights reserved.
12
Data Complexity
Data comes from a variety of systems in a variety of formats. This can make
it difficult to merge, cleanse, and transform data in a uniform manner.
Copyright © SAS Institute Inc. All rights reserved.
13
Reasons for the Big Data Explosion
• increasing “data velocity” due to the following:
• streaming data feeds
• point-of-sale (POS) transactional systems
• radio-frequency identification (RFID) tags
• smart metering
• bigger and cheaper data storage capabilities
• social media
• improved and automated business processes
• mergers and acquisitions, leading to the merge of multiple data sources
• more online self-service applications being used
Copyright © SAS Institute Inc. All rights reserved.
14
Factors Driving Demand for Big Data Solutions
In addition to rapidly increasing data growth rates, consider these factors:
• availability of data from social media sources
• in-memory technology
• demand for mobile business intelligence
• increasing requirements around real-time reporting
• desire to mine data from social media sources (sentiment analysis)
Copyright © SAS Institute Inc. All rights reserved.
15
How do I find the relevant data?
Copyright © SAS Institute Inc. All rights reserved.
16
Big Data Analytics
Data Science Definition
Required Skills of Data Scientists
Lecture 1: Introduction to Data Science
Data Science Venn Diagram
(Conway, 2010)
Data Science is a combination of:
- Computer skills
- Mathematical knowledge
- Domain knowledge in the particular
field
Conway (2010) emphasizes the need to
learn a lot!!
Copyright © SAS Institute Inc. All rights reserved.
18
Data Science: A Definition According to SAS
"Data Science can be thought of as a multidisciplinary field that combines
skills in software engineering and statistics with domain experience to
support the end-to-end analysis of large and diverse data sets, ultimately
uncovering value for an organization and then communicating that value to
stakeholders as actionable results."
Copyright © SAS Institute Inc. All rights reserved.
19
Data Science: A Definition According to SAS
communication
to stakeholders
as actionable results
software
engineering
advanced
analytics
domain
experience
support the
end-to-end analysis
of large and diverse
data sets value
"Data Science can be thought of as a multidisciplinary field
that combines skills in software engineering and statistics
with domain experience to support the end-to-end analysis
of large and diverse data sets...
...ultimately uncovering value for an organization and then
communicating that value to stakeholders as actionable
results."
Copyright © SAS Institute Inc. All rights reserved.
20
Levels of Analytics
Copyright © SAS Institute Inc. All rights reserved.
21
Analytic Methods
Descriptive model
Predictive model
Prescriptive model
helps you understand what
happened, or diagnostic models that
help you understand key
relationships and determine why
something happened
types
techniques
classification -> predict class membership
regression -> predict a number
decision trees | linear/logistic regression
neural networks
gradient boosting | random forests
support vector machines
the use of data, statistical algorithms,
and machine learning techniques to
identify the likelihood of future
outcomes based on historical data
what to do by providing
information about optimal
decisions based on the
predicted future scenarios
Copyright © SAS Institute Inc. All rights reserved.
22
Data Analysis
Machine Learning
Artificial Intelligence
Statistics
Natural Language
Processing
Data Mining
Predictive Analysis
Glossary of Terms
Deep Learning
Computer Vision
Prescriptive Analysis
Optimization
Copyright © SAS Institute Inc. All rights reserved.
23
Data Analysis
Machine Learning
Artificial Intelligence
Statistics
Natural Language
Processing
trains a machine how to learn with
minimal human intervention
trains a machine to perform
human-like tasks
enables understanding,
interaction, and communication
between humans and machines
machines learn from experience
adjust to new inputs and
perform human-like tasks
find meaningful patterns
and knowledge in data
numeric study of
data relationships
Data Mining
Predictive Analysis
Glossary of Terms
Deep Learning
in data, understand what is
relevant, assess outcomes,
accelerate informed decisions
identify the likelihood of future
outcomes based on historical data
Computer Vision
analyzes/interprets
a picture or video
Prescriptive Analysis
providing information
about optimal decisions
based on the predicted
future scenarios
delivers the best results
given resource constraints
Optimization
Copyright © SAS Institute Inc. All rights reserved.
Chandana Gopal, IDC, December 2017
Analytics is core to success in the digital economy.
Data and analytics driven organizations will thrive.
Copyright © SAS Institute Inc. All rights reserved.
Organizations That Are Using SAS AI and
Analytics Solutions
Copyright © SAS Institute Inc. All rights reserved.
53%
fewer customer
complaints1
Improved
liver and brain
tumor diagnosis with
AI and analytics
2.7x
increase in client
purchase rates4
90%
accuracy for ID of
wildlife using tracks5
Continuous learning
and Insight from
clients to improve
design and quality3
Rogers
Telecom
Amsterdam
UMC
Health Care
WildTrack
Data for Good
Honda
Manufacturing
Daiwa
Financial
Copyright © SAS Institute Inc. All rights reserved.
26
Big Data Analytics
Data Science Definition
Required Skills of Data Scientists
Lecture 1: Introduction to Data Science
Copyright © SAS Institute Inc. All rights reserved.
27
What Is a Data Scientist?
Data scientists are a new breed of analytical data expert
who have the technical skills to solve complex problems
and the curiosity to explore what problems need to be solved.
They are part mathematician, part computer scientist, part trend spotter.
They are a sign of the times. Their popularity reflects how businesses now
think about big data.
That unwieldy mass of unstructured information can no longer be ignored and forgotten.
It is a virtual gold mine that helps boost revenue – as long as there is someone who digs in and unearths business
insights that no one thought to look for before.
Enter the data scientist
Copyright © SAS Institute Inc. All rights reserved.
28
Typical Job Duties for a Data Scientist
It is not definitive, but think of …
• Collecting large amounts of
unruly data and transforming it
into a more usable format
• Solving business-related problems
using data-driven techniques
• Working with a variety of
programming languages,
including SAS, R and Python
• Having a solid grasp of statistics,
including statistical tests and
distributions
• Staying on top of analytical
techniques such as machine learning,
deep learning, and text analytics
• Communicating and collaborating
with both IT and business
• Looking for order and patterns
in data, as well as spotting trends
that can help a business’s
bottom line
Copyright © SAS Institute Inc. All rights reserved.
29
Typical Job Responsibilities for a Data Scientist
• collect large amounts of unruly data and transform it into
a more usable format
• solve business-related problems using data-driven techniques
• work with a variety of programming languages
(for example, SAS, R, and Python)
• have a solid grasp of statistics, such as statistical tests
and distributions
• stay on top of analytical techniques such as social
network analysis, text analytics, and new methodologies
for predictive modeling
• communicate and collaborate with both IT and business
• look for order and patterns in data
Copyright © SAS Institute Inc. All rights reserved.
30
But …
• There just are not enough data scientists in the workforce.
• it is important to realize one data scientist might not have all the
necessary skills.
• it is important to develop a team of data scientists that are “scattered
across the business.”
• There is a rise of easier-to-use analytics tools.
• Analytics is so important to society that it cannot be something that is only
the domain of experts.
➢So companies rely on Citizen Data Scientists.
(Gartner research director, Alexander Linden, April 2015)
Copyright © SAS Institute Inc. All rights reserved.
31
How to Find Citizen Data Scientists?
The demand for citizen data scientists will increase
five times more quickly than the demand
for “traditional,” highly skilled data scientists.
http://www.sas.com/en_us/insights/articles/analytics/how-to-find-and-equip-citizen-data-scientists.html
Copyright © SAS Institute Inc. All rights reserved.
32
Characteristics of Citizen Data Scientists
• tired of looking at the same reports
• want to get their hands on all the data themselves
and find new ways to get answers
• willing to learn new methods and use new tools
• analytically minded
I don’t want to
ask a statistician.
I want to try it myself.
How do I get
the answer?
Copyright © SAS Institute Inc. All rights reserved.
33
Three Roles Working Together …
domain expertise
… from basic discovery to data science ...
business
analyst
citizen
data
scientist
data
scientist
data science expertise
advanced analytics
Copyright © SAS Institute Inc. All rights reserved.
34
Data Scientist Skills
Computer
Science
Mathematics
and Statistics
Domain
Knowledge
Communication
and Visualization
Machine
Learning
Research
Software
Papers and
Techniques
Articles and
Best Practices
Reports
and Tasks
Scores and
Insights
Copyright © SAS Institute Inc. All rights reserved.
35
• Engagement with Business and
Management Levels
• Translation Insights into Business
Decisions and Actions
• Visual Presentation Expertise
• Data Visualization Tools Skills
• Storytelling Capabilities
• Programming Language
• Statistical Package
• Scripting Language
• Mathematical Package
• Machine Learning Package
• Deep Learning Package
• Data Cleansing
• Data Preparation
• Visualization Tools
• Databases (SQL, NoSQL, Graph)
• Parallel Database and
Parallel Query
• Distributed Computing
• Hadoop and Hive
• MapReduce
• Cloud Computing
• Graphical Processing
Data Scientist Skills
• Business Knowledge
• Data Curiosity
• Analytical Approach
• Problem Solver
• Proactive
• Strategic
• Creative
• Innovative
• Collaborative
• Design of Experiments
• Descriptive Statistics
• Statistical Inference
• Supervised Modeling (Regression,
Decision Tree, Forest, Gradient
Boosting, Neural Networks, Support
Vector Machine, Factorization
Machine, Ensemble Models, Two-
Stage Models)
• Unsupervised Modeling (K-Means,
Self-Organizing Maps, Variable
Clustering, Principal Components,
Association Rules, Sequence,
Association, Path Analysis, Link
Analysis)
• Optimization
• Forecasting
• Econometrics
• Text Mining
Computer Science
Mathematics and Statistics Domain Knowledge Communication and Visualization
Copyright © SAS Institute Inc. All rights reserved.
36
Data Scientist Approach
Science Art
Math
Statistics
Computer Science
Creativity
Trial and Error
Invention
Copyright © SAS Institute Inc. All rights reserved.
37
Data Scientist
Copyright © SAS Institute Inc. All rights reserved.
38
Government
Applied Data Science
Utilities
Retail
Insurance
Banking
Risk Analysis
Fraud Detection
Forecasting
Supply Chain
Bad Debt Prediction
Collecting Prediction
Spending Optimization
Loss Estimation
Customer Transaction Behavior
Money Laundering
Anomaly Detection
Churn Cross-Sell/Upsell
Segmentation
References
Conway, D. (2010). The Data Science Venn Diagram. Drewconway.com.
http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram
Van Der Velden, J. (2021). Introduction to Data Science Course Notes. SAS
Institute.

Introduction to Data Science - Fundamentals

  • 1.
    Copyright © SASInstitute Inc. All rights reserved. 1 CSC1202 Fundamentals of Data Science Lecture 1: Introduction to Data Science These slides are adapted with permission from SAS Introduction to Data Science Course Materials
  • 2.
    Copyright © SASInstitute Inc. All rights reserved. 2 Big Data Analytics Data Science Definition Required Skills of Data Scientists Lecture 1: Introduction to Data Science
  • 3.
    Copyright © SASInstitute Inc. All rights reserved. 3 Lecture 1: Introduction to Data Science Big Data Analytics Data Science Definition Required Skills of Data Scientists
  • 4.
  • 5.
    Copyright © SASInstitute Inc. All rights reserved. 5 Consequences of the Data Deluge • Every problem generates data eventually. Proactively defining a data collection protocol results in more useful information. This leads to more useful analytics. • Every company needs analytics eventually. Proactively analytical companies compete more effectively. • Everyone needs analytics eventually. Proactively analytical people are more marketable and more successful in their work.
  • 6.
    Copyright © SASInstitute Inc. All rights reserved. 6 "Big data is what happened when the cost of storing information became less than the cost of making the decision to throw it away.” - George Dyson Science Historian and TED Speaker Big Data Explained
  • 7.
    Copyright © SASInstitute Inc. All rights reserved. 7 Big Data: What Is It? The SAS definition of big data: The point at which the volume, velocity, and variety of data exceed an organization’s storage or computation capacity for accurate and timely decision making Here are some factors associated with big data: • data volume • data velocity • data variety • data variability • data complexity
  • 8.
    Copyright © SASInstitute Inc. All rights reserved. 8 Data Volume Data volumes are increasing due to use of the following: • social media (Facebook, Twitter, Instagram) • machines talking to machines • improvements in the manufacturing process (quality control) • automated tracking devices • streaming data feeds
  • 9.
    Copyright © SASInstitute Inc. All rights reserved. 9 Data Velocity • business processes that are more automated • mergers and acquisitions • more use of social media • more use of self-service applications • integration of business applications
  • 10.
    Copyright © SASInstitute Inc. All rights reserved. 10 Data Variety • structured data • unstructured data • business applications • unstructured text documents (articles, blogs, and so on) • emails • digital images • video and audio clips • streaming data • stock ticker data • RFID tag data • sensor data
  • 11.
    Copyright © SASInstitute Inc. All rights reserved. 11 Data Variability • The flow of data changes over time (seasonality, peak response, social media trends, and so on). • Data values change over time. How much history do you keep? • Data values are different across data sources. • Data is stored in different formats. • Data standards change across time. What was “valid” five years ago might not be “valid” today.
  • 12.
    Copyright © SASInstitute Inc. All rights reserved. 12 Data Complexity Data comes from a variety of systems in a variety of formats. This can make it difficult to merge, cleanse, and transform data in a uniform manner.
  • 13.
    Copyright © SASInstitute Inc. All rights reserved. 13 Reasons for the Big Data Explosion • increasing “data velocity” due to the following: • streaming data feeds • point-of-sale (POS) transactional systems • radio-frequency identification (RFID) tags • smart metering • bigger and cheaper data storage capabilities • social media • improved and automated business processes • mergers and acquisitions, leading to the merge of multiple data sources • more online self-service applications being used
  • 14.
    Copyright © SASInstitute Inc. All rights reserved. 14 Factors Driving Demand for Big Data Solutions In addition to rapidly increasing data growth rates, consider these factors: • availability of data from social media sources • in-memory technology • demand for mobile business intelligence • increasing requirements around real-time reporting • desire to mine data from social media sources (sentiment analysis)
  • 15.
    Copyright © SASInstitute Inc. All rights reserved. 15 How do I find the relevant data?
  • 16.
    Copyright © SASInstitute Inc. All rights reserved. 16 Big Data Analytics Data Science Definition Required Skills of Data Scientists Lecture 1: Introduction to Data Science
  • 17.
    Data Science VennDiagram (Conway, 2010) Data Science is a combination of: - Computer skills - Mathematical knowledge - Domain knowledge in the particular field Conway (2010) emphasizes the need to learn a lot!!
  • 18.
    Copyright © SASInstitute Inc. All rights reserved. 18 Data Science: A Definition According to SAS "Data Science can be thought of as a multidisciplinary field that combines skills in software engineering and statistics with domain experience to support the end-to-end analysis of large and diverse data sets, ultimately uncovering value for an organization and then communicating that value to stakeholders as actionable results."
  • 19.
    Copyright © SASInstitute Inc. All rights reserved. 19 Data Science: A Definition According to SAS communication to stakeholders as actionable results software engineering advanced analytics domain experience support the end-to-end analysis of large and diverse data sets value "Data Science can be thought of as a multidisciplinary field that combines skills in software engineering and statistics with domain experience to support the end-to-end analysis of large and diverse data sets... ...ultimately uncovering value for an organization and then communicating that value to stakeholders as actionable results."
  • 20.
    Copyright © SASInstitute Inc. All rights reserved. 20 Levels of Analytics
  • 21.
    Copyright © SASInstitute Inc. All rights reserved. 21 Analytic Methods Descriptive model Predictive model Prescriptive model helps you understand what happened, or diagnostic models that help you understand key relationships and determine why something happened types techniques classification -> predict class membership regression -> predict a number decision trees | linear/logistic regression neural networks gradient boosting | random forests support vector machines the use of data, statistical algorithms, and machine learning techniques to identify the likelihood of future outcomes based on historical data what to do by providing information about optimal decisions based on the predicted future scenarios
  • 22.
    Copyright © SASInstitute Inc. All rights reserved. 22 Data Analysis Machine Learning Artificial Intelligence Statistics Natural Language Processing Data Mining Predictive Analysis Glossary of Terms Deep Learning Computer Vision Prescriptive Analysis Optimization
  • 23.
    Copyright © SASInstitute Inc. All rights reserved. 23 Data Analysis Machine Learning Artificial Intelligence Statistics Natural Language Processing trains a machine how to learn with minimal human intervention trains a machine to perform human-like tasks enables understanding, interaction, and communication between humans and machines machines learn from experience adjust to new inputs and perform human-like tasks find meaningful patterns and knowledge in data numeric study of data relationships Data Mining Predictive Analysis Glossary of Terms Deep Learning in data, understand what is relevant, assess outcomes, accelerate informed decisions identify the likelihood of future outcomes based on historical data Computer Vision analyzes/interprets a picture or video Prescriptive Analysis providing information about optimal decisions based on the predicted future scenarios delivers the best results given resource constraints Optimization
  • 24.
    Copyright © SASInstitute Inc. All rights reserved. Chandana Gopal, IDC, December 2017 Analytics is core to success in the digital economy. Data and analytics driven organizations will thrive.
  • 25.
    Copyright © SASInstitute Inc. All rights reserved. Organizations That Are Using SAS AI and Analytics Solutions Copyright © SAS Institute Inc. All rights reserved. 53% fewer customer complaints1 Improved liver and brain tumor diagnosis with AI and analytics 2.7x increase in client purchase rates4 90% accuracy for ID of wildlife using tracks5 Continuous learning and Insight from clients to improve design and quality3 Rogers Telecom Amsterdam UMC Health Care WildTrack Data for Good Honda Manufacturing Daiwa Financial
  • 26.
    Copyright © SASInstitute Inc. All rights reserved. 26 Big Data Analytics Data Science Definition Required Skills of Data Scientists Lecture 1: Introduction to Data Science
  • 27.
    Copyright © SASInstitute Inc. All rights reserved. 27 What Is a Data Scientist? Data scientists are a new breed of analytical data expert who have the technical skills to solve complex problems and the curiosity to explore what problems need to be solved. They are part mathematician, part computer scientist, part trend spotter. They are a sign of the times. Their popularity reflects how businesses now think about big data. That unwieldy mass of unstructured information can no longer be ignored and forgotten. It is a virtual gold mine that helps boost revenue – as long as there is someone who digs in and unearths business insights that no one thought to look for before. Enter the data scientist
  • 28.
    Copyright © SASInstitute Inc. All rights reserved. 28 Typical Job Duties for a Data Scientist It is not definitive, but think of … • Collecting large amounts of unruly data and transforming it into a more usable format • Solving business-related problems using data-driven techniques • Working with a variety of programming languages, including SAS, R and Python • Having a solid grasp of statistics, including statistical tests and distributions • Staying on top of analytical techniques such as machine learning, deep learning, and text analytics • Communicating and collaborating with both IT and business • Looking for order and patterns in data, as well as spotting trends that can help a business’s bottom line
  • 29.
    Copyright © SASInstitute Inc. All rights reserved. 29 Typical Job Responsibilities for a Data Scientist • collect large amounts of unruly data and transform it into a more usable format • solve business-related problems using data-driven techniques • work with a variety of programming languages (for example, SAS, R, and Python) • have a solid grasp of statistics, such as statistical tests and distributions • stay on top of analytical techniques such as social network analysis, text analytics, and new methodologies for predictive modeling • communicate and collaborate with both IT and business • look for order and patterns in data
  • 30.
    Copyright © SASInstitute Inc. All rights reserved. 30 But … • There just are not enough data scientists in the workforce. • it is important to realize one data scientist might not have all the necessary skills. • it is important to develop a team of data scientists that are “scattered across the business.” • There is a rise of easier-to-use analytics tools. • Analytics is so important to society that it cannot be something that is only the domain of experts. ➢So companies rely on Citizen Data Scientists. (Gartner research director, Alexander Linden, April 2015)
  • 31.
    Copyright © SASInstitute Inc. All rights reserved. 31 How to Find Citizen Data Scientists? The demand for citizen data scientists will increase five times more quickly than the demand for “traditional,” highly skilled data scientists. http://www.sas.com/en_us/insights/articles/analytics/how-to-find-and-equip-citizen-data-scientists.html
  • 32.
    Copyright © SASInstitute Inc. All rights reserved. 32 Characteristics of Citizen Data Scientists • tired of looking at the same reports • want to get their hands on all the data themselves and find new ways to get answers • willing to learn new methods and use new tools • analytically minded I don’t want to ask a statistician. I want to try it myself. How do I get the answer?
  • 33.
    Copyright © SASInstitute Inc. All rights reserved. 33 Three Roles Working Together … domain expertise … from basic discovery to data science ... business analyst citizen data scientist data scientist data science expertise advanced analytics
  • 34.
    Copyright © SASInstitute Inc. All rights reserved. 34 Data Scientist Skills Computer Science Mathematics and Statistics Domain Knowledge Communication and Visualization Machine Learning Research Software Papers and Techniques Articles and Best Practices Reports and Tasks Scores and Insights
  • 35.
    Copyright © SASInstitute Inc. All rights reserved. 35 • Engagement with Business and Management Levels • Translation Insights into Business Decisions and Actions • Visual Presentation Expertise • Data Visualization Tools Skills • Storytelling Capabilities • Programming Language • Statistical Package • Scripting Language • Mathematical Package • Machine Learning Package • Deep Learning Package • Data Cleansing • Data Preparation • Visualization Tools • Databases (SQL, NoSQL, Graph) • Parallel Database and Parallel Query • Distributed Computing • Hadoop and Hive • MapReduce • Cloud Computing • Graphical Processing Data Scientist Skills • Business Knowledge • Data Curiosity • Analytical Approach • Problem Solver • Proactive • Strategic • Creative • Innovative • Collaborative • Design of Experiments • Descriptive Statistics • Statistical Inference • Supervised Modeling (Regression, Decision Tree, Forest, Gradient Boosting, Neural Networks, Support Vector Machine, Factorization Machine, Ensemble Models, Two- Stage Models) • Unsupervised Modeling (K-Means, Self-Organizing Maps, Variable Clustering, Principal Components, Association Rules, Sequence, Association, Path Analysis, Link Analysis) • Optimization • Forecasting • Econometrics • Text Mining Computer Science Mathematics and Statistics Domain Knowledge Communication and Visualization
  • 36.
    Copyright © SASInstitute Inc. All rights reserved. 36 Data Scientist Approach Science Art Math Statistics Computer Science Creativity Trial and Error Invention
  • 37.
    Copyright © SASInstitute Inc. All rights reserved. 37 Data Scientist
  • 38.
    Copyright © SASInstitute Inc. All rights reserved. 38 Government Applied Data Science Utilities Retail Insurance Banking Risk Analysis Fraud Detection Forecasting Supply Chain Bad Debt Prediction Collecting Prediction Spending Optimization Loss Estimation Customer Transaction Behavior Money Laundering Anomaly Detection Churn Cross-Sell/Upsell Segmentation
  • 39.
    References Conway, D. (2010).The Data Science Venn Diagram. Drewconway.com. http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram Van Der Velden, J. (2021). Introduction to Data Science Course Notes. SAS Institute.