Spotle.ai Study Material
Spotle.ai/Learn
Planning
Successful Data Science Projects
Spotle.ai Study Material
Spotle.ai/Learn 2
Project Management is the practice of initiating,
planning, executing, controlling, and closing the
work of a team to achieve specific goals and meet
specific success criteria at the specified time. 
- Wikipedia
What Is Project Management?
Spotle.ai Study Material
Spotle.ai/Learn
Spotle.ai Study Material
Spotle.ai/Learn
✓ Strategic Alignment – Aligns
all stakeholders on a
common goal and roadmap
✓ Manage Budget – Ensures a
work is delivered within
budget
✓ Manage Time – Ensures a
work is delivered within
budget
✓ Manage Scope – Control to
prevent scope creep
✓ Risk Management –
✓ Reuse Success Stories –
Codified best practices that
can be successfully reused
✓ Builds a clear Go-To-Market
path – Creates a unified plan
from requirement gathering
to live deployment and user
acceptance
Why Is It Important?
Spotle.ai Study Material
Spotle.ai/Learn
Spotle.ai Study Material
Spotle.ai/Learn
Project Management
basically gives a way
of managing chaos
and giving a
predictable path to
success.
4
Spotle.ai Study Material
Spotle.ai/Learn
5
In Data Science Projects, Project
Management Becomes Critical:
๏ Large In Scale
๏ Cross Functional Teams
๏ CXO Involvement
๏ High Priority
๏ High Risk
๏ Multi-Agency
๏ High Complexity
๏ Data Issues
๏ Iterative Process
Spotle.ai Study Material
Spotle.ai/Learn
6
85% Of
Big Data
Projects
Fail
Source: eweek.com
Spotle.ai Study Material
Spotle.ai/Learn
7
Delivering A
Successful Data
Science Project
Requires Smart
Planning And
Execution.
Spotle.ai Study Material
Spotle.ai/Learn
8
The
Methodology
For
Managing
Data
Science
Projects
SMART
Spotle.ai Study Material
Spotle.ai/Learn
Problem
Definition
• Understand the
business problem
• Review literature
• Background
Research
Data
Acquisition And
Preparation
• Identify Data Sources
• Define Data Collection Plan
• Data Collection – Primary and
Secondary Data
• Data cleaning and preparation
Model Selection
And Fitment
• Compare and
select model
Validation
• Validate model with actual
(mock/ scrambled) data
• User Acceptance Testing
Deployment
• Deployment of model in
live
Monitoring
• Monitor
accuracy in
live scenario
The Spotle.ai 6-Phase SMART Model
For Data Science Projects. This
model is based on industry
guidelines such as CRISP-DM and
the experience of practitioners.
Specify and Prepare
Model
Analyse
Roll-out
Test and Monitor
Feedback
9
SMART
Spotle.ai Study Material
Spotle.ai/Learn
Problem
Definition
Data
Acquisition And
Preparation
Model Selection
And Fitment
Validation
Deployment
Monitoring
10
Indicative Tools
Required At
Each Phase
• Microsoft Project
• Visio
• Word
• Powerpoint
• JIRA/ Other Project
Management Tool
• Tableau
• Power BI
• Excel
• Ggplot using R
• Scala/ Python/ Tensorflow
• Hadoop
• MySQL/ Other Database
• Statistics
• R
• Python
• SQL
• R
• Python
• Scala
• Tensor Flow
• SQL
• Azure ML/ IBM Blue
mix
• Hadoop
• R/ Python/ Scala
• Server Tools
• Monitoring and
logging tools
• JIRA to record
feedback
SMART
Spotle.ai Study Material
Spotle.ai/Learn
Problem
Definition
Data
Acquisition And
Preparation
Model Selection
And Fitment
Validation
Deployment
Monitoring
11
Key Stakeholders At
Each Stage Of A Data
Science Project
• Business User
• CXO/ Project Sponsor
• Business Analyst
• Data Scientist
• Data Analysts
• Data Collection Agencies
• Analyst Relations
• Data Scientist
• Data Scientist
• Software
Engineering
Team
• Data Scientist
• Business Analyst
• Business User
• Quality Assurance
• Data Scientist
• Software Engineer
• Business User
• Project Sponsor/
CXO
• Data Scientist
SMART
Spotle.ai Study Material
Spotle.ai/Learn
The CRISP Methodology For
Managing Data Science Projects
Spotle.ai Study Material
Spotle.ai/Learn
Spotle.ai Study Material
Spotle.ai/Learn
13
CRISP-DM stands for
cross-industry process
for data mining.
The CRISP-
DM methodology is a
structured approach to
planning a data mining/
data science/ analytics
project.
Spotle.ai Study Material
Spotle.ai/Learn
Business Understanding
Data Understanding
Data Preparation
Modelling
Evaluation
Deployment
The
CRISP
Model
14
Spotle.ai Study Material
Spotle.ai/Learn
Why
Data
Science
Projects
Fail?
Lack of
access to
data
Data Issues
Lack of
cross-
functional
expertise
Competency
Issues
Complex
models/
Under or
Over-
Fitment
Other Issues
Faulty
data/Data
misclassific
ation
Data
Scientist
lacks
experience
Lack of firm
support from
CXOs/
sponsors
15
Spotle.ai Study Material
Spotle.ai/Learn
Learn Data Science In The Most Intense Course Ever: https://spotle.ai/learn/data-science
1 Manage Cross-
Functional Team
6
Effectively Manage
Scope, Budget And
Time
4 Focus On Measurable
Results
2 Engage Effectively With
Senior Leaders
3
Develop Domain And
Technology
Understanding
5 Tolerate Uncertainty
To Be A Successful Data Science Project
Manager, You Should Be Able To

Planning Your Data Science Projects

  • 1.
  • 2.
    Spotle.ai Study Material Spotle.ai/Learn2 Project Management is the practice of initiating, planning, executing, controlling, and closing the work of a team to achieve specific goals and meet specific success criteria at the specified time.  - Wikipedia What Is Project Management? Spotle.ai Study Material Spotle.ai/Learn
  • 3.
    Spotle.ai Study Material Spotle.ai/Learn ✓Strategic Alignment – Aligns all stakeholders on a common goal and roadmap ✓ Manage Budget – Ensures a work is delivered within budget ✓ Manage Time – Ensures a work is delivered within budget ✓ Manage Scope – Control to prevent scope creep ✓ Risk Management – ✓ Reuse Success Stories – Codified best practices that can be successfully reused ✓ Builds a clear Go-To-Market path – Creates a unified plan from requirement gathering to live deployment and user acceptance Why Is It Important? Spotle.ai Study Material Spotle.ai/Learn
  • 4.
    Spotle.ai Study Material Spotle.ai/Learn ProjectManagement basically gives a way of managing chaos and giving a predictable path to success. 4
  • 5.
    Spotle.ai Study Material Spotle.ai/Learn 5 InData Science Projects, Project Management Becomes Critical: ๏ Large In Scale ๏ Cross Functional Teams ๏ CXO Involvement ๏ High Priority ๏ High Risk ๏ Multi-Agency ๏ High Complexity ๏ Data Issues ๏ Iterative Process
  • 6.
    Spotle.ai Study Material Spotle.ai/Learn 6 85%Of Big Data Projects Fail Source: eweek.com
  • 7.
    Spotle.ai Study Material Spotle.ai/Learn 7 DeliveringA Successful Data Science Project Requires Smart Planning And Execution.
  • 8.
  • 9.
    Spotle.ai Study Material Spotle.ai/Learn Problem Definition •Understand the business problem • Review literature • Background Research Data Acquisition And Preparation • Identify Data Sources • Define Data Collection Plan • Data Collection – Primary and Secondary Data • Data cleaning and preparation Model Selection And Fitment • Compare and select model Validation • Validate model with actual (mock/ scrambled) data • User Acceptance Testing Deployment • Deployment of model in live Monitoring • Monitor accuracy in live scenario The Spotle.ai 6-Phase SMART Model For Data Science Projects. This model is based on industry guidelines such as CRISP-DM and the experience of practitioners. Specify and Prepare Model Analyse Roll-out Test and Monitor Feedback 9 SMART
  • 10.
    Spotle.ai Study Material Spotle.ai/Learn Problem Definition Data AcquisitionAnd Preparation Model Selection And Fitment Validation Deployment Monitoring 10 Indicative Tools Required At Each Phase • Microsoft Project • Visio • Word • Powerpoint • JIRA/ Other Project Management Tool • Tableau • Power BI • Excel • Ggplot using R • Scala/ Python/ Tensorflow • Hadoop • MySQL/ Other Database • Statistics • R • Python • SQL • R • Python • Scala • Tensor Flow • SQL • Azure ML/ IBM Blue mix • Hadoop • R/ Python/ Scala • Server Tools • Monitoring and logging tools • JIRA to record feedback SMART
  • 11.
    Spotle.ai Study Material Spotle.ai/Learn Problem Definition Data AcquisitionAnd Preparation Model Selection And Fitment Validation Deployment Monitoring 11 Key Stakeholders At Each Stage Of A Data Science Project • Business User • CXO/ Project Sponsor • Business Analyst • Data Scientist • Data Analysts • Data Collection Agencies • Analyst Relations • Data Scientist • Data Scientist • Software Engineering Team • Data Scientist • Business Analyst • Business User • Quality Assurance • Data Scientist • Software Engineer • Business User • Project Sponsor/ CXO • Data Scientist SMART
  • 12.
    Spotle.ai Study Material Spotle.ai/Learn TheCRISP Methodology For Managing Data Science Projects Spotle.ai Study Material Spotle.ai/Learn
  • 13.
    Spotle.ai Study Material Spotle.ai/Learn 13 CRISP-DMstands for cross-industry process for data mining. The CRISP- DM methodology is a structured approach to planning a data mining/ data science/ analytics project.
  • 14.
    Spotle.ai Study Material Spotle.ai/Learn BusinessUnderstanding Data Understanding Data Preparation Modelling Evaluation Deployment The CRISP Model 14
  • 15.
    Spotle.ai Study Material Spotle.ai/Learn Why Data Science Projects Fail? Lackof access to data Data Issues Lack of cross- functional expertise Competency Issues Complex models/ Under or Over- Fitment Other Issues Faulty data/Data misclassific ation Data Scientist lacks experience Lack of firm support from CXOs/ sponsors 15
  • 16.
    Spotle.ai Study Material Spotle.ai/Learn LearnData Science In The Most Intense Course Ever: https://spotle.ai/learn/data-science 1 Manage Cross- Functional Team 6 Effectively Manage Scope, Budget And Time 4 Focus On Measurable Results 2 Engage Effectively With Senior Leaders 3 Develop Domain And Technology Understanding 5 Tolerate Uncertainty To Be A Successful Data Science Project Manager, You Should Be Able To