Intro to Machine Learning
By: Hesham Gowaily
Agenda
- Introduction to AI.
- What is Machine Learning.
- Types of Machine Learning.
- ML Case Studies.
- How to study ML.
Introduction to AI
AI Applications
• Face Recognition
• handwriting analysis
• Automatic Captioning
• Object Tracking
• Image Styling
• Machine Translation
• Sentiment Analysis
• Chatbots
• Automatic Text
Generation
• Virtual Assistants
• Voice Search
• Speech generation
• Medical Diagnosis
• Personalized Medicine
• Early detection
• Trading bots
• Stock predictions
• Fraud Detection
• Personalized Ads
• Product
Recommendations
• Autonomous cars
• Robotic Surgery
History of AI
The Big Data Era
Data
• Large volumes
of data is
produced
everyday.
• Everyone has a
phone packed
with several
sensors.
Infrastructure
• The computing
power of GPUs
has increased
dramatically.
• Cloud providers
offer online
computing
(IaaS).
Services
• User
applications:
YouTube, Gmail,
Facebook,
Twitter.
• Online storage
available for
free or low cost.
Notable AI Achievements
ImageNet is a database
of 14 million images with
over 20,000 categories.
GPT-3 is a language
model with 175 billion
learning parameters.
AI Jargon
Overlapping AI Related Terminology
• Artificial Intelligence (AI)
Trying to simulate human intelligence.
• Machine Learning (ML)
Learn by example from experience and historic
data.
• Deep Learning (DL)
Learn patterns using multi-layered data
processors.
• Data Science (DS)
Uses a variety of scientific methods, processes
and systems to solve problems involving data.
• Big Data
Analyze data sets that are too large or complex.
Artificial
Intelligence
Machine
Learning Data
Science
Deep
Learning
Big Data
What is Machine Learning
What is Machine Learning
The subfield of computer science that “gives computers the ability to
learn without being explicitly programmed”.(Arthur Samuel, 1959)
Using previous data for answering future questions
Historic may contain answers or may not contain answers
Training Prediction
Labeled Unlabeled
Machine Learning vs. Traditional Programming
Traditional
Programming
Machine
Learning
Data
Rules
Answers
Data
Answers
Rules
Traditional Programming:
• Business requirements and data are analyzed.
• A set of hard-coded rules are programmed and
tested.
• Program process new data based on the coded
rules.
Machine Learning:
• Data and their labels (answers) are fed into a
model.
• Model “learns” useful features and frequent
patterns to predict answers.
• Trained model is used to “predict” answers for
new data.
• Traditional Approach:
Price = 1.2 x Area + 0.7 x # Bedrooms +
0.3 x # Bathrooms
Pricing formula is known
beforehand and is explicitly hard-
coded. The formula can be
deducted by manual analysis or
SME domain experience.
• Machine Learning:
Price = A x Area + B x # Bedrooms + C
x # Bathrooms
Pricing Formula is unknown at
the beginning and would need
the model to be trained to
“Learn” the formula attributes
A,B and C.
HOUSING PRICES
Estimate housing prices based on 3 features (properties):
Area of the House, Number of Bedrooms, Number of Bathrooms
Housing Prices
Area #Bedrooms #Bathrooms Price
130 3 1 1,200
160 3 2 1,500
90 2 1 900
…
Model
Hyperp
aramet
ers
Optimiz
ation
Price = A x Area + B x # Bedrooms
+ C x # Bathrooms
A = 1.247
B = 0.682
C = 0.319
Input Dataset:
Contains prepared
historic data of
actual house sales.
Model:
Model Learns appropriate
“Parameters” to “Fit” the
input data.
Output:
Parameters that completes the
formula and can be generalized to
predict unsold houses.
Types of Machine Learning
Types of Learning
•Supervised Learning
•Unsupervised Learning
•Semi-supervised Learning
•Reinforcement Learning
Supervised Learning
• Learn through examples collected from historic data.
• Examples contain the desired output (labels) that
will be predicted for future data.
• Is this a cat or a dog?
• Is this email a spam or not?
• What is the market value of a house given its area and
number of bedrooms?
Supervised
Unsupervised
Semi-
Supervised
Reinforcement
Supervised Learning
Output is continuous. Predicts
numerical values such as prices or
temperature.
Supervised
Unsupervised
Semi-
Supervised
Reinforcement
Regression
Classification
Output is discrete. Predicts
categorical labels such as: Cat or
Dog.
Unsupervised Learning
• Using historic data that has no labels.
• Discovers the intrinsic links of data.
• Group photos into 20 groups based on their metadata.
• Segment customer profiles based on their demographics
and purchase behavior.
• Find an anomaly in credit card usage patterns.
Supervised
Unsupervised
Semi-
Supervised
Reinforcement
Unsupervised Learning
• Useful for learning structure in the data (clustering)
or detecting outliers (anomaly).
Supervised
Unsupervised
Semi-
Supervised
Reinforcement Anomaly
Semi-Supervised Learning
• Historic data has a small amount of labeled data,
and a large amount of unlabeled data.
• The cost of manually labeling all data is prohibitive.
• The problem is initially treated as Unsupervised to group
data to different structure.
• After that, available labels are used to label entire
clusters.
Supervised
Unsupervised
Semi-
Supervised
Reinforcement
Reinforcement Learning
• An agent interacts with an environment and watches
the result of the interaction.
• Environment gives feedback via a positive or
negative reward signal.
• The agent learns to optimize its interactions to
maximize the reward.
• An autonomous vehicle learns to put safety first, minimize
ride time, and obey the rules of law.
• An stock trading agent can decide to buy, sell or hold
based on market status and transactions profit/loss.
Supervised
Unsupervised
Semi-
Supervised
Reinforcement
ML Case Studies.
ML Productizing
AI-First AI-Inside
Actionable
Insights
AI tech is at the center and
is essential to the product
function. Examples: Virtual
assistants, Chatbots, self-
driving cars.
AI adds a useful function
that enhances user
experience. Example:
Recommendation engines,
process automation, Fraud
detection
AI leveraging data that you
collect to make informed
decisions.
Examples: Sales forecast,
Churn analysis
Case Study: Netflix Recommendation
Personalized recommendation using
Collaborative filtering
Scale:
• Volume: 13,612 titles (2019)
• Subs: 159 million (2020)
Results:
• High engagement rate, Low churn
• Personalization and recommendations
save Netflix more than $1Billion per
year.
Case Study: Infervision Cancer Diagnosis
Predominantly used in early-stage lung
cancer screening. Employs more than 50
deep learning algorithms to determine
each diagnosis
Scale:
• Trained using over 200,000 scans in
trials at 20 hospitals.
Results:
• Helped reduce the rate of missed cancer
diagnoses by 50 percent.
Case Study: Amazon Inventory Optimization
ML-powered inventory optimization
ensures that inventory preemptively
caters for forecast demands.
Scale:
• ship an average of 10 million packages
per day.
Results:
• Store 40% more inventory.
• Fulfill 1 and 2 days shipping on time.
How To Study ML
Study Pre-requisites
Math Programming Tools/Concepts
Probability and
Statistics
Python Modern IDE’s
Linear Algebra R Data tools
Calculus
Mathematics Study tips
• Probability Book
A First Course In Probability 9th ed.
• Mathematics for Machine Learning
3Blue1Brown
• Coursera: Mathematics for Machine Learning, by Imperial College of London
https://www.coursera.org/specializations/mathematics-machine-learning
• Book: Mathematics for Machine Learning
https://mml-book.github.io/
• Coursera: Machine Learning, offered by Stanford
https://www.coursera.org/learn/machine-learning
• YouTube: Stanford CS 229 – Machine Learning (Math focused)
https://www.youtube.com/playlist?list=PLoROMvodv4rMiGQp3WXShtMGgzqpfVfbU
• Book: Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd
Edition
https://www.oreilly.com/library/view/hands-on-machine-learning/9781492032632/
Machine Learning Study tips
QUESTIONS
Website
Datapartz.com
YouTube
Data Partz
LinkedIn
http://linkedin.com/in/heshamgowaily
Twitter
@heshamgowaily
Email
heshamgowaily@gmail.com
Phone
+20 (106) 700-5566
Thank You

Intro to machine learning

  • 1.
    Intro to MachineLearning By: Hesham Gowaily
  • 2.
    Agenda - Introduction toAI. - What is Machine Learning. - Types of Machine Learning. - ML Case Studies. - How to study ML.
  • 3.
  • 4.
    AI Applications • FaceRecognition • handwriting analysis • Automatic Captioning • Object Tracking • Image Styling • Machine Translation • Sentiment Analysis • Chatbots • Automatic Text Generation • Virtual Assistants • Voice Search • Speech generation • Medical Diagnosis • Personalized Medicine • Early detection • Trading bots • Stock predictions • Fraud Detection • Personalized Ads • Product Recommendations • Autonomous cars • Robotic Surgery
  • 5.
  • 6.
    The Big DataEra Data • Large volumes of data is produced everyday. • Everyone has a phone packed with several sensors. Infrastructure • The computing power of GPUs has increased dramatically. • Cloud providers offer online computing (IaaS). Services • User applications: YouTube, Gmail, Facebook, Twitter. • Online storage available for free or low cost.
  • 7.
    Notable AI Achievements ImageNetis a database of 14 million images with over 20,000 categories. GPT-3 is a language model with 175 billion learning parameters.
  • 8.
  • 9.
    Overlapping AI RelatedTerminology • Artificial Intelligence (AI) Trying to simulate human intelligence. • Machine Learning (ML) Learn by example from experience and historic data. • Deep Learning (DL) Learn patterns using multi-layered data processors. • Data Science (DS) Uses a variety of scientific methods, processes and systems to solve problems involving data. • Big Data Analyze data sets that are too large or complex. Artificial Intelligence Machine Learning Data Science Deep Learning Big Data
  • 10.
  • 11.
    What is MachineLearning The subfield of computer science that “gives computers the ability to learn without being explicitly programmed”.(Arthur Samuel, 1959) Using previous data for answering future questions Historic may contain answers or may not contain answers Training Prediction Labeled Unlabeled
  • 12.
    Machine Learning vs.Traditional Programming Traditional Programming Machine Learning Data Rules Answers Data Answers Rules Traditional Programming: • Business requirements and data are analyzed. • A set of hard-coded rules are programmed and tested. • Program process new data based on the coded rules. Machine Learning: • Data and their labels (answers) are fed into a model. • Model “learns” useful features and frequent patterns to predict answers. • Trained model is used to “predict” answers for new data.
  • 13.
    • Traditional Approach: Price= 1.2 x Area + 0.7 x # Bedrooms + 0.3 x # Bathrooms Pricing formula is known beforehand and is explicitly hard- coded. The formula can be deducted by manual analysis or SME domain experience. • Machine Learning: Price = A x Area + B x # Bedrooms + C x # Bathrooms Pricing Formula is unknown at the beginning and would need the model to be trained to “Learn” the formula attributes A,B and C. HOUSING PRICES Estimate housing prices based on 3 features (properties): Area of the House, Number of Bedrooms, Number of Bathrooms
  • 14.
    Housing Prices Area #Bedrooms#Bathrooms Price 130 3 1 1,200 160 3 2 1,500 90 2 1 900 … Model Hyperp aramet ers Optimiz ation Price = A x Area + B x # Bedrooms + C x # Bathrooms A = 1.247 B = 0.682 C = 0.319 Input Dataset: Contains prepared historic data of actual house sales. Model: Model Learns appropriate “Parameters” to “Fit” the input data. Output: Parameters that completes the formula and can be generalized to predict unsold houses.
  • 15.
  • 16.
    Types of Learning •SupervisedLearning •Unsupervised Learning •Semi-supervised Learning •Reinforcement Learning
  • 17.
    Supervised Learning • Learnthrough examples collected from historic data. • Examples contain the desired output (labels) that will be predicted for future data. • Is this a cat or a dog? • Is this email a spam or not? • What is the market value of a house given its area and number of bedrooms? Supervised Unsupervised Semi- Supervised Reinforcement
  • 18.
    Supervised Learning Output iscontinuous. Predicts numerical values such as prices or temperature. Supervised Unsupervised Semi- Supervised Reinforcement Regression Classification Output is discrete. Predicts categorical labels such as: Cat or Dog.
  • 19.
    Unsupervised Learning • Usinghistoric data that has no labels. • Discovers the intrinsic links of data. • Group photos into 20 groups based on their metadata. • Segment customer profiles based on their demographics and purchase behavior. • Find an anomaly in credit card usage patterns. Supervised Unsupervised Semi- Supervised Reinforcement
  • 20.
    Unsupervised Learning • Usefulfor learning structure in the data (clustering) or detecting outliers (anomaly). Supervised Unsupervised Semi- Supervised Reinforcement Anomaly
  • 21.
    Semi-Supervised Learning • Historicdata has a small amount of labeled data, and a large amount of unlabeled data. • The cost of manually labeling all data is prohibitive. • The problem is initially treated as Unsupervised to group data to different structure. • After that, available labels are used to label entire clusters. Supervised Unsupervised Semi- Supervised Reinforcement
  • 22.
    Reinforcement Learning • Anagent interacts with an environment and watches the result of the interaction. • Environment gives feedback via a positive or negative reward signal. • The agent learns to optimize its interactions to maximize the reward. • An autonomous vehicle learns to put safety first, minimize ride time, and obey the rules of law. • An stock trading agent can decide to buy, sell or hold based on market status and transactions profit/loss. Supervised Unsupervised Semi- Supervised Reinforcement
  • 23.
  • 24.
    ML Productizing AI-First AI-Inside Actionable Insights AItech is at the center and is essential to the product function. Examples: Virtual assistants, Chatbots, self- driving cars. AI adds a useful function that enhances user experience. Example: Recommendation engines, process automation, Fraud detection AI leveraging data that you collect to make informed decisions. Examples: Sales forecast, Churn analysis
  • 25.
    Case Study: NetflixRecommendation Personalized recommendation using Collaborative filtering Scale: • Volume: 13,612 titles (2019) • Subs: 159 million (2020) Results: • High engagement rate, Low churn • Personalization and recommendations save Netflix more than $1Billion per year.
  • 26.
    Case Study: InfervisionCancer Diagnosis Predominantly used in early-stage lung cancer screening. Employs more than 50 deep learning algorithms to determine each diagnosis Scale: • Trained using over 200,000 scans in trials at 20 hospitals. Results: • Helped reduce the rate of missed cancer diagnoses by 50 percent.
  • 27.
    Case Study: AmazonInventory Optimization ML-powered inventory optimization ensures that inventory preemptively caters for forecast demands. Scale: • ship an average of 10 million packages per day. Results: • Store 40% more inventory. • Fulfill 1 and 2 days shipping on time.
  • 28.
  • 29.
    Study Pre-requisites Math ProgrammingTools/Concepts Probability and Statistics Python Modern IDE’s Linear Algebra R Data tools Calculus
  • 30.
    Mathematics Study tips •Probability Book A First Course In Probability 9th ed. • Mathematics for Machine Learning 3Blue1Brown • Coursera: Mathematics for Machine Learning, by Imperial College of London https://www.coursera.org/specializations/mathematics-machine-learning • Book: Mathematics for Machine Learning https://mml-book.github.io/
  • 31.
    • Coursera: MachineLearning, offered by Stanford https://www.coursera.org/learn/machine-learning • YouTube: Stanford CS 229 – Machine Learning (Math focused) https://www.youtube.com/playlist?list=PLoROMvodv4rMiGQp3WXShtMGgzqpfVfbU • Book: Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd Edition https://www.oreilly.com/library/view/hands-on-machine-learning/9781492032632/ Machine Learning Study tips
  • 32.
  • 33.