Running
Machine Learning Applications
In Production
Sam BESSALAH
@samklr
Might Works well for KAGGLE!
Might Works well for KAGGLE!
But Kaggle isn’t real world Machine
learning!
In Real Life
- Trade off : Accuracy vs Interpretability cs Speed vs Infrastructure contraints
- Interpretability and Speed often beats accuracy
- Most of the time Kaggle is a feature engineering contest
- Contest oriented vs Real Product Impact
But in real life … Things are less obvious
Data Engineers
Data Pipeline
Data Scientists
/ ML Engineers
APP
Applications
Developers
But in real life … Things are less obvious
Data Engineers
Data Pipeline
Data Scientists
/ ML Engineers
APP
Applications
Developers
Innovation is often (wrongly ?) thought to be here ...
http://www.slideshare.net/jssm1th/an-architecture-for-agile-machine-learning-in-realtime-applications
@josh_wills
josh_wills
Production Requirements :
- Flexibility and agility
- Scalability and Performance
- Enable Real time decision making, sometimes at huge QPS at subseconds
pace.
- Security
josh_wills
Machine Learning as a Software Problem
- Most ML developement patterns lead to software design anti patterns
- Dependencies in code, creeps through Models dependencies in Data
- Wasteful use of data, since most ml model selection require multiple version
of data. Hence the instability of data, and of prediction services
- Breaks system isolation, leading to un-maintainable stacks
In Production, Machine Learning is a
Software and System Problem.
Treat it accordingly !!!!
Deployment / Model Serving
Deployment / Model Serving
The Missing Part in ML
- Model Serving is often ignored or left out to Back End Engineers to implement
at their own liking.
- More often it involves serving an API or a Service to do the Predict function.
But that not often enough.
- Software scaling can become problematic to the accuracy of the model.
- How many models are you serving?
- Are you running something else ?
- Are you updating your model in real time?
Example : AirBnb
- Trained Models are stored in PMML files
- They serve their models via Openscoring
PMML ?
PMML?
- Might be the solution for some (most ?) cases
- Support many models, but lacks support for many others
- Fails to capture the evolution of your modeling process … Transformations, re
encoding, etc .
- Better suited for exporting models to other systems, rather than being served
to machine learning products with real user facing.
- And … XML ?? Really????
Model Versioning - Packaging
- You usually don’t serve only one model. But a lot more. Especially when
running experiments.
- You should vie to package your model in versionned way.
- Git is awesome, but not appropriate for live model serving
- Build a model repository or a model index
- I usually use fast KV store or advanced data stores to save my models
- Build a service to manage your models (Model Manager) responsible for
evaluating and updating your model.
TensorFlow Serving
Serialization
- Remember PMML ?
- In Big Data, data has schema and proper evolution?
- Why not models ?
- Lots to choose from : Protobuf, Avro
- Use binary schema to represent and version your models
Evaluation
- Business metrics often differ from core model metrics : Trade off between
long term metrics and short term metrics.
- Hyperparameters
- A/B Testing - Multi Armed Bandits Problem
Hyper Parameters
Netflix
A/B Testing - Multi-armed Bandit
A/B Testing - Multi-armed Bandit
Dataiku
Experiments
Reproducibility
- How to keep track of data used for training ?
- Are notebooks enough?
- Junpyter Notebooks, Spark Notebooks, Zeppelin, etc ….
- Need for an end to end solution. Not perfect, but a workable one.
I forgot many things
- Monitoring
- Pipeline tuning (one model is often fed to another one)
- RPC over REST for fast model serving ?
- How to deal with heterogeneous systems ?
- Do you really have to distribute your processing?
- Is more data better than smartly tuned algorithms?
Machine Learning In Production

Machine Learning In Production

  • 1.
    Running Machine Learning Applications InProduction Sam BESSALAH @samklr
  • 5.
    Might Works wellfor KAGGLE!
  • 6.
    Might Works wellfor KAGGLE! But Kaggle isn’t real world Machine learning!
  • 8.
    In Real Life -Trade off : Accuracy vs Interpretability cs Speed vs Infrastructure contraints - Interpretability and Speed often beats accuracy - Most of the time Kaggle is a feature engineering contest - Contest oriented vs Real Product Impact
  • 9.
    But in reallife … Things are less obvious Data Engineers Data Pipeline Data Scientists / ML Engineers APP Applications Developers
  • 10.
    But in reallife … Things are less obvious Data Engineers Data Pipeline Data Scientists / ML Engineers APP Applications Developers Innovation is often (wrongly ?) thought to be here ...
  • 11.
  • 13.
  • 14.
  • 16.
    Production Requirements : -Flexibility and agility - Scalability and Performance - Enable Real time decision making, sometimes at huge QPS at subseconds pace. - Security
  • 18.
  • 20.
    Machine Learning asa Software Problem - Most ML developement patterns lead to software design anti patterns - Dependencies in code, creeps through Models dependencies in Data - Wasteful use of data, since most ml model selection require multiple version of data. Hence the instability of data, and of prediction services - Breaks system isolation, leading to un-maintainable stacks
  • 21.
    In Production, MachineLearning is a Software and System Problem. Treat it accordingly !!!!
  • 25.
  • 26.
    Deployment / ModelServing The Missing Part in ML
  • 27.
    - Model Servingis often ignored or left out to Back End Engineers to implement at their own liking. - More often it involves serving an API or a Service to do the Predict function. But that not often enough. - Software scaling can become problematic to the accuracy of the model. - How many models are you serving? - Are you running something else ? - Are you updating your model in real time?
  • 28.
  • 30.
    - Trained Modelsare stored in PMML files - They serve their models via Openscoring
  • 32.
  • 34.
    PMML? - Might bethe solution for some (most ?) cases - Support many models, but lacks support for many others - Fails to capture the evolution of your modeling process … Transformations, re encoding, etc . - Better suited for exporting models to other systems, rather than being served to machine learning products with real user facing. - And … XML ?? Really????
  • 37.
    Model Versioning -Packaging - You usually don’t serve only one model. But a lot more. Especially when running experiments. - You should vie to package your model in versionned way. - Git is awesome, but not appropriate for live model serving - Build a model repository or a model index - I usually use fast KV store or advanced data stores to save my models - Build a service to manage your models (Model Manager) responsible for evaluating and updating your model.
  • 40.
  • 43.
    Serialization - Remember PMML? - In Big Data, data has schema and proper evolution? - Why not models ? - Lots to choose from : Protobuf, Avro - Use binary schema to represent and version your models
  • 44.
    Evaluation - Business metricsoften differ from core model metrics : Trade off between long term metrics and short term metrics. - Hyperparameters - A/B Testing - Multi Armed Bandits Problem
  • 45.
  • 48.
    A/B Testing -Multi-armed Bandit
  • 49.
    A/B Testing -Multi-armed Bandit Dataiku
  • 50.
  • 52.
    Reproducibility - How tokeep track of data used for training ? - Are notebooks enough? - Junpyter Notebooks, Spark Notebooks, Zeppelin, etc …. - Need for an end to end solution. Not perfect, but a workable one.
  • 53.
    I forgot manythings - Monitoring - Pipeline tuning (one model is often fed to another one) - RPC over REST for fast model serving ? - How to deal with heterogeneous systems ? - Do you really have to distribute your processing? - Is more data better than smartly tuned algorithms?