WIFI SSID:SparkAISummit | Password: UnifiedAnalytics
Nick Pinckernell, Comcast Applied AI Research
Utilizing MLFlow and
Kubernetes to build an
Enterprise ML Platform
#UnifiedAnalytics #SparkAISummit
Topics
TOPIC WHY?
Example of data pipeline abstraction Modular components and reuse are
important for abstracting complex
systems
Ways to package and track ML project
and experiments
Consistency and reproducibility is key
for scale
How Comcast uses Kubeflow to serve
and deploy models and pipelines
A tangible example to help you
brainstorm about your organizations
requirements
3#UnifiedAnalytics #SparkAISummit
Challenges and motivations
• Before, there was no
– model management or tracking
– standardization for model packaging or deployments
• Cumbersome deployment process
– Deployment required code rewrite from research to operations
– Days or weeks to deploy
• Response and tradeoff: restrict model complexity
4#UnifiedAnalytics #SparkAISummit
Requirements
Minimum requirements from our organization
• Zero code refactoring or rewriting between research ready models and
production
• Easier experiment and model tracking
• Researchers need to deploy their own models
• A/B testing for quick model enhancement testing in production
• Ability to modularize and inject custom metrics and workflows at each step
5#UnifiedAnalytics #SparkAISummit
Solution – existing technologies
6#UnifiedAnalytics #SparkAISummit
Research
Model serving
Images,
containers
Data pipeline abstraction
• Determine use cases
• Identify commonalities for modularization
• Abstract interfaces
• Automate configuration
7#UnifiedAnalytics #SparkAISummit
Pipeline abstraction
8#UnifiedAnalytics #SparkAISummit
Pipeline abstraction
9#UnifiedAnalytics #SparkAISummit
Pipeline abstraction
10#UnifiedAnalytics #SparkAISummit
Pipeline abstraction
11#UnifiedAnalytics #SparkAISummit
Pipeline abstraction
12#UnifiedAnalytics #SparkAISummit
Pipeline abstraction
13#UnifiedAnalytics #SparkAISummit
Pipeline abstraction
14#UnifiedAnalytics #SparkAISummit
Pipeline abstraction
15#UnifiedAnalytics #SparkAISummit
Pipeline abstraction
16#UnifiedAnalytics #SparkAISummit
Pipeline abstraction
17#UnifiedAnalytics #SparkAISummit
Pipeline abstraction
18#UnifiedAnalytics #SparkAISummit
Pipeline abstraction
19#UnifiedAnalytics #SparkAISummit
Seldon inference graphs
Allows for complex graphs
• A/B testing
• Ensembles
• Multi-armed bandit
• Custom combinations
20#UnifiedAnalytics #SparkAISummit
https://github.com/SeldonIO/seldon-core/blob/release-0.2/notebooks/advanced_graphs.ipynb
Packaging and tracking
1. Researchers code and train models with Databricks, Spark
2. Experiments tracked with MLFlow
3. Packaging and model tracking with MLFlow and Kubeflow
• MLFLow standard packaging formats
• scikit-learn
• h2o
• TensorFlow
• more
21#UnifiedAnalytics #SparkAISummit
An MLFlow experiment
22#UnifiedAnalytics #SparkAISummit
MLFlow – multiple experiments
23#UnifiedAnalytics #SparkAISummit
MLFlow – multiple experiments
24#UnifiedAnalytics #SparkAISummit
MLFlow packaging
25#UnifiedAnalytics #SparkAISummit
Research and model flow
26#UnifiedAnalytics #SparkAISummit
Research and model flow
27#UnifiedAnalytics #SparkAISummit
Research and model flow – at scale
28#UnifiedAnalytics #SparkAISummit
Model serving with Kubeflow
Considerations and requirements
• Resilient
• Highly available
• Rate limiting
• Shadow deployments
• Auto-scaling (WIP)
29#UnifiedAnalytics #SparkAISummit
Ambassador
http://www.getambassador.io
Throughput
Static number of replicas Determined after
• Constant and burst load testing with Locust
30#UnifiedAnalytics #SparkAISummit
DEMO
A demonstration of
• MLFlow experiments
– Serving the chosen model
• Implementation of components
– Consumer pod
– Model pod
– Producer logic (to simulate real requests)
31#UnifiedAnalytics #SparkAISummit
Choosing the run
32#UnifiedAnalytics #SparkAISummit
Choosing the model
33#UnifiedAnalytics #SparkAISummit
Implementing the model
34#UnifiedAnalytics #SparkAISummit
Implementing the consumer
35#UnifiedAnalytics #SparkAISummit
Implementing the producer
36#UnifiedAnalytics #SparkAISummit
Deploy the model
• Define the YAML / JSON Seldon deployment
• Build the image
– s2i build -E environment_rest .
seldonio/seldon-core-s2i-python3:0.6-SNAPSHOT
sklearn-iris-mlflow:0.3
• Deploy
– kubectl create -f sklearn_iris_deployment.json
-n kubeflow
37#UnifiedAnalytics #SparkAISummit
38#UnifiedAnalytics #SparkAISummit
Grafana metrics
39#UnifiedAnalytics #SparkAISummit
COMCAST IS HIRING
PHILADELPHIA
WASHINGTON, D.C.
SILICON VALLEY
DENVER
DON’T FORGET TO RATE
AND REVIEW THE SESSIONS
SEARCH SPARK + AI SUMMIT

How to Utilize MLflow and Kubernetes to Build an Enterprise ML Platform