From the course: MLOps with Databricks
Custom models in MLflow - Databricks Tutorial
From the course: MLOps with Databricks
Custom models in MLflow
- [Instructor] In the previous video, we trained and logged the secular pipeline. MLflow supports secular flavor, So we could log it using MLflow secular log model function. But what if the model you work on isn't supported? Or what if you want to tweak the default predict function in the way that works for you, then wrapping a model using pyfunc is the way to go. Let's go back to our hotel_cancellation_basic model that we trained earlier. When we run a predict function, it outputs an array containing 0 01. Let's say we want to output a dictionary instead, which looks like prediction canceled or prediction not canceled. To log the model in MLflow tracking, we need a class that inherits from MLflow pyfunc, Python model class. So I created a class called hotel_cancellation_wrapper. It takes a pipeline, which is trained as input and has a predict function that has context in it. If you just want to use the predict function of the rep model without calling the load model function, you have to pass none as context. And here I can show you how that works. If you want to just use the predict function of the rep model without calling the load model function, plus none is a context. And you see it outputs prediction canceled here, exactly what we wanted it to be. It's important to understand why the predict function must have context here. The parent class, MLflow pyfunc Python model has function load context. When MLflow pyfunc load model is called, it gets invoked and the context gets loaded so all the dependencies become available for the predict function. When the MLflow pyfunc load model function is called the wrapper class gets loaded, which handles the resolution of the context. Let's run the code that logs the model. We can see that the model is successfully logged and now can be loaded using MLflow Python load model function. Let's take a look in the MLflow experiment tracking a UI, and see what gets logged when the model gets locked. Let's go to the model here. In the model folder we see multiple files, MLmodel, conda.yml file, a model pickle file, requirement text file, and the Python environment. MLflow is using Conda for environment management. If you implement Databricks model serving the conda.yml file from the model folder will be used to build an environment for the model. It contains the model requirements in Python version. In the MLmodel file, everything comes together. We can see the model signature, environment runtime use to log the model and the model flavor definition. In the next video, I explain how to register the model we just locked.