From the course: AWS Certified AI Practitioner (AIF-C01) Cert Prep

Introduction to MLOps

- The term ML Ops or machine learning operations is going to share a number of principles with another term DevOps, although not completely. Let's look through some of these similarities and differences. We'll start with version control, and this is going to be included for data, code, and models, and we're going to take a look at this in a little bit more detail coming up in just a few minutes. These repositories should be isolated and independent from each other. Next, we have the concept of automation. Anything that doesn't explicitly require human oversight should at least be under consideration for automation so that you can improve the repeatability and quality. Third is CI/CD, continuous integration, continuous deployment, and this is going to be expanded beyond what you would normally see in DevOps to include data as well as models. And then finally, we have an unfamiliar term, model governance. This is going to include the evaluation and the transparency in order to be able to check for certain issues like fairness or bias and ethics. There are some benefits from implementing ML Ops that include productivity. You can implement automated self-service for faster evaluation. You get reliability, and this is going to be the same as what you'd see with DevOps. It's consistent high quality automation, it means less troubleshooting. Third, repeatability, and this is also because of automation. It means there's can be an expectation of a finished product that falls within a certain range. Then we have auditability. That means that inputs and outputs are all versioned so that audits can be performed. And finally, data and model quality. By implementing guardrails and model validation using automation, it means that this is something that we can check on a frequent basis. Now if we look at the ML lifecycle, there are a number of phases or components that would be considered part of an ML Ops pipeline, including data pre-processing, feature engineering, training, tuning and evaluating models, model deployment and model monitoring. And so we can kind of break this up specifically as follows. We've got a data pipeline where we do our data preparation and we have a specific repository that is version controlled for this data. Next, we build the model followed by evaluating the model. And collectively these results are going to be placed in a separate repository just for the code. We follow that with model selection. And these three steps are now part of a build and test pipeline. Next we have a deployment pipeline for the deployment of the model, but model selection and deployment are going to be captured by the model repository, which is independent from the data repository and the code repository. And finally, we get to the monitoring pipeline, and this is where we perform the ongoing monitoring and evaluation of the quality of the model.

Contents