MLOps explained
Outlines
- What is MLOps
- Why do we need MLOps
- What are the benefits of MLOps
1. What is MLOps
- Model development
- Continuous integration + deployment (CI/CD)
- Monitoring
- Validation
- Governance
MLOps Goals
- Faster experimentation and model development
- Faster deployment of updated models into production
- Quality Assurance
2. Why do we need MLOps?
- Experimentation
- Track metrics
- Source control in the code
- Checkpoint steps in the ML lifecycle
- Automating proper validation / staged deployment
- Monitoring model performance efficacy
- Automated retraining
ML pipeline for data scientist teams
- Data
- Train
- Validate
- Deploy
- Monitor
Version the source data and its attributes
- Traditionally take up the most amount of time to clean up and get in shape.
- Data can come in different formats and different sources.
- The better the quality of data, the better the quality and efficacy of the model.
Build the model
- Feature selection / generation
- Algorithm selection
- Hyperparameter tuning
- Fitting the model
- etc.
Learn from mistakes
The data is just as crucial and informs the next set of combinations to try.
Responsible ML
- Understand: Interpretability fairness
- Protect: Differential privacy confidential machine learning
- Control: Aduit trail datasheets
Model Drift
Model was defined by the business case needs, if there is a new business case needs, it will need a rethink or retraining of the model.
Data Drift
When the model is trained on the demographics of a set of users and the population it is being utilized on doesn’t match that same demographic.
- Seasonality
- Consumer preferences
- New products
Automatically triggers model retraining, so that the new model can cater to the new requirements.
3. Quality Assurance
- Create reproducible ML pipelines
- Enable reusable ML environments
- Register package and deploy models
- Capture governance data
- Generate alerts
- Monitoring ML applications
- Automation
What MLOps can provide?
- Scalability and management
- Reusability and reproducibility
- Effortless CI/CD
- Maintain model health
- Advocates responsible AI practices
Best practices
- Create models with reusable ML pipelines
- Automation is key for robust MLOps
- Monitor performance
- Monitor data drift and utilize the insights to retrain the model
- Enable automatic audit trail creation for all artifacts