ML Ops Framework Setup : 4-Week Implementation

Tredence Inc

Automated ML model management (MLOps) to generate higher RoI on Data Science investments and increase the Business User’s confidence in analytical insights

Objective: Setup MLOPs practice with tracking of 2 ML models and enable clients with a clear MLOPs practice to onboard newer ML models in terms of tracking model performance.

Key Challenges Addressed:

  1. Data/model drift cause models to become ineffective over time
  2. No centralized way to measure model performance
  3. Poor model performance improvement statistics due to lack of explanation

Outcome:

  1. A centralized model monitoring system that measures the following: a. Model Drift which includes Overall drift, drift trend, feature wise drift b. Explainable AI which provides Global feature Importance, Local explanation c. Persona based insights for business users and data engineer (Top 5 KPIs)
  2. A Visual provenance graph to track model execution.

Implementation Plan The break-up of the implementation plan is as below:

  1. Week 1: Spent on ‘discovery’ to understand the business and ML models, data sources and downstream applications.
  2. Week 2: Integration of model pipelines and drift calculation for two models
  3. Week 3-4: Model drift calculation for two models along with Explainable AI and activation of visual provenance graph. Centralized model monitoring, Documentation and a MLOps roadmap for the future.

This implementation uses the following native Azure components:

  1. Azure Git: Allowing changes to the repository in a controlled way, allowing coordination between many people without accidentally overwriting or corrupting files
  2. App Services: The monitoring web app and python backend code is hosted on azure Linux app services. Both the apps can be scaled automatically or manually on demand.
  3. Microsoft Azure Data Factory: Used to fetch the status information of Data factory pipelines to track.
  4. Databricks Workspace: MLFlow component of Databricks is used to fetch the data stored by notebook during execution.
  5. Cosmos DB: With the flexibility of schema and changing nature of data, NoSQL helps accommodate requirements.
https://store-images.s-microsoft.com/image/apps.10564.b36faa26-1efe-4080-9699-1c92f9b55f04.4c2cc979-ddee-41ea-bbf7-100fe7ce345f.9e3d7621-a880-4818-ad79-c0da5fd8a5ef
/staticstorage/654f89d/assets/videoOverlay_7299e00c2e43a32cf9fa.png
https://store-images.s-microsoft.com/image/apps.10564.b36faa26-1efe-4080-9699-1c92f9b55f04.4c2cc979-ddee-41ea-bbf7-100fe7ce345f.9e3d7621-a880-4818-ad79-c0da5fd8a5ef
/staticstorage/654f89d/assets/videoOverlay_7299e00c2e43a32cf9fa.png
https://store-images.s-microsoft.com/image/apps.303.b36faa26-1efe-4080-9699-1c92f9b55f04.4c2cc979-ddee-41ea-bbf7-100fe7ce345f.c90a9c2a-0c5b-434f-8edf-4a1cc2f4f29f