A Guide to drift tracking

Multivariate drift detection helps in detecting changes or drifts in multiple variables or features together. Unlike univariate drift detection, which only focuses on detecting changes in a single variable, multivariate drift detection considers the relationship between multiple features and does not assume all features are independent of each other.

Thus, multivariate drift detection methods can detect changes in the distribution of the data, changes in the relationship between variables, and changes in the functional relationship between variables. These methods are particularly useful in complex systems where changes in one variable can have a significant impact on the behavior of other variables.
So multivariate drift helps users get a better understanding of the changes in the inference data. It is also easier to monitor as only one metric needs to be tracked compared to tracking every feature separately. But at the same time, it is computationally heavy to calculate and might be overkill for simpler systems.

Multivariate drift detection algorithms usually depend on a machine learning model to calculate the drift. So these algorithms can be classified as follows:

Using Supervised Methods:
These typically rely on training a binary-classifier model to guess whether a data point is from the baseline data frame. A higher value of the accuracy of the model indicates higher drift.

To find out which features have drifted, the feature importance of this binary classification model is used.

Unsupervised Learning Methods:
Here are a few methods:
Clustering: Use K-means, DBSCAN, or any-other clustering algorithm to find clusters in the reference dataset and current dataset and then find differences between the clusters to judge if the data has drifted or not.

Gaussian Mixture Models(GMM): GMM represents our data as a mixture of Gaussian distributions. GMM can be used to detect multivariate drift by comparing the parameters of the Gaussian distributions of the current dataset with the reference dataset.

Principal Component Analysis (PCA): Use PCA to reduce the dimensions of the dataset and then use regular univariate drift detection algorithms considering the features to be unique.

A Guide to drift tracking

What is Model drift?

Different types of Model drift:

Different Methods of Drift Monitoring

Statistical Methods

Model Level Drift (Multivariate drift detection)

Conclusion:

Subscribe to our newsletter

SSH Server Containers For Development on Kubernetes

Prompting, RAG or Fine-tuning - the right choice?

Large Language Models for Commercial Use

Adding OAuth2 to Jupyter Notebooks on Kubernetes

Blazingly fast way to build, track and deploy your models!

Product

Resources

Company

Goodreads

A Guide to drift tracking

What is Model drift?

Different types of Model drift:

Different Methods of Drift Monitoring

Statistical Methods

Model Level Drift (Multivariate drift detection)

Conclusion:

Subscribe to our Newsletter

Subscribe to our newsletter

Discover More

SSH Server Containers For Development on Kubernetes

Prompting, RAG or Fine-tuning - the right choice?

Large Language Models for Commercial Use

Adding OAuth2 to Jupyter Notebooks on Kubernetes

Related Blogs

Blazingly fast way to build, track and deploy your models!

Product

Resources

Company

Goodreads

Subscribe to our newsletter