Drift Detection

The Drift Detection Test allows users to identify shifts between training and field/test datasets

The Drift Detection Test enables you to detect drift between your training dataset and the field/test dataset. By setting a threshold on the distance metric, you can pinpoint out-of-distribution data points

Execute Test:

The code snippet provided is structured to set up and perform a Drift Detection Test, comparing a baseline dataset against a more recent dataset to identify any drift in the data.

Configure the drift detection test using the rules defined above.

rules = DriftDetectionRules()
rules.add(type="drift_detection", dist_metric="Mahalanobis", _class="ALL", threshold=2)

edge_case_detection = data_drift_detection(test_session=test_session,
                                           test_name=f"Drift-Detection-Test",
                                           train_dataset_name="bdd_train_dataset",
                                           field_dataset_name="bdd_field_dataset",
                                           train_embed_col_name="ImageVectorsM1",
                                           field_embed_col_name = "ImageVectorsM1",
                                           level = "image",
                                           rules = rules)


test_session.add(edge_case_detection)

test_session.run()

Rules

The first step is to establish the criteria for detecting drift in your datasets.

DriftDetectionRules(): Initialises the rules for drift detection.
- rules.add(): Adds a new rule for detecting data drift:
  - type: The type of drift detection, "anomaly_detection" in this case.
  - dist_metric: The distance metric to use for detection, "Mahalanobis" which measures the distance between a point and a distribution.
  - _class: Specifies the class(es) these metrics apply to. "ALL" means all classes in the dataset.
  - threshold: The value above which the distance metric indicates drift.

Initialise Drift Detection Test

data_drift_detection(): Prepares the drift detection test with the following parameters:
- test_session: The session object linked to your project.
- test_name: A descriptive name for this test.
- train_dataset_name: The name of the baseline or training dataset.
- field_dataset_name: The name of the new or field dataset to compare against the baseline.
- train_embed_col_name: The column (schema mapping) in the training dataset that contains embeddings.
- field_embed_col_name: The column (schema mapping) in the field dataset that contains embeddings.
- level: The level at which to detect drift, "image" means image-level detection.
- rules: The previously defined rules for data drift detection test
- test_session.add(): Registers the drift detection test within the session.
- test_session.run(): Initiates the execution of all configured tests in the session, including your drift detection test.

By completing these steps, you've initiated a Drift Detection Test on the RagaAI Testing Platform to analyse your datasets for any significant changes in data distribution.

Analysing Test Results

Interpreting the Results

In Distribution Data Points: Identified as "in distribution" if they fall within the set threshold, signifying alignment with the training data.
Out of Distribution Data Points: Labelled as "out of distribution" if they exceed the threshold, suggesting potential drift requiring close examination.

Interactive Embedding View

Visualisation: Use the interactive embedding view to visualise and comprehend the drift between datasets.
Data Selection: Apply the lasso tool within the embedding view to select and scrutinise data points of interest.

Visualising and Assessing Data

Data grid View: Helps visualise images sorted by field dataset (out of distribution and in distribution datapoints) along with the training dataset.
Image View: Delve into detailed analyses of mistake scores for each label, with interactive annotation rendering and original image viewing.

Image View

Information Card: Provides the name of the datapoint .

By adhering to these guidelines, you can effectively utilise Drift Detection in RagaAI to maintain the integrity and relevance of your models over time.

PreviousModel Comparison Test NextOutlier Detection

Last updated 1 year ago

Was this helpful?