Object Detection

This page provides examples of how RagaAI's Testing Platform can add value to teams building object detection models. It is a companion piece to the Product Demo available on the RagaAI Platform.

The Object Detection Project on the sample workspace is an example of how the RagaAI Testing Platform can help with the following tasks -

Data Quality Checks before training a new model
Model Quality Checks to identify performance gaps and perform regression analysis

The RagaAI Testing Platform is designed to add science to the art of detection AI issues, performing root cause analysis and providing actionable recommendations. This is done as an automated suite of tests on the platform.

An overview of all tests for the sample project is available here -

1. Data Drift Detection

Goal - Identify scenarios in the field data which are drastically different (out-of-distribution) with respect to the training dataset. The AI model is prone to generating erroneous predictions on such datapoints.

Methodology - RagaAI automatically detects OOD datapoints using the embeddings from the RagaAI DNA technology.

Insight - For this case, we see that the platform correctly identifies data drift for nighttime scenarios given the model has only been trained on daytime scenarios.

Impact - This automated test helps users access if the data in the production setting has shifted and the model needs to be retrained.

For more details, please refer to the detailed data drift documentation.

2. Failure Mode Analysis

Goal - Identify scenarios where the model performs poorly on the test dataset post training/re-training.

Methodology - RagaAI automatically detections scenarios within the dataset and brings any model vulnerabilities on such scenarios to the fore

Insight - In this case, we see that the model performs really poorly on nighttime scenarios even when the aggregate performance is above threshold.

Impact - This test helps users identify 90% of the vulnerabilities within a models Operational Design Domain (ODD) early in the model development lifecycle.

For more details, please refer to the detailed failure mode analysis documentation.

3. Outlier Detection

Goal: The goal is to find outliers in the training data. If the model is trained on such datapoints, it can make it really hard to achieve high performance.

Methodology: The system uses the RagaAI technique to automatically discover outliers by using embeddings obtained from the RagaAI DNA technology. This new technique identifies data points that differ significantly from the norm established during model training.

Insight: In the context of outlier detection, the system correctly identifies scenarios in field data that differ significantly from the training dataset.

Impact: The automated outlier detection test is a critical tool for users to monitor alterations in data distribution within a production environment. In this situation, the presence of outliers in nighttime settings shows data drift, indicating the need to retrain the model to retain optimal performance in a variety of conditions. This proactive strategy helps to ensure the model's accuracy and reliability over time.

For more details, please refer to the detailed outlier detection documentation.

4. Data Leakage Test

Goal: The primary objective is to detect instances of data leakage (non-independence) between the training and test datasets. If there is leakage, the model can artifically show very high performance on the test dataset.

Methodology: RagaAI automatically identifies potential data leakage by analyzing embeddings generated from the RagaAI DNA technology.

Insight: In the context of data leakage detection, the system adeptly identifies datapoints in the test dataset that have been leaked from the training dataset.

Impact: The detection of leaked data in the test set underscores the importance of addressing potential sources of data contamination. This proactive strategy ensures that the model remains robust and can reliably generalize to new, unseen data without being influenced by leaked information.

For more details, please refer to the detailed data leakage documentation.

PreviousSandbox Guide NextLLM Summarization

Last updated 1 year ago

Was this helpful?