Event Detection

This page provides examples of how RagaAI's Testing Platform can add value to teams building end-to-end model pipelines. It is a companion piece to the Product Demo available on the RagaAI Platform.

The Event Detection Project on the sample workspace is an example of how the RagaAI Testing Platform can help with the following tasks -

End-to-end pipeline level tests beyond AI models
A/B Tests for measuring which model performs better in the real world
Model Quality Checks to identify performance gaps and perform regression analysis

The RagaAI Testing Platform is designed to add science to the art of detection AI issues, performing root cause analysis and providing actionable recommendations. This is done as an automated suite of tests on the platform.

An overview of all tests for the sample project is available here -

1. A/B Test

Goal - A/B testing examines differences in a machine learning pipeline to determine their influence on performance, which informs model development and deployment decisions.

Methodology - RagaAI helps compare arbitrarily complex pipelines against each other and this can be done as a regression testing process where the pipelines are evaluated to see if the performance across scenarios has improved while not regressing on others.

Insight - The platform easily helps to analyse certain scenarios where divergence is the most. Here for example, under snowy conditions, there is a high difference percentage between the stop sign detections by Pipeline A and Pipeline B.

Impact - This automated test helps users assess the performance of the pipelines in production and choose the one that is best suited for deployment.

For more details, please refer to the detailed A/B Test Documentation.

2. Failure Mode Analysis

Goal - Identify scenarios where the model performs poorly on the test dataset post training/re-training.

Methodology - RagaAI automatically detections scenarios within the dataset and brings any model vulnerabilities on such scenarios to the fore

Insight - In this case, we see that the model performs really poorly in snowy conditions even when the aggregate performance is above threshold.

Impact - This test helps users identify 90% of the vulnerabilities within a models Operational Design Domain (ODD) early in the model development lifecycle.

For more details, please refer to the detailed Failure Mode Analysis documentation.

PreviousImage Classification NextTest Inventory

Last updated 1 year ago

Was this helpful?