# Executing Evaluations

### Via UI:

<figure><img src="/files/RMR7DqYKKqZmcLQogRMG" alt=""><figcaption><p>Evaluation </p></figcaption></figure>

#### 1. Adding an Evaluation

* Navigate to your dataset and click the **"Evaluate"** button to begin configuring your evaluation.

**2. Selecting a Metric**

* Choose the **metric** you want to run on the dataset from the available options.

#### 3. Naming the Metric

* Enter a **unique metric name** to identify this evaluation. This will help you track the column name .

  <figure><img src="/files/JAW0DdKa1eW73bykqq55" alt=""><figcaption></figcaption></figure>

#### 4. Configure the parameters

* Choose the **model** you want to use for running the evaluation. You can select from pre-configured models within the platform or use a custom gateway (described [here](https://docs.raga.ai/ragaai-catalyst-1/concepts/enable-custom-gateway)) to perform the evaluations.
* In case you have configured your own gateway, you should see a "custom\_gateway" option in the model selection dropdown, which can be selected and used.
* Map the selected metric to the appropriate **column names** in your dataset.
  * Ensure that each metric is correctly aligned with the corresponding data columns to ensure accurate evaluation.

<figure><img src="/files/ZU6CE8pCKEAXL4XZ53zh" alt=""><figcaption></figcaption></figure>

#### 5. Threshold

* User can configure the passing criteria for each metric to define the passed and failed datapoints. Users can re-configure the threshold once they have been calculated from the UI using the ⚙️ icon beside the metric column name.
* Click on **"Update Threshold"** to update.

<figure><img src="/files/7t34hYdvjnmSIRrAeWSt" alt=""><figcaption></figcaption></figure>

#### 6. Applying Filters (Optional)

* Optionally, you can apply **filters** to narrow down the data points for the evaluation.
  * This is useful if you want to evaluate a specific subset of your dataset.

#### 7. Saving the Configuration

* Once the metric, model, and filters are configured, click **"Save"** to save your evaluation setup.

#### 8. Configuring Multiple Evaluations

* Repeat the steps above to **configure multiple evaluations** if needed. This allows you to run several evaluations on the dataset simultaneously.

#### 9. Running the Evaluations

* Once all evaluations are set up, click **"Evaluate"** to execute the evaluations for all the configured metrics.

### Via SDK:

You can also run metrics using the following commands:

```python
from ragaai_catalyst import Evaluation
evaluation = Evaluation(project_name="your-project-name",
                        dataset_name="your-dataset-name")

evaluation.list_metrics() #List available metrics

#Define schema mapping for all metrics to be run
schema_mapping={
    'Query': 'prompt',
    'Response': 'response',
    'Context': 'context',
    'ExpectedResponse': 'expected_response'
}

#List metrics to be run
metrics = [
    {"name": "Hallucination", "config": {"model": "gemini-1.5", "provider": "gemini"}, "column_name": "Hallucination_v1", "schema_mapping": schema_mapping},
    {"name": "Response Correctness", "config": {"model": "gpt-4o-mini", "provider": "openai"}, "column_name": "Response_Correctness_v1", "schema_mapping": schema_mapping},
    {"name": "Toxicity", "config": {"model": "gpt-4o-mini", "provider": "openai"}, "column_name": "Toxicity_v1", "schema_mapping": schema_mapping}
    ]
    
#Trigger listed metrics to run
evaluation.add_metrics(metrics=metrics)
```

The schema mapping above follows a format similar to "key":"value" representations, with "key" representing the column names in your dataset, and "value" representing a Catalyst Schema definition variable (pre-defined). Here is a list of all supported schema variables (case-sensitive):

* prompt
* context
* response
* expected\_response
* expected\_context
* traceId
* timestamp
* metadata
* pipeline
* cost
* feedBack
* latency
* system\_prompt
* traceUri

In case you have enabled a custom gateway, the above metric evaluation configuration will be edited for your model's details as follows:

```python
{"name": "Hallucination", "config": {"model": "your-model-name", "provider": "your-model-provider"}, "column_name": "Hallucination_v1", "schema_mapping": schema_mapping}
```

Once evaluations have been triggered, they can be tracked and accessed as follows:

```python
#Get status
evaluation.get_status()

#View Results
df = evaluation.get_results()
df.head()
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.raga.ai/ragaai-catalyst/concepts/running-ragaai-evals/executing-evaluations.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
