SQL Prompt Ambiguity

Objective: This metric evaluates the clarity of the SQL prompt given to the model. It assesses how well the prompt defines the task and how likely it is to generate a correct SQL response. An LLM is used to determine if the prompt is ambiguous or unclear, which might lead to multiple valid interpretations or incorrect SQL generation.

Required Columns in Dataset:

  • Prompt: The SQL prompt or task description provided to the model.

  • Context: Additional information or dataset details that clarify the prompt.

Interpretation: A higher score indicates that the prompt was clear and unambiguous, leading to a correct or nearly correct SQL response. A lower score suggests that the prompt was unclear or poorly defined, increasing the chances of generating incorrect SQL queries.

Code Execution:

pythonCopy codeexperiment_manager = Experiment(project_name="project_name",
                                experiment_name="experiment_name",
                                dataset_name="dataset_name")

# Ambiguity Test
response = experiment_manager.add_metrics(
    metrics=[
        {"name": "SQL Prompt Ambiguity", "config": {"reason": True, "model": "gpt-4o-mini", "batch_size": 5, "provider": "OpenAI"}}
    ]
)

Refer Executing tests page to learn about Metric Configurations

Example:

Prompt: Retrieve the names and ages of students with grade A.

Context: The dataset contains columns for student_name, student_age, and a boolean column is_enrolled that indicates whether a student is currently enrolled.

Metric Score: Score: 0.3/1.0

Reasoning:

  • Ambiguity in Prompt: The prompt does not specify whether to filter for enrolled students or to use specific column names. This leaves the task open to interpretation, which may lead to incorrect SQL queries.

  • Model Confusion: Due to the lack of specificity, the model generated a valid SQL query but missed essential details like the is_enrolled = true condition and the correct column names.

Interpretation: The low score suggests that the prompt was not detailed enough to guide the model to generate the correct SQL query, highlighting the importance of clear and unambiguous task descriptions.

Last updated