SQL Context Sufficiency

Objective: This metric evaluates whether the context provided alongside the SQL prompt is sufficient to enable the model to generate a correct SQL query. It examines if the context contains all necessary details, such as column names, data types, and constraints, to ensure that the SQL query aligns with the expected output. An LLM is used to determine if the context provided is adequate or if additional information is required.

Required Columns in Dataset:

Prompt: The SQL prompt or task description provided to the model.
Context: Additional information or dataset details that clarify the prompt and guide the SQL generation.

Interpretation: A higher score indicates that the context is sufficient and provides all the necessary information for the model to generate the correct SQL query. A lower score suggests that the context is insufficient or lacks critical details, which may lead to errors in the generated SQL query.

Metric Execution via UI:

Code Execution:

# SQL Context Sufficiency
metrics = [
    {"name": "SQL Context Sufficiency", "config": {"model": "gpt-4o-mini", "provider":"azure"}, "column_name":"SQL_Context_Sufficiency_v2"},
    {"name": "SQL Context Sufficiency", "config": {"model": "gpt-4o-mini", "provider":"openai"}, "column_name":"SQL_Context_Sufficiency_v2"}
]

Example:

Prompt: Retrieve the names and ages of students with grade A.

Context: The dataset contains information about students, including their grades.

Metric Score: Score: 0.3/1.0

Reasoning:

Insufficient Context: While the context mentions that the dataset contains information about students and their grades, it fails to specify important details such as the exact column names (student_name, student_age), the existence of an is_enrolled column, and whether there are any constraints on the data.
Missing Critical Details: The lack of specific information about the dataset structure makes it difficult for the model to generate an accurate SQL query, as it may make incorrect assumptions or miss relevant conditions.

Interpretation: The low score reflects that the context was insufficient to guide the model effectively. For accurate SQL query generation, the context should include all necessary details, such as column names, data types, and relevant constraints, to ensure the model has enough information to work with.

Last updated 8 months ago

Was this helpful?