CodeBLEU

Objective:

CodeBLEU is a comprehensive metric for evaluating code generation, integrating BLEU with code-specific aspects such as syntax and dataflow. It measures both linguistic and structural similarity, making it suitable for assessing code accuracy beyond surface-level token matching.

Required Columns in Dataset:

Generated Code, Reference Code

Interpretation:

  • High CodeBLEU Score: Indicates strong alignment with the reference solution in terms of both syntax and logic.

  • Low CodeBLEU Score: Reflects potential deviations in code structure or logic, which may impact functionality.

Execution via UI:

Execution via SDK:

metrics=[
    {"name": "CodeBLEU", "schema_mapping": {"generated_code": "Generated Code", "reference_code": "Reference Code"}}
]

Last updated

Was this helpful?