CodeBLEU
Last updated
Last updated
Objective:
CodeBLEU is a comprehensive metric for evaluating code generation, integrating BLEU with code-specific aspects such as syntax and dataflow. It measures both linguistic and structural similarity, making it suitable for assessing code accuracy beyond surface-level token matching.
Required Columns in Dataset:
Generated Code
, Reference Code
Interpretation:
High CodeBLEU Score: Indicates strong alignment with the reference solution in terms of both syntax and logic.
Low CodeBLEU Score: Reflects potential deviations in code structure or logic, which may impact functionality.
Execution via UI:
Execution via SDK: