ChrF
Last updated
Last updated
Objective:
ChrF evaluates code generation models by calculating character-level n-gram overlaps between generated code and reference solutions. It is a token-agnostic metric that assesses the similarity in structure and detail, providing a character-level view of accuracy, ideal for detecting subtle code variations or formatting issues.
Required Columns in Dataset:
Generated Code
, Reference Code
Interpretation:
High ChrF: Indicates high similarity to the reference code, reflecting consistency in both syntax and function.
Low ChrF: Shows potential differences in code structure, which may affect readability or correctness.
Execution via UI:
Execution via SDK: