ChrF
Objective:
ChrF evaluates code generation models by calculating character-level n-gram overlaps between generated code and reference solutions. It is a token-agnostic metric that assesses the similarity in structure and detail, providing a character-level view of accuracy, ideal for detecting subtle code variations or formatting issues.
Required Columns in Dataset:
Generated Code
, Reference Code
Interpretation:
High ChrF: Indicates high similarity to the reference code, reflecting consistency in both syntax and function.
Low ChrF: Shows potential differences in code structure, which may affect readability or correctness.
Execution via UI:

Execution via SDK:
metrics=[
{"name": "ChrF", "schema_mapping": {"generated_code": "Generated Code", "reference_code": "Reference Code"}}
]
Last updated
Was this helpful?