ChrF

Objective:

ChrF evaluates code generation models by calculating character-level n-gram overlaps between generated code and reference solutions. It is a token-agnostic metric that assesses the similarity in structure and detail, providing a character-level view of accuracy, ideal for detecting subtle code variations or formatting issues.

Required Columns in Dataset:

Generated Code, Reference Code

Interpretation:

  • High ChrF: Indicates high similarity to the reference code, reflecting consistency in both syntax and function.

  • Low ChrF: Shows potential differences in code structure, which may affect readability or correctness.

Execution via UI:

Execution via SDK:

metrics=[
    {"name": "ChrF", "schema_mapping": {"generated_code": "Generated Code", "reference_code": "Reference Code"}}
]

Last updated

Was this helpful?