Functional Correctness
Test whether generated code runs correctly. Identify functional errors and refine LLM outputs for accuracy.

Last updated
Was this helpful?
Test whether generated code runs correctly. Identify functional errors and refine LLM outputs for accuracy.

Last updated
Was this helpful?
Was this helpful?
metrics=[
{"name": "Functional Correctness", "schema_mapping": {"generated_program": "Generated Program", "test_cases": "Set of Test Cases"}}
]