Robust Pass@k
Measure code generation reliability with Pass@k. Assess how often working solutions appear in multiple attempts.

Last updated
Was this helpful?
Measure code generation reliability with Pass@k. Assess how often working solutions appear in multiple attempts.

Last updated
Was this helpful?
Was this helpful?
metrics=[
{"name": "Robust Pass@k", "schema_mapping": {"original_prompt": "Original Prompt", "perturbed_prompts": "Perturbed Prompts", "generated_code": "Generated Code"}}
]