Ruby

Test LLM outputs with Ruby metrics. Assess token overlap for translation accuracy and language precision.

Objective:

Ruby uses program dependency graph (PDG) analysis to evaluate code structure similarity. It compares the logical structure and dependencies of generated code with a reference, identifying structural and semantic alignment, making it well-suited for complex programming tasks where accuracy in logic flow is crucial.

Required Columns in Dataset:

Generated Code, Reference Code

Interpretation:

  • High Ruby Score: Suggests that the generated code aligns closely with the logical structure of the reference code.

  • Low Ruby Score: Indicates deviations in code logic or structure, potentially affecting functionality.

Execution via UI:

Execution via SDK:

Last updated

Was this helpful?