POS

Objective: This test evaluates whether the parts of speech (PoS) of all words in a model-generated response match those specified in the concept set. It aims to assess the model's ability to correctly apply the designated parts of speech for given words within a contextually appropriate sentence.

Required Parameters:

  • Response: The sentence generated by the model using the words in concept_set.

  • Concept_set: A list of words in their root form, each tagged with its corresponding part of speech (e.g., "_V" for verb, "_N" for noun, "_A" for adjective).

Interpretation: A higher score indicates that the model's response correctly applies the parts of speech as specified in the concept set. The evaluation focuses on the accurate use of designated parts of speech in context.

  • Example with Higher Score:

    • Concept set: ["balloon_N", "soar_V", "red_A"]

    • Response: "The red balloon soared into the sky." (All concepts are used correctly according to their designated parts of speech.)

  • Example with Lower Score:

    • Concept set: ["balloon_N", "soar_V", "red_N", "sky_A"]

    • Response: "The red balloon soared into the sky." (The parts of speech for "red" and "sky" do not match the concept set, leading to a lower score.)

Note: Always use words in their root form in the concept set to focus on parts of speech application rather than word conjugation or inflection.

# Example with Higher Score

evaluator.add_test(
    test_names=["pos_test"],
    data={
        "concept_set" : ["balloon_N", "soar_V", "red_A"],
        "response": "The red balloon soared into the sky."
    },
).run()

evaluator.print_results()
# Example with Lower Score

evaluator.add_test(
    test_names=["pos_test"],
    data={
        "concept_set" : ["balloon_N", "soar_V", "red_N", "sky_A"],
        "response": "The red balloon soared into the sky."
    },
).run()

evaluator.print_results()

Last updated