How to Define Good Assessment Criteria

Overview

By analyzing real-time interactions, Wonda’s AI Assessment feature can now provide instant feedback on the language and engagement strategies used in a simulation.

This immediate feedback loop helps learners quickly adjust their approaches during repeated practice, with the goal of building and honing skills under diverse circumstances.

In this guide, you will find suggestions on:

When to use the AI-powered Assessment feature
How to prepare for writing strong criteria
How to write strong criteria
How to test your criteria

When to use Wonda’s AI-powered Assessment feature?

While repeated practice in authentic simulations can promote learner progress, the observed improvements are even greater when that practice is paired with coaching or feedback.

The AI Assessment feature can especially augment learning design in situations that benefit from:

Immediate deployment of learner feedback: for skill-based simulations where learners can rapidly integrate feedback into subsequent practice and real-life scenarios.
Instructor/facilitator pulse check on learner takeaways: to quickly and seamlessly gauge learner understanding of a topic via a debriefing simulation.
Conversational reflection and coaching: to offer thoughts on individual reflections that are done after an activity, experience, or subject.

Personalized self or formative assessment: instead of having learners rate their own competencies, place them in a simulation and then have them comment on the assessment.

How to get started with drafting criteria:

Before you start defining the criteria for the AI Assessment of a simulation, we highly recommend that you address the following points and questions.

Consider the learning goals of the broader learning experience and the learning objectives of the simulation/exercise in particular.
1. What are the learning goals of the overall learning experience?
2. Where does this simulation sit within the larger learning experience (i.e., what is the context around the simulation and the scaffolding leading up to it)?
3. What are the learning objectives of the simulation?
4. How does the simulation build skills or understanding toward the overall learning goals?
Break down skill sets or understanding into their components (the level of detail will depend on your answers to Question #1).
1. How might complex or intricate tasks be split into simpler components?
2. What components do you want to focus on?
3. For a skill like conflict resolution, the components of interest might be active listening, emotional regulation, self-awareness, and problem solving creativity.
4. For content or process understanding, which pieces of the content or process do you want to ensure that learners comprehend? Is it important that learners be able to describe the non-violent communication framework or their takeaways for how to integrate the framework into their daily lives?
Identify the key behaviors or responses that you want learners to verbally display for each component.
1. Referring back to the conflict resolution example outlined in Question #2, for the active listening component, you might be interested in gauging whether learners have picked up on a specific piece of information shared by the AI Character or the extent to which the concerns of the AI Character were addressed in the learners’ responses.
2. If the goal is to evaluate understanding of, for instance, the non-violent communication framework, what about the framework specifically do you want them to mention?

How to write the criteria:

Once you have addressed the questions and considerations above, you are ready to write the criteria.

Keeping the criteria generic gives more agency to the large language model (LLM) to determine the learners’ performance. This approach would likely work well when learners can take the conversation in a direction that best suits their own learning process, such as during reflections and debriefing exercises.

In cases where a specific outcome or set of outcomes would be considered more ideal, adding extra context to the criteria description will help tailor the feedback towards the desired solution.

Our team is continuously researching the extent to which criteria can be hyper specific and will update the guidance as more details come to light. In the meantime, it seems that a certain level of specificity can be achieved with respect to point allocations for a criteria when point values are assigned to specific expressed behaviors (e.g., always give 5/5 points if Alex raises the idea of a trial or interim period, even if there is no explanation of the trial or interim period).

Overall, here are some ways that you can format the criteria (these are a few samples and not an exhaustive list of workable options):

Block Questions, Sentences, or Phrases –
- Example: Did the human trainee propose solutions that would be good for you, as Alex, and for the company? Provide specific examples in your feedback.
- Example: The human trainee proposed solutions that would be optimal for you and for the company.
- Example: Proposed solutions that would be good for you and the company.
Split sections –
- What to look for:
  ___
  How to evaluate it:
  ___
Numbered by point allocations –
- 1 = ___
  2 = ___
  3 = ___
  4 = ___
  5 = ___

Important terminology:

Refer to the learner as the human trainee, and the AI Character/avatar as the AI Character

For detailed examples on how criteria description writing can impact the feedback provided, check out this guide.

How to test the criteria:

Once you have finalized the AI Character prompt, you are ready to put the criteria to test.

Have some conversations with the AI Character, during which you model examples of strong and weak conversations across the criteria (relaunch the simulation experience to create each sample conversation).

Then, go in the AI Assessment dashboard and:

Generate the reports.
Review the grades.
Make edits to the criteria and criteria descriptions accordingly.
Regenerate reports on the same conversations to validate edits.
Repeat the process as needed.

Release Notes

AI-Powered Assessments