Ellipse

Quality criteria for good exam questions

When does a test have good quality? The terms validity and reliability are important concepts in identifying the quality of a test. Behind these terms, however, is a whole world. We limit ourselves in this blog to the greatest common denominator. For more information on the concepts of reliability and validity, see the explanation of Sluiter, Hemker and Eggen (2018).

Reliable and valid test

A test is reliable if its result is not based on chance. This is the case, for example, when there are disruptive factors in the administration or errors in questions or in their assessment. A valid test measures what it is supposed to measure and is consistent with the purpose for which the test results are used. An example of the application of this criterion is test questions that align with specifically stated learning or test objectives and with the level at which testing is done. Complaints about the validity of a test are often related to the meaning stakeholders assign to the results of a test.

Exam questions that meet quality criteria

In short, good quality testing means that a candidate who actually masters the subject matter passes and that stakeholders have the perception that the testing is fair. With exam questions that meet quality criteria, great gains can usually be made. Moreover, meeting these criteria is usually quite easy. Awareness of these criteria and being able to apply them is then important. The testing criteria used relate to:

  1. Relevance;
  2. Objectivity;
  3. Specificity;
  4. Efficiency.

Relevance (1)

Does the question belong to the learning and testing goals in the examination program? So does it involve relevant knowledge or is it about such details that either no one can know or no one will ever use this knowledge? The question must be about subject matter that a professional can benefit from. An example: In an exam on product knowledge in the food trade, it may be less important to know the nutritional value of peanut butter by heart. Of this, after all, the necessary can be read on the label.

Objectivity (2)

Is the correct answer to these questions always right or are there also conceivable situations in which the “right” answer is not actually correct? Can other answers be counted correctly? An objective question usually does not lead to discussion. See the example below of a non-objective question.


What colors does the Dutch flag have? Tick all the correct answers.

  1. Red
  2. Blue
  3. White
  4. Orange

Correct answer: A, B and C


The question is whether answer D, orange, should also not be counted correctly. Suppose the flag has a pennant, then it is orange. Answer D may not be the most relevant answer, but it is not really wrong either. In any case, the question may lead to discussion.

Specificity (3)

A question should be targeted such that someone who has mastered the material should be able to answer the question correctly and someone who has not mastered the material should not. So a specific question distinguishes between “good” and “bad” candidates. See below for an example of a non-specific (open-ended) question.


Describe the leadership styles of a widely applied management theory.

A: Hersey and Blanchard’s theory describes four styles.

Description:

  • Delegation: leaving tasks to employees, little direction and little support;
  • Support, consult: helping employees, little direction;
  • Persuade, motivate: lots of task-based direction and lots of support;
  • Instructing, instructing: lots of direction, little support.

Other answers at the discretion of the proofreader.


The problem with this question is that it has little focus; there are several management theories and models that are often applied. In addition, there is no mention of what requirements the description must meet. There are very many answers this way that should be counted correctly.

Efficiency (4)

To meet the criterion of efficiency, it is important to limit the information in the question to only that which is necessary to answer the question. An example of what we often see is that a case text includes the entire article from a daily newspaper as background information. The advice then is to include such information not in the exam but in the teaching material. Another example is that due to the inclusion of a double negation, the candidate has to read the question several times to understand it correctly. Denials would be best made in bold or italics so that they attract attention. In fact, language errors and complicated language constructions fall under the “efficiency” criterion.

Conclusion

In a good test, it is important that a candidate who masters the subject matter passes and that all involved have the perception that the test is fair. The above quality criteria of exam questions help to achieve this goal.

Decor

Want to know more?