When is a test considered to be of good quality? The terms validity and reliability are important concepts when assessing the quality of a test. However, there is a whole world behind these terms. In this blog, we will limit ourselves to the greatest common denominator. For more information about the concepts of reliability and validity, please refer to the explanation by Sluiter, Hemker and Eggen (2018). Reliable and valid test A test is reliable if its results are not based on chance. This is the case, for example, if there are disruptive factors during the test or if there are errors in the questions or in their assessment. A valid test measures what it is supposed to measure and is consistent with the purpose for which the test results are used. An example of the application of this criterion is test questions that correspond to specifically formulated learning or assessment objectives and at the level at which the assessment is conducted. Complaints about the validity of an assessment are often related to the significance that stakeholders attach to the results of an assessment. Examination questions that meet quality criteria In short: good quality assessment means that candidates who have truly mastered the subject matter pass, and that stakeholders perceive the assessment to be fair. Exam questions that meet quality criteria generally yield significant benefits. Moreover, meeting these criteria is usually fairly easy to achieve. Awareness of these criteria and the ability to apply them is therefore important. The technical The criteria used relate to: Relevance; Objectivity; Specificity; Efficiency. Relevance (1) Does the question relate to the learning and assessment objectives in the examination programme? Does it concern relevant knowledge or does it concern such details that either no one can know this or no one will ever use this knowledge? The question must concern subject matter that is useful to a professional. An example: In an examination on product knowledge in the food trade, it is perhaps less important to know the nutritional value of peanut butter by heart. After all, this information can be found on the label. Objectivity (2) Is the correct answer to these questions always correct, or are there situations in which the ‘correct’ answer is actually incorrect? Can other answers also be considered correct? A objective question does not usually lead to discussion. See the example below of a non-objective question. What colours does the Dutch flag have? Tick all the correct answers. Red Blue White Orange Correct answer: A, B and C The question is whether answer D, orange, should also be considered correct. Suppose the flag has a pennant, then it is orange. Answer D may not be the most relevant answer, but it is not really wrong either. In any case, the question could lead to discussion. Specificity (3) A question should be formulated in such a way that someone who has mastered the subject matter should be able to answer it correctly, while someone who has not mastered the subject matter should not. A specific question therefore distinguishes between ‘good’ and ‘bad’ candidates. See below for an example of a non-specific (open) question. Describe the leadership styles of a widely used management theory. Answer: Hersey and Blanchard's theory describes four styles. Description: Delegating: assigning tasks to employees, with little guidance or support; Supporting, consulting: helping employees, little guidance; Persuading, motivating: lots of task-oriented guidance and lots of support; Assigning tasks, giving instructions: lots of guidance, little support. Other answers for assessment by the corrector. The problem with this question is that it is not very specific; there are various management theories and models that are often applied. In addition, it does not specify the requirements that the description must meet. This means that there are many answers that could be considered correct. Efficiency (4) In order to meet the criterion of efficiency, it is important to limit the information in the question to only that information which is necessary to answer the question. An example of what we often see is that a case text contains the entire article from a daily newspaper as background information. Our advice is not to include such information in the examination, but rather in the course material. Another example is when the candidate has to read the question several times to understand it properly because it contains a double negative. Negatives are best highlighted in bold or italics so that they attract attention. In fact, language errors and complex language constructions fall under the criterion of ‘efficiency’. Conclusion In the event of a good test It is important that candidates who have mastered the subject matter pass and that all those involved feel that the test is fair. The above quality criteria for examination questions contribute to achieving this goal.