Objective Test Items
An objective test item is defined as one for which the scoring rules are so exhaustive and specific that they do not allow scorers to make subjective inferences or judgments; thereby, any scorer that marks an item following the rules will assign the same test score. Objective tests began to be used early in the twentieth century as a means of evaluating learning outcomes and predicting future achievement, and their high reliability and predictive validity led to the gradual replacement of the essay test.
One common misconception about the objective test item is that it is limited to testing specific, often trivial, factual details, which would sometimes lead to the use of an essay or performance test to assess students' comprehension of broader principles or their ability to apply them. However, as Robert Ebel pointed out, well written objective tests (especially multiple choice tests) can actually assess such higher-order abilities to some extent. While it is true that some types of knowledge or abilities cannot be assessed by objective tests, educators also should keep in mind that what test items can assess depends largely on the skills and effort of the test constructor, rather the test format per se.
A variety of different types of objective test formats can be classified into two categories: a selected response format, in which examinees select the response from a given number of alternatives, including true/false, multiple choice, and matching test items; and a constructed response format, in which examinees are required to produce an entire response, including short answer test items. This distinction is sometimes captured in terms of recognition and recall. These two general categories are further divided into basic types of objective tests, illustrated in the following examples (Figure 1).
The true/false test is the simplest form of selected response formats. True/false tests are those that ask examinees to select one of the two choices given as possible responses to a test question. The choice is between true and false, yes and no, right and wrong, and so on. A major advantage of the true/false test is its efficiency as it yields many independent responses per unit of testing time. Therefore, teachers can cover course material comprehensively in a single test. However, one apparent limitation of the true/false test is its susceptibility to guessing. It should be noted, however, that test givers can attenuate the effects of guessing by increasing the number of items in a test. In addition, some guessing might reflect partial knowledge, which would provide a valid indication of achievement.
Another selected response format type is the multiple-choice test, which has long been the most widely used among the objective test formats. Multiple-choice test items require the examinee to select one or more responses from a set of options (in most cases, 3–7). The correct alternative in each item is called the answer (or the key), and the remaining alternatives are called distracters. Examinees have less chance of guessing the correct answer to a multiple-choice test question compared to a true/false test question. In addition, the distracter an examinee selects may provide useful diagnostic information.
Related to the multiple-choice test is the matching test, which consists of a list of premises, a list of responses, and directions for matching the two. Examinees must match each premise with one of the responses on the basis of the criteria described in the directions. A major strength of the matching test is that it is space-saving and, therefore, can be used to assess several important learning targets at once.
A typical example of a constructed-response format is the short-answer test, which asks examinees to supply a word, phrase, or number that answers a question or completes a sentence. Sometimes it is called a completion or fill-in-the-blank test. Although what a short-answer test item can assess is generally more limited to factual information, it does not require the development of plausible distracters. Moreover, short-answer items are much less susceptible to guessing than selected-response format items.
Basically, scoring objective test items is easy: It only requires one to follow the scoring rules. However, constructing good objective test items requires much more skill and effort. The first step is to develop a set of test specifications that can serve to guide the selection of test items. A table of specifications (or test blueprint) is a useful tool for this purpose. This tool is usually a two-way grid that describes content areas to be covered by the test as the row headings and skills and abilities to be developed (i.e., instructional objectives) as the column headings (Figure 2). After specifying the content and ability covered by the test using the table of specifications, the appropriate test item format is selected for each item. At this point, not only objective test items but also other types of test items—essay test or performance assessment—should be considered, depending on the learning outcomes to be measured.
The next step is to create specific test items. Typically, it is particularly important for objective test items to be written in clear and unambiguous language to allow examinees to demonstrate their attainment of the learning objectives. If complex wording is used, the item simply reflects reading comprehension ability. It is also important for each objective test item to focus on an important aspect of the content area rather than trivial details. Asking trivial details not only makes the test items unnecessarily difficult, it also obscures what the test constructor really wants to measure. Similarly, relatively novel material should be used when creating items that measure understanding or the ability to apply principles. Items created by copying sentences verbatim from a textbook only reflect rote memory, rather than higherorder cognitive skills.
Many other specific rules exist for constructing objective test items. Test constructors must be very careful that examinees with little or no content knowledge cannot arrive at the correct answer by utilizing the characteristics of the test format that are independent of specific content knowledge. Jason Millman and his colleagues called this skill of the examinees “test-wiseness.” For example, in multiple-choice test items, all options should be grammatically correct with respect to the stem (questions or incomplete statements preceding options), and key words from a stem, or their synonyms, should not be repeated in the correct option. Any violation of these rules would obviously provide an advantage for testwise examinees. Test composers should also equalize the length of the options of an item and avoid using specific determiners such as all, always, and never because some testwise examinees know that the correct option is frequently long and without such specific determiners. Robert Thorndike and Anthony Nitko have provided more comprehensive guidelines, with detailed explanations for constructing objective test items.
One common criticism of objective test items is that students are encouraged toward rote learning and other surface-processing strategies. Another related criticism is that objective tests, if used to evaluate the educational attainment of schools, encourage teachers to place undue emphasis on factual knowledge and disregard the understanding of students in the classrooms. Some evidence suggests that both are the case.
Kou Murayama, in a series of studies, investigated the effects of objective test items on the use of learning strategies. In one study, junior high school students participated in a history class for five days and took either an essay or short-answer test at the end of each day. Results showed that in the last day, those who took the short-answer tests used more rote learning strategies and fewer deep-processing strategies than those who took the essay tests. George Madaus reviewed much literature about the effects of standardized testing on what is taught at schools and found that teachers pay particular attention to the form of the questions and adjust their instruction accordingly, suggesting that objective tests could narrow instruction to the detriment of higher-order skills. Madaus argued that high-stakes tests—tests that are used to make important decisions such as the ranking ofschools—have much more influenceon teaching.
However, educators should be reminded that objective test items are not limited to testing for specific factual knowledge. Well written items may not have such negative effects on students' use of learning strategies or teachers' teaching styles. Thus, it is not the objective test items per se that should be changed. What is important is to change the stereotypical beliefs that objective test items require only rote learning of factual knowledge and avoid poorly constructed objective test items.
See also:Standardized Testing
Ebel, R. L., & Frisbie, D. A. (1991). Essentials of educational measurement (5th ed.). Englewood Cliffs, NJ: Prentice Hall.
Madaus, G. F. (1998). “The influence of testing on the curriculum.” In L. N. Tanner (Ed.), Critical issues in curriculum (pp. 83–121). Chicago: University of Chicago Press.
Millman, J., Bishop, C. H., & Ebel, R. L. (1965): “An analysis of test-wiseness.” Educational and psychological measurement, 25(3), 707–726.
Murayama, K. (2003). “Test format and learning strategy use.” Japanese Journal of Educational Psychology, 51(1), 1–12.
Nitko, A. J. (2004). Educational assessment of students (4th ed.). Upper Saddle River, NJ: Merrill.
Thorndike, R. M. (1997). Measurement and evaluation in psychology and education (6th ed.). Upper Saddle River, NJ: Merrill.