Guidelines for Writing Test Items
Whether your intention is to construct a formative test or a summative test, you now have topics in mind (and expectations of your students for those topics) and you have constructed a TOS that indicates either the test formats you want to use or perhaps the levels of thinking that you want your students to bring to the assessment exercise. You have also “weighted” the test to reflect the emphasis you have given to the various topics. It is time to start writing items. And so now you need to address the third of the five key elements for a high-quality test: how to go about assessing the students.
A key to gathering data for formative or summative purposes is to select the most appropriate format. For instance, if you want to know whether a student can swim, a multiple-choice test (selected response) would not be the best approach. Instead, the best approach would be a performance test that has the student in the water demonstrating an ability to swim!
On the other hand, it is equally important that you select a testing method that does not impede a student’s ability to demonstrate achievement. This brings us to a key point for all of the test formats: Assessing all students fairly does not necessarily mean that all students are assessed in the same manner. An obvious example would be that of a student who has poor writing skills but is sufficiently verbal to express his or her knowledge. It could be the case that you will use a personal communication format with this student while others use an essay format.
Personal Communication
A personal communication assessment is a circumstance in which you speak directly with the student. You’ve been through at least a few of these situations in your life. If you have ever been interviewed for admission to a program (e.g., entering the teacher-education program at your college or university) or for a job (e.g., with the principal of a school where you wanted to work) then you know firsthand what a personal communication assessment is all about. Keep in mind, however, that simply reading a test to a student (perhaps because of a language problem) that the other students read to themselves does not constitute a personal communication assessment. Rather, the personal communication technique provides the test taker and the test giver the opportunity to pursue answers beyond the initial response to the question asked.
Advantages and Disadvantages
A personal communication assessment is somewhat of a paradox in that it can be a quick and efficient way of assessing students, but it can also be a very time-consuming approach. If you have just a few questions to ask, a relatively small number of students, and very particular answers that you are looking for, then this is a good approach. On the other hand, if you have a lot of content to address, a large number of students, and you really want to put this approach to use by letting students elaborate on their initial responses, then time may become an issue that prevents using this technique. And keep in mind that as you speak with one student, all of the others will need some work to keep them occupied.
If writing is what you are assessing, then this would not be the most appropriate format. However, in a situation in which writing is not the target of your assessment, personal communication may very well open up an opportunity for success to a student who understands what has been taught but does not have the ability to express it in writing. Another advantage is that it allows you to attend to nonverbal responses such as those that would indicate the student has not understood the question.
Questioning Techniques
It may seem at first that this is a pretty simple approach to testing. We want to make it clear that this is very much an interpersonal dynamic and as such takes not only well-thought-out preparation but also practice in various questioning techniques. Your TOS will guide the writing of the initial prompts. Try to avoid asking questions that can be answered with a simple yes or no. If you must, be prepared to ask a follow-up question such as “Why?” or “Why not?”
When using personal communication there are a number of questioning techniques that you can use as you help students go beyond a superficial response.
Probing Questions. These questions offer the student the opportunity to provide more depth to an answer. For instance, as a follow-up to a response you might ask, “What more can you tell me about that?”
Clarification Questions. If a student’s response is not clear to you, ask a clarification question that offers the opportunity for the student to more precisely phrase the answer. These questions, in particular, help to avoid misinterpreting or misunderstanding the student’s meaning. For example, if the student has told you that deserts are hot and dry, you might ask, “Do you mean that they are hot and dry all the time?”
Elaboration Questions. These questions ask for more description or for bringing in more information about something. If you ask a student what she saw at the zoo and she responds by saying “birds,” an elaboration question would ask for more description of the birds: “What can you tell me about the birds you saw?”
Redirection Questions. It will sometimes be obvious that the test taker has misinterpreted your question or has approached it from an inappropriate perspective. A redirection question clarifies your meaning or draws the student back to the appropriate topic without dismissing what they have already said. In particular, such a question would help to avoid making the test taker feel flustered because he or she missed the point. For example, you might say, “Yes, those are the types of clouds, but what I wanted to know was how they differ from one another?”
Supporting Questions. These questions are very much an example of that interpersonal dynamic we referred to previously. There is no anonymity during a personal communication assessment, and so personal vulnerability is very much out there on display. To maintain the flow of such an assessment when an answer is not quite correct or the student has realized that he or she is “in trouble,” a supporting question provides support for what has been said but provides direction for overcoming the difficulty. For instance, a student has listed four of the six main characters in a story you have asked about, and now you see that pleading look in the eyes hoping that they’ve all been covered. You might say, “Those four are correct. Were there any other characters that we discussed?”
If you start paying attention to the questions that people ask you will begin to see these various formats. You will need to practice using them in your conversations with students (and others) before you can use them effectively as part of an assessment.
Establishing Scoring Criteria
Establish appropriate scoring criteria before speaking with the student. Just because this is a conversation does not mean that credit is given for an answer that “sounds good.” Determine in advance what needs to be said to constitute an appropriate response for each question. If multiple elements exist, you can draw those out with the questioning techniques (that’s the beauty of this approach), but only give credit when a student actually offers a correct response. As mentioned previously, this technique—far more so than the other test formats—offers you a way to avoid making inferences about what a student knows or means. Establish in advance what the scoring criteria will be and then use effective questioning to determine whether or not the student has mastered the topic.
Recording the Results The final concern is with how you will record the assessment. If you spend a lot of time writing down what a student says in response to the questions, the student will likely start saying less and less. People simply get uneasy when the things they say are being written down. The best bet is to record the conversation. In that situation you will be able to attend to the responses rather than being preoccupied with documenting them. In some instances, however, it may be necessary to obtain appropriate permission before recording a conversation with a child. Other possibilities include either taking brief notes or having a checklist available for when you hear the elements of a correct answer.
Essay
You might think of an essay as the written version of the personal communication assessment. There is no back-and-forth discussion, of course, but the test taker does have the opportunity to provide a response that can include great detail and elaboration. Unfortunately, this also means that test takers have a tendency to write down everything they can think of in the hope that something will be correct. That is not an appropriate way to answer a question, and if the essay prompt has been written in a manner that allows it to happen, then the question is at fault as well. Let’s take a closer look.
Things to Think About
Before charging ahead with the writing of essay questions there are a number of concerns that you should consider. First, are your test takers up to the task of writing responses that demonstrate their achievement? This may sound like a no-brainer, but keep in mind two things: (1) you don’t want students’ verbal ability (or lack thereof) to be an obstacle that prevents them from expressing what they know about a particular topic, and (2) not everyone in your class will be on the same level when it comes to verbal ability. For example, if you are giving a science test, then it is really science content that you are targeting rather than writing ability. A student who does not write well may still know the science content that you’ve taught.
Essentially going hand in hand with the question of whether your students can write well enough to complete essays is the question of their reading ability. As you prepare essay questions, write them on the lowest reading level represented in your class. Unless you are testing reading comprehension, the essay question itself should not be an obstacle to your students demonstrating their achievement.
Essay assessments are most effective when the questions are sharply focused and narrow in scope. Broadly stated questions (actually, that would be vaguely stated questions) that allow students to respond in a number of ways really indicate that the test designer has not identified the target of the assessment and has not established the scoring criteria for the essay prompts. So, rather than using two or three essay questions that require pages of responses, use four or five questions that call for brief and concise responses. This actually makes things easier on the test taker and on the person who has to grade the tests.
Which brings us to the next point: the number of students to be evaluated. Keep in mind that the evaluator (typically you) is human. If you sit down with a stack of twenty-five essay exams, three essays each, with broadly stated prompts that allow lots of latitude in responses, (a) it will take a very long time to grade, and (b) you will likely find that by the time you reach the seventy-fifth essay your scoring criteria will have changed considerably. We are not saying to just shelve the essay approach and go with selected response if you have lots of students. Rather, we are saying that whether you have many students or just a few, your life will be made much easier if you construct your essay questions well and establish clear scoring criteria that match with the structure of the prompt.
Writing Good Essay Questions
There are three key points to keep in mind when constructing essay questions:
Specify the Content. State plainly in the prompt what information the test taker is supposed to work with in constructing a response. If the question refers to discussions in class, say so in the prompt. On the other hand, if the question refers to something straight from a textbook or other source, state that in the prompt. And yes, if it addresses both of those, you can say that as well. For instance, your prompt may begin,
With regard to our class discussions about the measures of central tendency and the normal curve . . .
We have now established that the only thing a correct response will relate to will be these two topics discussed in class.
Tell Them What to Do With the Information. You have now set the context for the response, and so it is time to tell them what to do once they’ve brought that information back from their long-term memory. You don’t want the respondent to give you a simple definition of the measures of central tendency, right? If so, then essay is not the most appropriate approach for this topic. However, what do you want? Do you want an explanation, a description, an opinion, a value judgment? It will make the task of scoring the test much simpler if you tell them what sort of thinking they are supposed to bring to the task. How so? Because if you ask for an explanation and they give you a list, then the answer is incorrect. All you will need to do when scoring the response is go up to the prompt and underline the portion that said they were to explain. Simple. So, our earlier example might continue in this way:
With regard to our class discussion about the measures of central tendency and the normal curve, explain why they are referred to as measures of central tendency and identify where they would be found on the normal curve . . .
Qualify Your Expectations. Your students now know what to discuss and how to discuss it. Add the final element to the question so that you encourage a wellrounded response, though without giving away any answers. That is, indicate what you expect an appropriate answer to include. This one step in the writing of essay questions will make them much more effective as assessment instruments. Our ongoing example may continue like this:
With regard to our class discussion about the measures of central tendency and the normal curve, explain why they are referred to as measures of central tendency and identify where they would be found on the normal curve. An appropriate answer will describe what each of the measures tells us, how they relate to scores distributed along a normal curve, and the underlying relationship between the measures and the curve.
If you now list the points available for this question as 15, then a good test taker should be able to see that there are three components to a correct answer and that each component is likely worth 5 points. If you receive a response that does not include each of the components asked for, those points are deducted and in the prompt you should underline the component the student failed to address.
Because essays have become so misused over the years, both in terms of their structure and the responses that we accept from students, switching to the system we describe may take a bit of doing. However, students will quickly come to understand that they need to read the question, answer the question that’s been asked, and address each component that is required. If you follow these suggestions for writing essay questions it will no longer be necessary to search through a response to see whether some aspect of the correct answer is in there somewhere. Write well-structured prompts and teach your students to respond to what is being asked of them. It shouldn’t take more than one or two testing experiences to get the point across, and then life will be better for test taker and test grader alike.
Scoring Essay Questions
There are three final points to keep in mind if you are going to use an essay format for assessment. You may have encountered the opposite of the first point, but we hope you will see why it is a good idea to switch. That is, use enough sharply focused questions to address the content the students are responsible for knowing/understanding and have everyone respond to the same questions. No choosing two out of three or three out of five. The only rationale for that approach is to cover the test writer’s failure to construct good, solid essay prompts. Obviously, if there are questions on a test that students can opt not to respond to, it means that particular information isn’t really important. And if it’s not important, why did you teach it and why is it on the test at all? What’s more, if students are responding to different questions, then the results of the assessment are not giving a true picture of the performance of your class. Instead, construct well-conceived questions that appropriately span the content for which the students are being held responsible. Your TOS will help to ensure that you accomplish this.
The second point to remember is that if someone besides yourself will be scoring any of the essays, be certain that they know exactly what you expect in an answer. As is the case when students are responding to different questions, if scorers are imposing different criteria when grading the answers, it is the efficacy of the assessment that is lost.
Finally—and here’s the good news—if you’ve followed these guidelines you will find that grading the responses is an easy task, even if you have a lot of them to read. It is the student’s responsibility to answer the question(s) that you have asked, and it is your responsibility to see whether that has been done. This will be simple if you’ve written a good prompt that tells them what content to use, what to do with that content, and what elements should appear in a correct response. It is not your responsibility to try to find a way to give the test takers points for what they’ve written. Whenever you do this—and it happens all the time—you enable a pattern of poor-quality testing in terms of both the students’ performance and your approach to assessment.