Visual, Auditory, And Kinesthetic Learners (page 2)
The concept of visual, auditory, and kinesthetic learners is probably familiar to you. It states that each person has a preferred way of receiving new information, through one of three senses. Vision (seeing) and audition (hearing) are clear enough, but kinesthesia might require an explanation. Kinesthesia is the sensation that tells you where your body parts are. If you were to close your eyes and I moved your arm as though you were, say, waving, you would know where your arm was even though you couldn't see it. That information comes from special receptors in your joints, muscles, and skin. That's kinesthesia.
The visual-auditory-kinesthesia theory holds that everyone can take in new information through any of the three senses, but most of us have a preferred sense. When learning something new, visual types like to see diagrams, or even just to see in print the words that the teacher is saying. Auditory types prefer descriptions, usually verbal, to which they can listen. Kinesthetic learners like to manipulate objects physically; they move their bodies in order to learn (Figure 3).
To give you a backdrop against which to evaluate this theory, I'll start with a few facts about memory that cognitive scientists have worked out. People do differ in their visual and auditory memory abilities. That is, our memory system can store both what things look like and what they sound like. We use visual memory representations when we create a visual image in our mind's eye. For example, suppose I ask you, "What is the shape of a German shepherd's ears?" or "How many windows are there in your classroom?" Most people say they answer these questions by creating a visual image and inspecting it. A great deal of work by experimental psychologists during the 1970s showed that such images do have a lot of properties in common with vision—that is, there's a lot of overlap between your "mind's eye" and the parts of your brain that allow you to see. We also store some memories as sound, such as Katie Couric's voice, the roar of the MGM lion, or our mobile phone's ringtone. If I ask you, for example, "Who has a deeper voice: your principal or your superintendant?" you will likely try to imagine each person's voice and compare them. We can store both visual and auditory memories, and as with any other cognitive function, each of us varies in how effectively we do so. Some of us have very detailed and vivid visual and auditory memories; others of us do not.
Cognitive scientists have also shown, however, that we don't store all of our memories as sights or sounds. We also store memories in terms of what they mean to us. For example, if a friend tells you a bit of gossip about a coworker (who was seen coming out of an adult bookshop), you might retain the visual and auditory details of the story (for example, how the person telling the story looked and sounded), but you might remember only the content of the story (adult bookshop) without remembering any of the auditory or visual aspects of being told. Meaning has a life of its own, independent of sensory details (Figure 4).
Now we're getting to the heart of the visual-auditory-kinesthetic theory. It is true that some people have especially good visual or auditory memories. In that sense there are visual learners and auditory learners. But that's not the key prediction of the theory. The key prediction is that students will learn better when instruction matches their cognitive style. That is, suppose Anne is an auditory learner and Victor is a visual learner. Suppose further that I give Anne and Victor two lists of new vocabulary words to learn. To learn the first list, they listen to a tape of the words and definitions several times; to learn the second list, they view a slide show of pictures depicting the words. The theory predicts that Anne should learn more words on the first list than on the second whereas Victor should learn more words on the second list than on the first. Dozens of studies have been conducted along these general lines, including studies using materials more like those used in classrooms, and overall the theory is not supported. Matching the "preferred" modality of a student doesn't give that student any edge in learning.
How can that be? Why doesn't Anne learn better when the presentation is auditory, given that she's an auditory learner? Because auditory information is not what's being tested! Auditory information would be the particular sound of the voice on the tape. What's being tested is the meaning of the words. Anne's edge in auditory memory doesn't help her in situations where meaning is important. Similarly, Victor might be better at recognizing the visual details of the pictures used to depict the words on the slides, but again, that ability is not being tested.
The situation described in this experiment probably matches most school lessons. Most of the time students need to remember what things mean, not what they sound like or look like. Sure, sometimes that information counts; someone with a good visual memory will have an edge in memorizing the particular shapes of countries on a map, for example, and someone with a good auditory memory will be better at getting the accent right in a foreign language. But the vast majority of schooling is concerned with what things mean, not with what they look like or sound like.
So does that mean that the visual-auditory-kinesthetic theory is correct some small proportion of the time, such as when students are learning foreign language accents or countries on a map? Not really. Because the point of the theory is that the same material can be presented in different ways to match each student's strength. So what the teacher ought to do (according to the theory) is this: when learning countries on a map, the visual learners should view the shapes of the countries but the auditory learners should listen to a description of each country's shape; and when learning a foreign accent, the auditory learners should listen to a native speaker but the visual learners will learn more quickly if they view a written representation of the sounds. It seems obvious that this approach won't work.