Sample Surveys Study Guide (page 2)
Introduction to Sample Surveys
The results of surveys are presented almost daily in newspapers, over the radio, and on television. From surveys, the proportion p of the population with a certain trait or opinion can be estimated. In fact, if the sample size is 1,500, we can be almost sure that our estimate is within 0.03 of the population proportion. Remarkably, being able to estimate the population proportion with this precision does not depend on the size of the population. A sample of 1,500 people is sufficient whether we are drawing inference to the people living in a particular state, to the people living within the United States, or to the people living on Earth, provided the sample is taken properly. Taking a proper sample is challenging. In this lesson, we will learn more about conducting sample surveys.
Margin of Error
From June 24 through 26, 2005, the Gallup Organization contacted 1,009 adults nationally and asked them, "How patriotic are you? Would you say —extremely patriotic, very patriotic, somewhat patriotic, or not especially patriotic?" Of the respondents, 72% said "extremely or very patriotic." Thus, = 0.72 is the estimate of the proportion p of adults in the United States who would state they are extremely or very patriotic. The sample proportion is a point estimate of the population proportion. A point estimate of a population parameter is a single number that is based on sample data and represents a plausible value of the parameter.
The Gallup Organization also reported that there was a ±3 percentage point margin of error associated with the survey. The margin of error provided by this and other media descriptions of survey results has two important characteristics. First, the difference between the sample proportion and the population proportion p is less than the margin of error about 95% of the time; that is, for about 19 of every 20 random samples of the same size from the same population, the sample proportion will be within the margin of error of the population proportion. Second, the sample proportion will differ from the population proportion by more than the margin of error about 5% of the time; that is, for about one in every 20 samples of the same size from the same population, the difference in the sample proportion and the population proportion will be greater than the margin of error.
The margin of error can be used to obtain an interval of plausible values for the parameter of interest. For the survey on patriotism, the point estimate was 0.72, and the margin of error was 0.03. Thus, the interval of plausible values based on this sample is 0.69 to 0.75.
A sample of high school students was randomly selected from a very large city. Each student was asked, "Are you employed either part time or full time during the school year?" Of those sampled, 38% reported that they had a part-time or a full-time job during the school year. The margin of error was reported to be 5%. Give a point estimate and an interval of reasonable values for the proportion of this city's high school students having employment that, with 95% certainty, includes the true proportion.
The point estimate of the proportion of the high school students in this city who are employed, either part time or full time, is 0.38. An interval of plausible values for this proportion is between 0.38 – 0.05 = 0.33 and 0.38 + 0.05 = 0.43.
Census versus Sample Survey
In a census, every unit in the population is included in the sample. This is the only way to determine a parameter exactly. If our goal is to determine a parameter's value, why do we usually sample and not take a census? There are various reasons that we must, or want to, sample instead of taking a census.
It may not be feasible to take a census. When a nurse draws blood for a test, you certainly want her to be satisfied with a sample and not to take all of your blood as a census would require. A manufacturer who takes a census to determine the mean lifetime of the batteries the company produces will have nothing left to sell!
Many times, a census takes too long to complete. Suppose we want to know what proportion of the cotton plants in a 160-acre field has at least one insect on them. (The number of plants per acre can vary from 30,000 to 58,000.) It would take days to check each plant. By then, the plants first inspected that had insects may or may not still have insects on them, and the plants inspected early that did not have insects might now have insects on them. The U.S. Census, which is completed every ten years, takes years to plan and more time to compile the results after the data are collected; it would not be feasible to census the U.S. population each year.
A census is often not as accurate as a sample survey. A small group of interviewers can be trained more easily than can a large one. Finding a small number of nonrespondents is a much more manageable task than finding a large number of nonrespondents. For the U.S. Census, it is difficult to actually count all citizens. Some do not have a home; others do not want to be counted. Various techniques have been used to count these people. This has led some to argue that a more accurate count of the U.S. population would be obtained if it were estimated from a sample; others disagree.
Simple Random Samples
Earlier, we noted that the sample proportion will be within the margin of error of the population mean provided that the sample was properly taken. Before describing some of the methods that can be used to select samples properly, we need to think more carefully about some of the elements of sampling.
In a sample survey, the target population is the set of units that is of interest. The sampled population is the set of units from which the sample is selected. Although we want the target and sampled populations to be the same, this is rarely the case. As an illustration, suppose the target population is every household in the United States. If a telephone survey is conducted using the white pages from phone books across the nation, only households with telephone numbers in the white pages are part of the sampled population. Households without a telephone or with unlisted numbers are not part of the sampled population. The sample frame is a list of all units from which the sample is drawn; it is a list of the units in the sampled population.
Generally, the purpose of a sample survey is to draw inferences about some population characteristic(s). For a relatively small sample to accurately reflect the characteristics of a large population, the sample cannot be drawn haphazardly. Proper sampling methods, specifically, probability sampling plans, must be used. A probability sampling plan is one in which every unit of the sampled population has a known probability of being included in the sample. In Lesson 2, we learned that a simple random sample is one in which every set of units of size n in the sampled population has an equal chance of being selected; a random sample is a probability sample.
Suppose we want to take a simple random sample of size 30 from the people who have donated funds to the local public radio station within the past year. Working with the station, we could write each contributor's name on a slip of paper, place it in a bowl, mix the pieces of paper thoroughly, and draw out 30 slips. The names on the 30 slips of paper constitute the people in the sample. This approach becomes impractical as the population of interest becomes large. Writing the names of all residents of a city, much less a state or nation, on slips of paper would take a prohibitive amount of time. Instead, the sample frame (list of names) is usually generated from one or more sources, such as tax rolls or residential addresses, and the computer is used to make selections from the list in a manner that permits every listed unit (person) to have an equal chance of being chosen. Those units (people) selected by the computer constitute the simple random sample.
Generating the sample frame is a major challenge, especially if the population is large and/or geographically dispersed. The resources available to create the frame may not be sufficient. Sometimes, even if they are sufficient, it is impossible to create the sample frame, at least within the desired time frame. Other sampling methods have been developed as alternatives to simple random sampling. These tend to be more complicated both in selecting the sample and in obtaining parameter estimates from the sample. Depending on the circumstance, they may have some advantages over simple random sampling. We will consider four such methods: stratified random sampling, cluster sampling, systematic sampling, and multistage sampling.
- Kindergarten Sight Words List
- First Grade Sight Words List
- 10 Fun Activities for Children with Autism
- Grammar Lesson: Complete and Simple Predicates
- Definitions of Social Studies
- Child Development Theories
- Signs Your Child Might Have Asperger's Syndrome
- How to Practice Preschool Letter and Name Writing
- Social Cognitive Theory
- Curriculum Definition