Sample Surveys Study Guide (page 2)

Updated on Oct 5, 2011


The point estimate of the proportion of the high school students in this city who are employed, either part time or full time, is 0.38. An interval of plausible values for this proportion is between 0.38 – 0.05 = 0.33 and 0.38 + 0.05 = 0.43.

Census versus Sample Survey

In a census, every unit in the population is included in the sample. This is the only way to determine a parameter exactly. If our goal is to determine a parameter's value, why do we usually sample and not take a census? There are various reasons that we must, or want to, sample instead of taking a census.

It may not be feasible to take a census. When a nurse draws blood for a test, you certainly want her to be satisfied with a sample and not to take all of your blood as a census would require. A manufacturer who takes a census to determine the mean lifetime of the batteries the company produces will have nothing left to sell!

Many times, a census takes too long to complete. Suppose we want to know what proportion of the cotton plants in a 160-acre field has at least one insect on them. (The number of plants per acre can vary from 30,000 to 58,000.) It would take days to check each plant. By then, the plants first inspected that had insects may or may not still have insects on them, and the plants inspected early that did not have insects might now have insects on them. The U.S. Census, which is completed every ten years, takes years to plan and more time to compile the results after the data are collected; it would not be feasible to census the U.S. population each year.

A census is often not as accurate as a sample survey. A small group of interviewers can be trained more easily than can a large one. Finding a small number of nonrespondents is a much more manageable task than finding a large number of nonrespondents. For the U.S. Census, it is difficult to actually count all citizens. Some do not have a home; others do not want to be counted. Various techniques have been used to count these people. This has led some to argue that a more accurate count of the U.S. population would be obtained if it were estimated from a sample; others disagree.

Simple Random Samples

Earlier, we noted that the sample proportion will be within the margin of error of the population mean provided that the sample was properly taken. Before describing some of the methods that can be used to select samples properly, we need to think more carefully about some of the elements of sampling.

In a sample survey, the target population is the set of units that is of interest. The sampled population is the set of units from which the sample is selected. Although we want the target and sampled populations to be the same, this is rarely the case. As an illustration, suppose the target population is every household in the United States. If a telephone survey is conducted using the white pages from phone books across the nation, only households with telephone numbers in the white pages are part of the sampled population. Households without a telephone or with unlisted numbers are not part of the sampled population. The sample frame is a list of all units from which the sample is drawn; it is a list of the units in the sampled population.

Generally, the purpose of a sample survey is to draw inferences about some population characteristic(s). For a relatively small sample to accurately reflect the characteristics of a large population, the sample cannot be drawn haphazardly. Proper sampling methods, specifically, probability sampling plans, must be used. A probability sampling plan is one in which every unit of the sampled population has a known probability of being included in the sample. In Lesson 2, we learned that a simple random sample is one in which every set of units of size n in the sampled population has an equal chance of being selected; a random sample is a probability sample.

Suppose we want to take a simple random sample of size 30 from the people who have donated funds to the local public radio station within the past year. Working with the station, we could write each contributor's name on a slip of paper, place it in a bowl, mix the pieces of paper thoroughly, and draw out 30 slips. The names on the 30 slips of paper constitute the people in the sample. This approach becomes impractical as the population of interest becomes large. Writing the names of all residents of a city, much less a state or nation, on slips of paper would take a prohibitive amount of time. Instead, the sample frame (list of names) is usually generated from one or more sources, such as tax rolls or residential addresses, and the computer is used to make selections from the list in a manner that permits every listed unit (person) to have an equal chance of being chosen. Those units (people) selected by the computer constitute the simple random sample.

Generating the sample frame is a major challenge, especially if the population is large and/or geographically dispersed. The resources available to create the frame may not be sufficient. Sometimes, even if they are sufficient, it is impossible to create the sample frame, at least within the desired time frame. Other sampling methods have been developed as alternatives to simple random sampling. These tend to be more complicated both in selecting the sample and in obtaining parameter estimates from the sample. Depending on the circumstance, they may have some advantages over simple random sampling. We will consider four such methods: stratified random sampling, cluster sampling, systematic sampling, and multistage sampling.

View Full Article
Add your own comment