Sampling Distributions
Here's a problem we haven't yet considered. Suppose, in the bulbtesting scenario, our sample consists of 1000 randomly selected bulbs, and we get the results illustrated by Figs. 54 and 55. What if we repeat the experiment, again choosing a sample consisting of 1000 randomly selected bulbs? We won't get the same 1000 bulbs as we did the first time, so the results of the experiment will be a little different.
Suppose we do the experiment over and over. We'll get a different set of bulbs every time. The results of each experiment will be almost the same, but they will not be exactly the same. There will be a tiny variation in the estimate of the mean from one experiment to another. Likewise, there will be a tiny variation in the estimate of the standard deviation. This variation from experiment to experiment will be larger if the sample size is smaller (say 100 bulbs), and the variation will be smaller if the sample size is larger (say 10,000 bulbs).
Imagine that we repeat the experiment indefinitely, estimating the mean again and again. As we do this and plot the results, we obtain a distribution that shows how the mean varies from sample to sample. Figure 56 shows what this curve might look like. It is a normal distribution, but its values are much more closely clustered around 3.600 amperes. We might also plot a distribution that shows how the standard deviation varies from sample to sample. Again we get a normal distribution; its values are closely clustered around 0.23, as shown in Fig. 57.
Figures 56 and 57 are examples of what we call a sampling distribution. Figure 56 shows a sampling distribution of means. Figure 57 illustrates a sampling distribution of standard deviations. If our experiments involved the testing of more than 1000 bulbs, these distributions would be more centered (more sharply peaked curves), indicating less variability from experiment to experiment. If our experiments involved the testing of fewer than 1000 bulbs, the distributions would be less centered (flatter curves), indicating greater variability from experiment to experiment.
The Central Limit Theorem
Imagine a population P in which some characteristic x can vary from element (or individual) to element. Suppose P contains p elements, and p is a very large number. The value x is plotted on the horizontal axis of a graph, and the number y of individuals with characteristic value x is plotted on the vertical axis. The result is a statistical distribution. Maybe it's a normal distribution (bellshaped and symmetrical), and maybe not. The number of elements p in the population P is so large that it's easiest to render the distribution as a smooth curve.
Now imagine that we choose a large number, k, of samples from P. Each sample represents a different random crosssection of P, but all the samples are the same size. Each of the k samples contains n elements, where n < p. We find the mean of each sample and compile all these means into a set {μ_{1}, μ_{2}, μ_{3}, . . ., μ_{k}}. We then plot these means on a graph. We end up with a sampling distribution of means. We've been through this discussion with the example involving the light bulbs, and now we're stating it in general terms. We're repeating this concept because it leads to something important known as the Central Limit Theorem.
According to the first part of the Central Limit Theorem, the sampling distribution of means is a normal distribution if the distribution for P is normal. If the distribution for P is not normal, then the sampling distribution of means approaches a normal distribution as the sample size n increases. Even if the distribution for P is highly skewed (asymmetrical), any sampling distribution of means is more nearly normal than the distribution for P. It turns out that if n ≥ 30, then even if the distribution for P is highly skewed and p is gigantic, for all practical purposes the sampling distribution of means is a normal distribution.
The second part of the Central Limit Theorem concerns the standard deviation of the sampling distribution of means. Let σ be the standard deviation of the distribution for some population P. Let n be the size of the samples of P for which a sampling distribution of means is determined. Then the standard deviation of the sampling distribution of means, more often called the standard error of the mean (SE), can be found with the following formula:
SE ≈ σ/(n^{1/2})
That is, SE is approximately equal to the standard deviation of the distribution for P, divided by the square root of the number of elements in each sample. If the distribution for P is normal, or if n ≥ 30, then we can consider the formula exact:
SE = σ/(n^{1/2})
From this it can be seen that as the value of n increases, the value of SE decreases. This reflects the fact that large samples, in general, produce more accurate experimental results than small samples. This holds true up to about n = 30, beyond which there is essentially no further accuracy to be gained.
Practice problems for these concepts can be found at:
 1

2
Ask a Question
Have questions about this article or topic? AskRelated Questions
Q:
Q:
Q:
Q:
Popular Articles
 Kindergarten Sight Words List
 First Grade Sight Words List
 10 Fun Activities for Children with Autism
 Child Development Theories
 Social Cognitive Theory
 Why is Play Important? Social and Emotional Development, Physical Development, Creative Development
 Signs Your Child Might Have Asperger's Syndrome
 Theories of Learning
 A Teacher's Guide to Differentiating Instruction
 Definitions of Social Studies