Education.com
Try
Brainzy
Try
Plus

Chi-Square Goodness-of-Fit Test for AP Statistics

based on 1 rating
By — McGraw-Hill Professional
Updated on Feb 4, 2011

Practice problems for these concepts can be found at:

The following are the approximate percentages for the different blood types among white Americans: A: 40%; B: 11%; AB: 4%; O: 45%. A random sample of 1000 black Americans yielded the following blood type data: A: 270; B: 200; AB: 40; O: 490. Does this sample provide evidence that the distribution of blood types among black Americans differs from that of white Americans or could the sample values simply be due to sampling variation? This is the kind of question we can answer with the chi-square goodness-of-fit test. ("Chi" is the Greek letter Χ chi-square is, logically enough, Χ2.) With the chi-square goodness-of-fit test, we note that there is one categorical variable (blood type) and one population (black Americans). In this chapter we will also encounter a situation in which there is one categorical variable measured across two populations (called a chi-square test for homogeneity of proportions) and a situation in which there are two categorical variables measured across a single population (called a chi-square test for independence).

To answer this question, we need to compare the observed values in the sample with the expected values we would get if the sample of black Americans really had the same distribution of blood types as white Americans. The values we need for this are summarized in the following table.

It appears that the numbers vary noticeably for types A and B, but not as much for types AB and O. The table can be rewritten as follows.

Before working through this problem, a note on symbolism. Often in this book, and in statistics in general, we use English letters for statistics (measurements from data) and Greek letters for parameters (population values). Hence, is a sample mean and μ is a population mean; s is a sample standard deviation and σ is a population standard deviation, etc. We follow this same convention in this chapter: we will use Χ2 when referring to a population value or to the name of a test and use X2 when referring to the chi-square statistic.

The chi-square statistic (X2) calculates the squared difference between the observed and expected values relative to the expected value for each category. The X2 statistic is computed as follows:

The chi-square distribution is based on the number of degrees of freedom which equals, for the goodness-of-fit test, the number of categories minus 1 (df = n – 1). The X2 statistic follows approximately a unique chi-square distribution, assuming a random sample and a large enough sample, for each different number of degrees of freedom. The probability that a sample has a X2 value as large as it does can be read from a table of X2 critical values, or determined from a calculator. There is a X2 table in the back of this book and you will be supplied a table like this on the AP exam. We will demonstrate both the use of tables and the calculator in the examples and problems that follow.

A hypothesis test for Χ2 goodness-of-fit follows the, by now familiar, pattern. The essential parts of the test are summarized in the following table.

Let's use the four-step hypothesis-testing procedure.

example: The following are the approximate percentages for the different blood types among white Americans: A: 40%; B: 11%; AB: 4%; O: 45%. A random sample of 1000 black Americans yielded the following blood type data: A: 270; B: 200; AB: 40; O: 490. Does this sample indicate that the distribution of blood types among black Americans differs from that of white Americans?

solution:

example: The statistics teacher, Mr. Hinders, used his calculator to simulate rolling a die 96 times and storing the results in a list L1. He did this by entering MATH PRB randInt(1,6,96)→(L1). Next he sorted the list (STAT SortA(L1)). He then counted the number of each face value. The results were as follows (this is called a one-way table).

Does it appear that the teacher's calculator is simulating a fair die? (That is, are the observations consistent with what you would expect to get if the die were fair?)

solution:

Practice problems for these concepts can be found at:

Add your own comment