**Introduction to Hypothesis Testing for Proportions**

*A confidence interval on the population proportion provides a set of plausible values for that proportion as we saw in the last lesson. If the proportion is hypothesized to be, say, 0.4, but the interval does not include 0.4, then it would be reasonable to reject that value for the population proportion. We will not always want to assess the validity of hypotheses using a confidence interval. In this lesson, the logic of statistical hypothesis testing and the application of this logic to proportions will be presented.*

**Logic of Hypothesis Testing**

To conduct a statistical test of hypotheses, we must first have two hypotheses: the null hypothesis and the alternative hypothesis. The *null hypothesis*, denoted by *H*_{0}, is a statement that nothing is happening. The specific null hypothesis varies depending on the problem. It could be that medication does not alter blood pressure, that no relationship exists between IQ and grades, or that hair grows at the same rate for females and males. The *alternative hypothesis*, denoted by *H*_{a}, is a statement that something is happening. As with the null hypothesis, the alternative hypothesis depends on the problem. It could be that medication changes blood pressure, that a relationship does exist between IQ and grades, or that hair grows at different rates for females and males.

For a moment, consider a trial. The null hypothesis at every trial is *H*_{0}: The defendant is not guilty. That is, she did not do whatever she is accused of. The alternative hypothesis is always *H*_{a}: The defendant is guilty. That is, she did commit the crime of which she is accused. In the U.S. judicial system, the jury is instructed that the null hypothesis of not guilty can be rejected in favor of the alternative guilty only if such a conclusion can be drawn beyond a shadow of a doubt; the evidence must be strong enough that the null is not true. Statistical hypothesis testing is also firmly based on the idea of rejecting the null hypothesis only if there is strong evidence against it.

For any test of hypotheses, two types of errors are possible, type I errors and type II errors. A *type I error* occurs if the null hypothesis is rejected when it is true. For the trial example, a type I error is committed if the jury declares an innocent person guilty. A *type II error* occurs if the null hypothesis is not rejected, but it is false (the alternative is true). If a guilty person is declared innocent, then a type II error has been made in a jury trial.

In most hypothesis testing settings, we can never be absolutely sure whether the null or the alternative is true. Again thinking of the jury trial, we can never be certain whether she is innocent or guilty. (A confession may not be true; a witness may lie.) The null hypothesis that the person is innocent is rejected only if the evidence presented in the case makes jurors believe that the likelihood of having that much or more evidence against her is extremely unlikely if she is innocent. In *statistical hypothesis testing*, the *p*-value is the probability of observing an outcome as unusual or more unusual than that that was observed given that the null hypothesis is true. If the *p*-value becomes too small, then we reject the null hypothesis in favor of the alternative, just as the jurors would reject the hypothesis of innocence and conclude guilty. However, in the case of the statistical test, we have a number, the *p*-value, which gets smaller as the evidence against the null hypothesis increases.

The significance level of a test, or the *α* level, is the borderline for deciding whether the *p*-value is small enough to justify rejecting the null hypothesis in favor of the alternative hypothesis. If the *p*-value is smaller than the significance level, the null hypothesis is rejected; otherwise, it is not. The significance level is the largest acceptable probability of a type I error.

The standard for rejecting the null hypothesis is quite high. For a jury to declare the defendant guilty, the evidence must be strong enough to remove any shadow of a doubt from the minds of the jurors. As a consequence, some guilty people will be found not guilty only because that shadow of the doubt remained. Similarly, if the *p*-value is above the significance level and the null is not rejected, this does *not* mean that the null hypothesis is true; we simply do not have enough evidence to say it is not true. This is why, in statistical hypothesis testing, we say that we have "failed to reject the null hypothesis," but we would *not* conclude that the null hypothesis is true.

### Ask a Question

Have questions about this article or topic? Ask### Related Questions

See More Questions### Popular Articles

- Kindergarten Sight Words List
- First Grade Sight Words List
- 10 Fun Activities for Children with Autism
- Signs Your Child Might Have Asperger's Syndrome
- Theories of Learning
- A Teacher's Guide to Differentiating Instruction
- Child Development Theories
- Social Cognitive Theory
- Curriculum Definition
- Why is Play Important? Social and Emotional Development, Physical Development, Creative Development