Practice problems for these concepts can be found at:

The steps in the hypothesis-testing procedure is as follows:

- State hypotheses in the context of the problem. The first hypothesis, the
**null hypothesis**, is the hypothesis we are actually testing. The null hypothesis usually states that there is no bias or that there is no distinction between groups. It is symbolized by *H*_{0}.
The second hypothesis, the **alternative hypothesis**, is the theory that the researcher wants to confirm by rejecting the null hypothesis. The alternative hypothesis is symbolized by *H*_{A}. There are three forms for the alternative hypothesis: ≠ , >, or <. That is, if the null is *H*_{0}: μ_{1} – μ_{2} = 0, then *H*_{A} could be:

*H*_{A}: μ_{1} – μ_{2} = 0 (this is called a two-sided alternative)

*H*_{A}: μ_{1} – μ_{2} > 0 (this is a one-sided alternative)

*H*_{A}: μ_{1} – μ_{2} < 0 (also a one-sided alternative)

(In the case of the one-sided alternative *H*_{A}: μ_{1} – μ_{2} > 0, the null hypothesis is sometimes written *H*_{0}: μ_{1} – μ_{2} ≤ 0.)

- Identify which test statistic (so far, that's
*z* or *t*) you intend to use and show that the conditions for its use are satisfied. If you are going to state a significance level, α, it can be done here.
- Compute the value of the test statistic and the
*P*-value.
- Using the value of the test statistic and/or the
*P*-value, give a conclusion in the context of the problem.

If you stated a significance level, the conclusion can be based on a comparison of the *P*-value with α. If you didn't state a significance level, you can argue your conclusion based on the value of the *P*-value alone: if it is small, you have evidence against the null; if it is not small, you do not have evidence against the null.

The conclusion can be (1) that we reject *H*_{0} (because of a sufficiently small *P*-value) or (2) that we do not reject *H*_{0} (because the *P*-value is too large). We never accept the null: we either reject it or fail to reject it. If we reject *H*_{0}, we can say that we accept *H*_{A} or, preferably, that we have evidence in favor of *H*_{A}.

Significance testing involves making a decision about whether or not a finding is statistically significant. That is, is the finding sufficiently unlikely so as to provide good evidence for rejecting the null hypothesis in favor of the alternative? The four steps in the hypothesis testing process outlined above are the four steps that are required on the AP exam when doing inference problems. In brief, every test of a hypothesis should have the following four steps:

- State the null and alternative hypotheses in the context of the problem.
- Identify the appropriate test and check that the conditions for its use are present.
- Do the correct mechanics, including calculating the value of the test statistic and the
*P*-value.
- State a correct conclusion in the context of the problem.

### z-Procedures versus t-Procedures

In this chapter, we explore inference for means and proportions. When we deal with means, we may use, depending on the conditions, either *t*-procedures or *z*-procedures. With proportions, assuming the proper conditions are met, we deal only with large samples—that is, with *z*-procedures.

When doing inference for a population mean, or for the difference between two population means, we will usually use *t*-procedures. This is because *z*-procedures assume that we know the population standard deviation (or deviations in the two-sample situation) which we rarely do. We typically use *t*-procedures when doing inference for a population mean or for the difference between two population means when:

- The sample is a simple random sample from the population
and

- The sample size is large (rule of thumb:
*n* ≥ 30) or the population from which the sample is drawn is approximately normally distributed (or, at least, does not depart dramatically from normal)
You can always use *z*-procedures when doing inference for population means when:

- The samples are simple random samples from the population
and

- The population(s) from which the sample(s) is (are) drawn is normally distributed (in this case, the sampling distribution of will also be normally distributed)
and

- The population standard deviation(s) is (are) known.

Historically, many texts allowed you to use *z* procedures when doing inference for means if your sample size was large enough to argue, based on the central limit theorem, that the sampling distribution of is approximately normal. The basic assumption is that, for large samples, the sample standard deviation *s* is a reasonable estimate of the population standard deviation σ. Today, most statisticians would tell you that it's better practice to *always* use *t*-procedures when doing inference for a population mean or for the difference between two population means. You can receive credit on the AP exam for doing a large sample problem for means using *z*-procedures but it's definitely better practice to use *t*-procedures.

When using *t*-procedures, it is important to check in step II of the hypothesis test procedure, that the data could plausibly have come from an approximately normal population. A stemplot, boxplot, etc., can be used to show there are no outliers or extreme skewness in the data. *t*-procedures are **robust** against these assumptions, which means that the procedures still work reasonably well even with some violation of the condition of normality. Some texts use the following guidelines for sample size when deciding whether of not to use *t*-procedures:

*n* < 15. Use *t*-procedures if the data are close to normal (no outliers or skewness).
*n* > 15. Use *t*-procedures unless there are outliers or marked skewness.
*n* > 40. Use *t*-procedures for any distribution.

For the two-sample case discussed later, these guidelines can still be used if you replace *n* with *n*_{1} and *n*_{2}.

### Using Confidence Intervals for Two-Sided Alternatives

Consider a two-sided significance test at, say, α = 0.05 and a confidence interval with *C* = 0.95. A sample statistic that would result in a significant finding at the 0.05 level would also generate a 95% confidence interval that would not contain the hypothesized value. Confidence intervals for two-sided hypothesis tests could then be used in place of generating a test statistic and finding a *P*-value. If the sample value generates a *C*-level confidence interval that does not contain the hypothesized value of the parameter, then a significance test based on the same sample value would reject the null hypothesis at α = 1 – *C*.

You should *not* use confidence intervals for hypothesis tests involving one-sided alternative hypotheses. For the purposes of this course, confidence intervals are considered to be two sided (although there are one-sided confidence intervals).

Practice problems for these concepts can be found at: