Sample Size for Estimating a Population Proportion
The confidence interval for a population proportion is given by:
 ± z*
The margin of error is
 z* .
Let M be the desired maximum margin of error. Then,
 M ≤ z*
Solving for n,
But we do not have a value of until we collect data, so we need a way to estimate . Let P* = estimated value of . Then
There are two ways to choose a value of P*:
 Use a previous determined value of . That is, you may already have an idea, based on historical data, about what the value should be close to.
 Use P* = 0.5. A result from calculus tells us that the expression
achieves its maximum value when P* = 0.5. Thus, n will be at its maximum if P* = 0.5. If P* = 0.5, the formula for n can more easily be expressed as
.
It is in your interest to choose the smallest value of n that will match your goals, so any value of P* < 0.5 would be preferable if you have some justification for it.
example: Historically, about 60% of a company's products are purchased by people who have purchased products from the company previously. The company is preparing to introduce a new product and wants to generate a 95% confidence interval for the proportion of its current customers who will purchase the new product. They want to be accurate within 3%. How many customers do they need to sample?
solution: Based on historical data, choose P* = 0.6. Then
.
The company needs to sample 1025 customers. Had it not had the historical data, it would have had to use P* = 0.5.
If P* = 0.5, n ≥ = 1067.1. You need a sample of at least 1068 customers. By using P* = 0.6, the company was able to sample 43 fewer customers.
