Practice problems for these concepts can be found at:

The second major part of a course in statistics involves making *inferences* about populations based on sample data (the first was *exploratory data analysis*). The ability to do this is based on being able to make statements such as, "The probability of getting a finding as different, or more different, from expected as we got by chance alone, under the assumption that the null hypothesis is true, is 0.6." To make sense of this statement, you need to have a understanding of what is meant by the term "probability" as well as an understanding of some of the basics of probability theory.

An **experiment or chance experiment (random phenomenon)**: An activity whose outcome we can observe or measure but we do not know how it will turn out on any single trial. Note that this is a somewhat different meaning of the word "experiment" than we developed in the last chapter.

**example:** if we roll a die, we know that we will get a 1, 2, 3, 4, 5, or 6, but we don't know *which* one of these we will get on the next trial. Assuming a fair die, however, we *do* have a good idea of approximately what proportion of each possible outcome we will get over a large number of trials.

**Outcome:** One of the possible results of an experiment (random phenomenon).

**example:** the possible outcomes for the roll of a single die are 1, 2, 3, 4, 5, 6. Individual outcomes are sometimes called **simple events**.

### Sample Spaces and Events

**Sample space:** The set of all possible outcomes, or simple events, of an experiment.

**example:** For the roll of a single die, S = {1, 2, 3, 4, 5, 6}.

**Event:** A collection of outcomes or simple events. That is, an event is a subset of the sample space.

**example:** For the roll of a single die, the sample space (all outcomes or simple events) is S = {1, 2, 3, 4, 5, 6}. Let event A = "the value of the die is 6." Then A = {6}. Let B = "the face value is less than 4." Then B = {1, 2, 3}. Events A and B are subsets of the sample space.

**example:** Consider the experiment of flipping two coins and noting whether each coin lands heads or tails. The sample space is S = {HH, HT, TH, TT}. Let event B = "at least one coin shows a head." Then B = {HH, HT, TH}. Event B is a subset of the sample space S.

**Probability of an event**: the relative frequency of the outcome. That is, it is the fraction of time that the outcome would occur if the experiment were repeated indefinitely. If we let E = the event in question, *s* = the number of ways an outcome can succeed, and *f* = the number of ways an outcome can fail, then

Note that *s* + *f* equals the number of outcomes in the sample space. Another way to think of this is that the probability of an event is the sum of the probabilities of all outcomes that make up the event.

For any event A, *P*(A) ranges from 0 to 1, inclusive. That is, 0 ≤ *P*(A) ≤ 1. This is an algebraic result from the definition of probability when success is guaranteed (*f* = 0, *s* = 1) or failure is guaranteed (*f* = 1, *s* = 0).

The sum of the probabilities of all possible outcomes in a sample space is one. That is, if the sample space is composed of *n* possible outcomes,

**example**: In the experiment of flipping two coins, let the event A = obtain at least one head. The sample space contains four elements ({HH, HT, TH, TT}). *s* = 3 because there are three ways for our outcome to be considered a success ({HH, HT, TH}) and *f* = 1.

Thus

**example:** Consider rolling two fair dice and noting their sum. A sample space for this event can be given in table form as follows:

Let B = "the sum of the two dice is greater than 4." There are 36 outcomes in the samples space, 30 of which are greater than 4. Thus,

Furthermore,

### Probabilities of Combined Events

*P*(A or B): The probability that **either** event A **or** event B occurs. (They can both occur, but only one needs to occur.) Using set notation, *P*(A or B) can be written P(A B). A B is spoken as, "A union B."

*P*(A and B): The probability that **both** event A **and** event B occur. Using set notation, *P*(A and B) can be written *P*(A ∩ B). A ∩ B is spoken as, "A intersection B."

**example:** Roll two dice and consider the sum (see table). Let A = "one die shows a 3," B = "the sum is greater than 4." Then *P*(A or B) is the probability that *either* one die shows a 3 *or* the sum is greater than 4. Of the 36 possible outcomes in the sample space, there are 32 possible outcomes that are successes [30 outcomes greater than 4 as well as (1,3) and (3,1)], so

There are nine ways in which a sum has one die showing a 3 and has a sum greater than 4: [(3,2), (3,3), (3,4), (3,5), (3,6), (2,3), (4,3), (5,3), (6,3)], so

**Complement of an event A:** events in the sample space that are not in event A. The complement of an event A is symbolized by , or Furthermore, *P*() = 1 – *P*(A).

### Mutually Exclusive Events

**Mutually exclusive (disjoint) events**: Two events are said to be *mutually exclusive* (some texts refer to mutually exclusive events as *disjoint*) if and only if they have no outcomes in common. That is, A ∩ B = Ø. If A and B are mutually exclusive, then *P*(A and B) = *P*(A ∩ B) = 0.

**example:** in the two-dice rolling experiment, A = "face shows a 1" and B = "sum of the two dice is 8" are mutually exclusive because there is no way to get a sum of 8 if one die shows a 1. That is, events A and B cannot both occur.

### Conditional Probability

**Conditional Probability**: "The probability of A given B" assumes we have knowledge of an event B having occurred before we compute the probability of event A. This is symbolized by *P*(A|B). Also,

Although this formula will work, it's often easier to think of a condition as reducing, in some fashion, the original sample space. The following example illustrates this "shrinking sample space."

**example:** Once again consider the possible sums on the roll of two dice. Let A = "the sum is 7," B = "one die shows a 5." We note, by counting outcomes in the table, that *P*(A) = 6/36. Now, consider a slightly different question: what is *P*(A|B) (that is, what is the probability of the sum being 7 *given that* one die shows a 5)?

**solution:** Look again at the table:

The condition has effectively reduced the sample space from 36 outcomes to only 11 (you do not count the "10" twice). Of those, two are 7s. Thus, the *P*(*the sum is* 7 |*one die shows a* 5) = 2/11.

**alternate solution**: If you insist on using the formula for conditional probability, we note that P(A and B) = *P*(*the sum is* 7 and *one die shows a* 5) = 2/36, and *P*(B) = *P*(*one die shows a* 5) = 11/36. By formula

Some conditional probability problems can be solved by using a **tree diagram**. A tree diagram is a schematic way of looking at all possible outcomes.

**example**: Suppose a computer company has manufacturing plants in three states. 50% of its computers are manufactured in California, and 85% of these are desktops; 30% of computers are manufactured in Washington, and 40% of these are laptops; and 20% of computers are manufactured in Oregon, and 40% of these are desktops. All computers are first shipped to a distribution site in Nebraska before being sent out to stores. If you picked a computer at random from the Nebraska distribution center, what is the probability that it is a laptop?

**solution:**

Note that the final probabilities add to 1 so we know we have considered all possible outcomes. Now, *P*(*laptop*)

= 0.075 + 0.12 + 0.12 = 0.315

.

### Independent Events

**Independent Events:** Events A and B are said to be *independent* if and only if *P*(A) = *P*(A|B) or *P*(B) = *P*(B|A). That is, A and B are independent if the knowledge of one event | having occurred | does not change the probability that the other event occurs.

**example:** Consider drawing one card from a standard deck of 52 playing cards.

Let A = "the card drawn is an ace." |
*P*(A) = 4/52 = 1/13. |

Let B = "the card drawn is a 10, J, Q, K, or A." |
*P*(B) = 20/52 = 5/13. |

Let C = "the card drawn is a diamond." |
*P*(C) = 13/52 = 1/4. |

- Are A and B independent?
**solution:** P(A|B) = *P*(the card drawn is an ace | the card is a 10, J, Q, K, or A) = 4/20 = 1/5 (there are 20 cards to consider, 4 of which are aces). Since *P*(A) = 1/13, knowledge of B has changed what we know about A. That is, in this case, *P*(A) ≠ *P*(A|B), so events A and B are *not* independent.

- Are A and C independent?
**solution:** P(A|C) = P(the card drawn is an ace | the card drawn is a diamond) = 1/13 (there are 13 diamonds, one of which is an ace). So, in this case, *P*(A) = *P*(A|C), so that the events "the card drawn is an ace" and "the card drawn is a diamond" are independent.

### Probability of A and B or A or B

**The Addition Rule:** *P*(A or B) = *P*(A) + P(B) – *P*(A and B).

Special case of *The Addition Rule*: If A and B are *mutually exclusive*,

*P*(A and B) = 0, so *P*(A or B) = *P*(A) + *P*(B).

**The Multiplication Rule**: *P*(A and B) = *P*(A) · *P*(B|A).

Special case of *The Multiplication Rule*: If A and B are *independent*,

*P*(B|A) = *P*(B), so *P*(A and B) = *P*(A) · *P*(B).

**example:** If A and B are two mutually exclusive events for which *P*(A) = 0.3, *P*(B) = 0.25. Find *P*(A or B).

**solution:** *P*(A or B) = 0.3 + 0.25 = 0.55.

**example:** A basketball player has a 0.6 probability of making a free throw. What is his probability of making two consecutive free throws if

- he gets very nervous after making the first shot and his probability of making the second shot drops to 0.4.
**solution:** *P*(making the first shot) = 0.6, *P*(making the second shot | he made the first) = 0.4. So, *P*(making both shots) = (0.6)(0.4) = 0.24.

- the events "he makes his first shot" and "he makes the succeeding shot" are independent.

**solution:** Since the events are independent, his probability of making each shot is the same. Thus, *P*(he makes both shots) = (0.6)(0.6) = 0.36.

Practice problems for these concepts can be found at: