Introduction to Discrete Probability Distributions
Because some probability distributions occur frequently in practice, they have been given specific names. In this lesson, we will discuss three discrete probability distributions: the Bernoulli, the binomial, and the geometric distributions.
Bernoulli Distribution
Suppose we flip a fair coin and observe the upper face. The sample space may be represented as S = {Head, Tail}. Suppose we define X=1 if a head is on the upper face and 0 if a tail is on the upper face. X is an example of a random variable. We have used the term random variable somewhat loosely in earlier lessons. Formally, a random variable X assigns a numerical result to each possible outcome of a random experiment. If X can assume a finite or countably infinite number of values, then X is a discrete random variable; otherwise, X is a continuous random variable. In this lesson, we will consider the distributions of some discrete random variables.
The probability function P(X = x), or p(x), assigns a probability to each possible value of X. Because these are probabilities, 0 ≤ p(x) ≤ 1 for all X = x. Further, if we sum over all possible values of X, we must get one (i.e.,
Bernoulli Trial
A Bernoulli trial is any random experiment that has only two possible outcomes.
|
A discrete probability function is any function that satisfies the following two conditions: (1) The probabilities are between 0 and 1 and (2) the probabilities sum to one. As an illustration, let X= – 1, 0, or 1 if the stock market goes down, up, or stays the same, respectively, on a given day. The probabilities associated with the particular outcomes of X change from day to day. However, suppose for a given day, they are as follows:

Each probability is between 0 and 1, and the sum of the probabilities is one. Thus, this is a valid probability function. The graph of the distribution is given in Figure 10.1.

For the moment, we are going to focus on studies in which each observation may result in one of two possible outcomes. Flipping the coin is one such study as each flip will result in either a head or a tail. In an orchard, each piece of fruit either has or has not been damaged by insects. The television set tested at the factory either works or it does not. A person has a job or does not have a job. In each case, there are only two possible outcomes; one outcome may be labeled a success and the other a failure. A Bernoulli trial is any random experiment that has only two possible outcomes. For a Bernoulli trial, let X be a random variable defined as follows:

The choice of which outcome is considered a success and which is considered a failure is arbitrary. It is only important to clearly state for which outcome X =1 and for which X = 0. The probability of success is denoted by p where 0 < p < 1. Because there are only two outcomes, the probability of a success and the probability of a failure must sum to 1. Thus, the probability of a failure is 1 – p. We can present the probability distribution of the Bernoulli random variable as shown in Table 10.2.

Binomial Distribution
Binomial Distribution
If x is the number of successes in n independent Bernoulli trials, each with the probability of p success, x is a binomial random variable.
|
Seldom are we satisfied with performing one Bernoulli trial. Instead, we want to conduct multiple Bernoulli trials, observing the outcome of each one. Suppose we have n independent Bernoulli trials, each with the probability p of success. Let X be the number of successes observed in the n trials. Then X is a binomial random variable. The probability of x successes in n trials may be written as:

In the above equation,
is the number of ways to choose x items from n and is called the number of combinations of n things taken x at a time. (Recall that n! = n(n – 1)(n – 2) . . . (1) so that 5! = 5(4)(3)(2)(1) =120.) We will consider some examples of the binomial distribution and then provide an explanation of why the probabilities are computed as stated here.
Suppose we randomly select 25 apples from the orchard and count how many are damaged. If one apple being damaged has no effect on whether or not the next selected apple is damaged, then the number of damaged apples would have a binomial distribution with n = 25 and p equal to the proportion of damaged apples in the orchard.
In a quality control study, we could randomly choose 50 television sets from the production line and carefully test each to determine how many are defective. If we assume that whether or not one television is defective is independent of the next television being defective, then the number of defective televisions would have a binomial distribution with n = 50 and p equal to the proportion of defective sets produced during the period of the study.
To understand why the probabilities associated with the binomial random variable are as given in this way, first suppose we flip a coin twice and observe the number of heads. Each flip may be considered to be a Bernoulli trial. Because the outcome on one flip does not affect the outcome on the next flip, the two trials are independent. Let X be the random variable denoting the number of heads observed on the two flips. Then X = 0, 1, or 2. In Figure 10.2, we have a tree diagram representing this experiment. Notice that each flip results in a set of two branches, one representing heads and the other tails. The tree has four terminal branches, representing the outcomes HH,HT,TH, and TT, where H represents heads and T represents tails. The random variable X assigns 2 to HH, 1 to HT, 1 to TH, and 0 to TT. Because all four outcomes are equally likely, we have the following probability distribution for X.


Notice that because the two flips are independent, the probability of two heads is p(2) = p2. Here,
so
Other outcomes can be computed in the same manner. Also note that the values of X are not equally likely. Because there are two ways to obtain one head (HT and TH) and only one way to obtain either zero heads (TT) or two heads (HH), the probability that X=1 is twice that of X = 0 or X = 2. Thus, we have 2p(1 – p) as the probability of one head.
When experiments increase in size, as would be the case if we flipped the coin 40 times, it becomes unreasonable to construct a tree diagram or to list all possible outcomes. We need a general way of counting the number of ways to get x successes in n trials. The number of ways to choose x items from n is
and is called the number of combinations of n things taken r at a time.
For the coin flipping example, we have n = 2. For X = 0, we have
. When X = 1,
. When X = 2, this function is again 1. Thus, we have accurately counted the number of ways to get 0, 1, or 2 heads. The probability of any particular sequence of heads and tails is px(1 –p)n – x because we have x successes, each with probability p, and (n – x) failures, each with probability 1 – p. Thus, the probability of X successes in n trials may be written as
which is the probability function given earlier.
Example
A student knows that the test tomorrow will have ten true-false questions on it. She decides to flip a coin and to mark true if the upper face is a head and mark false if the upper face is a tail. She will repeat the process for each question. What is the probability that she will miss every question?
Solution
Because the student flips the coin to determine the response, she has a probability of
of getting each question correct. The flips are independent, so whether she gets a question correct is independent of whether she gets any other question correct. The probability that the student does not get any question correct is then:

It is very likely that she will get at least some of the questions correct.
Example
In a study, dogs were trained to detect the presence of bladder cancer by smelling urine (see USA Today, September 24, 2004). During training, each dog was presented with urine specimens from healthy people, those from people with bladder cancer, and those from people sick with unrelated diseases. The dog was to lie down by any urine specimen from a person with bladder cancer. After training, each dog was presented with seven urine specimens, only one of which came from a person with cancer. The specimen that the dog laid down beside was recorded. If the dog identified the urine specimen from a person with cancer, the test was considered a success; otherwise, it was a failure. Each dog repeated the test nine times. If a dog cannot detect the presence of bladder cancer by smelling the urine, what is the probability he will identify the specimen with cancer in at least eight of the trials?
Solution
At first glance, this may not seem like a series of Bernoulli trials. However, notice that for each trial, there are two possible outcomes: choosing the specimen associated with cancer or choosing a specimen of an individual without cancer. Because there are six noncancer specimens and only one with cancer, the probability of success is
if the dog cannot detect bladder cancer in urine and so chooses one at random. The trials are independent. Thus, the probability of the dog making the correct identification in at least eight of the nine trials is:

It would be very unlikely for a dog to detect the urine sample from the person with bladder cancer in at least eight of nine trials by chance alone.
Geometric Distribution
For the binomial distribution, we had n independent Bernoulli trials, each with the probability p of success. Suppose now that we have independent Bernoulli trials each with the probability p of success. However, let X be the number of failures prior to the first success. X is a geometric random variable, and the probability that X = x is p(x) = (1 – p)xp, x = 0, 1, 2, ...
To understand why the probabilities are computed in this manner, first note that, if p is the probability of success, (1 – p) is the probability of failure. Further, there is only one way to have x failures prior to the first success; otherwise, we will get a success before the xth failure.
A geometric random variable is sometimes defined as the number of trials needed to obtain the first success instead of as the number of failures prior to the first success. Both definitions are valid, but the probability function differs slightly for the two, so it is important to read the definition carefully when moving from one source to the next.
Example
If we repeatedly flip a fair coin, what is the probability that we will get the first head on the third flip?
Solution
Because we have a fair coin, the probability of a head on each flip is
If the first head is on the third flip, we had two tails (or failures) prior to this first success. Thus, the probability that the number of failures X = 2 prior to the first success is
.
Binomial Distribution In Short
Three common discrete distributions are the Bernoulli, the binomial, and the geometric. The Bernoulli arises when a trial in an experiment has two possible outcomes, commonly referred to as a success and a failure. If we conduct n independent Bernoulli trials, each with the probability p of success, the number of successes is a binomial random variable. If we conduct independent Bernoulli trials, each with the probability p of success, the number of failures prior to the first success is a geometric random variable.
Find practice problems and solutions for these concepts at Discrete Probability Distributions Practice Exercises.
Add your own comment