Education.com
Try
Brainzy
Try
Plus

# Evaluating Results: Statistics, Probability, and Proof (page 3)

By John Wiley & Sons, Inc.
Updated on Jan 1, 2011

### Can You Prove It?

Let's say that Alice is doing an experiment as her science project in which she has planted popcorn seeds in two planters to test the value of a fertilizer. She uses the controlled experiment design.

To the experimental group she adds a chemical fertilizer, urea, a nitrogen compound that may be put into the soil or dissolved in the water given the plants. Her independent variable is the addition of the urea to the experimental group. Her dependent variable, if she observes one, is the difference in growth rate (height or weight) of the plants in her two planters.

At a proper time in her experiment, she measures the heights of the plants with the following results:

We see that there is a difference between the means (commonly called average) of the two groups. The difference is 1.9 cm in favor of the experimental group; the average height of the plants in that group is 1.9 cm taller than the height of the plants in the control group. This looks good.

"See!" Alice says. "Adding urea to the experimental planting has made the corn grow faster." Can she be sure of this? No, she cannot. Maybe it was a chance happening that she got five taller growing plants in the experimental group and five shorter growing plants in the control. She should not make any decision just yet. She should get someone to make a good statistical treatment (unless she can do it herself) that would go beyond comparing the mean heights of the two groups.

A statistical analysis would show how much the heights vary among themselves. Then it would show how the means compare with a larger "population" of plants like Ken's 100 plants. Where would this larger population be found? It would be imagined, inferred, or hypothetical: it would be created out of the variability, the range, the scatter of her sample and the size of the sample. It would be created by the use of equations in statistics books.

Furthermore, a judgment would be made about the chance, or the probability, that the difference Alice found was or was not simply a chance difference. This, too, would be done by reference to appropriate tables in statistics books. Actually, the number of plants in Alice's experiment is too small (only five) to make it worth all of that analysis, yet her results are supported by agricultural research by professional scientists and by the experiences of the thousands of farmers who have found it useful to apply urea and other nitrogen compounds to their corn plantings.

With all of that support, why wouldn't scientists declare that they have proven the value of this treatment of corn? The problem lies partly in this question: How can you know when you have proven a thing to be true? And it lies partly in the way the words "prove" and "true" are used in mathematics and logic as compared to the way they are used in ordinary speech.

First, the mathematics and logic. You and I can agree that this is a true statement in arithmetic: 148 + 293 + 167 = 608. That is, we follow certain rules of mathematics to prove whether the statement is an equality. Mathematicians would not agree, however, that we had proven it by following the rules of addition. They are more concerned about the sources of those rules. In the end, they would show that the statement was proven by agreeing on certain things about arithmetic and its rules.

In logic of the formal sort, proof would be much the same, as in this example:

If all wangtups have gitly speekrongs,
And if Q is a wangtup,
Then Q has gitly speekrongs.

Even though the statements do not mean anything in real life, if we accept the first and second statements as true, then the conclusion, the third statement, is also true. The "proof" is all right there in the statement. It has nothing to do with real people or things and their mixed-up ways.

Still, these simple examples do not do justice to mathematics and logic. Both are fascinating and powerful tools of thought or reasoning that humankind has created. The proof or truth of these examples, however, is so very much different from the kinds of proof that scientists are seeking that it becomes awkward to try to use the same language to describe them all. Even though mathematicians and logicians got there first with the terms "prove" and "true," scientists in recent times have pulled away from using these terms.

In ordinary experience as well there is a problem with these key words. Most people would say, "See, Alice proved it! It is true that urea makes corn grow faster." Or they might say, "That proves it! Hocus is better for a headache than Pocus," even though they may have used the medication only one time and their test has serious weaknesses. Or, again: "That proves it! Dreams do foretell the future. I knew that you were coming because I dreamed about it!"

These difficulties with the language, however, do not provide the main objection to the use of "prove" in scientific work. When we talk about "proving" something in science we are, in effect, predicting the future as well as examining the present. How much can we depend on something happening in the future just because today's scientific findings show it to be probable now?

In Alice's experiment, for example, she used only five plants in each planter. Such a small sample cannot tell us much about the larger population of future corn plantings, no matter how much statistical analysis we apply to it. However, let's do some more analysis of Alice's results to see how this helps us to learn about the predictive value of her findings. Let's rearrange the measurements of the corn plants according to height (see table 13.2).

Does this tell us more than a simple comparison of the means? Suppose her results in the experimental group had been as in table 13.3 (also ranked by height).

Here we see that the difference between the means of the two groups is the same as in table 13.2. But notice the range of heights in table 13.3. The experimental plants are not as uniformly taller than the control plants as they were in table 13.2. There is more variability. These results would provide a less reliable basis for predicting about future plantings.

I hope that you begin to agree, if you had not already known, that statistical treatment of data can reveal useful information. Finding the means and their difference is statistical analysis. Ranking the heights and comparing the pairs of plants is statistical analysis. These two ways of analyzing data are very elementary (even antiquated) when compared with the methods used by people with more mathematical and statistical knowledge.

### Replicating and Expanding on Experiments

How could Alice "prove" more, besides just making statistical analyses of her data? She could replicate the experiment. This would raise the predictive power of her test if results were as good as the first test or better, even though it would still not finally prove anything. We must accept this because there is always uncertainty about the future. Some things are more highly probable than others, of course. We are all fairly sure that the sun will come up tomorrow, while we may not be so sure that another planting of corn, treated as Alice's was treated, will turn out the same. So we are always dealing in probabilities.

Scientists like to show that their findings allow them to predict, or generalize, in another way than in the simple replication of an experiment or other investigation. Alice could expand her research in several ways:

Plan A: One experimental level of urea, applied in water (Alice's first plan).

Planter Description
1 Control: no urea
2 Experimental: 2 g (grams) per liter of water used to water the plantings

Plan B: Three experimental levels of urea, applied in water

Planter Description
1 Control; no urea
2 Experimental: 2 g urea per liter of water
3 Experimental: 4 g urea per liter of water
4 Experimental: 6 g urea per liter of water

Plan C: Three experimental levels of urea, applied in soil

Planter Description
1 Control; no urea
2 Experimental: 109 urea mixed in the soil
3 Experimental: 20 g urea mixed in the soil
4 Experimental: 30 g urea mixed in the soil

If she were to test both variables—two ways of applying urea and three different levels of urea—at the same time Alice would need an arrangement of planters (or outdoor plots) as in figure 13.3.

You may be interested in figuring out how many different experiments would be needed to test each of these plans one at a time against a control and against each different level of urea. Also consider that there are other nitrogen compounds that should be compared with urea; each should be applied in different amounts. Then there are other kinds of soil, other varieties of corn, other planting methods, other methods of applying the fertilizer, and other chemicals that may be as important as nitrogen for promoting healthy growth in the corn. Many of these variables would best be tested in combination with certain other variables. Therefore, the designs in some cases would be more complicated than in the above Plan C. For the most significant results, most of the experiments would be conducted all the way through to the mature stage of the crop. Therefore, the testing would need to be done outdoors in plots of land large enough to accommodate farm equipment.

Surely under these expanded conditions there would be enough "population" to make the results prove something! Well, perhaps not surely, but more so. And yet these methods would create other problems. Rarely would individual plants be measured in order to determine results. Instead, more gross measures, such as weighing the grain from each plot or weighing the grain and other plant matter, would be used. This would increase our confidence in the results in that they would not be affected so much by variations among individual plants as in Alice's small groups. Nevertheless, the different plots might vary as to quality of soil, drainage conditions, and the like, and so scientists have found that each "treatment" must be used over several smaller plots that are spread in a randomized pattern around an entire field. For example, instead of two larger plots, one experimental and one control, the experimental plot is divided into five smaller experimental plots (each given the same treatment) and the control plot is divided into five smaller control plots. These plots are distributed randomly throughout the entire field. As a consequence, we find that we are dealing with a small number of things (five plots) as in Alice's experiment with five corn plants. While this gives important improvements to the overall plan, it still shows somewhat the same problem of a small sample (small number of plots). As a consequence, the statistical treatment for such a study must be highly developed if you are to squeeze the most meaning out of the results.

All said and done, there is still uncertainty about the evaluation or results as there is elsewhere in scientific method. We must not be disheartened about this uncertainty, however. Unfortunately, many people have been oversold on science and its powers for finding out the "truth" about things. Others have shown disappointment over the way science has not been able to solve more problems. It is important to understand that scientific methods are the best that have been found so far for learning about many things, and that they are superior to ordinary, everyday, "commonsense" methods. That's why scientific methods are called "scientific"—they are better than unscientific methods. Yet, by comparison, humankind has been working with scientific methods only a short time. Not all kinds of human problems can be solved by using scientific kinds of knowledge, but those problems that might be solved by scientific methods seem to be limitless. Nevertheless, in spite of the uncertainty of science and the limited speed with which scientists can move into new areas, we must use scientific methods to find out all we can about the world and the people and things in it. Even with its uncertainty, it is still the best we have for trying to resolve many of the problems of humankind.