Statisitcs Definitions Help (page 2)
The process of truncation is a method of approximating numbers denoted as decimal expansions. It involves the deletion of all the numerals to the right of a certain point in the decimal part of an expression. Some electronic calculators use truncation to fit numbers within their displays. For example, the number 3.830175692803 can be shortened in steps as follows:
Rounding is the preferred method of approximating numbers denoted as decimal expansions. In this process, when a given digit (call it r) is deleted at the right-hand extreme of an expression, the digit q to its left (which becomes the new r after the old r is deleted) is not changed if 0 ≤ r ≤ 4. If 5 ≤ r ≤ 9, then q is increased by 1 (''rounded up''). Most electronic calculators use rounding rather than truncation. If rounding is used, the number 3.830175692803 can be shortened in steps as follows:
Cumulative Absolute Frequency
When data are tabulated, the absolute frequencies are often shown in one or more columns. Look at Table 2-5, for example. This shows the results of the tosses of the blue die in the experiment we looked at a while ago. The first column shows the number on the die face. The second column shows the absolute frequency for each face, or the number of times each face turned up during the experiment. The third column shows the cumulative absolute frequency, which is the sum of all the absolute frequency values in table cells at or above the given position.
The cumulative absolute frequency numbers in a table always ascend (increase) as you go down the column. The total cumulative absolute frequency should be equal to the sum of all the individual absolute frequency numbers. In this instance, it is 6000, the number of times the blue die was tossed.
Cumulative Relative Frequency
Relative frequency values can be added up down the columns of a table, in exactly the same way as the absolute frequency values are added up. When this is done, the resulting values, usually expressed as percentages, show the cumulative relative frequency.
Examine Table 2-6. This is a more detailed analysis of what happened with the blue die in the above-mentioned experiment. The first, second, and fourth columns in Table 2-6 are identical with the first, second, and third columns in Table 2-5. The third column in Table 2-6 shows the percentage represented by each absolute frequency number. These percentages are obtained by dividing the number in the second column by 6000, the total number of tosses. The fifth column shows the cumulative relative frequency, which is the sum of all the relative frequency values in table cells at or above the given position.
The cumulative relative frequency percentages in a table, like the cumulative absolute frequency numbers, always ascend as you go down the column. The total cumulative relative frequency should be equal to 100%. In this sense, the cumulative relative frequency column in a table can serve as a checksum, helping to ensure that the entries have been tabulated correctly.
The mean for a discrete variable in a distribution is the mathematical average of all the values. If the variable is considered over the entire population, the average is called the population mean. If the variable is considered over a particular sample of a population, the average is called the sample mean. There can be only one population mean for a population, but there can be many different sample means. The mean is often denoted by the lowercase Greek letter mu, in italics (μ). Sometimes it is also denoted by an italicized lowercase English letter, usually x, with a bar (vinculum) over it.
Table 2-7 shows the results of a 10-question test, given to a class of 100 students. As you can see, every possible score is accounted for. There are some people who answered all 10 questions correctly; there are some who did not get a single answer right. In order to determine the mean score for the whole class on this test – that is, the population mean, called μp – we must add up the scores of each and every student, and then divide by 100. First, let's sum the products of the numbers in the first and second columns. This will give us 100 times the population mean:
- (10 × 5) + (9 × 6) + (8 × 19) + (7 × 17) + (6 ×18) + (5 × 11) + (4 × 6) + (3 × 4) + (2 × 4) + (1 × 7) + (0 × 3)
- = 50 + 54 + 152 + 119 + 108 + 55 + 24 + 12 + 8 + 7 + 0
- = 589
Dividing this by 100, the total number of test scores (one for each student who turns in a paper), we obtain μp = 589/100 = 5.89.
The teacher in this class has assigned letter grades to each score. Students who scored 9 or 10 correct received grades of A; students who got scores of 7 or 8 received grades of B; those who got scores of 5 or 6 got grades of C; those who got scores of 3 or 4 got grades of D; those who got less than 3 correct answers received grades of F. The assignment of grades, informally known as the ''curve,'' is a matter of teacher temperament and doubtless would seem arbitrary to the students who took this test. (Some people might consider the ''curve'' in this case to be overly lenient, while a few might think it is too severe.)
If the number of elements in a distribution is even, then the median is the value such that half the elements have values greater than or equal to it, and half the elements have values less than or equal to it. If the number of elements is odd, then the median is the value such that the number of elements having values greater than or equal to it is the same as the number of elements having values less than or equal to it. The word ''median'' is synonymous with ''middle.''
Table 2-8 shows the results of the 10-question test described above, but instead of showing letter grades in the third column, the cumulative absolute frequency is shown instead. The tally is begun with the top-scoring papers and proceeds in order downward. (It could just as well be done the other way, starting with the lowest-scoring papers and proceeding upward.) When the scores of all 100 individual papers are tallied this way, so they are in order, the scores of the 50th and 51st papers – the two in the middle – are found to be 6 correct. Thus, the median score is 6, because half the students scored 6 or above, and the other half scored 6 or below.
It's possible that in another group of 100 students taking this same test, the 50th paper would have a score of 6 while the 51st paper would have a score of 5. When two values ''compete,'' the median is equal to their average. In this case it would be midway between 5 and 6, or 5.5.
The mode for a discrete variable is the value that occurs the most often. In the test whose results are shown in Table 2-7, the most ''popular'' or often occurring score is 8 correct answers. There were 19 papers with this score. No other score had that many results. Therefore, the mode in this case is 8.
Suppose that another group of students took this test, and there were two scores that occurred equally often. For example, suppose 16 students got 8 answers right, and 16 students also got 6 answers right. In this case there are two modes: 6 and 8. This sort of distribution is called a bimodal distribution.
Now imagine there are only 99 students in a class, and there are exactly 9 students who get each of the 11 possible scores (from 0 to 10 correct answers). In this distribution, there is no mode. Or, we might say, the mode is not defined.
The mean, median, and mode are sometimes called measures of central tendency. This is because they indicate a sort of ''center of gravity'' for the values in a data set.
- Kindergarten Sight Words List
- First Grade Sight Words List
- 10 Fun Activities for Children with Autism
- Grammar Lesson: Complete and Simple Predicates
- Definitions of Social Studies
- Child Development Theories
- Signs Your Child Might Have Asperger's Syndrome
- How to Practice Preschool Letter and Name Writing
- Social Cognitive Theory
- Theories of Learning