Range of Data[edit | edit source]
The range of a sample (set of data) is simply the maximum possible difference in the data, i.e. the difference between the maximum and the minimum values. A more exact term for it is "range width" and is usually denoted by the letter R or w. The two individual values (the max. and min.) are called the "range limits". Often these terms are confused and students should be careful to use the correct terminology.
For example, in a sample with values 2 3 5 7 8 11 12, the range is 11 (|12|-|2|+1=11) and the range limits are 2 and 12.
The range is the simplest and most easily understood measure of the dispersion (spread) of a set of data, and though it is very widely used in everyday life, it is too rough for serious statistical work. It is not a "robust" measure, because clearly the chance of finding the maximum and minimum values in a population depends greatly on the size of the sample we choose to take from it and so its value is likely to vary widely from one sample to another. Furthermore, it is not a satisfactory descriptor of the data because it depends on only two items in the sample and overlooks all the rest. A far better measure of dispersion is the standard deviation (s), which takes into account all the data. It is not only more robust and "efficient" than the range, but is also amenable to far greater statistical manipulation. Nevertheless the range is still much used in simple descriptions of data and also in quality control charts.
The mean range of a set of data is however a quite efficient measure (statistic) and can be used as an easy way to calculate s. What we do in such cases is to subdivide the data into groups of a few members, calculate their average range, and divide it by a factor (from tables), which depends on n. In chemical laboratories for example, it is very common to analyse samples in duplicate, and so they have a large source of ready data to calculate s.
(The factor k to use is given under standard deviation.)
For example: If we have a sample of size 40, we can divide it into 10 sub-samples of n=4 each. If we then find their mean range to be, say, 3.1, the standard deviation of the parent sample of 40 items is appoximately 3.1/2.059 = 1.506.
With simple electronic calculators now available, which can calculate s directly at the touch of a key, there is no longer much need for such expedients, though students of statistics should be familiar with them.