Statistics/Displaying Data/Histograms

From Wikibooks, open books for an open world
< Statistics‎ | Displaying Data
Jump to: navigation, search

Histograms[edit]

Histogram of the Michaelson-Morley Speed of Light Data.png

It is often useful to look at the distribution of the data, or the frequency with which certain values fall between pre-set bins of specified sizes. The selection of these bins is up to you, but remember that they should be selected in order to illuminate your data, not obfuscate it.

A histogram is similar to a bar chart. However histograms are used for continuous (as opposed to discrete or qualitative) data. The defining property of a histogram is:

The area of each bar is proportional to the frequency.

If each bin has an equal width, then this can be easily done by plotting frequency on the vertical axis. However histograms can also be drawn with unequal bin sizes, for which one can plot frequency density.

To produce a histogram with equal bin sizes:

  • Select a minimum, a maximum, and a bin size. All three of these are up to you. In the Histogram data used above the minimum is 1, the maximum is 110, and the bin size is 10.
  • Calculate your bins and how many values fall into each of them. For the Histogram data the bins are:
    • 1 ≤ x < 10, 16 values.
    • 10 ≤ x < 20, 4 values.
    • 20 ≤ x < 30, 4 values.
    • 30 ≤ x < 40, 2 values.
    • 40 ≤ x < 50, 2 values.
    • 50 ≤ x < 60, 1 values.
    • 60 ≤ x < 70, 0 values.
    • 70 ≤ x < 80, 0 values.
    • 80 ≤ x < 90, 0 values.
    • 90 ≤ x < 100, 0 value.
    • 100 ≤ x < 110, 0 value.
    • 110 ≤ x < 120, 1 value.
  • Plot the counts you figured out above. Do this using a standard bar plot.

-->

Frequency Density[edit]

Another way of drawing a histogram is to work out the Frequency Density.

Frequency Density
The Frequency Density is the frequency divided by the class width.

The advantage of using frequency density in a histogram is that doesn't matter if there isn't an obvious standard width to use. For all the groups, you would work out the frequency divided by the class width for all of the groups.

External Links

Statistics