Statistics/Displaying Data/Box Plots

From Wikibooks, open books for an open world
< Statistics‎ | Displaying Data
Jump to: navigation, search

Box Plots[edit]

Figure 1. Box plot of data from the Michelson-Morley Experiment

A box plot (also called a box and whisker diagram) is a simple visual representation of key features of a univariate sample.

The box lies on a vertical axis in the range of the sample. Typically, a top to the box is placed at the 1st quartile, the bottom at the third quartile. The width of the box is arbitrary, as there is no x-axis (though see Violin Plots, below).

In between the top and bottom of the box is some representation of central tendency. A common version is to place a horizontal line at the median, dividing the box into two. Additionally, a star or asterisk is placed at the mean value, centered in the box in the horizontal direction.

Another common extension is to the 'box-and-whisker' plot. This adds vertical lines extending from the top and bottom of the plot to for example, the maximum and minimum values, The farthest value within 2 standard deviations above and below the mean. Alternatively, the whiskers could extend to the 2.5 and 97.5 percentiles. Finally, it is common in the box-and-whisker plot to show outliers (however defined) with asterisks at the individual values beyond the ends of the whiskers.

Violin Plots are an extension to box plots using the horizontal information to present more data. They show some estimate of the CDF instead of a box, though the quantiles of the distribution are still shown.