From Wikibooks, open books for an open world
< Statistics‎ | Summary
Jump to: navigation, search


  1. Introduction
    1. What Is Statistics?
    2. Subjects in Modern Statistics
    3. Why Should I Learn Statistics? 0% developed
    4. What Do I Need to Know to Learn Statistics?
  2. Different Types of Data
    1. Primary and Secondary Data
    2. Quantitative and Qualitative Data
  3. Methods of Data Collection
    1. Experiments
    2. Sample Surveys
    3. Observational Studies
  4. Data Analysis
    1. Data Cleaning
    2. Moving Average
  5. Summary Statistics
    1. Measures of center
      1. Mean, Median, and Mode
      2. Geometric Mean
      3. Harmonic Mean
      4. Relationships among Arithmetic, Geometric, and Harmonic Mean
      5. Geometric Median
    2. Measures of dispersion
      1. Range of the Data
      2. Variance and Standard Deviation
      3. Quartiles and Quartile Range
      4. Quantiles
  6. Displaying Data
    1. Bar Charts
    2. Comparative Bar Charts
    3. Histograms
    4. Scatter Plots
    5. Box Plots
    6. Pie Charts
    7. Comparative Pie Charts
    8. Pictograms
    9. Line Graphs
    10. Frequency Polygon
  7. Probability
    1. Combinatorics
    2. Bernoulli Trials
    3. Introductory Bayesian Analysis
  8. Distributions
    1. Discrete Distributions
      1. Uniform Distribution
      2. Bernoulli Distribution
      3. Binomial Distribution
      4. Poisson Distribution
      5. Geometric Distribution
      6. Negative Binomial Distribution
      7. Hypergeometric Distribution
    2. Continuous Distributions
      1. Uniform Distribution
      2. Exponential Distribution
      3. Gamma Distribution
      4. Normal Distribution
      5. Chi-Square Distribution
      6. Student-t Distribution
      7. F Distribution
      8. Beta Distribution
      9. Weibull Distribution
  9. Testing Statistical Hypothesis
    1. Purpose of Statistical Tests
    2. Formalism Used
    3. Different Types of Tests
    4. z Test for a Single Mean
    5. z Test for Two Means
    6. t Test for a single mean
    7. t Test for Two Means
    8. paired t Test for comparing Means
    9. One-Way ANOVA F Test
    10. z Test for a Single Proportion
    11. z Test for Two Proportions
    12. Testing whether Proportion A Is Greater than Proportion B in Microsoft Excel
    13. Spearman's Rank Coefficient
    14. Pearson's Product Moment Correlation Coefficient
    15. Chi-Squared Tests
      1. Chi-Squared Test for Multiple Proportions
      2. Chi-Squared Test for Contingency
    16. Approximations of distributions
  10. Point Estimates100% developed  as of 12:07, 28 March 2007 (UTC) (12:07, 28 March 2007 (UTC))
    1. Unbiasedness
    2. Measures of goodness
    3. UMVUE
    4. Completeness
    5. Sufficiency and Minimal Sufficiency
    6. Ancillarity
  11. Practice Problems
    1. Summary Statistics Problems
    2. Data-Display Problems
    3. Distributions Problems
    4. Data-Testing Problems
  12. Numerical Methods
    1. Basic Linear Algebra and Gram-Schmidt Orthogonalization
    2. Unconstrained Optimization
    3. Quantile Regression
    4. Numerical Comparison of Statistical Software
    5. Numerics in Excel
    6. Statistics/Numerical_Methods/Random Number Generation
  13. Multivariate Data Analysis
    1. Principal Component Analysis
    2. Factor Analysis for metrical data
    3. Factor Analysis for ordinal data
    4. Canonical Correlation Analysis
    5. Discriminant Analysis
  14. Analysis of Specific Datasets
    1. Analysis of Tuberculosis
  15. Appendix
    1. Authors
    2. Glossary
    3. Index
    4. Links

edit this box

Range of Data[edit]

The range of a sample (set of data) is simply the maximum possible difference in the data, i.e. the difference between the maximum and the minimum values. A more exact term for it is "range width" and is usually denoted by the letter R or w. The two individual values (the max. and min.) are called the "range limits". Often these terms are confused and students should be careful to use the correct terminology.

For example, in a sample with values 2 3 5 7 8 11 12, the range is 11 (|12|-|2|+1=11) and the range limits are 2 and 12.

The range is the simplest and most easily understood measure of the dispersion (spread) of a set of data, and though it is very widely used in everyday life, it is too rough for serious statistical work. It is not a "robust" measure, because clearly the chance of finding the maximum and minimum values in a population depends greatly on the size of the sample we choose to take from it and so its value is likely to vary widely from one sample to another. Furthermore, it is not a satisfactory descriptor of the data because it depends on only two items in the sample and overlooks all the rest. A far better measure of dispersion is the standard deviation (s), which takes into account all the data. It is not only more robust and "efficient" than the range, but is also amenable to far greater statistical manipulation. Nevertheless the range is still much used in simple descriptions of data and also in quality control charts.

The mean range of a set of data is however a quite efficient measure (statistic) and can be used as an easy way to calculate s. What we do in such cases is to subdivide the data into groups of a few members, calculate their average range, \bar{R} and divide it by a factor (from tables), which depends on n. In chemical laboratories for example, it is very common to analyse samples in duplicate, and so they have a large source of ready data to calculate s.

s= \frac {\bar{R}}k

(The factor k to use is given under standard deviation.)

For example: If we have a sample of size 40, we can divide it into 10 sub-samples of n=4 each. If we then find their mean range to be, say, 3.1, the standard deviation of the parent sample of 40 items is appoximately 3.1/2.059 = 1.506.

With simple electronic calculators now available, which can calculate s directly at the touch of a key, there is no longer much need for such expedients, though students of statistics should be familiar with them.