Statistics/Testing Statistical Hypothesis

From Wikibooks, open books for an open world
Jump to: navigation, search
Two examples of how the means of two distributions may be different, leading to two different statistical hypotheses
Equal distributions, but different means.

There are many different tests for the many different kinds of data. A way to get started is to understand what kind of data you have. Are the variables quantitative or qualitative? Certain tests are for certain types of data depending on the size, distribution or scale. Also, it is important to understand how samples of data can differ. The 3 primary characteristics of quantitative data are: central tendency, spread, and shape.

When most people "test" quantitative data, they tend to do tests for central tendency. Why? Well, let's say you had 2 sets of data and you wanted to see if they were different from each other. One way to test this would be to test to see if their central tendency (their means for example) differ.

Imagine two symmetric, bell shaped curves with a vertical line drawn directly in the middle of each, as shown here. If one sample was a lot different than another (a lot higher in values,etc.) then the means would be different typically. So when testing to see if two samples are different, usually two means are compared.

Two medians (another measure of central tendency) can be compared also. Or perhaps one wishes to test two samples to see if they have the same spread or variation. Because statistics of central tendency, spread, etc. follow different distributions - different testing procedures must be followed and utilized.

In the end, most folks summarize the result of a hypothesis test into one particular value - the p-value. If the p-value is smaller than the level of significance (usually \alpha=5%, but even lower in other fields of science i.e. Medicine) then you reject the null-hypothesis, but this does not mean you accept the alternative hypothesis. The p-value is essentially the probability of obtaining a test statistic at least as extreme as the one observed. If the p-value is greater than the level of significance you fail to reject the null-hypothesis, but this does not mean that the null-hypothesis is correct.