Statistics Ground Zero/Significance
In statistical testing we deal in probabilities. To ask our research question in a statistically testable way is to ask
- If the null hypothesis is true, how likely is it that I would observe the data that I have collected?
Put slightly more technically
- The p value represents the probability of seeing data this extreme if the null hypothesis were true
We set a threshold, most commonly 99% or 95%, meaning that we acknowledge that we might be misled into rejecting the null hypothesis 1% or 5% of the time respectively. P must fall below this threshold for us to reject the null hypothesis. That is to say p must be less than 0.01 or less than 0.05 (the inverse of 99% and 95% expressed as decimals).
This value, the p value, is said to determine whether the outcome of a test is significant or not. If the outcome is significant then the null hypothesis is rejected.
Choosing a test
Very often people find step three above - choosing the correct test - the most difficult, but if we know what we want to do and something about the nature of our data it is not so very difficult. The following table covers a surprisingly large number of common cases.
|Question||Measure of Dependent Variable||Two Variables or Groups||More than Two Variables or Groups||Parametric||Non-parametric|
|Is there an association?||Nominal||Yes||||Chi-square|
|Is there an association?||Ordinal||Two||Spearman's correlation coefficient (with an indication of strength)|
|Is there an association?||Scalar||Two||Pearson's correlation coefficient (with an indication of strength)|
|Are the means or medians the same?||Scalar||Two||Student's T-test||Mann-Whitney U-test|
|Are the means or medians the same?||Scalar||More than two||Analysis of Variance (ANOVA)||Kruskal-Wallis|
|Can I predict one from another?||Scalar||Two||More than two independent||Regression or multiple regression|
One or two tails?
When we formulate our hypothesis involving the comparison of values for a parameter or statistic, we choose whether to ask the question in one of two ways. We might simply ask are the values different or we might ask is one value smaller (or greater) than the other. In the first case we will determine the outcome using a two tailed test and in the second case, using a one tailed test.
- This is not true: it is possible to test the assocation of more than two nominal variables but the design is complicated