Statistics Ground Zero/Significance

From Wikibooks, open books for an open world
< Statistics Ground Zero
Jump to navigation Jump to search


In statistical testing we deal in probabilities. To ask our research question in a statistically testable way is to ask

If the null hypothesis is true, how likely is it that I would observe the data that I have collected?

Put slightly more technically

The p value represents the probability of seeing data this extreme if the null hypothesis were true

We set a threshold, most commonly 99% or 95%, meaning that we acknowledge that we might be misled into rejecting the null hypothesis 1% or 5% of the time respectively. P must fall below this threshold for us to reject the null hypothesis. That is to say p must be less than 0.01 or less than 0.05 (the inverse of 99% and 95% expressed as decimals).

This value, the p value, is said to determine whether the outcome of a test is significant or not. If the outcome is significant then the null hypothesis is rejected.

Choosing a test[edit]

Very often people find step three above - choosing the correct test - the most difficult, but if we know what we want to do and something about the nature of our data it is not so very difficult. The following table covers a surprisingly large number of common cases.

Question Measure of Dependent Variable Two Variables or Groups More than Two Variables or Groups Parametric Non-parametric
Is there an association? Nominal Yes [1] Chi-square
Is there an association? Ordinal Two Spearman's correlation coefficient (with an indication of strength)
Is there an association? Scalar Two Pearson's correlation coefficient (with an indication of strength)
Are the means or medians the same? Scalar Two Student's T-test Mann-Whitney U-test
Are the means or medians the same? Scalar More than two Analysis of Variance (ANOVA) Kruskal-Wallis
Can I predict one from another? Scalar Two More than two independent Regression or multiple regression

One or two tails?[edit]

When we formulate our hypothesis involving the comparison of values for a parameter or statistic, we choose whether to ask the question in one of two ways. We might simply ask are the values different or we might ask is one value smaller (or greater) than the other. In the first case we will determine the outcome using a two tailed test and in the second case, using a one tailed test.


  1. This is not true: it is possible to test the assocation of more than two nominal variables but the design is complicated


1 Introduction

2 Statistical Measure

3 Parametric and Non-parametric Methods

4 Descriptive Statistics

5 Inferential Statistics: hypothesis testing

6 Degrees of freedom

7 Significance

8 Association

9 Comparing groups or variables

10 Regression