Statistics/Distributions/F

From Wikibooks, open books for an open world
< Statistics‎ | Distributions
Jump to: navigation, search

F Distribution[edit]

Fisher-Snedecor
Probability density function
F distributionPDF.png
Cumulative distribution function
F distributionCDF.png
Parameters d1, d2 > 0 deg. of freedom
Support x ∈ [0, +∞)
PDF \frac{\sqrt{\frac{(d_1\,x)^{d_1}\,\,d_2^{d_2}}
{(d_1\,x+d_2)^{d_1+d_2}}}}
{x\,\mathrm{B}\!\left(\frac{d_1}{2},\frac{d_2}{2}\right)}\!
CDF I_{\frac{d_1 x}{d_1 x + d_2}} \left(\tfrac{d_1}{2}, \tfrac{d_2}{2} \right)
Mean \frac{d_2}{d_2-2}\!
for d2 > 2
Mode \frac{d_1-2}{d_1}\;\frac{d_2}{d_2+2}\!
for d1 > 2
Variance \frac{2\,d_2^2\,(d_1+d_2-2)}{d_1 (d_2-2)^2 (d_2-4)}\!
for d2 > 4
Skewness \frac{(2 d_1 + d_2 - 2) \sqrt{8 (d_2-4)}}{(d_2-6) \sqrt{d_1 (d_1 + d_2 -2)}}\!
for d2 > 6
Ex. kurtosis see text
MGF does not exist, raw moments defined in text and in [1][2]
CF see text

Named after Sir Ronald Fisher, who developed the F distribution for use in determining ANOVA critical values. The cutoff values in an F table are found using three variables- ANOVA numerator degrees of freedom, ANOVA denominator degrees of freedom, and significance level.

ANOVA is an abbreviation of analysis of variance. It compares the size of the variance between two different samples. This is done by dividing the larger variance over the smaller variance. The formula of the F statistic is:


F(r_1, r_2) = \frac{\chi_{r1} ^2 /r_1}{\chi_{r2} ^2 /r_2}

where  \chi_{r1}^2 and  \chi_{r2}^2 are the chi-square statistics of sample one and two respectively, and  r1 \mbox{and } r2 are their degrees of freedom, i.e. the number of observations.

One example could be if you want to compare apples that look alike but are from different trees and have different sizes. You want to investigate whether they have the same variance of the weight on average.

There are three apples from the first tree that weigh 110, 121 and 143 grams respectively, and four from the other which weigh 88, 93, 105 and 124 grams respectively. The mean and variance of the first sample are 124.67 and 16.80 respectively, and of the second sample 102.50 and 16.01. The chi-square statistic of the first sample is


\frac{110-124.67}{16.80^2} + \frac{121-124.67}{16.80^2} + \frac{143-124.67}{16.80^2} = 2.00
,

and for the second sample


\frac{88-102.50}{16.01^2} + \frac{93-102.50}{16.01^2} + \frac{105-102.50}{16.01^2} + \frac{124-102.50}{16.01^2} = 3.00
.

The F statistic is now F = \frac{3/4}{2/3} = 1.125. The Chi-square statistic divided by degrees of freedom appears on the nominator for the second sample because it was larger than that of the first sample.

The critical value of the F distribution for 4 degrees of freedom. in the nominator and 3 degrees of freedom in the denominator, i.e. F(f1=4, f2=3) is 9.12 at a 5% level of confidence. Since the test statistic 1.125 is smaller than the critical value, we cannot reject the null hypothesis that they have the same variance. The conclusion is that they have the same variance.

External links[edit]

  • Invalid <ref> tag; no text was provided for refs named johnson
  • Invalid <ref> tag; no text was provided for refs named abramowitz