Hi all, This Blog is an English archive of my PhD experience in Imperial College London, mainly logging my research and working process, as well as some visual records.

Friday, 24 August 2007

Chi-Squared Distribution, F-Distribution, t-Distribution

Chi-Squared Distribution


If Y_i have normal independent distributions with mean 0 and variance 1, then

chi^2=sum_(i==1)^rY_i^2
(1)

is distributed as chi^2 with r degrees of freedom. This makes a chi^2 distribution a gamma distribution with theta=2 and alpha=r/2, where r is the number of degrees of freedom.


chi-square
Probability density function
F-Distribution

A continuous statistical distribution which arises in the testing of whether two observed samples have the same variance. Let chi_m^2 and chi_n^2 be independent variates distributed as chi-squared with m and n degrees of freedom.

Define a statistic F_(n,m) as the ratio of the dispersions of the two distributions

F_(n,m)=(chi_n^2/n)/(chi_m^2/m).

Fisher-Snedecor
Probability density function
t-Distribution

StudentsTDistribution

A statistical distribution published by William Gosset in 1908. His employer, Guinness Breweries, required him to publish under a pseudonym, so he chose "Student." Given N independent measurements x_i, let

t=(x^_-mu)/(s/sqrt(N)),
(1)

where mu is the population mean, x^_ is the sample mean, and s is the estimator for population standard deviation (i.e., the sample variance) defined by

s^2=1/(N-1)sum_(i==1)^N(x_i-x^_)^2.
(2)

Student's t-distribution is defined as the distribution of the random variable t which is (very loosely) the "best" that we can do not knowing sigma.


The relation of F to t

Since the F test is just an extension of the t test to more than two groups, they should be related and they are. With two group, F=t^2. For example, consider the critical values for df=(1,15) with alpha=0.05: F(1,15)=4.54=t(15)^2

The relation of T to Gaussian Distr.

The t density curves are symmetric and bell-shaped like the normal distribution and have their peak at 0. However, the spread is more than that of the standard normal distribution. This is due to the fact that in formula 1, the denominator is s rather than . Since s is a random quantity varying with various samples, the variability in t is more, resulting in a larger spread.

The larger the degrees of freedom, the closer the t-density is to the normal density. This reflects the fact that the standard deviation s approaches for large sample size n. You can visualize this in the applet below by moving the sliders.


The stationary curve is the standard normal density.



No comments: