Hi all, This Blog is an English archive of my PhD experience in Imperial College London, mainly logging my research and working process, as well as some visual records.

Thursday 30 August 2007

Statistical Data Analysis: Elementary Concepts

Understanding Statistical Inference

Statistical inference is based upon mathematical laws of probability. The following example will give you the basic ideas.

Statistical Inference & The Coin Toss

Suppose we want to do a few coin tosses (sample) so that we can decide if a particular coin is equally likely to land head or tail over an infinite number of tosses (population).

If we toss the coin ten times and get 6 heads and 4 tails, we might suspect the coin is biased towards heads, but we wouldn't be very confident about this, because it's not that unusual (not that improbable) to get 6 heads out of 10.

On the other hand, if we toss the coin ten times and get 10 heads - we would be more confident that the coin is biased towards heads, because it is very unusual (not very probable at all) that we would get this result from an unbiased coin.

Statistical Data Analysis: Hypothesis Testing

The most common kind of statistical inference is hypothesis testing. Statistical data analysis allows us to use mathematical principles to decide how likely it is that our sample results match our hypothesis about a population. For example, if our research hypothesis is that the coin is not fair, but is actually biased towards heads - we can use principles of statistics to tell us how likely it is that we could get our sample results even if the coin were fair after all (null hypothesis).

If the probability of getting our sample results from a fair coin is very low, we feel confident in rejecting the null hypothesis (that the coin is fair). Even though we can't say for sure (because even a fair coin could produce our sample results), we can say that the results of our study support the hypothesis that the coin is indeed biased.

When we make this decision based on statistical data analysis, this is statistical inference.

Statistical Data Analysis: p-value

In statistical hypothesis testing we use a p-value (probability value) to decide whether we have enough evidence to reject the null hypothesis and say our research hypothesis is supported by the data.

The p-value is a numerical statement of how likely it is that we could have gotten our sample data (e.g., 10 heads) even if the null hypothesis is true (e.g., fair coin). By convention, if the p-value is less than 0.05 (p <>

No comments: