Hi all, This Blog is an English archive of my PhD experience in Imperial College London, mainly logging my research and working process, as well as some visual records.

Saturday 25 August 2007

Type I Error &Type II Error

Errors & the Power of a Test

As can be seen, hypothesis testing is just educated guessing. Moreover, guesses (educated or not) are sometimes wrong. Consider the possible decisions we can make:

Let us now consider each decision in more detail.

A Type I Error is the false rejection of a true null. It has a probability of alpha (a). In other words, this error occurs as a result of the fact that we have to somehow separate probable from improbable.

    Picture (511x298, 18.3Kb)

Correct Decision I occurs when we fail to reject a true null. It has a probability of 1-a. From a scientist's perspective this is a "boring" result.


    Picture (510x298, 16.3Kb)

A Type II Error is the false retention of a false null. It has a probability equal to beta (b).

    Picture (512x239, 13.9Kb)

Correct Decision II occurs when we reject a false null. The whole purpose of the experiment is to provide the occasion for this type of decision. In other words, we performed the statistical test because we expect the sample to differ. This decision has a probability of 1-b. This probability is also known as the power of the statistical test. In other words, the ability of a test to find a difference when there really is one, is power.

    Picture (511x246, 18.6Kb)

Factors Influencing Power:

  1. Alpha (a). Alpha and beta are inversely related. In other words, as one increases, the other decreases (i.e., a ´ b = K). Thus, all other things being equal, using an alpha of .05 will result in a more powerful test than using an alpha of .01.

  2. Sample Size (N). The bigger the sample (i.e., the more work we do), the more powerful the test.

  3. Type of Test. Metric tests (as compared to nonparametric tests that we discuss later in the semester) are generally more powerful due to assumptions that are more restrictive.

  4. Variability. Generally speaking, variability in the sample and/or population results in a less powerful test.

  5. Test Directionality. One-tailed tests have the potential to be more powerful than two-tailed tests.

  6. Robustness of the Effect. Six beers are more likely to influence reaction time than one beer.

No comments: