Hi all, This Blog is an English archive of my PhD experience in Imperial College London, mainly logging my research and working process, as well as some visual records.

Tuesday 28 August 2007

Wilcoxon Signed Rank Test & Wilcoxon Rank Sum Test

1. Basis

Statistical Rank: The ordinal number of a value in a list arranged in a specified order (usually decreasing).

Rank Test: A statistical test making use of the statistical ranks of data points. Examples include the Kolmogorov-Smirnov test and Wilcoxon signed rank test.

Robust Estimate: An estimation technique which is insensitive to small departures from the idealized assumptions which have been used to optimize the algorithm. Classes of such techniques include M-estimates (which follow from maximum likelihood considerations)

, L-Estimates (which are linear combinations of order statistics), and R-Estimates (based on statistical rank tests).

Wilcoxon Test Statistic: The Wilcoxon test statistic is equivalent to the T_+ statistic in the Wilcoxon signed rank test (Kanji 1999).

2. Wilcoxon Signed Rank Test(To test difference between paired data)

A nonparametric alternative to the paired t-test which is similar to the Fisher sign test, but it is much more sensitive. In fact, for large numbers it is almost as sensitive as the Student t-test. For small numbers with unknown distributions this test is even more sensitive than the Student t-test. As it is only on rare occasions that we do know that values are Normal distributed, this test is to be preferred over the Student t-test. This test assumes that there is information in the magnitudes of the differences between paired observations, as well as the signs. Take the paired observations, calculate the differences, and rank them from smallest to largest by absolute value. Add all the ranks associated with positive differences, giving the T_+ statistic. Finally, the P-value associated with this statistic is found from an appropriate table. The Wilcoxon test is an R-estimate.

2.1 Steps

STEP 1

  • Exclude any differences which are zero
  • Put the rest of differences in ascending order
  • Ignore their signs
  • Assign them ranks
  • If any differences are equal, average their ranks

STEP 2

  • Count up the ranks of +ives as T+
  • Count up the ranks of –ives as T-

STEP 3

  • If there is no difference between drug (T+) and placebo (T-), then T+ & T- would be similar
  • If there were a difference
    • one sum would be much smaller and
    • the other much larger than expected
  • The smaller sum is denoted as T
  • T = smaller of T+ and T-

STEP 4

  • Compare the value obtained with the critical values (5%, 2% and 1% ) in table
  • N is the number of differences that were ranked (not the total number of differences)
  • So the zero differences are excluded

2.2 Example:

To test whether a new drug is significantly effective from the placebo to patients.

Patient

Hours of sleep

Difference

Rank

Ignoring sign

Drug

Placebo

1

6.1

5.2

0.9

3.5*

2

7.0

7.9

-0.9

3.5*

3

8.2

3.9

4.3

10

4

7.6

4.7

2.9

7

5

6.5

5.3

1.2

5

6

8.4

5.4

3.0

8

7

6.9

4.2

2.7

6

8

6.7

6.1

0.6

2

9

7.4

3.8

3.6

9

10

5.8

6.3

-0.5

1

3rd & 4th ranks are tied hence averaged

T= smaller of T+ (50.5) and T- (4.5)

Here T=4.5 significant at 2% level indicating the drug (hypnotic) is more effective than placebo

3. Wilcoxon Rank Sum Test

(A nonparametric alternative to the two-sample t-test.)

3.1 Mechanism

The Wilcoxon test is based upon ranking the nA + nB observations of the combined sample. Each observation has a rank: the smallest has rank 1, the 2nd smallest rank 2, and so on. The Wilcoxon rank-sum test statistic is the sum of the ranks for observations from one of the samples. Let us use sample A here and use wA to denote the observed rank sum and WA to represent the corresponding random variable.

wA = sum of the ranks for observations from A.

How do we obtain the P-value corresponding to the rank-sum test statistic wA? To answer this question we must ¯rst consider how rank sums behave under H0, and how they behave under H1. Fig. 3 depicts two situations using samples of size nA = nB = 5 and plotting sample A observations with a "2" and sample B observations with an "o".

Suppose that H0 : A = B is true. In this case, all n = nA + nB observations are being drawn from the same distribution and we might expect behavior somewhat like Fig. 3(a) in which the pattern of black and white circles is random. The set of ranks for n observations are the numbers 1; 2; : : : ; n.

When nA of our n observations from a distribution are labeled A and nB observations from the same distribution are labeled B, then as far as the behavior of the ranks (and thus wA) is concerned, it is just as if we randomly labeled nA of the numbers 1; 2; : : : ; n with A's and the rest with B's. The distribution of a rank sum, WA, under such conditions has been worked out
and computer programs and sets of Tables are available for this distribution.

² o o ² o ² ² o ² o

Rank 1 2 3 4 5 6 7 8 9 10
(a)

o o o o ² o ² ² ² ²
Rank 1 2 3 4 5 6 7 8 9 10
(b)
Figure 3 : Behaviour of ranks.

Suppose that H1 : A > B is true: In this case we would expect behavior more like that in Fig. 3(b) which results in sample A containing more of the larger ranks. Evidence against H0 which con¯rms H1 : A > B is thus provided by an observed rank sum wA which is unusually large according to the distribution of rank sums when H0 is true. Thus the P-value for the test is

(H1 : A > B) P-value = pr(WA ¸ wA);

where the probability is calculated using the distribution that WA would have if H0 was true. Suppose, on the other hand, that the alternative H1 : A < value =" pr(WA"> B and WA ¸ wA.

For the two-sided test, i.e. testing H0 : A = B versus the alternative H1 :A 6= B, a rank sum that is either too big or too small provides evidence against H0. We then calculate the probability of falling into the tail of the distribution closest to wA and double it. Thus if wA is in the lower tail
then P-value = 2 pr(WA · wA), whereas if wA is in the upper tail then

P-value = 2 pr(WA ¸ wA).

3.2 Steps

Step 1

  • Rank the data of both the groups in ascending order
  • If any values are equal average their ranks
Step 2
  • Add up the ranks in group with smaller sample size
  • If the two groups are of the same size either one may be picked
  • T= sum of ranks in group with smaller sample size

Step 3

  • Compare this sum with the critical ranges given in table
  • Look up the rows corresponding to the sample sizes of the two groups
  • A range will be shown for the 5% significance level

3.3 Example

Non-smokers (n=15)

Heavy smokers (n=14)

Birth wt (Kg)

Rank

Birth wt (Kg)

Rank

3.99

27

3.18

7

3.79

24

2.84

5

3.60*

18

2.90

6

3.73

22

3.27

11

3.21

8

3.85

26

3.60*

18

3.52

14

4.08

28

3.23

9

3.61

20

2.76

4

3.83

25

3.60*

18

3.31

12

3.75

23

4.13

29

3.59

16

3.26

10

3.63

21

3.54

15

2.38

2

3.51

13

2.34

1

2.71

3




Sum=272


Sum=163

* 17, 18 & 19are tied hence the ranks are averaged

When one performs a Wilcoxon test by hand, Tables are required to P-values
.



1 comment:

Unknown said...

hey,

this was clearly written (not sure if I fully understood the ranksum test since you elaborated a bit less)
I am curious to know the situations under which one is preferred...
any thoughts on that?
ps: there is also a signtest that ONLY looks at the signs