Kruskal-Wallis ANOVA

Navigation:  Analysis of variance and covariance > Non-Parametric ANOVA >

Kruskal-Wallis ANOVA

Previous pageReturn to chapter overviewNext page

The Kruskal-Wallis analysis of variance (KW ANOVA) is a non-parametric version of standard one-way ANOVA, in which the data are replaced by their ranks, and the test evaluates the median rank values. As such the test is a multi-group extension of the Wilcoxon/Mann-Witney rank sum test. The test uses the following formula:

where n is the total number of observations and there are k groups each with ni members (often all equal sized). The ranked scores are those denoted by the letter r, with the r bar values being the medians of the groups and the overall median for all the data. The above formula can be re-written in an alternative form, which is simpler to compute:

where Ri is the sum of the ranks from group i. The statistic is approximately distributed as a chi-square with k-1 degrees of freedom. Tied values also present a potential problem and adjustments to the formula above for ties can be used, but as Kruskal and Wallis (1952, p587 [KRU1]) note, the effect of these adjustments is generally small, particularly if the number of observations is not large and the number of ties is less than 25% of the total. Standard software implementation, such as the MATLab function kruskalwallis() and the R function kruskal.test() do perform adjustment for ties, which explains the slight difference between the results obtained from the formulas above and package output.

The chi-square approximation is generally very good, but as with contingency table analysis the approximation is not so good if the number of replicates in groups (ni) is less than 5. Kruskal and Wallis (1952, table 6.1 [KRU1] tabulate comparisons between the exact probabilities and the chi-square approximation for 3-group problems with up to 5 members in each group and a range of computed H values.

Example: Bacteria counts in milk samples

To illustrate its use, we take the bacteria count data used in the earlier section on one-way ANOVA and replace the entries for each shipment with the ranks of the counts (shown in brackets) rather than the counts themselves:

Bacteria count data, shipments 1-5, with ranks

1

2

3

4

5

24 (27.5)

14 (14.5)

11 (9)

7 (4)

19 (22.5)

15 (16.5)

7 (4)

9 (7)

7 (4)

24 (27.5)

21 (25)

12 (11)

7 (4)

4 (1)

19 (22.5)

27 (29)

17 (19)

13 (13)

7 (4)

15 (16.5)

33 (30)

14 (14.5)

12 (11)

12 (11)

10 (8)

23 (26)

16 (18)

18 (20.5)

18 (20.5)

20 (24)

We then carry out a KW analysis of variance on the ranked data, in this case using the MATLab function kruskalwallis(), which computes the ranks from the source data automatically. The resulting ANOVA table looks very similar to that produced earlier, with one obvious difference being that the statistics computed are evaluated using the chi-square distribution rather than the F distribution. If a standard ANOVA is carried out on the ranked data above (not a recommended procedure) it produces a rather different result but still shows the observed differences between columns to be significant. The Chi-square value reported is 16.91 as compared with 16.80 produced using the formula without adjustment for ties. The correction used within the R library involves dividing the final result by: (1 - sum(TIES^3 - TIES)/(n^3 - n))), where the sum is over each group of tied values (in this example, 7 groups of ties).

Bacteria count KW ANOVA

Source

Sums of squares

Degrees of freedom

Mean squares

Chi-square

Prob>Chi-sq

Shipments

1302.25

4

325.5625

16.91

0.0020

Residual error

930.75

25

37.23

 

 

Totals

2233.00

29

 

 

 

In this example the two methods produce similar results, showing that there is a significant difference between shipments. However, whereas the standard ANOVA procedure assumes that the underlying data are Normally distributed the KW approach does not, and is not sensitive to non-Normality. For example, suppose that one of the samples in shipment 4 was recorded as having a bacteria count of 70 rather than 7. The standard ANOVA for the data would be greatly affected by this outlier as it has the effect of greatly increasing the residual or error variance and reducing the between shipments variance, with the net effect that the F ratio becomes approximately 1 with a probability of 0.43, so non-significant. KW ANOVA on the other hand is barely altered, even though the rankings for shipment 4 are altered, and yields a chi-square of 12.14 and a probability of 0.0163, still highly significant. Of course this is an extreme example, and the KW test still requires that samples are independent and can be meaningfully ranked, but it is clearly a much more robust approach than the parametric method. On the other hand, where the assumptions behind the parametric approach are met it does provide a more sensitive test (i.e. it can detect differences between groups with greater accuracy).

References

[KRU1] Kruskal W H, Wallis W A (1952) Use of ranks in one-criterion variance analysis. J of the American Statistical Association, 47(260), 583–621