The KruskalWallis analysis of variance (KW ANOVA) is a nonparametric version of standard oneway ANOVA, in which the data are replaced by their ranks, and the test evaluates the median rank values. As such the test is a multigroup extension of the Wilcoxon/MannWitney rank sum test. The test uses the following formula:
where n is the total number of observations and there are k groups each with ni members (often all equal sized). The ranked scores are those denoted by the letter r, with the r bar values being the medians of the groups and the overall median for all the data. The above formula can be rewritten in an alternative form, which is simpler to compute:
where Ri is the sum of the ranks from group i. The statistic is approximately distributed as a chisquare with k1 degrees of freedom. Tied values also present a potential problem and adjustments to the formula above for ties can be used, but as Kruskal and Wallis (1952, p587 [KRU1]) note, the effect of these adjustments is generally small, particularly if the number of observations is not large and the number of ties is less than 25% of the total. Standard software implementation, such as the MATLab function kruskalwallis() and the R function kruskal.test() do perform adjustment for ties, which explains the slight difference between the results obtained from the formulas above and package output.
The chisquare approximation is generally very good, but as with contingency table analysis the approximation is not so good if the number of replicates in groups (ni) is less than 5. Kruskal and Wallis (1952, table 6.1 [KRU1] tabulate comparisons between the exact probabilities and the chisquare approximation for 3group problems with up to 5 members in each group and a range of computed H values.
Example: Bacteria counts in milk samples
To illustrate its use, we take the bacteria count data used in the earlier section on oneway ANOVA and replace the entries for each shipment with the ranks of the counts (shown in brackets) rather than the counts themselves:
Bacteria count data, shipments 15, with ranks
1 
2 
3 
4 
5 
24 (27.5) 
14 (14.5) 
11 (9) 
7 (4) 
19 (22.5) 
15 (16.5) 
7 (4) 
9 (7) 
7 (4) 
24 (27.5) 
21 (25) 
12 (11) 
7 (4) 
4 (1) 
19 (22.5) 
27 (29) 
17 (19) 
13 (13) 
7 (4) 
15 (16.5) 
33 (30) 
14 (14.5) 
12 (11) 
12 (11) 
10 (8) 
23 (26) 
16 (18) 
18 (20.5) 
18 (20.5) 
20 (24) 
We then carry out a KW analysis of variance on the ranked data, in this case using the MATLab function kruskalwallis(), which computes the ranks from the source data automatically. The resulting ANOVA table looks very similar to that produced earlier, with one obvious difference being that the statistics computed are evaluated using the chisquare distribution rather than the F distribution. If a standard ANOVA is carried out on the ranked data above (not a recommended procedure) it produces a rather different result but still shows the observed differences between columns to be significant. The Chisquare value reported is 16.91 as compared with 16.80 produced using the formula without adjustment for ties. The correction used within the R library involves dividing the final result by: (1  sum(TIES^3  TIES)/(n^3  n))), where the sum is over each group of tied values (in this example, 7 groups of ties).
Bacteria count KW ANOVA
Source 
Sums of squares 
Degrees of freedom 
Mean squares 
Chisquare 
Prob>Chisq 
Shipments 
1302.25 
4 
325.5625 
16.91 
0.0020 
Residual error 
930.75 
25 
37.23 


Totals 
2233.00 
29 



In this example the two methods produce similar results, showing that there is a significant difference between shipments. However, whereas the standard ANOVA procedure assumes that the underlying data are Normally distributed the KW approach does not, and is not sensitive to nonNormality. For example, suppose that one of the samples in shipment 4 was recorded as having a bacteria count of 70 rather than 7. The standard ANOVA for the data would be greatly affected by this outlier as it has the effect of greatly increasing the residual or error variance and reducing the between shipments variance, with the net effect that the F ratio becomes approximately 1 with a probability of 0.43, so nonsignificant. KW ANOVA on the other hand is barely altered, even though the rankings for shipment 4 are altered, and yields a chisquare of 12.14 and a probability of 0.0163, still highly significant. Of course this is an extreme example, and the KW test still requires that samples are independent and can be meaningfully ranked, but it is clearly a much more robust approach than the parametric method. On the other hand, where the assumptions behind the parametric approach are met it does provide a more sensitive test (i.e. it can detect differences between groups with greater accuracy).
References
[KRU1] Kruskal W H, Wallis W A (1952) Use of ranks in onecriterion variance analysis. J of the American Statistical Association, 47(260), 583–621