Bartlett's M test

Navigation:  Classical tests > Variance tests > Tests of homogeneity >

Bartlett's M test

Previous pageReturn to chapter overviewNext page

This test is due to Bartlett (1937, [BAR1]) and is one of the methods used to test whether a set of estimates of the variance, taken from k>2 independent samples, can be regarded as being equal, homogeneous, or homoscedastic. The test assumes that the population or populations have Normal distributions (this is a fairly strict assumption) and that the samples are random, independent samples from the respective populations. As a result, although widely used, Bartlett's test can be regarded as non-robust. In the description of the test and example provided below, we follow Pearson and Hartley (1954, [PEA1]).

The test statistic is defined as follows:

Provided the individual sample variances equal some (unknown) variance value, and none of the degrees of freedom is small, M is approximately distributed as a Chi-square distribution with k-1 degrees of freedom.

Example: Children's weights

This example is from Pearson and Hartley (1954, p58, [PEA1]). The table shows the variance of the weights of boys of a similar age calculated from 10 independent samples. The null hypothesis is that these weight variations can be ascribed to random variation from a common population variance.

Sample

# of boys

Weight var, si2

νi

ln(si2)

νiln(si2)

1/νi

νisi2

1

10

51

9

3.93

35.39

0.111

459

2

15

78

14

4.36

60.99

0.071

1092

3

21

91

20

4.51

90.22

0.050

1820

4

23

52

22

3.95

86.93

0.045

1144

5

15

101

14

4.62

64.61

0.071

1414

6

11

36

10

3.58

35.84

0.100

360

7

31

41

30

3.71

111.41

0.033

1230

8

15

76

14

4.33

60.63

0.071

1064

9

3

64

2

4.16

8.32

0.500

128

10

6

93

5

4.53

22.66

0.200

465

Totals

150

 

140

 

576.99

1.254

9176

From the totals above M can be computed as M=140ln(9176/140)-576.99=8.8. Since the critical value of the chi-square distribution, upper 5% tail, is 16.92 this value is much larger than the observed figure and we infer that there is no real difference in the variances of the samples, and therefore the assumption of homogeneity is reasonable. A larger difference, for example M=18.8, would be greater than the chi-square figure, but is actually marginal at the 5% level because the chi-square approximation is not quite correct for small values of k. Indeed it is poor for very small values of k (k<4). Closer approximations to the true distribution of M have been made, and adjustment factors result in a figure just under 18 at the 5% level, so the adjusted measure would suggest the figure of 18.8 is significant and the homogeneity assumption is risky.

References

[BAR1] Bartlett M S (1937) Properties of sufficiency and statistical tests. Proc of the Royal Statistical Society, Series A, 160, 268–282

[PEA1] Pearson E S, Hartley H O eds. (1954) Biometrika Tables for Statisticians. 4th edition. Vol. 1, Cambridge University Press, Cambridge, UK