This test is due to Bartlett (1937, [BAR1]) and is one of the methods used to test whether a set of estimates of the variance, taken from k>2 independent samples, can be regarded as being equal, homogeneous, or homoscedastic. The test assumes that the population or populations have Normal distributions (this is a fairly strict assumption) and that the samples are random, independent samples from the respective populations. As a result, although widely used, Bartlett's test can be regarded as nonrobust. In the description of the test and example provided below, we follow Pearson and Hartley (1954, [PEA1]).
The test statistic is defined as follows:
Provided the individual sample variances equal some (unknown) variance value, and none of the degrees of freedom is small, M is approximately distributed as a Chisquare distribution with k1 degrees of freedom.
Example: Children's weights
This example is from Pearson and Hartley (1954, p58, [PEA1]). The table shows the variance of the weights of boys of a similar age calculated from 10 independent samples. The null hypothesis is that these weight variations can be ascribed to random variation from a common population variance.
Sample 
# of boys 
Weight var, si2 
νi 
ln(si2) 
νiln(si2) 
1/νi 
νisi2 
1 
10 
51 
9 
3.93 
35.39 
0.111 
459 
2 
15 
78 
14 
4.36 
60.99 
0.071 
1092 
3 
21 
91 
20 
4.51 
90.22 
0.050 
1820 
4 
23 
52 
22 
3.95 
86.93 
0.045 
1144 
5 
15 
101 
14 
4.62 
64.61 
0.071 
1414 
6 
11 
36 
10 
3.58 
35.84 
0.100 
360 
7 
31 
41 
30 
3.71 
111.41 
0.033 
1230 
8 
15 
76 
14 
4.33 
60.63 
0.071 
1064 
9 
3 
64 
2 
4.16 
8.32 
0.500 
128 
10 
6 
93 
5 
4.53 
22.66 
0.200 
465 
Totals 
150 

140 

576.99 
1.254 
9176 
From the totals above M can be computed as M=140ln(9176/140)576.99=8.8. Since the critical value of the chisquare distribution, upper 5% tail, is 16.92 this value is much larger than the observed figure and we infer that there is no real difference in the variances of the samples, and therefore the assumption of homogeneity is reasonable. A larger difference, for example M=18.8, would be greater than the chisquare figure, but is actually marginal at the 5% level because the chisquare approximation is not quite correct for small values of k. Indeed it is poor for very small values of k (k<4). Closer approximations to the true distribution of M have been made, and adjustment factors result in a figure just under 18 at the 5% level, so the adjusted measure would suggest the figure of 18.8 is significant and the homogeneity assumption is risky.
References
[BAR1] Bartlett M S (1937) Properties of sufficiency and statistical tests. Proc of the Royal Statistical Society, Series A, 160, 268–282
[PEA1] Pearson E S, Hartley H O eds. (1954) Biometrika Tables for Statisticians. 4th edition. Vol. 1, Cambridge University Press, Cambridge, UK