The Anderson-Darling (AD) statistic is a goodness-of-fit test that is primarily used for deciding whether a sample of size n is drawn from a specified distribution, most commonly whether the sample data is drawn from a Normal distribution, N(0,1). In this context it is widely believed to be one of the best statistics of this type available, even with relatively small sample sizes. The statistic requires an ordered set of sample data without duplicates, {Y1<Y2<Y3....Yn}, and the cumulative distribution function (cdf), F(), of the distribution being considered. As such it is similar to the Kolmogorov-Smirnov (KS) test (see below) but with improved sensitivity in the tails of the distribution F(). Statistics of this type are sometimes referred to as EDF tests, so-called because they compare the sample data set which is regarded as the empirical distribution function (EDF) with some known distribution function. Like the KS test it uses a form of distance function to compute the similarity between the EDF and F(), but unlike KS the AD statistic considers the full range of data values rather than simply the largest deviation of values. The type of function used is based on the Cramer-von Mises goodness-of-fit approach, which computes the (weighted) squared area between the stepped sample cdf, Fn(x), and the diagonal, y=x, using the integral:

The AD statistic makes use of the observation that, to obtain a random sample of values from any given probability distribution function, f(x), one computes the cdf, F(x), and then takes random samples of x from a Uniform distribution over the range [0,1], looking up the random sample values using the selected cdf. The test statistic, A, is of the form:

Looking closely at this expression we see that the summation is a form of weighted cross-product of the sampled values, transformed by the chosen function, F(x). The transformation effectively allows one to compare the sampled data with a Uniform distribution, or seen graphically, with the diagonal (y=x) of a graph whose vertical axis reflects the cdf, F(), for example a Normal probability plot (for an example, see above).

If the function, F(), is a Normal distribution with known mean and standard deviation, i.e. it is fully specified in advance, and the sample size >5, then the significance levels for the AD statistic are as follows: 1%:3.857 (Marsaglia, 2004, states the correct figure is 3.878, [MAR1]); 5%:2.492; 10%:1.933

More typically the parameters of the Normal distribution against which comparison is being made are not known in advance, and are therefore estimated using the mean and standard deviation of the sample dataset in the usual manner. In this instance the approach taken is to form the ordered set {Yi} from the sample data, {xi}, by standardization to give a dataset with zero mean and unit standard deviation for comparison with the Normal, N(0,1):

For small sample sizes the computed statistic is often multiplied by an adjustment factor, k, which varies according to the distribution being fitted. For the Normal, with estimated mean and standard deviation, the adjustment widely used is:

and the significance levels for kA in this case, based on Stephens (1972 and later papers, [STE1], [STE2]) are: 1%:1.092; 2.5%:0.918; 5%:0.787; 10%:0.656

In Anderson & Darling's original 1952 paper [AND1] they did not derive a closed solution for the distribution of A, for arbitrary sample size n - their principal result is for the limiting distribution as n→∞ . To obtain the best possible estimate for the significance values, p, Marsaglia (2004, [MAR1]) ran large-scale simulations for sample sizes n=8,16,32,64,128 and then fitted a family of polynomial functions to the difference, or error, between the observed values and the target of a Uniform distribution. This is essentially a form of adjustment to the limit value for finite sample sizes based on simulation results. Note that this procedure applies for any cumulative distribution function, F(), not just the Normal.

Although software packages typically cite Anderson & Darling (1952) and/or Stephens (1972 and later papers, [STE1], [STE2]), the specific computations are rarely provided. In the case of R, implementations do provide source code and relevant references. The R Package ADGofTest by Bellosta uses the computational procedure described in Marsaglia's 2004 paper.

[AND1] Anderson T W, Darling D A (1952) Asymptotic theory of certain "goodness-of-fit" criteria based on stochastic processes. Annals of Mathematical Statistics, 23, 193–212. Available from: http://projecteuclid.org/DPubS/Repository/1.0/Disseminate?view=body&id=pdf_1&handle=euclid.aoms/1177729437

[MAR1] Marsaglia G (2004) Evaluating the Anderson-Darling distribution J Stat. Software, 9(2). Available from: http://www.jstatsoft.org/v09/i02/paper

[STE1] Stephens M A (1972) EDF tests for goodness of fit: Part I. US Office of Naval Research, Tech. Rpt. 186

[STE2] Stephens M A (1974) EDF tests for goodness of fit and some comparisons. J Amer. Statistical Assoc., 69(347), 730-737

Wikipedia: http://en.wikipedia.org/wiki/Anderson-Darling_test

NIST: http://www.itl.nist.gov/div898/handbook/eda/section3/eda35e.htm