The Shapiro-Wilk (SW) test for Normality was introduced by the authors [SHA1] using the observation that a Normal probability plot that examines the fit of a sample dataset to the Normal is rather like linear regression - the diagonal line of the graph is the line of perfect fit, with divergence from this line being similar to the residuals in regression. By analyzing the scale of this variation (analysis of variance) the quality of the fit can be examined. The authors recommended the use of their statistic use with smaller samples (e.g. <20) and using empirical tests of its power and sensitivity against a range of other tests and a variety of non-Normal distributions showed that it is indeed an effective and sensitive measure. The test can be applied to large samples, as was suggested by Royston [ROY1], who also produced algorithms for implementing his extension and which are implemented in the R stats package as shapiro.test().

The test statistic is of the form:

where the yi are the sample data, sorted by size (ordered), and the ai are constants to be evaluated. The idea behind the SW test is that if the sample data is indeed a random sample from a Normal distribution with unknown mean, μ, and variance, σ2, then we should be able to represent the sample data through a simple linear equation:

where the xi are an ordered set of random N(0,1) variates. A least squares fit of the (x,y) pairs provides the means to determine the unknown coefficients ai. The vector of these coefficients is obtained from the matrix expression:

where V is the variance-covariance matrix of the elements of the vector x, and the vector m is the expected value of the elements of x, i.e. the mean values of the order statistics for the Normal distribution. The statistic W is scale and origin invariant and has a maximum value of 1 and a minimum of na12(/(n-1), thus the minimum value is approximately the square of the smallest coefficient for n>10. Unfortunately the distribution of W for general n is not known and must be obtained by simulation and/or tabulation of the results, or using approximation (as is the case with Royston's approach). The statistic is rather like a squared correlation coefficient (or coefficient of determination), so a high value indicates a closer match to the Normal, but this by itself is not sufficient - high values can often be found using small samples of data that are not Normal. The authors claim that it is particularly sensitive to skewness and long-tailed distributions.

Example: Weights and heights of sampled men

Shapiro and Wilk give the example of the weights of 11 randomly selected men. The ordered data values were: 148 154 158 160 161 162 166 170 182 195 236, and the shapiro.test() function in R gives the result: W = 0.7888, p-value = 0.006704, i.e. these data are very unlikely to have been drawn from a Normal distribution. This is also clear from a Normal QQ plot of the data, shown below together with a line drawn through the first and third quartiles.

References

[ROY1] Royston P (1982) An extension of Shapiro and Wilk's W test for normality to large samples. Applied Statistics, 31, 115–124

[SHA1] Shapiro S S, Wilk M B (1965) An Analysis of Variance Test for Normality (Complete Samples). Biometrika, 52(3/4), 591-611