Analysis of variance and covariance

<< Click to Display Table of Contents >>

Navigation:  »No topics above this level«

Analysis of variance and covariance

Analysis of variance, often abbreviated to ANOVA, is a family of methods for comparing the mean values of three or more sets of data, each of which represent independent random samples. ANOVA methods can be applied to the comparison of the means of two groups, but in this case it is equivalent to the standard two-mean t-test and is therefore not generally applied in this context. The technique was originally developed by R A Fisher in the 1920s, principally as a way of comparing the results (yields) obtained in agricultural trials. ANOVA compares mean values through a process that involves separating the total variance of a dataset into distinct components, typically the variance within distinct groups (sometimes referred to as the within-group variance or error variance) which is treated as unexplained, and the between-group variance, which is a function of the size of the differences in the mean values of the various groups, and is treated as explained by underlying differences between the groups. A small within-group variance indicates that the groupings are relatively homogeneous internally whilst a large between-group variance indicates that the groups are relatively distinct with respect to the measured variable (or response variable). If the between-group variance is much greater than the within group variance the differences between groups are much more likely to be real rather than the result of chance variation. Under a set of well-defined conditions the statistical significance of these differences can be evaluated.

ANOVA is always determined with respect to an underlying model, in many cases a model that has been devised as part of the design of experiment (DoE) process. The most basic linear model, which we introduced in the discussion of completely randomized designs, is of the form:

This states that the observed or measured value y (observation j in group i) is a linear combination of an overall mean value, μ, plus a treatment or group effect, T, plus some unexplained random variation, or error, e. It is assumed that the error component has a mean value of 0, a common variance across groups (homoscedasticity, which may be evaluated using Bartlett's or Brown-Forsythe-Levene's test), and that these errors are Normally distributed (Normality), that observations are random and independent (independence), and that the means and variances are additive (additivity). Collectively these assumptions enable statistical analysis of the ratio of between group to within group variances to be evaluated by comparison with the F-distribution (which as we have seen can be obtained as the ratio of two sets of squared Normally distributed variables). Where the data do not meet these requirements a non-parametric form of ANOVA may be possible, based on the ranks of the data rather than the data themselves. This is known as the Kruskal-Wallis (KW) one-way analysis of variance.

In the basic model above, there is a single treatment factor, T, defined at a fixed number of levels or into a number of categories — this kind of model is known as a fixed-effects or Type 1 ANOVA model. For example, the treatment might be the application of a fertilizer and the levels a pre-defined set of rates per square meter; alternatively the 'treatment' might be a set of geographical regions, or a set of different varieties of wheat to be compared. These are examples of fixed-effect one-way ANOVA models (or single factor models). In these models there is one dimension to the analysis. Set up as a table, the rows (or columns) of the table will contain the treatment factor levels or groupings, and the second dimension of the table will contain the replicates or data items applying to that grouping. With a two-way ANOVA the rows and columns of the data table are both groupings or classifications of interest, with cell entries being the data (one or more samples). For example, in the agricultural case, the rows (R) might be the levels of fertilizer treatment applied with the columns (C) being the varieties being studied. For a two-factor model, without replication, the basic model is of the form:

where the i and j subscripts now refer simply to row and column effects. If replication is provided for in the design, the model becomes a little more complex, and the row and column components are described as main effects, with additional information becoming available on interactions between the row and column effects. The interaction effect, δ, provides a measure of the joint effect of factors 1 and 2, over and above their separate effects. For example, in many industrial experiments, both pressure and temperature are varied, and each may have an impact on the results obtained (the response variable), but their combined effect may also be important. An observed interaction effect may, of course, be the result of random variation in the data, or in some cases, an indication that some other factor of importance has been omitted from the analysis (a confounding effect). The basic model for a two-factor ANOVA with k replicates is of the form:

thus there is an overall mean effect, a row effect, a column effect and an interaction effect, plus random error. These core models can be extended to more than two independent variables, producing three-way or multi-way analysis of variance. Multivariate analysis of variance (MANOVA) is used where instead of a single dependent variable, y, there are multiple independent variables.

In these examples it has been assumed that the experiments are complete (all factors at all levels have at least one observation), and where replicates are made, the number of such replicates is the same for all factor combinations. In some instances, notably in factorial experiments, a complete set of trials is not feasible and a fractional factorial experiment is conducted. In these designs the interaction effects are typically sacrificed in order to ensure that the main effects are obtained despite the limitations of the study. In such cases some or all of the interaction effects will be confounded with the main effects, but all the main effects will be obtained. It has also been assumed that where replication occurs, the number of replicates will be the same for all factor combinations. In practice this may be difficult to achieve, and in cases where the numbers are approximately equal and/or are proportional to the frequency of the individual factors, the theory applied in ANOVA remains valid.

In addition to fixed effects model there are random effects or Type 2 models, and mixtures of both Type 1 and Type 2 cases. Random effects are applicable where the treatments are random samples from a population or populations of treatments. For example, an experiment run on three machines or production lines chosen at random from 40 available machines would comprise a random effects model (or mixed model depending on the specification of other factors). In general random effects and mixed models are used to examine the variability of the treatments and their use in predictive modeling, rather than to make inferences regarding the treatment levels or groups. Fixed effect models are most commonly applied for designed experiments with pre-determined factor levels or categories. Crawley (2007, p473-4, [CRA1]) provides useful guidance on when to choose fixed, random or mixed effects models, which we have summarized in the table below:





Interested in effect sizes?




Is it reasonable to suppose that the factor levels come from a population of levels?




Are there enough levels of the factor in the data on which to base an estimate of the variance of the population of effects?




Are the factor levels informative?




Are the factor levels just numerical labels?




Am I mostly interested in making inferences about the distribution of effects, based on the random sample of effects represented in the data?




Is it a hierarchical experiment where the factor levels are experimental manipulations?

Yes, split-plot



Is it a hierarchical observational study?




Are both fixed and random effects present? If yes, use linear or non-linear model as appropriate




In some instances ANOVA procedures can be enhanced by considering additional information that is available and is known to be correlated with one or more of the factors under investigation. Such information, which is known to co-vary with factors in the model, could help explain the observed data more fully, thereby reducing the unexplained or error variance. This in turn will increase the sensitivity of the test procedures used, since the ratio of explained to unexplained variance will typically increase because the denominator is now smaller. Inclusion of covariates to ANOVA procedures is known as ANCOVA, and when applied in the more general multivariate case, MANCOVA. Such covariates are continuous as opposed to categorical, and the analytical approach can be seen as a combination of ANOVA and regression methods.

In the sections that follow we start by showing exactly why ANOVA works and produces results that are useful and readily understood. We then provide examples of one-way, two-way and multi-way ANOVA, before proceeding to the more complex case of ANCOVA.


[CRA1] Crawley M J (2007) The R Book. J Wiley & Son, Chichester, UK, 2nd ed 2015