﻿ Analysis of variance and covariance > ANOVA > Two factor or two-way and higher-way ANOVA

# Two factor or two-way and higher-way ANOVA

Navigation:  Analysis of variance and covariance > ANOVA >

# Two factor or two-way and higher-way ANOVA

Two factor or two-way ANOVA is very similar to one-way ANOVA, but instead of the rows in the table being replicates and the columns being treatments, the rows also define a factor and the values recorded in the cells (individual row-column entries) are the data. If a single entry is made in each cell it means that the combination of each factor or treatment was tested exactly once and the analysis is described as being without replication. If there are n >1 entries (replicates) in each cell then the combinations are tested multiple times, and this is described as a with replication two-way analysis. If such an analysis does include replicates the separation of the sums of squares provides information on the interaction of the row and column factors, as well as the separate effects of the row factor and the column factor. It is immediately apparent that two-way and higher-way ANOVA follows on from a carefully designed experiment, whereas one-way analysis of variance is much more general in its form.

Example 1: Potato yields

As with one-way ANOVA it is simplest to illustrate the two-way model with a practical example. The table below, from Yule and Kendall (1950, p515, [YUL1]), shows the results from a trial of 4 varieties of potatoes (factor 1, rows, Var1-Var4) and 5 types of fertilizer (factor 2, columns F1-F5). Five plots of land of the same size and characteristics were selected for the trial, each treated with the same fertilizer. Each of the 4 varieties of potato were planted on these plots covering the same area (e.g. in strips), and the yields in tons were recorded following harvesting.

Potato yields, in tons

 F1 F2 F3 F4 F5 Var1 1.9 2.2 2.6 1.8 2.1 Var2 2.5 1.9 2.3 2.6 2.2 Var3 1.7 1.9 2.2 2.0 2.1 Var4 2.1 1.8 2.5 2.3 2.4

The standard two-way ANOVA table for this data is now shown, produced using the MATLab function anova2(). It shows that the variance in the data is now divided into the column-wise component (between fertilizers), the row-wise component (between varieties) and a residual or error component.

Potato yields — two-way ANOVA

 Source Sums of squares Degrees of freedom Mean squares F-test Prob>F Between fertilizers 0.4620 4 0.1155 2.0323 0.1537 Between varieties 0.2855 3 0.0952 1.6745 0.2251 Residual 0.6820 12 0.0568 Total 1.4295 19

This analysis indicates that neither variety nor fertilizer have a significant effect on yields, since both have probabilities of effects of this size which are much greater than 10%. If the differences had been larger and/or the residual been smaller, then the results may have been significant.

This analysis corresponds to the two-factor model without replicates we introduced at the start of this topic: where the R and C components are the rows and columns, and because this is a fixed effects or Type 1 model, we have: If the experiment described above had been fully replicated in adjacent set of plots, or in a similar area, the basic model with k=2 replicates would be of the form: thus in this case there is an overall mean effect, a row effect, a column effect and a new elements, an interaction effect, plus random error. The interaction effect is only measurable in this kind of model when there are replicates, and it provides a measure of how fertilizer and variety might interact to result in higher or lower yields, as a separate effect from the individual variety and fertilizer effects.  Interaction effects are obtainable in other forms of design, including three-factor or higher-factor (multiple-factor) ANOVA models, partial factorial designs and Latin square designs, for example, without requiring replication.

In the case of Latin square designs, the statistical model is of the form: and hence can be seen as a special form of three-way problem when it comes to analysis of variance. However, Latin square designs are not fully balanced (they are effectively fractional factorial designs) and so must be analyzed using software that can handle unbalanced designs or by using a linear modeling option, such as glm in Minitab, aov() or lm() and anova() in R, or anovan() in MATLab.

Example 2: Bean yields, Latin square model

The following example of an agricultural trial into the yield of beans under a variety of different treatments is from Weatherburn (1968, [WEA1]). The letters indicate the pattern in which five treatments A to E, were applied and the numbers show the yields recorded. In this case the rows and columns have no real meaning — they reflect the 5x5 pattern of equal sized plots used when the trial was conducted, which is a common practice in such experiments — this is designed to remove possible spatial factors such as variation in soil fertility or moisture retention.

Yield data for 5x5 Latin square design

 A 7.4 D 8.9 E 5.8 B 12 C 14.3 C 11.8 B 6.5 A 8.7 E 7.6 D 7.9 D 10.1 C 17.9 B 9 A 8.5 E 7.1 E 8.8 A 10.1 C 15.7 D 11.1 B 7.4 B 11.8 E 8.8 D 14.3 C 18.4 A 10.1

The following table analyzes the yield in terms of the row, column and treatment effects. The R code for producing this analysis is provided in the code samples section:

ANOVA for 5x5 Latin square design

 Source Sums of squares Degrees of freedom Mean squares F-test Prob>F Between rows 46.948 4 11.737 4.6446 0.01697 Between columns 13.020 4 3.255 1.2881 0.32860 Between treatments 190.888 4 47.722 18.8848 4.099e-05 Residual 0.6820 12 2.527

The analysis shows that there appears to be a row effect, but no column effect, and the real data of interest, the different treatments shows a very large contribution to the overall variance that is highly significant.

References

[WEA1] Weatherburn C E (1968) A First Course in Mathematical Statistics, Cambridge University Press, Cambridge UK

[YUL1] Yule G U, Kendall M G (1950) An Introduction to the Theory of Statistics. Griffin, London, 14th edition