In the previous subsection we described completely randomized designs. We also noted that in some circumstances an improved understanding of the effect of treatments or factors could be obtained by eliminating what one might describe as nuisance factors, that is components that contribute to data variation which are not of direct interest. For example, when testing the efficacy of two alternative procedures for deep cleaning of hospital wards, each hospital participating in the trial could be taken as a separate block, thereby removing (to an extent) variations between hospitals from the analysis of the variations between the alternative procedures applied to wards within these hospitals. In a sense, this approach is equivalent to running a series of separate trials, with each hospital being a distinct unit of study. Differences between hospitals, in their age, construction, location, diligence etc. can be (at least partially) separated from the effect of the alternative procedures. It is clear from this example that in some instances approaches to blocking will be obvious and effective, whilst in others selection of the most appropriate and effective blocking may be difficult. As we have noted in the introduction to this overall topic, the dictum is: block what you can and randomize what you cannot.
As with completely randomized designs, a simple model can be used to describe the general form of randomized block designs. Let yij represent the data obtained from the experiment (the measured outcome or result) conducted on the jth replicate that receives the ith treatment; let Ti be the effect attributable to the ith treatment, Bj be the effect attributable to the jth block and let e denote residual error, unexplained by other factors. Then the statistical model for this kind of experiment is of the form:
This model states that the measured response is a simple linear function of the overall mean value, μ, for all the data, plus a treatment effect, plus a block effect, plus some residual error. The overall mean is estimated from the mean of all the sample data, whilst the treatment mean values and block mean values are estimated from the means for each treatment or block. The treatment effects are estimated as the difference between the overall mean and the individual treatments means. Analysis of the data is typically achieved using analysis of variance (ANOVA).
In certain, special cases, it may be possible to devise designs that handle two or three independent blocking factors at once, within a design that requires very few runs or trials. These designs are known as Latin-square, Graeco-Latin square and hyper-Graeco-Latin squares. They are of limited application due to the constraints on their formulation, but where they can be applied they are very effective in terms of the number of runs required. As such they are a special case of the so-called fractional variants of full factorial designs.
The idea of blocking is a powerful one, but it is often the case that blocking of factors in an entirely consistent manner is not possible. For example, in an agricultural trial of six varieties of sugar beet (see Cox,1958, p.147 [COX1]) each variety is to be sown early or late in the sowing season. A fully randomized block design 'on the ground' would have plots with randomly assigned varieties and early or late sowing. However, from a practical perspective, modifying the design may be necessary so that all the early sowings are made in one go on a series of strips, and then all the late sowings are made in one go, for each full block of the design. The full blocks are thus split into two or more sub-plots or units in order to facilitate the operational requirements of the trial. Such split-plot designs are almost always inherent in industrial trials, even though the design might have been created as an apparently full randomized design, simply because re-setting experimental equipment or pilot production processes in a fully randomized manner is often impossible. A recent fuller discussion of split plot designs and their merits is provided in Jones and Nachtheim (2009, [JON1]). They argue that split plot designs are often cheaper to run, more efficient and can have greater validity than more widely used alternatives, and that explicit support for such designs should be made more readily available in statistical design and analysis software packages.