<< Click to Display Table of Contents >>
## Full Factorial designs |

The simplest type of full factorial design is one in which the k factors of interest have only two levels, for example High and Low, Present or Absent. As noted in the introduction to this topic, with k factors to examine this would require at least 2k runs. Thus for 3 factors, a total of 8 runs would be required (assuming no replication). The runs required can be shown in tabular form as follows, where -1 and 1 denote the factor levels:

RUN |
A |
B |
C |

1 |
-1 |
-1 |
-1 |

2 |
+1 |
-1 |
-1 |

3 |
-1 |
+1 |
-1 |

4 |
+1 |
+1 |
-1 |

5 |
-1 |
-1 |
+1 |

6 |
+1 |
-1 |
+1 |

7 |
-1 |
+1 |
+1 |

8 |
+1 |
+1 |
+1 |

If any two columns in this table are element-wise multiplied together, the result sums to 0. This arrangement has a geometric interpretation, that the two factors when considered as vectors, are at right angles to one another, or orthogonal. As far as possible designs are sought that ensure all factors are orthogonal, whether the design is complete or incomplete. Orthogonal designs remove correlations between main effects and interactions, enabling these components of variation to be analyzed independently. In the case of two or three factor designs this can be displayed diagrammatically, and for a 3 factor experiment it maps to a cube, the vertices of which correspond to the various high-low combinations.

Although the basic full factorial design above seems very limiting, it has been shown to be an extremely effective way of exploring a range of factors in order to improve the design of many processes. For example, the use of two-levels for continuous variables can provide a means for identifying the direction in which further research and experimentation should proceed. And whilst the approach is impractical for large numbers of factors, very efficient fractional versions of such designs have been devised. The design above can be further improved by randomization of the order in which runs are made, by fully replicating the experiment (e.g. two or more times) and by including a center point (after randomization) at the start, middle and end of the sequence of runs. Center points are the zero values of the high-low ranges, so are coded as 0,0,0 in the example above. They help provide a measure of the stability of the process and an indication of any simple curvature (i.e. simple non-linearity). If added to the standard 8 run design above replicated twice, this would then require a total of 8*2+3=19 runs. With larger numbers of runs and more complex (surface) designs one might add additional evenly spaced center points.

By running full factorial experiments all main effects and interaction effects in a linear model can be estimated. The basic model for the 3 factor model above is of the form:

and because all 8 runs are included, the 8 coefficients, βi, of the model can be determined. For example, to determine the main effect for factor A or x1, we can use sum of the results (y-values) obtained from runs 2, 4, 6 and 8 (when this factor is set high) minus the sum of the results obtained when this factor is set low (runs 1,3,5 and 7). The difference between the two sums, divided by 4, will provide an estimate the size of the main effect for factor A. All main effects and interactions can be estimated in a similar manner.

The last of the components in the model above is the three factor interaction, and is generally of least interest. The components that are just in one factor are the main effects, and generally of primary interest, although simple two-way interaction effects (e.g. the behavior of a chemical reaction when both temperature and pressure are high) will also often be of interest. If we are prepared to lose the 3-way interaction information we can devise a design that retains all the remaining information but allows for simple two-level blocking of the data. This could facilitate analysis of morning and afternoon runs, or day and night shift runs, or batch 1 and batch 2 inputs (i.e. blocking out batch variations). To achieve this we take the table above and augment it with a column obtained by multiplying the three column entries together:

RUN |
A |
B |
C |
ABC |
BLOCK |

1 |
-1 |
-1 |
-1 |
-1 |
1 |

2 |
+1 |
-1 |
-1 |
+1 |
2 |

3 |
-1 |
+1 |
-1 |
+1 |
2 |

4 |
+1 |
+1 |
-1 |
-1 |
1 |

5 |
-1 |
-1 |
+1 |
+1 |
2 |

6 |
+1 |
-1 |
+1 |
-1 |
1 |

7 |
-1 |
+1 |
+1 |
-1 |
1 |

8 |
+1 |
+1 |
+1 |
+1 |
2 |

The final column in the table shows which block the various runs are assigned to, prior to randomization (all the +1 effects are in block 1 and the -1 effects in block 2). In this way all main and two-level interaction effects are retained, blocking is achieved, and the number of runs is unchanged, at the cost of confounding the 3-way interactions with the blocking. Clearly if the number of factors is large, typically greater than 5 or 6, or if more than two factor levels are included, the number of runs required becomes unmanageable (especially if replication is also included) so consideration must be given to only running a subset or fraction of the full factorial design. This is the approach used in fractional factorial designs.