Regression designs and response surfaces

Navigation:  Design of experiments >

Regression designs and response surfaces

Previous pageReturn to chapter overviewNext page

Although the designs discussed in the preceding sections have focused on analysis of the relative importance of individual factors and their interactions, the nature of the response in the measured variable (e.g. fuel consumption, strength or flexibility of fiber) has had little attention. Where such responses are continuous variables, which is often the case in production engineering and in some other fields, the estimation of the parameters of the underlying model becomes of great interest. If non-linearity in the response is suspected (for example the value at a center point for a factor appears to be somewhat different from the average of the high and low value responses) then a quadratic or cubic model, rather than a linear model, might be felt to provide a better representation of the data. The response variable model in such cases would then include additional terms, in x2 or x3. For example, the model might be of the form:

This kind of model cannot be estimated using a two-level factorial design with added center points, but requires the use of 3-level factorial designs. Typically careful analysis of the data precedes such analysis, with examination of possible constraints on the range of the factors being examined, and data transformation where appropriate. These steps also minimize the likelihood of requiring a higher-order model to represent the data.

Having selected a design with a quadratic component, the response data can be plotted for factor pairs as a response surface, and the characteristics of that surface (e.g. peaks, ridges) examined in order to obtain a better understanding of the underlying response behavior and hence facilitate process optimization or improvement. Special designs have been devised to enable the parameters of such models to be readily estimated. For example, Box-Behnken designs "are incomplete three-level factorials where the experimental points are specifically chosen to allow the efficient estimation of the coefficients of a second-order model" (Box et al. (2005, section 11.6, p475, [BOX1]). Box-Wilson or central composite designs (CC designs) provide similar functionality. Factor levels in the design matrix are usually denoted by -1, 0 and 1, or 0, 1, and 2. A summary of the main types of design in this category (from the NIST) is provided below:

Response surface designs

CCC designs provide high quality predictions over the entire design space, but require factor settings outside the range of the factors in the factorial part. Note: When the possibility of running a CCC design is recognized before starting a factorial experiment, factor spacings can be reduced to ensure that ±α for each coded factor corresponds to feasible (reasonable) levels. Requires 5 levels for each factor.

CCI designs use only points within the factor ranges originally specified, but do not provide the same high quality prediction over the entire space compared to the CCC. Requires 5 levels of each factor.

CCF designs provide relatively high quality predictions over the entire design space and do not require using points outside the original factor range. However, they give poor precision for estimating pure quadratic coefficients. Requires 3 levels for each factor.

Box-Behnken These designs require fewer treatment combinations than a central composite design in cases involving 3 or 4 factors. The Box-Behnken design is rotatable (or nearly so) but it contains regions of poor prediction quality like the CCI. Its "missing corners" may be useful when the experimenter should avoid combined factor extremes. This property prevents a potential loss of data in those cases. Requires 3 levels for each factor.

In a presentation to the Royal Statistical Society annual conference (2010, {DAV1]) T P Davis extended his 2006 paper on "Science, engineering, and statistics" {DAV2] by arguing for the application of dimensional analysis (where practical) in experimental design. Essentially this involves a process of redefining the design problem in terms of a far smaller number of dimensionless variables, running a much smaller experiment to fit the dimensionless equation, and then converting the resulting fitted expression back into the original units. He provides a worked example based on predicting the flight time for a (paper) helicopter. With a Box-Behnken response surface design a total of 13 experimental runs are required, but with the dimensional analysis approach similar results can be achieved with just 3 runs. The approach is based on a fundamental theorem known as Buckingham's Pi Theorem. In broad terms this theorem states that a functional relationship in n variables and m fundamental units can be expressed as an expression involving n-m dimensionless units. In the paper helicopter example it is possible to argue that the flight time (T) can be expressed as a function of the form T=F(m,g,r,d,h) where m is a measure of the mass of the helicopter, g is gravitational acceleration, r is the rotor radius, d is the density of air and h is the launch height. This can be re-written as v=F(m,g,r,d) where T=h/v, with v being velocity (thus there are 5 variables). For the experiment a total of 3 fundamental units are proposed, these being the rotor radius, tail length and tail width, with all others being held constant. Buckingham's theorem states that this can be expressed as an equation in 5-3=2 dimensionless variables that are composites of the original 5. With the expression so formed only a small number of experimental runs are required as there are only 2 variables to fit. The resulting fitted equation is then back-transformed into the original units to produce what turns out to be a dimensionally consistent non-linear equation with a predictive power that is comparable to that obtained using the Box-Behnken approach.

References

[BOX1] Box G E P, Hunter J S, Hunter W G (1978) Statistics for Experimenters: An Introduction to Design, Data Analysis and Model Building. J Wiley & Sons, New York. The second, extended edition was published in 2005

[DAV1] Davis T P (2010) Statistical Engineering. Presentation to the Royal Stat. Soc. 2010 Annual Conference. Slides available from Tim Davis' website: http://www.timdavis.co.uk/StatisticalEngineering-RSS%2C2010.pdf

[DAV2] Davis T P (2006) Science, engineering, and statistics. Applied Stochastic Models in Business and Industry, 22, 401-430. A longer, unpublished version of this paper is available from: http://www.timdavis.co.uk/ScienceEngineeringandStatistics.PDF

Web site:

NIST/Sematech eHandbook of Engineering Statistics, section 5.3, Response surface designs