As mentioned in the previous section, the discipline that we now know as Statistics, developed from early work in a number of applied fields. It was, and is, very much an applied science. Gambling was undoubtedly one of the most important early drivers of research into probability and statistical methods and Abraham De Moivre's book, published in 1718, "The Doctrine of Chance: A method of calculating the probabilities of events in play" [DEM1] was essential reading for any serious gambler at the time. The book contained an explanation of the basic ideas of probability, including permutations and combinations, together with detailed analysis of a variety of games of chance, including card games with delightful names such as Basette and Pharaon (Faro), games of dice, roulette, lotteries etc. A typical entry in De Moivre's book is as follows:

"Suppose there is a heap of 13 cards of one color [suit], and another heap of 13 cards of another color; what is the Probability, that taking one Card at a venture [random] out of each heap, I shall take out the two Aces?" He then goes on to explain that since there is only one Ace in each heap, the separate probabilities are 1/13 and 1/13, so the combined probability (since the cards are independently drawn) is simply:

hence the chance of not drawing two Aces is 168/169, or put another way, the odds against drawing two Aces are 168:1 - for gambling, whether the gambler or the gambling house, setting and estimating such odds is vitally important! De Moivre's book ran into many editions, and it was in the revised 1738 and 1756 editions that De Moivre introduced a series approximation to the Binomial for large n with p and q not small (e.g. not less than 0.3). These conditions lead to an approximation that is generally known as the Normal distribution. His motivation for so developing this approximation was that computation of the terms of the Binomial for large values of n (e.g. >50, as illustrated below) was extremely tedious and unrealistic to contemplate at that time. Furthermore, as n increases the individual events have very small probabilities (with n=500 the maximum probability for an individual event with p=0.5 is 0.036 - i.e. there is just under 4% chance of seeing exactly 250 heads, say, when 500 tosses of an unbiased coin are made). For this reason one tends to be interested in the probability of seeing a group or range of values (e.g. 400 or more heads from 500 tosses), rather than any specific value. Looking at the chart below the vertical bars should really be just vertical lines, and as the number of such lines becomes very large and the interval between events becomes relatively smaller, a continuous smooth bell-like curve approximation (which is what the Normal distribution provides) starts to make sense (see further, the Normal distribution).

Binomial distribution, mean = 25

De Moivre also worked extensively on another topic, mentioned in the previous section, mortality tables. This work developed following the publication by John Graunt in 1662 of figures on births and deaths in London, and similar research by Edmund Halley (the astronomer) of birth and deaths data for the City of Breslau (modern day Wrocław in Poland) between 1687 and 1691 [HAL1]. Halley was interested in using this data in order to "ascertain the price of annuities upon lives", i.e. to determine the level at which life insurance premiums (or annuities) might be set. As an illustration, Halley observed that (based on his data) there was only 100:1 chance that a man in Breslau aged 20 would die in the following 12 months (i.e. before reaching 21), but 38:1 if the man was 50 years old. De Moivre included Halley's data and sample annuity problems and solutions in the 1756 edition of his "Doctrine of Chance" book, cited above.

A very different application of statistics arose during the 19th century with the development of new forms of communication, especially the development of telephony and the introduction of manual and then mechanical exchange equipment. A Danish mathematician, Agner Erlang, working for the Copenhagen Telephone Authority (KTAS), addressed the important questions of queuing and congestion. Answers were needed to questions such as "how many operators are needed to service telephone calls for 1000 customers?" and "how many lines are required to ensure that 95% of our customers can call other major towns in the country without finding that the line is busy". Questions such as these are closely related to problems of queuing and queue management, such as "how many checkouts do I need in a supermarket to ensure customers on a busy Saturday do not have to wait in line more than a certain amount of time?", or "how long should we have a stop sign on red before we allow the traffic to cross an intersection?". Erlang investigated these questions by assuming that there are a large number of customers but only a small chance that any particular customer would be trying to make a call at any one time. This is rather like the Binomial with n large and p very small, which had been shown by the French mathematician, Siméon Poisson (in a work of 1837) to have a simple approximation, and is now given the name Poisson Distribution. Erlang also assumed that when a call was made, the call lengths followed an Exponential Distribution, so short calls were much more common than very long calls. In fact, this assumption is unnecessary - all that really matters is that the calls are made independently and have a known average duration over an interval of time, e.g. during the peak hour in the morning. The number of calls per hour made to the system times their average length gives the total traffic, in dimensionless units that are now called Erlangs and usually denoted by the letter A or E. Erlang derived a variety of statistical measures based on these assumptions, one of the most important being the so-called Grade of Service (GoS). This states the probability that a call will be rejected because the service is busy, where the traffic offered is E and the number of lines or operators etc available is m. The formula he derived, generally known as the Erlang B formula, is:

Hence, if we have 2 units of traffic per hour (E=2) and m=5 channels to serve the traffic, the probability of congestion is expected to be just under 4%. Put another way, if you are designing facilities to serve a known peak traffic E and a target GoS of 5%, you can apply the formula incrementally (increasing m by 1 progressively) until you reach your target. Note that this very simple example assumes that there is no facility for putting calls into a queuing system, or re-routing them elsewhere, and critically assumes that calls arrive independently. In practice these assumptions worked very well for many years while telephone call traffic levels were quite low and stable over periods of 0.5-1.0 hours. However, with sudden increases in call rates people started to find lines busy and then called back immediately, with the result that call arrival rates were no long Poisson-like. This leads to a very rapidly degrading service levels and/or growing queuing patterns (familiar problems in physical examples such as supermarket checkouts and busy motorways, but also applicable to telephone and data communications networks). Erlang, and subsequently others, developed statistical formulas for addressing many questions of this type that are still used today. However, as with some other areas of statistical methods previously described, the rise of computational power has enabled entire systems to be simulated, allowing a range of complex conditions to be modeled and stress-tested, such as varying call arrival rates, allowing buffering (limited or unlimited), handling device failure and similar factors, which would have been previously impossible to model analytically.

The final area of application we shall discuss is that of experimental design. Research into the best way to examine the effectiveness of different treatments applied to crops led R A Fisher to devise a whole family of scientific methods for addressing such problems. In 1919 Fisher joined the Rothamsted Agricultural Experiment Station and commenced work on the formal methods of randomization and the analysis of variance, which now form the basis for the design of 'controlled' experiments throughout the world. Examples of the kind of problem his procedures address are: "does a new fertilizer treatment X, under a range of different conditions/soils etc, improve the yield of crop Y?" or "a sample of women aged 50-60 are prescribed one of three treatments: hormone replacement therapy (HRT); or a placebo; or no HRT for x years - does the use of HRT significantly increase the risk of breast cancer?".

As can be seen from these varied example, statistics is a science that has developed from the need to address very specific and practical problems. The methods and measures developed over the last 150-200 years form the basis for the many of the standard procedures applied today, and are implemented in the numerous software packages and libraries utilized by researchers on a daily basis. What has perhaps changed in recent years is the growing use of computational methods to enable a broader range of problems, with more variables and much larger datasets to be analyzed. The range of applications now embraced by statistics is immense. As an indication of this spread, the following is a list of areas of specialism for consultants, as listed by the websites of the UK Royal Statistical Society (RSS): and the US American Statistical Association (ASA):

Statistical Consultancy - Areas of specialism - RSS

Applied operational research |
Epidemiology |
Neural networks and genetic algorithms |
Sampling |

Bayesian methods |
Expert systems |
Non-parametric statistics |
Simulation |

Bioassay |
Exploratory data analysis |
Numerical analysis and optimization |
Spatial statistics |

Calibration |
Forecasting |
Pattern recognition and image analysis |
Statistical computing |

Censuses and surveys |
GLMs and other non-linear models |
Quality methodology |
Statistical inference |

Clinical trials |
Graphics |
Probability |
Survival analysis |

Design & analysis of experiments |
Multivariate analysis |
Reliability |
Time Series |

Statistical Consultancy - Areas of specialism - ASA

Bayesian Methods |
General Advanced Methodological Techniques |
Quality Management, 6-Sigma |
Statistical Software - SAS |

Biometrics & Biostatistics |
Graphics |
Risk Assessment & Analysis |
Statistical Software - SPSS |

Construction of Tests & Measurements |
Market Research |
Sampling & Sample Design |
Statistical Training |

Data Collection Procedures |
Modeling & Forecasting |
Segmentation |
Survey Design & Analysis |

Decision Theory |
Non Parametric Statistics |
Statistical Organization & Administration |
Systems Analysis & Programming |

Experimental Design |
Operations research |
Statistical Process Control |
Technical Writing & Editing |

Expert Witness |
Probability |
Statistical Software - other |
Temporal & Spatial Statistics |

References

[DEM1] De Moivre A (1713) The Doctrine of Chance: A method of calculating the probabilities of events in play; Available as a freely downloadable PDF from http://books.google.com/books?id=3EPac6QpbuMC

[HAL1] Halley E (1693) An Estimate of the Degrees of Mortality of Mankind. Phil. Trans. of the Royal Society, January 1692/3, p.596-610; Available online at http://www.pierre-marteau.com/editions/1693-mortality.html . Also available in Newman J R (1960) The World of Mathematics. Vol 3, Part VIII Statistics and the Design of Experiments, pp1436-1447. Oliver & Boyd, London