There are a very large number of statistical software packages and routines, ranging from the large-scale commercial packages with many optional modules, to niche-market and application-specific software that may or may not be fully supported. A major change that has occurred in recent years is the rise of OpenSource code, and in the case of statistical software the "R Project for Statistical Computing" (http://www.r-project.org/) has led the way. The ready availability of this software together with support from a global network of developers and statisticians, and some excellent supporting materials (books and online documentation) has meant that for many professional statisticians (academic, government and commercial) R has become a key resource. However, software like R is not always simple to use and is generally not integrated into the working practices and systems of major organizations. Integrated packages, with powerful visual user interfaces, graphical displays and data input/output facilities have an important role to play. The majority of these are commercial products, many of which have a long and often distinguished history. In general all packages, free and commercial, provide online help in various forms (WinHelp, HTML, PDF etc), and in most instances include the formulas applied in each case and references on which the procedures are based.

Typically statistical software is only accurate in three senses:

• | first, the execution of all algorithms will depend on the underlying processor architecture, and for most statistical calculations 32-bit arithmetic is quite sufficient, but in some circumstances it is not. Errors are always possible, and on occasion will result in the package or toolkit giving incorrect answers even though on most occasions it will be correct |

• | second, the particular implementation of an algorithm and the selection of formulas will tend to be different for every package, so one should not expect even simple analyses produced by different packages or by hand/spreadsheet calculation, to yield exactly the same answers. However, they should be very similar and not result in incorrect decision-making on the basis of the computations |

• | third, many commercial packages have been in the marketplace for many years, some for several decades. They can only survive if they meet client expectations and provide consistent, 'accurate' results. However, any specific release cannot be guaranteed error free, and users should always be wary of unexpected results. If necessary output should be cross-checked with other software and/or by manual computation, and the latest updates implemented where appropriate as regular, fast bug-fixing is a critical issue in software provision |

The Wikipedia website provides a wide-ranging and useful (though not necessarily entirely independent and up-to-date) analysis of statistical software packages, available at: http://en.wikipedia.org/wiki/Comparison_of_statistical_packages. This includes pricing information, operating system support and functionality summaries (notably for regression and ANOVA facilities) for over 50 products. More specialized and OpenSource software, such as the MRC's BUGS project or the NIST's Dataplot, are not listed in most reviews and although provided without warranties they are backed up by expert development teams from the sponsoring bodies.

In general quality and accuracy reviews that are current are not available, so it is up to the user to determine the suitability and effectiveness of the software they choose to use. Every 5-10 years a detailed study of this type is produced, which draws attention to the problems that may exist. For example, Altman et al. (2002, [ALT1, see also, 2004 [ALT2]) conducted benchmark tests on 5 areas of statistics: univariate, ANOVA, linear and non-linear regression and distributions, for 4 well-known statistical packages and reported the results obtained by earlier researchers for a further 7 packages. In all cases the packages produced reliable results for most simple problems, but many produced incorrect results for more complex problems. In some instances this reflected processor-related issues, in others basic design choices or design flaws. Statistical Reference Datasets (StRD) exist for each of these areas from the US NIST. Several product sets now utilize these datasets when evaluating their operation including OpenSource products such as R (base) and the NIST's Dataplot.

There is also almost unlimited scope for errors and/or confusion in the data input, processing/programming and output phases when statistical software is used. In many instances non-statisticians are best advised to acquire the services of a statistician familiar with their application area and the use of specific appropriate software (e.g. R, SPSS, SAS/STAT etc) rather than attempting to undertake the entire task themselves. The involvement of a professional statistician may well also highlight issues in the design and implementation of a research program, and hence their early involvement is likely to be extremely beneficial to the successful outcome of many projects.

References

[ALT1] Altman M, McDonald M P (2002) Choosing Reliable Statistical Software. Political Science & Politics, 34(3), 681-687

[ALT2] Altman M, Gill J, McDonald M P (2004) Numerical Issues in Statistical Computing for the Social Scientist. Wiley Series in Probability and Statistics, J Wiley & Sons, NJ

Websites:

NIST StRD: http://www.itl.nist.gov/div898/strd/general/dataarchive.html