Bayes theorem, which we discussed earlier in the section covering Bayesian probability theory, explains how hypotheses and/or probabilities can be adjusted to reflect not only the observed data but also any prior information one might have regarding the problem under consideration that might be important to include in the analysis. This is particularly common in areas such as risk assessment, and since much of medical research and testing can be seen as a form of risk assessment, Bayesian approaches have been widely adopted in this field. In the context of estimation, a Bayes estimator of a parameter θ will be a posterior distribution of the parameter based on the likelihood of the observed data (hence similar to a maximum likelihood approach) multiplied by the prior distribution selected for the parameter. Notice that here the parameter to be estimated is assumed to have a distribution of values rather than a single true value. This can be formally stated as:
This indicates that a Bayesian estimator requires selection of both the data and a probability distribution for the prior, with the result being dependent on these selections. Commonly used analytical priors are the Uniform or constant prior, the Normal distribution (for Normally distributed data) which results in a posterior distribution that is also Normal, and the Beta distribution for Binomial data which results in a Beta distribution for the posterior. The first two examples are described as conjugate priors, because the posterior distribution is of the same form as the prior. In both instances as the sample size increases, the influence of the prior decreases quite rapidly, demonstrating that there is often little difference between a Bayes estimate and an MLE estimator for larger sample sizes. With smaller sample sizes (from 1 upwards) Bayes estimators tend to slide the estimated set of values of a parameter (the credible set) or belief interval towards the prior, and many researchers believe that this is a more satisfactory way of addressing many such problems than frequentist approaches, and in some instances the only reasonable approach. Bayes estimators are not unbiased, but this is not a cause for serious concern, since unbiased MLE estimators can actually be poorer than biased Bayes estimators for small samples. It must be emphasized, however, that this is only valid if the priors are indeed informative. Empirical priors (which can be regarded as objective) and analytical priors are widely used in specific applications and posterior estimates obtained using computationally intensive processing. Packages that support such methods, including MCMC procedures, are implemented in the BUGS project and SAS, amongst others.
Example: Disease prevalence
Mossman and Berger (2001, [MOS1]) provide the following example of the application of Bayesian estimation to determine a confidence set (or interval in classical statistics) for the proportion of a population that has a particular disease given that a diagnostic test for the disease is positive (see also: Yudowsky's example, discussed earlier in this Handbook). Let the probability that a member of the population has the disease (the prevalence or pre-test probability), D, be p0; let the probability that a patient tested for the disease who receives a positive result actually has the disease (the test sensitivity) is p1; and the probability that they do not have the disease even though they have a positive test result, is p2 (this is the false positive rate; 1-p2 is referred to as the specificity of the test). Bayes theorem then states that the post test probability (a point estimate) is:
If we apply this rule to Yudowsky's mammography example, a test with a sensitivity of 80% and a false positive rate of 9.6%, we have p0=0.01 (1% prevalence), p1=0.8, and p2=0.096, giving a post test estimate of 7.76%.
Mossman and Berger then computed 95% confidence intervals for this form of estimate using Bayesian methods (treating the various p-values as having prior distributions) with simulated data and a range of values for the three proportions and with n=20 to 160 (n indicates the size of sample used to determine the population rates). In theory the two-sided test should result in approximately 2.5% of the samples falling above and below the confidence interval for specified values of p0. This proved to be very close to the result obtained using the Bayesian estimator, computed using a Beta(1/2,1/2) distribution prior, whereas for a range of other tests that they analyzed the total proportion falling above and below the 95% confidence intervals were generally smaller, effectively showing that the confidence intervals were wider than necessary for a given probability level. In general the Bayesian procedure produced better results the any other method for smaller sample sizes and for smaller values of p, but with larger sample sizes the differences between alternative estimation procedures is less substantial (note that most analyses of this type of data addresses odds ratios rather than simple odds).