The Introduction to this Handbook has provided an initial flavor of the ideas that form the basis of statistical methods. However, as with every discipline, there is a whole host of concepts and terminology that are peculiar to statistics and it is helpful to identify some of these before proceeding further. As more details are provided on how statistical analyses can and should be conducted, and specific techniques are explained in greater depth, so the usage of these discipline-specific terms will become more apparent. It is also worth stating that there is not a universal agreement on the usage of the various concepts and terms we discuss, and indeed there are different schools of thought regarding some aspects of the approach that should be taken when undertaking statistical analysis.
We start by reviewing basic probability theory, commencing with the classical notions we have already introduced and then extending into a brief examination of two of the main schools of thought, frequentist and Bayesian. On the way we explain the terms odds and risks in common statistical usage, and conclude with an explanation of the concept of a probability distribution. The primary purpose of obtaining an understanding of the probabilities associated with different events or outcomes is to assist in the process of decision-making. This may be in many different forms, from deciding what odds to play in a game of chance, to helping decide whether a particular drug treatment is effective. In many instances the behavior of samples is used as an indication of how the population as a whole might behave, which involves inferring population results (e.g. the likely range of responses to a program of vaccination) on the basis of samples or trials. This process is an example of statistical inference, to which a whole host of topics relate - for example, understanding how reliable the inferences made might be (e.g. the confidence one might place in an estimated result or value). It also raises the question as to how one takes samples, how large such samples should be, and how to avoid the wide range of problems, notably various sources of bias. Each of these areas, and many others, is described in the topics and subsections that follow.
We also provide a brief commentary on statistical modeling. In many situations data is collected according to some pre-defined plan, either devised by the researcher or third parties, and then analyzed using a variety of statistical methods. Where, as is generally the case, such analysis involves some form of idealized model of how the various components of the data might behave the aim might be to examine the validity and fit of the model, and/or to use the results to guide subsequent research or predictions. In other instances data is collected according to a well-defined experimental design, controlled and managed by the research team, again with an underlying statistical model in mind.