Cohort studies

<< Click to Display Table of Contents >>

Navigation:  Statistical data > Statistics in Medical Research > Conduct and reporting of medical research >

Cohort studies

Cohort studies primarily consist of the selection of a group of individuals (the cohort) and studying aspects of their development over many years, possibly several decades. In particular, disease incidence and mortality of the cohort are studied. As with case-control studies and randomized control trials, the unit of analysis is the individual, i.e. macro-level relationships amongst groups (i.e. population correlation or ecological studies) are not the basis for research in any of these methods. At the start of the process members of the cohort are recruited, generally with a carefully constructed profile that is designed to embrace the study population of interest, and detailed interviews are conducted to learn relevant information about their background, health history, lifestyle etc. Further interviews and/or questionnaires are then conducted every few years over the course of the project. The great advantage of this approach, which is known as a prospective cohort study, is that information about the individuals is well documented and all subsequent disease incidence is recorded, in many cases up to eventual death. However, this may take a long time and may exhibit only a few cases of the specific disease or diseases of interest unless a very large cohort is used, which increases the cost and complexity of the project. Retrospective or historical cohort studies seek to identify a group or cohort with known exposure to a suspected agent, and then attempt to reconstruct the history of exposure and related data in order to obtain an understanding about current disease incidence and mortality patterns. This approach has the advantage that data is available relatively quickly and without excessive expense, may be the only possible approach (for example if a substance or circumstance no longer applies), but is subject to many practical problems — notably missing data and recall problems.

Prospective cohort studies may be compared with a case-control approach, for which cases are selected from those with a disease of interest, controls are selected, and the two datasets are compared. For example, in the very first major cohort study, that of British Doctors commencing in 1951, some 34,440 male doctors were recruited to the study, and after 20 years total incidence of lung case deaths was 441. This compares with a case-control study that commenced in 1948 and was completed in 1952, with 4342 people being involved and 1488 being lung cancer cases. The advantages of case-control studies for identifying possible relationships of importance is clear, and they are much faster and typically less costly to conduct, but they are severely limited in the degree of confidence in the nature of the relationship and by their reliance on recalled information. With a cohort study an improved understanding of the relationship between exposure, lifestyle and outcomes is possible, including effects not previously identified. For example, in the British Doctors study it was apparent, after 20 years, that the death rate amongst heavy smokers from all causes was twice that of non-smokers. In this example the level of long-term successful follow-up of the cohort was very high, but in other studies this has not been the case. Unless the results are available for a very high proportion of the cohort, the validity of the results may be called into question.

In summary, Breslow and Day [BRE2] identify the following advantages of cohort studies as compared with case-control studies. Cohort studies:

Give a wider picture of the health hazards associated with a given exposure

Eliminate most forms of selection bias and recall bias

Are often the only practical option where exposure to specific agents (e.g. suspected hazardous industrial chemicals) is rare

In addition to detailed interviews or questionnaires, medical tests can be carried out at the start of the cohort study that may aid interpretation of outcomes, for example in terms of prior susceptibility to certain conditions

Repeated measurements over the lifetime of the study may be possible and important — for example, measurements of specific chemicals present in blood or urine samples

Absolute risk rates are obtained for the cohort, as opposed to relative risk rates for case-control studies. However, if the cohort is not representative of the broader population these risk estimates cannot be extended without reservation

An additional aspect of cohort studies is the fact that they can look at length of survival, rather than simply the outcome at a single time point (see for example, Svensson et al., [SVE1], where the authors analyzed the survival rate of patients with bone cancer after 1, 3 and 5 years ).

Historically, most attention has been focused on mortality whereas more recently interest has been shown in a combination of severity of various conditions (e.g. chronic illnesses) and the detail of dose-response relationships, both of which are more demanding in terms of data (e.g. obtaining an understanding of multi-factor effects) and analytical methods. For dose-response analysis much emphasis is placed on fitting alternative models to data collected over time, using various forms of time series analysis to help model, and thereby predict, safe exposure or dose levels in absolute and temporal terms.


[BRE1] Breslow N E, Day N E (1980) Statistical Methods in Cancer Research: Volume 1 — The Analysis of Case-Control studies. IARC Scientific Publications No.32, World Health Organization, IARC  Lyon

[BRE2] Breslow N E, Day N E (1987) Statistical Methods in Cancer Research: Volume 2 — The Design and Analysis of Cohort Studies. IARC Scientific Publications No.82, World Health Organization, IARC Lyon

[SVE1] Svensson E, Christiansen C F, Ulrichsen S P, Rorth M R, Sorensen H T (2017) Survival after bone metastasis by primary cancer type: a Danish population-based cohort study. BMJ Open 2017;7:e016022