Randomized controlled trials

<< Click to Display Table of Contents >>

Navigation:  Statistical data > Statistics in Medical Research > Conduct and reporting of medical research >

Randomized controlled trials

The randomized controlled trial (RCT) is generally accepted as the preferred approach to conducting a wide variety of medical trials and is a central technique in the broader field of evidence-based medicine. The advantages of the approach are many, but in particular they have been shown to be very effective in controlling for selection bias and confounding, which other procedures may fail to achieve. In order to explain the central ideas behind RCTs, we start by providing a brief description below of the steps involved in the very first RCT, which was designed to determine the effectiveness of a new treatment for tuberculosis (TB).

The first step when designing an RCT is to carefully define the problem to be studied. Ideally this definition should be easily understood and as narrow as is practical. In the British Medical Research Council's (MRC) study of tuberculosis treatments in the late 1940s, the trial (of the effectiveness of streptomycin as a treatment) was defined by restricting it to: "acute progressive bilateral pulmonary tuberculosis of presumably recent origin, bacteriologically proved, unsuitable for collapse therapy, age group 15 to 25 (later extended to 30)" [BH1; Ch.20, and [MRC1]). This particular trial was the first truly randomized clinical trial and this aspect of the trial was devised by Austin Bradford Hill — he himself had been diagnosed with TB and spent two years in hospital and a further two years convalescing from the disease in his early 20s.

The second step, in the most commonly applied form of RCT, is to consider two distinct groups, ideally of approximately equal size and make-up. In the tuberculosis trial 55 patients were given the new treatment (Streptomycin plus bed rest) whilst a separate group of 52 patients were just given bed rest. One group is the treatment group (or intervention group) whilst the other is the non-treatment group or control group. By non-treatment we mean that this group receives a placebo (i.e. a tablet or preparation that is not a drug at all and has no physiological effect on the patient), or no treatment, or continues on an existing, established treatment programme — the approach chosen must be precisely defined.

The patients selected to be involved in the trial are then randomly allocated to one of the two groups and their progress monitored over a period of time (usually relatively short). Most RCTs now carried out follow this general approach. There are other variants, notably cross-over RCTs, where patients are randomly assigned to groups but are then randomly re-assigned so they receive a sequence of treatments or non-treatments.

In the case of the 1948 tuberculosis trial the results after 6 months were as follows:




Considerable improvement        

28 (51%)

4 (8%)

Lesser improvement or deterioration

23 (42%)

34 (65%)


4 (7%)

14 (27%)

This finding was regarded as being an important breakthrough, both in terms of the success of the treatment, which was clear to see, and statistically significant (i.e. is extremely unlikely to have occurred by chance). After 12 months a further 8 of the treated patients had died, and a further 10 of the control group, again a significant result, but also indicating a reduced efficacy of the treatment over time. None of the patients, however, could be regarded as having been cured. Interestingly enough, at the time the authors did not report why they chose the sample sizes used. In a remarkable recorded interview in 1990 Bradford Hill, then aged 93, stated that the main reason for choosing 50 or so patients was that this was as much Streptomycin as could be obtained from the USA at the time given its scarcity, high cost and the considerable problems in obtaining US currency in the UK in the immediate post-WWII period. He also stated that the patients were not informed about the treatment they were to receive (i.e. there was no "informed consent" which is nowadays an absolute requirement for such trials). The authors also do not describe what form of statistical analysis was performed, merely that the results were statistically significant — the impression one has from Bradford Hill's book is that simple chi-square tests were carried out.

In order to carry out random assignment of patients to groups, the researchers who were treating the patients in the trial allocated patients themselves to the trial groups using random numbers, but this method can lead to problems. An example is given by Cancer Research UK:

"it is possible to be biased without realizing it. For example, if a new treatment has quite bad side effects, the doctors running the trial might subconsciously avoid putting sicker patients into the group having the new treatment. So as the trial went on, the control group would have more and more of the sickest patients in it. The people in the new treatment group would then do better than the control group. So, when the trial results come out, the new treatment would [incorrectly] look as if it works better than the standard treatment."

To avoid such problems a system known as blinding is applied. In blind trials one or more parties are unaware of the assignment of treatments. For example, in a so-called single-blind trial the individual receiving the treatment is not made aware whether the treatment they are being given is an existing treatment, a new treatment or a placebo. In double-blind trials neither the experimenter nor the patient know which treatment has been assigned to which patient (treatments are coded) thereby minimizing the risk of any influence the experimenter may have on the experiment. Unfortunately this terminology is not uniformly applied, leading to current recommendations to describe in detail the kind of blinding applied (if any), even extending to those involved in analysis and interpretation of the results.

Whilst the RCT procedure described above appears simple and straightforward, it does have difficulties. Determining the appropriate sample size to use can be problematic, especially in the case of rarer conditions and/or where suitable triallists are simply not available in the location or at the period of time required (see the earlier discussion on sample size, and in particular, the Salk Polio vaccine trials of 1954). It is clear that sample size and retention of participants needs careful thought and prior research to determine the sort of effect that the treatment or intervention might have and the size of effect that would be considered clinically useful (and therefore worth powering the trial to detect). To achieve the required sample size it is quite usual for a trial to be run with several participating centers in different parts of the country, or even to be international, with participating centers from several countries.

The cost of conducting RCT trials may be high, and if the trial is interrupted during its structured program, or there are problems with adverse effects or triallists dropping out, how are the results to be interpreted? Sample size determination can be difficult and it may not be possible to obtain the desired numbers in each group and stratum or to obtain the overall desired sample size if it is far larger than is achievable within time and cost constraints. Having an appropriate sample size is an ethical issue because too small a trial is unlikely to give a useful result, so cannot justify the cost and patient time involved, and too large a trial similarly wastes money and patient time. There may also be other ethical problems — for example, should a particular treatment that is thought to be very promising be withheld from very sick patients? (see further, the Declaration of Helsinki). Despite these reservations, RCTs do provide the "gold standard" for conducting many types of medical trial, and have provided to be extremely successful including in trials involving complex interventions — see for example John et al. [JOH1], Kinmonth et al. [KIN1] and the BUMPES (Birth in Upright Maternal Position) Trial [BUMPES].


[BH1] Bradford Hill A (1937) Principles of Medical Statistics. The Lancet, London (issued in various editions until 1971. Then republished as "A Short Textbook of Medical Statistics" in 1977

[BUMPES]The Epidural and Position Trial Collaborative Group (2017) Upright versus lying down position in second stage of labour in nulliparous women with low dose epidural: BUMPES randomised controlled trial. BMJ ;359:j4471

[JOH1] John J H, Ziebland S, Yudkin P L, Roe L S, Neil H A W for the Oxford Fruit and Vegetable Study Group (2002) A randomised controlled trial of a primary care intervention to increase fruit and vegetable consumption: effects on plasma antioxidant concentrations and blood pressure. Lancet ; 359:1969-74

[KIN1] Kinmonth A L, Wareham N J, Hardeman W, Sutton S, Prevost A T, Fanshawe T et al. (2008) Efficacy of a theory-based behavioural intervention to increase physical activity in an at-risk group in primary care (ProActive UK): a randomised trial.  Lancet ; 371 (9606):41-48

[MRC1] Medical Research Council (1948) Streptomycin treatment of pulmonary tuberculosis. BMJ, 4582, 769-782

Wikipedia: Randomized controlled trials: https://en.wikipedia.org/wiki/Randomized_controlled_trial