Why promising results from a large clinical trial into vitamin D and COVID-19 may not be all that they seem

Queen Mary University of London

On 13th February David Davis MP tweeted: “The findings of this large and well conducted study should result in this therapy being administered to every Covid patient in every hospital in the temperate latitudes.”

The therapy in question was vitamin D and the Spanish study was a randomised trial purporting to show a 60 per cent drop in death rates amongst hospitalised patients. The story was covered in other media, but an avalanche of responses on twitter and directly to the website where the study details were posted, indicated that not everyone agreed with David Davis’ assessment, and the study was eventually removed from the website. So, what was wrong with this study?

Over the past year, everyone has become increasingly familiar with clinical trials. The backbone of clinical research, clinical trials are based on the idea that to see if a treatment works we take a group of individuals who might benefit from the treatment if it does work, give the treatment to half of them and not to the other half and then compare what happens to the individuals in the two halves.

Yes, trials can be more complicated but this is the essence. Furthermore, two things are key as we construct this comparison. First, there must be a large enough number of people involved in the comparison. Second, the comparison must be “fair”. Putting all older people in one group and younger in another group would obviously negate any findings, and if the government had tried to pursue policies for treatment of COVID-19 based on trials of, say, eight patients, no-one, no matter what their background, would have trusted them.

Understanding how to get large enough numbers is fairly straightforward. To make the comparison as fair as possible, we use randomisation, a statistical technique for deciding whether individuals get the treatment or not. Though everything is done by computer nowadays, the easiest way to understand randomisation is to think of tossing a coin for every individual – heads you go in the treatment group, tails you go in the non-treatment group. It’s accepted as the best way of making sure the two groups end up being comparable.

So, was this Spanish study large enough and did it use randomisation? With over 900 patients and described by the authors as a randomised trial, on the face of it, yes. However, dig a little deeper, like many of those who tweeted in response to David Davis, and things begin to fall apart. This study has a remarkably large number of flaws. Here I outline two, which, unfortunately, leave the purported results in tatters.

Cluster randomised trials

First, although the study is described by the authors as a randomised controlled trial, it wasn’t the 900 plus individuals in the trial who were randomised, it was hospital wards, eight of them. Five of them were selected randomly to give vitamin D to their patients, and the other three wards did not receive vitamin D. Randomising entities other than actual patients is a perfectly legitimate clinical trial design; in this particular case, earning the trial the designation of a “cluster” randomised trial. However, eight is rather a small number.

What if the wards have very different characteristics and, when randomised, just by chance the two groups aren’t very comparable? There might, for example, be one very large pioneering ward amongst the eight; whichever group that ward is randomised into will have an unfair advantage showing improved patient benefit. This is not quite the same as having only eight patients in a trial, which most people might consider suspect, but it has similarities. It doesn’t necessarily invalidate the trial, but some account of the variation between wards does need to be made in the trial analysis. This was not done in this trial, and this means the results presented are too precise.

So, let’s solve this. Let’s give the trial data to some statisticians who know how to make the appropriate adjustments, and see how the results pan out. Unfortunately, a further fatal flaw in this trial means that that won’t rectify things. This flaw is in the way the patients themselves were assigned to the wards. It appears (though the publication is not very clear on this) that they were recruited to the trial by hospital staff who knew which wards were treatment wards and which were non-treatment wards.

It doesn’t take very much thought to realise that this may have influenced the doctors’ decision on which patients to recruit into the trial, and which ward they were sent to. There were no safeguards to ensure that patients in the treated and non-treated groups were comparable in any way. Though some statistical techniques can go partway to account for this problem, sadly they cannot do so with absolute certainty; this is a case of unknown unknowns.

With a bit of thought and the expertise of those versed in trial design this trial could have been conducted without these flaws. As it is, David Davis and others have been, sadly, misled.

/Public Release. This material comes from the originating organization and may be of a point-in-time nature, edited for clarity, style and length. View in full here.