Forty years of sports performance research and little insight gained
BMJ 2012; 345 doi: https://doi.org/10.1136/bmj.e4797 (Published 18 July 2012) Cite this as: BMJ 2012;345:e4797- Carl Heneghan, clinical reader in evidence based medicine,
- Rafael Perera, university lecturer in statistics,
- David Nunan, research fellow,
- Kamal Mahtani, clinical lecturer,
- Peter Gill, DPhil candidate
- 1 Centre for Evidence-Based Medicine, Department of Primary Care Health Sciences, University of Oxford, Oxford OX1 2ET, UK
- Correspondence to: C Heneghan carl.heneghan{at}phc.ox.ac.uk
Sports drinks manufacturers are keen to emphasise that their products are supported by science, although they are more reticent about the details. As part of the BMJ’s analysis of the evidence underpinning sports performance products, it asked manufacturers to supply details of the studies. Only one manufacturer, GlaxoSmithKline, complied. It provided us with a comprehensive bibliography of the trials used to underpin its product claims for Lucozade—a carbohydrate containing sports drink.1 Other manufacturers of leading sports drinks did not supply us with comprehensive bibliographies, and in the absence of systematic reviews we surmise that the methodological issues raised in this article could apply to all other sports drinks.
Of this list of 176 studies, we were able to critically review 106 studies (101 clinical trials) dating from 1971 through to 2012. We did not review posters, supplements, theses, or unavailable articles (see the linked data supplement).
Clinical trials are the best study design we have to evaluate what effect a “treatment”—in this case Lucozade sports drinks—will have on performance. However, not all trials are created equal,2 and the label of randomised controlled trial is no guarantee that a study will provide adequate or useful evidence. As it turns out, if you apply evidence based methods, 40 years of sports drinks research does not seemingly add up to much, particularly when applying the results to the general public. Below we set out the main problems we identified together with some examples.
The smaller the sample size, the lower the confidence around the reported effect
Only four studies included a power calculation at the outset,3 4 5 6 and very few studies discussed the importance of statistical power: we identified only one study, in seven moderately trained subjects, that reported the inability to detect significant differences in muscle glycogen use between treatments “may be due to a lack of statistical power given the small number of subjects.”7 Within sports science, given the small sample sizes, it is plausible that any hypothesis could be tested and subsequently reported in a peer reviewed journal as positive.8 Small studies are known to be systematically biased towards the effectiveness of the interventions they are testing. Researchers have previously defined “small” as less than an average of 100 participants in each arm.9 Yet, only one of the 106 studies —in 257 marathon runners—exceeded this target.10 The next largest had 52 participants,11 and the median sample size was nine.
There is one caveat to the requirements for larger sample sizes: where the variability in the outcome measure is low. The greater the level of variability, the higher the sample size required to detect the same effect.12 By choosing homogenous groups of athletes and physiological measures, studies aimed to reduce variability and thus negate the need for larger sample sizes; however, this means that the results apply only to these highly selected groups.13
Poor quality surrogate outcomes undermine the validity of reported effects
A study of carbohydrate ingestion among 16 university football players with compromised glycogen stores (that is, after overnight fasting) reported participants significantly differed in the mean points scored per shot on the Loughborough Soccer Shooting Test (LSST).14 Yet, this test does not discriminate elite from non-elite players.15 Although plasma glucose concentrations and carbohydrate oxidation rates were significantly increased in 12 football players during football specific exercises, sprint power was not significantly affected by the ingestion of carbohydrate or placebo. Studies that provided no exercise or performance outcomes were even more difficult to interpret.16 17
Moreover, tests with a high coefficient of variation, such as time to exhaustion, are often misleading. Cycling at 75% maximal oxygen consumption to exhaustion has a coefficient of variation as high as 27%, whereas time trial cycling has been shown to have a coefficient as low as 3.8%.18 Thus time trials are likely to be a more valid measure of cycling performance.19 Also poor reproducibility of a test, such as time to exhaustion, means psychological factors may contribute significantly to performance, independent of the intervention under scrutiny. Worryingly, most performance tests used to assess sports drinks have never been validated.20 Furthermore, we are not aware of any major sporting events that use time to exhaustion as the outcome. Yet it was used in 17 of the included studies and is often used as a performance measure in sports research.
Poorly designed research offers little to instil confidence in product claims
Most studies (76%) were low in quality because of a lack of allocation concealment and blinding, and often the findings contrasted with each other. One of the few trials that used a sports related outcome, long distance canoeing, shows some of the design problems.21 The study found that use of a glucose syrup drink meant canoeists were able to maintain a consistent lap time whereas there was a gradual worsening of performance in the placebo group. But, the “effect is masked to some extent by the fact that four of the volunteers participated in two-man kayaks.” This resulted in each two man crew having one person taking the drink and the other placebo; inevitably the times recorded for both were virtually identical, despite the difference in drinks being used. The studies often had substantial problems because of use of different protocols, temperatures, work intensities, and outcomes.22
Data dredging leads to spurious statistical results
Data dredging is the inappropriate use of statistics to uncover misleading relations within the data and is common in studies where a clear protocol with a primary outcome of interest has not been defined. When glucose syrup was given to one football team before matches over a season the number of goals scored in the second half was double that scored by the control team, with 15/20 being scored in the final half hour of the match. The goals conceded in both halves without glucose, and in the first half with glucose, were almost identical. However, the data showed a marked improvement in defensive performance, with only one goal conceded during the final 30 minutes of the 20 matches.23 Since outcomes were not defined beforehand, it is possible that the researchers also examined the last 20 minutes, the last 10, or even extra time, thus increasing the chance of a type I error (false positive) arising because of the number of outcome tests.
Biological outcomes do not necessarily correlate with improved performance
In a study of six trained soccer, hockey, or rugby players use of muscle glycogen during high intensity running was reduced in those who had a carbohydrate drink compared with those who had a non-carbohydrate drink. Yet, the average sprint times were the same for both groups.24 Several of the outcomes used are subject to higher order interactions—that is, they are affected by two or more variables.25 For example, several factors independently influence the oxidation of ingested carbohydrate, including the timing, the exercise intensity, and the type of carbohydrate. These interactions can mask the effectiveness of the intervention, making it hard to determine causation from association. In addition, physiological outcomes such as maximal oxygen consumption, used in many studies, have been shown to be poor predictors of performance, even among elite athletes.26
Inappropriate use of relative measures inflates the outcome and can easily mislead
The finding that consuming carbohydrate containing drinks during high intensity exercise “delayed fatigue and improved endurance running capacity by 33% when compared with the same volume of artificially sweetened water” sounds impressive. However, the crossover study that reported this result was in nine participants who had fasted overnight for at least 10 hours, done a 15 minute warm-up followed by 75 minutes of exercise, and then ran to exhaustion. In one arm they consumed a carbohydrate drink (concentration 6.9%) while in the other they consumed a non-carbohydrate placebo. The absolute difference in run time to exhaustion was 2.2 minutes (8.9 min v 6.7 min). By the time of the exhaustion run, the control group had not eaten for at least 11.5 hours.27 In addition, exclusion of the initial first 75 minutes from the outcome calculation dramatically inflates the relative effect. If this is included the difference is 83.9 min v 81.7 min, an improvement of only 3% rather than the claimed 33%. Improved running capacity by 33% for sports performance is implausible—it would be the difference between running a marathon in two hours rather than three.
Studies that lack blinding are likely to be false
In a study where the control group drank plain water with no artificial sweetener, the endurance running capacity of nine participants was measured after a 12 hour fast and the 6.9% carbohydrate drink was found to give better performance.28 Yet, a study of high intensity running that used correct blinding—“the solutions ingested were of the same colour, texture, and taste, and were administered in a double blind fashion”—found no difference in the run time to exhaustion after 100 minutes of exercise.29 A study using the Loughborough Intermittent Shuttle-running Test (90 minutes in six blocks of 15 minutes separated by 3 minute rest periods) in which the placebo was taste matched found that mean 15 m sprint times were similar in both groups after 90 minutes and there was no effect on muscle glycogen use.30 Of the 106 studies, only 38 (36%) reported blinding but even these gave poor descriptions of who exactly was blinded, particularly outcome assessors.
Manipulation of nutrition in the run-in phase significantly affects subsequent outcomes
Many studies seemingly starve participants the night before and on the morning of the research study. However, in one study that gave subjects breakfast31 the high glycogen levels at the outset of exercise negated the effects of carbohydrate ingestion during exercise. The study, among eight cyclists, gave high or low glycogen food before exercise in a factorial crossover design with either a carbohydrate or non-carbohydrate drink during exercise. Over the final period of the time trial, power output and pace were significantly lower only during the arm in which participants had low glycogen food before exercise and non-carbohydrate drinks during exercise.32 It is important that studies represent real life situations: when would an athlete fast for 12 hours before a major performance event?
Biological gradients are useful to establish the relation between cause and effect
Studying the dose-response relation reveals important differences. For example, a study analysing carbohydrate feedings and exercise performance showed glucose solutions at a concentration of 4% or more delayed gastric emptying compared with more dilute solutions.33 The 20 g/L glucose solution was emptied at the same rate as water and the 40 and 60g/L solutions were emptied at a slower rate. In a crossover study of carbohydrate versus water in marathon running a lemon drink containing 5.5% carbohydrate was the preferred option among endurance trained runners and performed better than an orange 6.9% carbohydrate drink.28
Changes in environmental factors lead to wide variation in outcomes
Fluid replacement with dilute carbohydrate drink (2%) was found to be beneficial when exercising in heat (ambient temperature 30.2oC) compared with a 15% carbohydrate electrolyte drink, which was no better than no drink at all. The mechanisms for the improvements in exercise capacity reported were unclear. Many of the suggested benefits across trials did not seemingly have a plausible or known mechanism of why they would even work.34 Although drinking a carbohydrate electrolyte solution after a 12 hour fast induced greater metabolic changes than flavoured water and placebo solutions, endurance capacity (intermittent running of high intensity in hot environmental conditions at 30oC) was not significantly affected.35 In a cold environment (mean temperature 10oC) , the effect of carbohydrate drinks (glucose concentrations of 2%, 6%, and 12%) on exercise capacity was no better than that of flavoured water, as measured by cycle rides to exhaustion.36
Studying the general population requires larger samples
Larger sample sizes are required to take account of the substantial variability in sports performance among untrained people. Of note, the study of 257 participants in the 2009 London marathon 10 concluded that, in addition to sex, body size, and training, pre-race day carbohydrate intake can significantly and independently influence marathon running performance. Interestingly, there was no mention of carbohydrate feeding as a factor in predicting improved performance for marathon runners.
From our analysis of the current evidence, we conclude that over prolonged periods carbohydrate ingestion can improve exercise performance, but consuming large amounts is not a good strategy particularly at low and moderate exercise intensities and in exercise lasting less than 90 minutes. There was no substantial evidence to suggest that liquid is any better than solid carbohydrate intake and there were no studies in children. Given the high sugar content and the propensity to dental erosions children should be discouraged from using sports drinks.37 Through our analysis of the current sports performance research, we have come to one conclusion: people should develop their own strategies for carbohydrate intake largely by trial and error.
Notes
Cite this as: BMJ 2012;345:e4797
Footnotes
Competing interests: All authors have completed the ICMJE uniform disclosure form at www.icmje.org/coi_disclosure.pdf (available on request from the corresponding author) and declare: no support from any organisation for the submitted work; no financial relationships with any organisations that might have an interest in the submitted work in the previous three years, no other relationships or activities that could appear to have influenced the submitted work.
Provenance and peer review: Commissioned; externally peer reviewed.