Contributed equally to this work with: Tianjing Li, Tsung Yu * E-mail: tli19@jhu.edu Affiliation Center for Clinical Trials and Evidence Synthesis, Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, United States of America ⨯
Contributed equally to this work with: Tianjing Li, Tsung Yu Affiliation Center for Clinical Trials and Evidence Synthesis, Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, United States of America ⨯
Affiliation Center for Clinical Trials and Evidence Synthesis, Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, United States of America ⨯
Affiliation Center for Clinical Trials and Evidence Synthesis, Department of Epidemiology, Johns Hopkins Bloomberg School of Public Health, Baltimore, Maryland, United States of America ⨯
Randomized crossover trials are clinical experiments in which participants are assigned randomly to a sequence of treatments and each participant serves as his/her own control in estimating treatment effect. We need a better understanding of the validity of their results to enable recommendations as to which crossover trials can be included in meta-analysis and for development of reporting guidelines.
To evaluate the characteristics of the design, analysis, and reporting of crossover trials for inclusion in a meta-analysis of treatment for primary open-angle glaucoma and to provide empirical evidence to inform the development of tools to assess the validity of the results from crossover trials and reporting guidelines.
We searched MEDLINE, EMBASE, and Cochrane’s CENTRAL register for randomized crossover trials for a systematic review and network meta-analysis we are conducting. Two individuals independently screened the search results for eligibility and abstracted data from each included report.
We identified 83 crossover trials eligible for inclusion. Issues affecting the risk of bias in crossover trials, such as carryover, period effects and missing data, were often ignored. Some trials failed to accommodate the within-individual differences in the analysis. For a large proportion of the trials, the authors tabulated the results as if they arose from a parallel design. Precision estimates properly accounting for the paired nature of the design were often unavailable from the study reports; consequently, to include trial findings in a meta-analysis would require further manipulation and assumptions.
The high proportion of poorly reported analyses and results has the potential to affect whether crossover data should or can be included in a meta-analysis. There is pressing need for reporting guidelines for crossover trials.
Citation: Li T, Yu T, Hawkins BS, Dickersin K (2015) Design, Analysis, and Reporting of Crossover Trials for Inclusion in a Meta-Analysis. PLoS ONE 10(8): e0133023. https://doi.org/10.1371/journal.pone.0133023
Editor: Lamberto Manzoli, University of Chieti, ITALY
Received: December 30, 2014; Accepted: June 22, 2015; Published: August 18, 2015
Copyright: © 2015 Li et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Data Availability: All relevant data are within the paper and its Supporting Information files.
Funding: The project was funded by Grant 1 RC1 EY020140 and Grant U01-EY020522, National Eye Institute, National Institutes of Health, United States.
Competing interests: The authors have declared that no competing interests exist.
Randomized crossover trials are clinical experiments in which participants are assigned randomly to a sequence of treatments and each participant serves as his/her own control in estimating treatment effect [1,2]. For example, in an AB/BA design, the simplest form of a randomized crossover trial, participants are assigned randomly to either treatment A followed by treatment B, or treatment B followed by treatment A (Fig 1). Because both treatments are evaluated for the same individual, the treatment effect can be estimated based on an average of within-individual differences (Fig 1, Tables 1 and 2) [1–3]. Given this property, a crossover trial can theoretically achieve the same precision as a parallel group trial with only half the sample size. The required sample size is reduced further because outcomes measured in the same individual generally have a smaller variance than outcomes measured between individuals [1,2].
PowerPoint slide larger image original image Fig 1. Illustration of the design and analysis of a crossover trial.Carryover effect: If A is an active intervention and B is a placebo, then the BA sequence is unlikely to be affected by a carryover effect, but the AB sequence is potentially susceptible. In the AB sequence, when some effect of the active intervention A is carried over to the second period, placebo could demonstrate artificial “effectiveness”. Under this scenario, the treatment effect of A compared to B would be under-estimated for the AB sequence, and so for both sequences combined [7]. Thus, if there are differential carryover effects in the two treatment sequences, the design can yield biased estimates of the treatment effect [1–4]. Washout period: To minimize a possible carryover effect between periods in a crossover trial, investigators use a “washout” phase that is sufficiently long to eliminate the first intervention’s effects [1, 2]. Although some researchers have recommended estimating and testing for the carryover effect, and when the effect is present, analyzing data collected from the first period only, this method has been shown to lead to biased estimates of effect [9]. Senn and others have taken the position that the crossover design should be used only when the assumption that there is a minimal carryover effect is likely to hold [1]. In such cases, instead of testing for carryover effect, one proceeds as if there were none. There also is the ethical consideration with using a washout period in participants with a chronic condition; in such cases, giving no treatment may not be in a participant’s best interests.
PowerPoint slide larger image original image Table 1. Analysis of a crossover trial–an illustrative example. PowerPoint slide larger image original image Table 2. Results of the illustrative crossover trial presented in Table 1.Several aspects of crossover trial design are critical to the potential risk of bias in the findings and interpretation. The first design consideration is that treatment from one period may have a residual effect that persists into the subsequent period, particularly when there is no “washout” between periods [1,2]. This is called a carryover effect (Fig 1). The second consideration is a period effect, which can occur when the treatment effect is not constant over time resulting in treatment by period interaction [1,2]. Period effect is more likely to occur when the treatment periods are long and when the underlying medical condition is not stable. Third, dropouts and missing data usually have a larger impact on crossover trials than on parallel group trials because missing data from one period preclude the within-individual comparison for all who enrolled in the trial [4]. Finally, there are situations in which the crossover design is inappropriate; for example, when the treatment in an earlier period (e.g., a vaccine) permanently alters the course of the condition such that, on entry to the next period, the participant characteristics systematically differ from their initial state [1–3].
The analysis and reporting of crossover trials should also account for the paired nature of the design [3] (Tables 1 and 2). This means that the treatment effect and associated precision are calculated based on within-individual treatment comparisons so that the potential gains in precision and statistical efficiency by choosing a crossover design are realized.
We became interested in the problem of crossover trials in the context of a systematic review and network meta-analysis we undertook, which identified a large number of trials that used a crossover design. We did not want to eliminate them from our analysis, as this would mean wasted information. On the other hand, we were faced with the challenge of deciding which of the trials should be included in the network meta-analysis and how. Our objective was to evaluate the design, analysis, and reporting characteristics of these crossover trials and provide empirical evidence to inform the development of tools to assess the validity of the results from crossover trials and reporting guidelines for crossover trials.
We examined randomized crossover trials eligible for a systematic review and network meta-analysis that we are conducting on the comparative effectiveness of medical interventions for ocular hypertension and open-angle glaucoma. The main inclusion criteria of the systematic review were: randomized controlled trials (RCTs) that assigned to each treatment ≥10 participants of any age or gender with physician-diagnosed ocular hypertension or primary open-angle glaucoma; and trials comparing at least one medical intervention with no treatment/placebo or another medical intervention. We set no maximum or minimum limit on the duration of treatment; however, we included only trials in which participants had been followed for ≥28 days after randomization.
We searched the Cochrane Register of Controlled Trials (CENTRAL) in The Cochrane Library, MEDLINE, and EMBASE on November 17, 2009 following a search strategy that was published previously [5]. Two individuals independently assessed titles and abstracts and then full text articles to identify eligible RCTs. Two individuals working independently identified the crossover trials within the total group for this study. We resolved disagreements between the two reviewers through discussion or consultation with a third person.
For each included crossover trial, two individuals (at least one with statistical expertise) independently abstracted data using an electronic data collection form developed, pilot-tested, and maintained in the Systematic Review Data Repository (S1 File), adapting some data items from a previous study of crossover trials [6]. We resolved disagreements between the two reviewers through discussion or consultation with a third person.
For each trial report, we recorded the rationale provided by the authors for using a crossover design, information on number of interventions being compared, sample size calculation, statistical analysis methods stated in the methods section of a report, and whether a washout period was used. We reviewed whether carryover and period effects were mentioned anywhere in the report and how the two effects were addressed in the data analysis of treatment effects. When change from baseline was reported as an outcome metric, we abstracted information on which baseline was used for calculating change (i.e., “before the start of the first treatment” or “after the completion of one treatment and before the start of the next treatment”). We also abstracted information on whether and how the investigator dealt with missing data and how results were reported. We assessed whether it was possible to calculate precision of effect that accounts for the paired nature of the design when not reported by the investigators, so that the study could be included in a meta-analysis.
We tabulated the number and proportion of trials reporting each of these characteristics. All analyses were conducted using STATA 13 (StataCorp. 2013. Stata Statistical Software: Release 13. College Station, TX: StataCorp LP.).
We identified 83 crossover trials (82 publications) eligible for inclusion in our systematic review; these trials constitute 16% of all eligible studies for our planned systematic review and network meta-analysis within this time period.
In terms of design characteristics, only a small fraction of the crossover trial investigators (5%, 4/83) provided a rationale for using a crossover design (Table 3). A large majority of the trials (88%, 73/83) examined two treatments. A pre-planned sample size calculation was reported for about half of the trials (54%, 45/83). Fewer than one half (41%, 34/83) reported using a washout period before the next treatment was started; a further 16% (13/83) stated why a washout period was not needed.
PowerPoint slide larger image original image Table 3. Reported design characteristics of included crossover trials (n = 83).The methods used for data analysis are of critical importance for those interpreting the findings (Table 4). Almost all trials (99%, 82/83) used data from more than one treatment period to estimate the treatment effect. However, only three- quarters of the trials (76%, 63/83) stated that the analysis accounted for the crossover nature of the design, that is, that each participant served as his or her own control. Ten percent (8/83), 17% (14/83), and 14% (12/83) of trials mentioned testing for the presence of carryover effect, attempted to deal with it in the analysis, or commented on it in the discussion section, respectively. Similarly, 18% (15/83), 23% (19/83), and 10% (8/83) of trial reports mentioned testing for the presence of period effect, attempted to deal with it in the analysis, and discussed it, respectively.
PowerPoint slide larger image original image Table 4. Reported analysis characteristics of included crossover trials (n = 83).Of the 54 trials that reported analyzing change in intraocular pressure from baseline, half of them (56%, 30/54) used the value measured before the start of the first treatment for calculating the change; a quarter of them (26%, 14/54) used the value measured after the completion of one treatment but before the start of the next treatment; one trial used both (2%); and the remaining trials (17%, 9/54) did not report clearly what had been used as the baseline value. Of the 62 trials with missing data, a large proportion (84%, 52/62) used complete case analysis (i.e., removed all participants with missing outcome data from the analysis). Only 2% (2/83) of trial reports included a patient flow diagram, which would have clarified questions about missing data.
Almost three-quarters (72%, 60/83) of the trials presented outcome data as if they arose from a parallel group design. That is, instead of reporting the summary statistics (point estimate and precision estimate) of the within-individual difference with respect to an outcome, the investigators summed outcome measurements for a treatment from all participants across sequences [3,4]. For example, outcomes for treatment A were averaged across both sequence periods and outcomes for treatment B were averaged, and then the two averages were compared. An example of this type of inappropriate reporting of outcome data in crossover trials can be found in Table 2 from the publication of Konstas et al. [18]. This way of reporting ignored the paired nature of the design. A point estimate calculated this way is valid only when there are no missing data (i.e., the mean of differences equal the difference in means), but the estimate is less precise than the estimate calculated using the appropriate method. We also came across cases in which the reporting retained the paired nature of the design by examining the within-individual difference with respect to an outcome. However, the reporting was still incomplete because the treatment effect, the average of the outcome data for the two treatment sequences, was not reported (see Table 2 of Harasymowycz et al. [19]).
In our sample, almost all trials (94%, 78/83) reported a point estimate of treatment effect (Table 5), yet only one quarter of them (23%, 19/83) reported a standard deviation, a standard error, or a confidence interval on the estimated treatment effect that accounted for the paired nature of the design; one half of them (51%, 42/83) reported results of a hypothesis test for the treatment effect that accounted for the pairing (a t-statistic or a p-value from a paired sample t-test), and 5% (4/83) reported individual patient data.
PowerPoint slide larger image original image Table 5. Reporting of results of included crossover trials (n = 83).A meta-analyst may decide to include or exclude crossover trials from a meta-analysis depending on the presumed assumptions made and approaches taken (Table 6). To include a study in a meta-analysis, one would need a point estimate (e.g., relative risk, mean difference) and associated precision of the point estimate (e.g., standard error, confidence interval). In our sample, only 60% (50/83) of trials reported these two data elements for inclusion in a meta-analysis without further assumptions and mathematical manipulations; that proportion decreased to 31% (26/83) if only crossover trials that used a washout period were considered appropriate for inclusion. Mathematical manipulation includes calculating precision of the point estimate using individual patient data when available or assuming a certain degree of correlation between the two measurements taken on the same individual [7]. Meta-analysts may also choose to use the data from the first period only; 19% (16/83) of trials in our sample would contribute data to the meta-analysis were we to follow this approach.
PowerPoint slide larger image original imageTable 6. Number of crossover trials that would be included in a meta-analysis, assuming inclusion based on different design and analysis characteristics (n = 83).
Up-to-date systematic reviews and meta-analyses are an important way of summarizing the current status of information about treatment effectiveness and safety. In preparing for a systematic review of medical interventions for ocular hypertension and open-angle glaucoma, we found that a large number of eligible trials are crossover trials. For some disciplines, including ophthalmology subspecialties, crossover trials may be encountered quite often in the literature [8]. We believe that data from these trials are critical to presenting summary information; and not to include them would represent a waste of research information. As far as we know, the topic of inclusion of crossover trials into a meta-analysis has not been addressed with empirical data. For example, we struggled with which studies sufficiently minimized bias to merit inclusion as well as which results were based on paired analysis. We examined critical characteristics of the 83 eligible crossover trials and report on them here to facilitate further discussion. Our goal is to contribute to developing guidance and reporting standards for future investigators and systematic reviewers on areas of potential concern.
We found that the crossover design is attractive to investigators but easily can be misused. This has implications for our evidence base as a whole since the results may be of limited value to meta-analysts due to inappropriate analysis and inadequate reporting. In our sample, authors of only a few trials discussed the prerequisites of the crossover design. For example, there was limited information with regard to whether the underlying disease was likely to have a constant intensity during all treatment periods; the authors infrequently explored or discussed whether the effect of the treatment was likely to be restricted to the period in which it was applied (minimal carryover effect). Furthermore, some trials failed to accommodate the within-individual differences in the analysis, losing the statistical efficiency offered by the design. For a large proportion of the trials, the authors tabulated the results as if they arose from a parallel design. The precision estimates that had properly accounted for the paired nature of the design were often unavailable from the study reports; consequently, to include their findings in a meta-analysis would require further manipulation and assumptions.
We provide the following recommendations.
Absence of reporting guidelines may help to explain the inadequate and sometime misleading reporting we observed in our sample. A CONSORT extension for reporting crossover trials is under development, which will be useful for journal editors as well as investigators. In addition to the above-mentioned issues specific to crossover trials, other elements described in the CONSORT statement for randomized controlled trials should also be carefully described [15]. Adequate reporting is also helpful for assessing the risk of bias of crossover trials [16].
In addition to disseminating possibly misleading information on the effects of interventions, poor reporting of crossover trials has negative downstream consequences. It precludes full use of crossover data in meta-analyses. Methods exist to transform and impute missing information so that crossover trials could be included [7,16]. For example, one can approximate the paired analysis by assuming a certain degree of correlation between two measurements taken on the same individual. When a carryover effect cannot be ruled out, one can use data collected from the first period in a meta-analysis (which might be biased) [9]. Yet as shown in this paper and demonstrated in the literature, most of these methods rely on assumptions and additional data manipulation, unnecessary steps when the reporting is accurate, complete, and appropriate to the design.
This multifaceted project is continuing. A future step will be to evaluate the impact of including different set of trials into the meta-analysis. We are in the process of publishing our main network meta-analysis on the comparative effectiveness of first-line topic medications for open angle glaucoma. We will be interested to see whether the relative effect estimates and rankings will change depending on whether trials meeting criteria f, g, or h on Table 6 are included in the network meta-analysis. Criterion h is the most stringent one and restricting analysis to this set of trials would be the least biased in theory.
Current practice of including crossover trials in a meta-analysis varies. Elbourne and colleagues examined 184 systematic reviews that mentioned including crossover trials. They found that 17% of them excluded crossover trials from the meta-analysis, about a half used data from the first period of the trial only, and a third included data from both periods as though a parallel group design had been used; only one review (1%) incorporated the paired data into the meta-analysis [7]. A more recent study by Lathryis and colleagues had similar findings: only 1/33 meta-analyses they examined stated that the paired data had been incorporated into the meta-analysis [17]. Thus, the crossover design is not well understood by most authors conducting systematic reviewers. Because the methods for including crossover trials into a meta-analysis may not be familiar to the usual systematic reviewer, we recommend working with a statistician. When the data from crossover trials cannot be incorporated fully into a systematic review and meta-analysis, full benefit of the trial is not realized.
In conclusion, the value of crossover trials to clinicians, their patients, and systematic reviewers depends on the appropriateness of the design, conduct, as well as the quality of reporting. There is pressing need for reporting guidelines for crossover trials. Guidance is needed if we are to incorporate crossover trial findings into meta-analyses.