Stroke Treatment Academic Industry Roundtable Recommendations for Individual Data Pooling Analyses in Stroke
Pooled analysis of individual patient data from stroke trials can deliver more precise estimates of treatment effect, enhance power to examine prespecified subgroups, and facilitate exploration of treatment-modifying influences. Analysis plans should be declared, and preferably published, before trial results are known. For pooling trials that used diverse analytic approaches, an ordinal analysis is favored, with justification for considering deaths and severe disability jointly. Because trial pooling is an incremental process, analyses should follow a sequential approach, with statistical adjustment for iterations. Updated analyses should be published when revised conclusions have a clinical implication. However, caution is recommended in declaring pooled findings that may prejudice ongoing trials, unless clinical implications are compelling. All contributing trial teams should contribute to leadership, data verification, and authorship of pooled analyses. Development work is needed to enable reliable inferences to be drawn about individual drug or device effects that contribute to a pooled analysis, versus a class effect, if the treatment strategy combines ≥2 such drugs or devices. Despite the practical challenges, pooled analyses are powerful and essential tools in interpreting clinical trial findings and advancing clinical care.
Scientific advancement is based on hypothesis testing and replication. Clinical trials are interpreted on this basis: a single positive trial may be encouraging for any new therapy, but 2 such trials are typically required for marketing authorization and establishment into clinical practice. For many reasons, that may include insufficient statistical power, suboptimal design, inexperience with treatment delivery, and use of prototype treatment approaches, initial clinical trials of a useful treatment may declare a falsely neutral result; however, publication bias also contributes to the trend for later trials to be positive. It has become recognized practice to pool trials to refine our assessment of the treatment effect, helping to indicate whether not only it is effective but also how effective it may be, and in which circumstances.
Typically, data are pooled at the trial level (ie, meta-analysis) or, occasionally, at the subgroup level. For example, the Cochrane review of thrombolysis for acute ischemic stroke management used trial-level data for its main analysis and considered 2 treatment time windows for subgroup analyses. Although this type of trial-level pooling is useful, it disregards potentially valuable information at the patient level that could prevent false conclusions. Taking the principal results of the Cochrane thrombolysis review, a reader may conclude that the use of intravenous alteplase is justified only if administered within 3 hours of stroke onset because the 3- to 6-hour subgroup analysis showed no significant benefit or that treatment at any time within 6 hours is justified because the primary analysis of the 0- to 6-hour data was positive.1 Conversely, a reader of the pooled analysis of individual patient data (IPD) would likely draw a different conclusion: seeing that treatment benefit is closely dependent on delay since stroke onset, and that benefit remains statistically significant until at least 4.5 hours, the reader may favor treatment beyond 3 hours but only until 4.5 hours.
Pooling of IPD also opens the way for more powerful analyses because results can be adjusted for multiple covariates (ie, variables that may influence the outcome such as time to treatment, age, stroke severity, sex, diabetes, previous stroke, and baseline neuroimaging features in the setting of thrombolysis trial data). Exploration of individual covariates in larger samples allows for a better estimate of treatment effect size in future populations and subgroups, restricts the confidence interval around these estimates, and indicates which are the important factors to consider when selecting patients for treatment. The analysis of pooled IPD releases the restrictions imposed by the individual trial protocols and publications: fresh criteria for defining subgroups and applying a common outcome measure become possible. Furthermore, subgroup analyses that are prespecified (ie, before release of trial results) and adequately powered could go beyond being hypothesis generating to achieving a new level of evidence. Individual trials may be underpowered to assess a given subgroup and, in that circumstance, a pooled analysis might bring key confirmatory data for regulatory considerations. It is acknowledged that in addition to prespecifying subgroups of interest, pooled analyses must still protect against the risks inherent in multiple testing by prespecifying the primary end point and incorporating statistical adjustment where necessary.
These advantages of IPD carry a modestly greater burden, however. Cooperation among trialists is required and needs to be coordinated; the necessary technical skills, time, effort, and costs are increased; it is essential to understand and allow for the varied context and conditions under which data were collected across trials; and the risks of data mining become infinitely greater.
This article describes conclusions arising from a workshop held at the ninth Stroke Academic Industry Roundtable (STAIR) on October 5, 2015 in Bethesda. This workshop was designed to discuss principles that would facilitate and optimize value from pooling of stroke trial data. Participants included academic, industry, and regulatory experts are listed in the Appendix. The approach taken to develop STAIR guidelines has been described elsewhere.2 Key recommendations are summarized in the Table.
Outcome Measure Selection
Although most acute stroke trials have chosen the modified Rankin Scale (mRS) as their principal outcome measure, and those that instead targeted vessel patency have retained mRS as a secondary measure, several analytic approaches and definitions of good results have been used. For example, these have included dichotomizing mRS score at 0 to 1 versus 2 to 6, at 0 to 2 versus 3 to 6, or even at 0 to 4 versus 5 to 6, examining a shift in distribution of the full scale, examining the distribution after combining category 5 (bedbound) with 6 (dead), or finally examining patient-centered utility of the mRS scores, which has a similar effect as the previous approach (Figure). Each has merit, but a pooled analysis may have different aims from individual randomized controlled trial objectives. A common approach is needed when combining trials if the influence of covariates is to be correctly estimated. Because switching the choice of end point may change the formal interpretation of the trial between neutral and positive, and because a common end point likely will not already exist, the collaborators planning a pooled IPD analysis must take care when prespecifying their end point.
If none of the trials to be included has already been unblinded, then any rational approach to analysis may be justified. If, however, ≥1 trial results were known, then this would influence or be perceived as influencing the choice of common end point. The least restrictive approach is needed (ie, the one that invokes fewest assumptions). It must still be an end point that is rational for the treatment being tested and useful for clinical interpretation. The available choices each have pros and cons.
Dichotomization considers the mRS in only 2 categories, such as mRS score of 0 to 1 as good outcome and mRS score of 2 to 6 as bad outcome. This end point can readily be used to assess statistical significance and to generate a measure of effect size with an associated confidence interval, can be converted easily to a number needed to treat, and is simple to explain to patients and clinicians. However, it also has 3 disadvantages. First, it may conceal harmful effects within the poor outcome stratum: for example, an increase in mortality caused by increased intracranial bleeding. This separates benefit from risk. It may be desirable to do so, particularly if the timescale for these 2 differs, such as when fatal bleeding because of treatment may be somewhat balanced by later survival gains among the less disabled survivors of treatment. Second, for many stroke trial populations, it also conceals benefits among a majority of patients who participated and were destined at best to achieve partly disabled survival (mRS scores, 2, 3, or 4). It is neither ethical to include such patients if they do not contribute usefully to interpretation of the trial nor it is statistically sensible to disregard the richness of the information that they provide; indeed, an ordinal approach to analysis typically contributes 36% more information and thus statistical power than a dichotomized approach.3 Third, dichotomization requires a combination of advanced knowledge of the treatment’s effects, the case mix of the trial, and luck. Without these, the chosen cut point for dichotomization may turn out to show a smaller treatment effect than other thresholds that have been disregarded. Although this has been discussed in the stroke literature, several recent trials retained dichotomization of primary end points and reported neutral results, whereas they would have declared positive results if different cut points or ordinal analyses had been selected as favored by the European Stroke Organisation Outcomes Working Group.4–6
An ordinal approach also invokes certain assumptions and requires some choices, however. The first assumption of ordinality is that each step on the scale reflects a genuine improvement from the preceding step, as perceived by all relevant parties. This may not be universally accepted for mRS because in some societies and among certain age groups, survival with severe disability—bedbound, incontinent, and totally dependent (ie, mRS 5)—is considered to be as bad as or even worse than death.7 This creates an argument for combining mRS categories 5 and 6 in an ordinal analysis approach.8 The second assumption, which has less importance and which does not compromise statistical analysis but has an impact on presentation of results, is that all steps are of equal value (ie, assumption of proportionality). It is evident that this assumption is violated for mRS: many patients regard the steps between mRS score of 5 and 4 (being released from bed) and from 4 to 3 (recovering independent mobility) as carrying greater value than returning to all usual activities (mRS score, 2–1) or being free from nondisabling symptoms (mRS score, 1–0). Describing a trial result by showing average improvement of a certain proportion for each mRS category is complex. The statistical approaches to ordinal analysis have some disadvantages also. We usually compare overall differences in mRS distributions between 2 treatment groups using the Cochran–Mantel–Haenzsel test, which is a nonparametric test and therefore does not assume a normal distribution of the data. We then adjust for imbalances in covariates by using its van Elteren variant, but this approach requires the covariates to be categorical rather than continuous (eg, age and stroke severity must be grouped into strata). It provides a P value but expresses neither the direction of change nor the size of effect. It typically is followed by ordinal logistic regression to estimate the odds ratio of the treatment effect and its associated confidence interval. This second step introduces an assumption of proportionality of odds—implying that the treatment has changed the odds of moving from mRS score of 5 to mRS score of 4 by a similar amount to the odds of moving from mRS score of 3 to 2, etc. This assumption has been violated when examining thrombolysis treatment for acute stroke.9 (K.R. Lees, unpublished data, 2016). It also creates a second problem: the logistic regression generates its own P value associated with the estimated confidence interval for the odds ratio, and that P value generally differs slightly from the overall P value calculated from the van Elteren test. However, logistic regression permits the use of both continuous and categorical covariates. Even so, there are several approaches to describing the effect size that do not invoke the proportionality assumption10–12 and also several circumstances where any violation of the assumption has limited impact.
A further variation in ordinal approach is to adjust the weight given to mRS categories according to the perceived preference (ie, utility) for each mRS category by multiplying the number of patients within each mRS category by that utility weight, and then statistically comparing the sum of these products from each of 2 treatment arms.13,14 This approach solves several of the weaknesses of the earlier methods, but its main disadvantages are that social, geographical, and demographic factors may influence the weights given to mRS categories, and that some disabilities such as dysphasia cannot be ranked because many stroke survivors with dysphasia cannot respond to such surveys.
In considering all of these issues, the STAIR workshop participants concluded that a standard methodology for pooled analyses would be desirable. An ordinal approach should generally be favored for an IPD pooled analysis, in which contributory trials have varying end points because this has greater statistical power and reduced reliance on assumptions around the nature of the treatment effect. The participants also favored collapsing mRS categories 5 and 6 because this better reflected perceived value of the steps. Although there was considerable enthusiasm for the utility-weighted approach, it was considered still to be less validated and subject to geographic or cultural biases. The participants noted that, although an ordinal approach for distribution of mRS (where mRS scores 5 and 6 are combined) should be the primary approach, results should also be converted to the utility-weighted and dichotomized approaches for descriptive purposes. Finally, they also noted that their conclusion should not restrict the analytic approaches of individual trials, where different considerations may apply.
A pooled IPD analysis must also harmonize the timing of final assessment used for its principal analysis although the choice here is less controversial, less under control of the trialists, and possibly may have less impact on interpretation. The latest common assessment that is available in all trial data sets should be used, recognizing the usual convention that recovery is unlikely to have stabilized before 3 months. For example, the Stroke Thrombolysis Trialists’ Collaboration chose to accept outcomes at 3 months for 8 trials, but at 6 months for a ninth trial in their pooled analysis, rather than to describe outcome at 1 month or earlier.15
Just as interim analysis of a single trial for efficacy or futility influences the probability of reaching a final positive result and thus requires reduction of the final P value for significance, assimilation over time of trial data sets to a pooled IPD analysis must be recognized as a sequential approach. Even if the protocol for a pooled analysis were published in advance of unblinding of any of its contributory trials, specifying the number, size, and identity of the trials that will be included before a result will be announced, it is conceivable that a further trial will be created later to extend, confirm, or refine some aspect of its findings. Pooled analysis is a continual process. The participants at STAIR recommend that a sequential analysis approach be taken to control for the potential bias generated when analysis may be undertaken repeatedly, on an expanding sample. The statistical analysis plan for the TREAT (Thrombectomy and tPA) collaborators’ pooling of the thrombectomy trials describes an appropriate approach.8 Bayesian approaches were also suggested, and further work in this area is needed to consider the advantages of one over the other.
This requirement to adjust for potential repeated looks at the data applies to not only an overall result from the pooled analyses (is treatment effective or not?) but also, perhaps more importantly, subgroups, which likely will expand at different rates because trials vary in their case mix. Furthermore, for subgroups, especially, a sample size calculation should be described. This need not restrict analysis before attainment of that sample, but will assist in interpretation of neutral findings for such subgroups. Again, the TREAT statistical analysis plan covers both issues.8
Third, because of variations including trial design, timescale, geographical location, there will be likely variation in treatment effects that cluster within trials. Pooling IPD allows powerful analyses of individual factors that contribute to variation, but the analysis approach should still stratify by trial to control for possible heterogeneity between trials. This was done by the Stroke Thrombolysis Trialists’ Collaboration and is planned by TREAT investigators.8.9
The sequential nature of such pooled IPD analyses leads to a question over timing of publication of results. There are arguments in favor of lodging such papers in an accessible online repository each time that they are updated, but the STAIR participants recommend that formal publication in a peer review journal be considered each time that a fresh analysis produces a finding that may change clinical management. Reporting should follow PRISMA-IPD (Preferred Reporting Items for Systematic Reviews and Meta-Analyses–Individual Patient Data) recommendations.16
There may be conflict about pooled outcomes as a group treatment effect (eg, thrombectomy by any reasonable means improves outcome) versus specific device- or drug-specific effects (eg, thrombolysis via recombinant tissue-type plasminogen activator, but not streptokinase, is effective). There is a need to define circumstances in which the scientific community, and regulators, should accept a group effect. This may be reasonable if each component drug or device in isolation shows a point estimate for the effect above a certain threshold, but it is uncertain whether the absence of significant heterogeneity is sufficient. In circumstances of a new treatment, a noninferiority analytic approach may be taken. This could be extended to allow for a drug-by-drug (or device-by-device) comparison of individual effects against the pooled effect of the remaining treatments. There is a need for development work in this area, to consider technical aspects and to formulate guidance on managing such exploratory analyses.
It would be ideal if the planned interpretation in this regard were published in advance of any trial result being known, and certainly preferable that it should be decided in advance of any pooled analysis. If plans are not prespecified, then any heterogeneity within the result will need cautious interpretation.
Sharing of Data
Ideally, all data sets would be collected in a common format and would be shared immediately on conclusion of each trial. In practice, neither is realistic. Trials are individually designed and require time to publish individual primary and secondary results. Common data elements for National Institute of Neurological Disorders and Stroke trials have been defined elsewhere,17 but are variably observed. Data are stored and shared in varied formats using diverse definitions for each variable. A substantial part of the work of pooling involves understanding each trial properly. This requires a skilled, stroke-experienced statistician working in close collaboration with the original investigators of each trial. These original investigators should also meet and collaborate actively in the writing of protocols for pooled analysis and in the interpretation of findings. The trial protocols, statistical analysis plans, manuals of procedures, and case report forms should be shared to aid interpretation of the data set. It is not sufficient to send a file with data to the pooling group and hope that they will correctly understand the documents from an individual study.
The timing of data sharing presents another challenge. Trial investigators must have an opportunity to present and publish the primary and planned secondary analyses of their study without compromising this intellectual property by releasing raw data to the public domain or having the research questions answered from a pooled source beforehand. The pooling collaboration should be able to offer firm guarantees that shared data will be used only for the approved pooled analyses and will not be released to a third party without previous agreement, and that the pooled analyses that compete with the trialists’ existing plans will not be released in advance of their individual publication. At some later point, these issues become less relevant, particularly for government-sponsored and investigator-initiated trials. For example, National Institutes of Health–funded clinical trials are required to have data-sharing plans on initiation to ensure timely release of data to the public.18 More broadly, the Institute of Medicine recommends public release of data associated with the primary publication of the trial results within 6 months of primary publication, and the full data set no longer than 18 months after study completion (unless the data are part of a regulatory application).19 Even so, the STAIR participants recognized that IPD analyses should be undertaken as a joint, collaborative venture for scientific and political reasons, and these rules about data do not directly guarantee cooperation.
More complex is the situation in which a pooled data set may already answer a question that is being tackled specifically by an ongoing or planned trial, and may thereby compromise completion of that trial. For example, an analysis of IPD from recently published thrombectomy trials may indicate an apparent relation of treatment benefit to time elapsed from stroke onset. At the same time, ongoing trials are examining late time windows. The pooling collaborators must consider the merits of such cases, taking into account the relative size of the data sets, the timescale over which the ongoing trial(s) may be completed, the clinical impact of any early announcement and the ethical dimension. Potential conflicts of interest among investigators must be handled carefully. These questions are similar to issues that regularly face independent data monitoring committees.
A collegiate spirit and recognition of colleagues’ contributions and concerns are also required for leadership and authorship purposes. Pooling projects require representation from every contributing trial. These representatives should ideally be in place even during the planning phase although there must be a mechanism to add contributors when new trials become available. It is a good principle that authorship should also have ≥1 representative from each trial that contributes data to the collaboration, even if a small writing group will draft the articles, and it is desirable that the author byline should refer to each of the component trial groups, with a listing of their steering committees in an appendix. Pooled analyses can have considerable academic impact, and it would be unreasonable for the original authors of the contributing trials not to share in the final reports. Pooled analyses should not be undertaken by independent groups without full participation of the original trialists, for both academic and practical reasons. Some of these, for example, relating to checks of data integrity, are reflected in the PRISMA-IPD statement.16
A further challenge arises from the contribution of funders. Sponsorship of research should merit access to output from pooled analyses at an early stage but should neither influence the design of the analysis, the interpretation of findings, nor the timing of publication. Handling of subsets of data and of drug- or device-specific analyses, as discussed earlier, may need cautious consideration if these would have commercial implications.
Pooling of IPD from individual clinical trials provides the power to determine treatment effects with more precision, especially within subgroups, and to explore modifiers of treatment effect. To be unbiased, detailed analysis plans should be declared before trial results are available. An ordinal analysis with a sequential approach, with statistical adjustment for each iteration, is favored. All contributing trial teams should contribute to leadership, data verification, and authorship of pooled analyses. With careful planning and collaborative approaches, pooled analyses can meaningfully and rapidly advance clinical care for our stroke patients by providing supportive data and new observations.
Stroke Treatment Academic Industry Roundtable (STAIR) IX Collaborators: Andrei V. Alexandrov, Andrew Bivard, Johannes Boltze, Joseph P. Broderick, Bruce C.V. Campbell, Francis M. Creighton, David Fiorella, Anthony J. Furlan, Philip B. Gorelick, David C. Hess, Won-Ki Kim, Lawrence L. Latour, David S. Liebeskind, Marie Luby, Patrick Lyden, John Kylan Lynch, Randolph S. Marshall, Bijoy K. Menon, Keith W. Muir, Yuko Palesch, Helen Peng, Kent E. Pryor, J Mocco, Peter Rasmussen, Ralph L. Sacco, Lee H. Schwamm, Eric E. Smith, Yoram Solberg, Achala Vagal, Steven Warach, Lawrence R. Wechsler, Max Wintermark, Albert J. Yoo, Kay M. Zander.
We thank Gary Houser for his invaluable help in organizing the Stroke Academic Industry Roundtable conference.
K.R. Lees chairs the Virtual International Stroke Trials Archive (VISTA) collaboration and the European Stroke Organisation Outcomes Working Group, is a member of the Stroke Thrombolysis Trialists’ Collaboration, the ThRombEctomy And tPA (TREAT) Collaboration and reports receipt of fees and expenses from American Stroke Association, Applied Clinical Intelligence, Atrium, Boehringer Ingelheim, EVER NeuroPharma, Hilicon, Nestle, Novartis, Stroke Academic Industry Roundtable, University of Lancaster; and research funding to the University of Glasgow and to the Virtual International Stroke Trials Archive from Genentech. Dr Khatri is a member of the TREAT Collaboration within VISTA; reports payment to University of Cincinnati Department of Neurology for her research efforts from National Institutes of Health/National Institute of Neurological Disorders and Stroke (StrokeNET National Coordinating Center Co-PI and Regional Coordinating Center PI), Genentech, Inc (PRISMS Lead PI), and Penumbra, Inc (THERAPY Neurology PI); and receives fees from Grand Rounds Experts, Inc (online clinical consultations), UpToDate, Inc (royalties), and medicolegal consultations.
Guest Editor for this article was Giuseppe Lanzino, MD.
- Received January 29, 2016.
- Revision received May 12, 2016.
- Accepted May 27, 2016.
- © 2016 American Heart Association, Inc.
- 2.↵STAIR Consensus Process. STAIR Consensus Conferences: Stroke Treatment Academic Industry Roundtable (STAIR) web site. http://www.thestair.org/. Accessed November 23, 2015.
- Bath PM,
- Lees KR,
- Schellinger PD,
- Altman H,
- Bland M,
- Hogg C,
- et al
- Sandercock P,
- Wardlaw JM,
- Lindley RI,
- Dennis M,
- Cohen G,
- Murray G,
- et al
- MacIsaac RL,
- Khatri P,
- Bendszus M,
- Bracard S,
- Broderick J,
- Campbell B,
- et al
- Emberson J,
- Lees KR,
- Lyden P,
- Blackwell L,
- Albers G,
- Bluhmki E,
- et al
- Howard G,
- Waller JL,
- Voeks JH,
- Howard VJ,
- Jauch EC,
- Lees KR,
- et al
- Rahlfs VW,
- Zimmermann H,
- Lees KR
- Churilov L1,
- Arnup S,
- Johns H,
- Leung T,
- Roberts S,
- Campbell BC,
- et al
- Hong KS,
- Saver JL
- Chaisinanunkul N,
- Adeoye O,
- Lewis RJ,
- Grotta JC,
- Broderick J,
- Jovin TG,
- et al
- 15.↵Stroke Thrombolysis Trialists’ Collaboration. Details of a prospective protocol for a collaborative meta-analysis of individual participant data from all randomized trials of intravenous rt-PA vs. control: statistical analysis plan for the Stroke Thrombolysis Trialists’ Collaborative meta-analysis. Int J Stroke. 2013;8:278–283.
- Stewart LA,
- Clarke M,
- Rovers M,
- Riley RD,
- Simmonds M,
- Stewart G,
- et al
- Saver JL,
- Warach S,
- Janis S,
- Odenkirchen J,
- Becker K,
- Benavente O,
- et al
- 18.↵Availability of Research Results: Publications, Intellectual Property Rights, and Sharing Research Resources. NIH Grants Policy Statement web site. https://grants.nih.gov/grants/policy/nihgps_2013/nihgps_ch8.htm#_Toc271264947. Accessed January 1, 2016.
- 19.↵Institute of Medicine (IOM) Committee on Strategies for Responsible Sharing of Clinical Trial Data. IOM website. http://www.iom.edu/activities/research/sharingclinicaltrialdata.aspx. Accessed January 1, 2016.