In Anticipation of International Stroke Trial-3 (IST-3)
Few medical advances have polarized physicians so powerfully as thrombolytic therapy for stroke.1 After years of delays, this highly efficacious treatment fully entered mainstream clinical use with the publication of the European Cooperative Acute Stroke Study (ECASS)-3 study; annually we see increasing rates of use globally.2 Along the way, many self-appointed “expert critiques” challenged the original trials and sought to insinuate that the therapy was less effective than primary data indicated. Seeking to use data and science to quell some of the hostility toward recombinant tissue-type plasminogen activator (rtPA), a highly dedicated and rigorous group of investigators at Edinburgh organized a trial in the late 1990s named the International Stroke Trial (IST).3 Because 2 previous trials shared the name IST—1 of aspirin for stroke and the other for a neuroprotectant—the trial became known as the IST-3. Originally planned for a sample size of 6000 patients, IST-3 sought to become the largest acute stroke trial ever done.4 After a decade and a half, the IST-3 study group has thrown in the towel approximately halfway short of their finish line, and will soon publish the results of their truncated trial. The investigators have been refreshingly transparent throughout and have posted on their web site their statistical analysis plan, their protocol, and the baseline characteristics of their study sample (www.dcn.ed.ac.uk/ist3/). These documents lead to a number of questions about IST-3 and how—no matter what the results—thoughtful clinicians might anticipate using the IST-3 data in clinical decision-making.
A number of design issues in IST-3 deserve thoughtful review before seeing the results. The study leadership invoke a concept known as “large simple” trials, meaning trials that seek power by simplifying protocols and reducing data collection to a bare minimum in hopes of enhanced recruitment. Large simple trials may be an answer to the continuing difficulty getting clinical trials finished on budget and in time. The first draft of the IST-3 protocol, therefore, included a centralized randomization scheme, masked placebo treatment, and blinded outcomes assessment.4 The trial was indeed simple, but for a variety of reasons, recruitment was slow and only 244 patients were enrolled at 16 centers. Industry support for the trial evaporated, and the investigators were forced to curtail their infrastructure significantly. The authors reformulated the trial and scraped together sources of further funding. The revised study closed in March 2011 after enrolling 3035 patients from 156 hospitals in 12 countries. Although the trial is only half as large as planned, it is in fact the largest acute stroke thrombolysis trial yet completed; the trial investigators believe a revised recruitment target of 3100 would yield 80% power to detect an absolute difference of 4.7% in the primary outcome, a dichotomized modified Rankin score at 3 months. This truncated sample potentially has power to address a few critical subgroup hypotheses such as efficacy in the elderly or in patients presenting later.
When reorganized, the trial became an unblinded trial with a design that sounds rigorous: “prospective, randomized, open, blinded end point” (PROBE). The investigators have written that their end point is blinded because a centralized researcher, unaware of the patient's treatment group, contacts the patient or family member 6 months after stroke and attempts to derive a modified Rankin Scale from a questionnaire or telephone call. Within certain boundaries, mail or phone contacts can yield a valid estimate of the modified Rankin Scale that would have been derived during an in-person visit5,6; the IST-3 investigators did an excellent job of rigorously handling these outcome data. The reader should keep in mind, however, that the patient or family member may be aware of the patient's treatment, because there was no masked placebo infusion. A “placebo” effect could contaminate the subject's recall of events as well as their estimation of the patient's abilities. Worse, having been randomized to “no treatment,” the patient's motivation during rehabilitation may change in unpredictable ways. Last, with loss of blinding, the control group changed from “no treatment” to early administration of aspirin, a drug associated with a small early benefit. Reassuringly, all images in IST-3 have been read by a blinded central observer, so outcomes based on imaging will have meaning free from any recall bias.
Another nuance embedded in the IST-3 trial design is the ambiguity of the prespecified inclusion and exclusion criteria. Although the perils of allowing site investigators to “cherry-pick” the best (or worst) candidates for clinical trial enrollment are well known, the IST-3 investigators were forced by events to allow this bias into their trial; in 2003, the drug was approved for use in Europe, mooting the need for their controlled trial. Rather than fold up their tents, the investigators reasoned that not all investigators in all countries were ready to implement aggressive stroke teams and thrombolytic therapy. Again, a design compromise was formulated in rigorous-sounding terms: the “uncertainty principle.” After listing the inclusion and exclusion criteria from the approved label, the textual definition of the IST-3 uncertainty principle is “Further inclusion and exclusion criteria are not specified precisely but are guided by the uncertainty principle (or absence of proof for that particular patient). If, for whatever reason, the clinician is convinced that a patient fulfilling the above criteria should be treated, the patient should be treated with rtPA and NOT randomized. If the clinician is convinced that a patient should not be treated (for whatever reason), the patient should NOT be included in the trial. Patients should only be randomized if they fulfill the eligibility criteria AND the clinician is substantially uncertain about the balance of risks and benefits of rtPA for that individual.” Investigator selection bias has doomed many previous trials, for example, the Extracranial–Intracranial (EC-IC) bypass study in which surgeons skimmed off the patients most likely to respond to therapy and operated on them outside the protocol.7 As it turns out, 350 patients (13%) were enrolled into the IST-3 trial who could have been treated with rtPA under currently accepted license criteria but this figure is a moving target: regulatory bodies in different countries approved the drug at different time points throughout the trial. Due to funding constraints, no data are available regarding the number of patients with rtPA outside the study who did not meet the license criteria.
Despite the potentially confounded end point blinding and the squishy patient selection methods, much can be forgiven of a trial that seeks to address a critical public health issue using “brute force,” that is, a very large trial. Although these design problems should be borne in mind at the time of reading the IST-3 results, such flaws do not necessarily render the IST-3 useless. Rather, the astute reader must place the results in context and adapt clinical decision-making with appropriate caution. There were, however, forces at work during the IST-3 of much greater significance.
Fortunately for our patients, but unfortunately for clinical trialists, medical progress continues during our trial recruitment. Secular trends in disease management proceed outside of the clinical trial protocol and site investigators adopt new treatment approaches in an unpredictable—and potentially an undocumented—manner. During the 12 years of IST-3 recruitment, a number of wonderful and powerful changes occurred in stroke management such as the widespread adoption of statin therapy, the routine use of aspirin immediately after stroke, and an appreciation of the need for tight blood pressure control. The use of aggressive rehabilitation after stroke also evolved significantly during IST-3. To some extent, such confounds can be adjusted for statistically, but only if the statistician is aware of and can quantify the confound. In large, simple trials, the investigators cannot collect detailed follow-up information. Therefore, there is no way to ascertain the extent to which IST-3 patients were influenced by great secular changes. Although it is true that such confounds should occur equally in both treatment arms, such balance might not happen given that the treatment arms were not blinded. Patients who got rtPA might be perceived as either more or less in need of aggressive poststroke prevention therapy or rehabilitation. Furthermore, adoption of these new prevention strategies certainly varied among countries and among study sites, introducing considerable “noise” into the follow-up period.
As medical management of patients with stroke evolved during IST-3, so did the acceptance of thrombolytic therapy itself. Indeed, the IST-3 protocol helped bring experience with thrombolytic therapy to many countries and centers for the first time. Of the enrolled patients, 1891 (62%) were enrolled at centers without any previous experience using rtPA, and the investigators are deservedly proud of this public health benefit from their trial. As a large, simple trial, the IST-3 could not formally train and certify sites uniformly in proper use of the drug, but extensive training materials were offered through a web site. The effect of site inexperience on protocol adherence and adverse events cannot be estimated from the available data. Previous publications documented that failure to adhere to the protocol results in higher rates of serious adverse events,8 so the risk profile in IST-3 should be placed in context: some sites may have violated the protocol to an undocumented extent.
Simultaneous with wider use of intravenous rtPA, intra-arterial interventional treatments and application of angiography or perfusion imaging were evolving. The use of interventional techniques at their sites could not be documented in IST-3, so the reader cannot interpret the effect of this secular trend on the results. For example, in some centers, patients presenting too late for standard therapy—1 of the key IST-3 target groups—were preferentially diverted to intervention. Removal of these patients from IST-3 could serve to inflate the response rates in both treated and control groups because the diverted patients might be expected to do worse than patients presenting earlier. Selection for treatment by more sophisticated imaging techniques such as perfusion mismatch may also have diminished the pool of “responsive” trial patients. Similar confounds might result from systematically diverting more severe patients or older patients to intervention.
The IST-3 protocol, attempting to generate interest and enthusiasm among prospective and inexperienced trial sites, included several deviations from standard care of the thrombolyzed patient with stroke. For example, based on preclinical data, the National Institute of Neurological Disorders and Stroke (NINDS) investigators limited the blood pressure allowed during thrombolytic therapy to 185 systolic and 95 diastolic.9,10 Instead of following the NINDS blood pressure limits, the IST-3 protocol allowed site discretion: “The randomization system will only accept patients with systolic BP between 90 to 220 mm Hg and diastolic BP between 40 to 130 mm Hg. Although these data provide some guidance, the decision about whether or not to include a patient with persistently high levels of blood pressure in the trial must rest with the physicians' judgment.” As a large, simple trial, the IST-3 could not collect detailed information on blood pressure during and for 24 hours after rtPA administration. The results of IST-3 safety outcomes should be interpreted accordingly.
Many clinical trialists believe that source verification of some clinical trial data assures safety, accuracy, and validity of the trial data. Authorities do not agree on the minimum quantity of verified data to assure validity (100%, half, 10% sample).11 Although clinical trial fraud occurs rarely, an on-site monitoring visit reduces the likelihood that fraudulent data entered the trial. Far more importantly, however, a monitoring visit allows the sponsor to document protocol adherence in each site; confirm capture of all adverse events; and correct investigator or coordinator misunderstandings. Large simple trials sacrifice monitoring rigor in favor of enhanced trial recruitment and greater trial efficiency. The IST-3 trialists used a sampling technique, the extent of which cannot be determined from available publications. It remains unclear whether the extent and nature of monitoring in IST-3 were sufficient to infer data integrity, but there is no evidence to suggest any problems with the data set due to limited monitoring.
Given these considerations, the data from IST-3 could be interpreted with certain caveats in mind. The main (ie, overall) effect across 0 to 6 hours is of limited interest: we know rtPA is useful when administered earlier and to appropriate patients. Instead, the critical issue is whether there is independent benefit among patients treated after 4.5 hours. The trial includes 1007 (33%) patients treated in this late window, almost as many again as in the existing pooled data set for this epoch. Although perhaps underpowered, these data could provide a critical foundation for mounting further trials in this time cohort. On the other hand, the early closure due to slow enrollment reduces power from planned levels, especially within subgroups; important clinical benefit may be missed. A significant overall benefit in the entire population treated between 0 and 6 hours could be detected, but this should not be interpreted as implying benefit for treatment out to 6 hours without great caution. Instead, the 4.5- to 6-hour subgroup should be examined separately to check for an independent benefit. Furthermore, there is a statistical issue that, although arcane, may prove critical: typically an effect of time delay is tested for using an interaction term combining treatment delay time and treatment. Interpretation of this interaction could be problematic because patients with established indications for thrombolysis were likely excluded from the early-onset subgroups (<3 hours initially, then 3–4.5 hours) but not later (4.5–6 hours), so the estimated slope of the decay in benefit or increase in harm with increasing onset-to-treatment time may be flattened. To interpret treatment effects within the 3- to 4.5-hour subgroup, it would seem critical to assure that patients treated after publication of ECASS-3 and revision of European Stroke Organisation (ESO) treatment guidelines are presented separately in a sensitivity analysis; it is possible that participating centers changed their treatment policies shortly after the revised guidelines appeared in print. Also, the trial was conducted in centers that are using sophisticated mismatch imaging for treatment selection; in some centers, patients with mismatch may have been treated openly and only those without mismatch randomized into IST-3. Thus, data from the 4.5- to 6-hour subgroup in IST-3 may underestimate benefit or overestimate risk of treatment, or both.
Beneficial treatment effect in the elderly is of interest, but again must be examined carefully. Aged patients were enrolled: 1407 (46%) of the patients were aged 81 to 90 and 210 (7%) were aged >90 years. Previous data from the NINDS trial, which included patients approximately 50 years of age, suggested a diminished but definite treatment benefit.12 Nonrandomized assessments have suggested benefit is maintained in the elderly.13,14 An important check will be whether the elderly patients enrolled to IST-3 represent all (unselected) elderly or whether—like the younger patients—they may represent a group chosen because they fall outside treatment guidelines. Use of alteplase in the elderly is not currently licensed in Europe but this restriction is widely disregarded, as illustrated by the ESO guidelines and Safe Implementation of Treatments in Stroke (SITS) registry; it may be difficult to establish whether the participating IST-3 centers followed the marketing approval conditions or worked to the ESO guidance, but it seems likely that this varied among centers. The true benefits in the elderly will be underestimated if a selection policy was applied that randomized only the patients in whom treatment was considered likely to be less safe or effective.
First and foremost, the IST-3 investigators succeeding in recruiting the largest acute stroke thrombolysis study population ever, and they deserve to be congratulated for accomplishing this milestone despite severe budgetary limitations. IST-3 may confirm a beneficial treatment effect of thrombolytic therapy in elderly patients with stroke, assuming the lack of blood pressure control and onset-to-treatment delay do not confound the results. The IST-3 data may also clarify risk and benefit of other so-called “off-label” use such as use in very mild patients. If confounds such as lack of blinding or selection bias prove overwhelming, at least IST-3 may point the way to more rigorous controlled trials of these interesting subgroups. Significantly, the IST-3 results may provide helpful insight into the effect of site experience on thrombolytic therapy. The IST-3 results will demonstrate through benefit and safety data the effect of inexperience on thrombolytic therapy benefit and risk. Perhaps most interesting, the trial will illuminate the benefit and safety of thrombolytic therapy administered 4.5 to 6 hours after stroke onset.
One thing that the IST-3 results cannot do is reaffirm or refute prior trials of thrombolytic therapy. The protocol—being large, simple, efficient, and streamlined—is just too different from previous work. All prior trials were rigorous, including unbiased patient selection, blinding, and scrupulous follow-up. Furthermore, the use of rtPA in “the real world” is very different from the IST-3 protocol, and so the results do not illustrate the effect of rtPA in a real-world setting. We anticipate the IST-3 results with great interest, and we congratulate the authors on persevering in the face of nearly insurmountable obstacles.
- Received March 28, 2012.
- Revision received April 3, 2012.
- Accepted April 5, 2012.
- © 2012 American Heart Association, Inc.
- Lyden PD
- Yamaguchi T,
- Mori E,
- Minematsu K,
- del Zoppo GJ
- Lyden P,
- Broderick J,
- Mascha E,
- Group Nr-PSS,
- Yamaguchi T,
- Mori E,
- et al
- Katzan I,
- Hammer M,
- Furlan A
Office of Communications CfDEaR. Guidance for Industry. Oversight of Clinical Investigations—A Risk-Based Approach to Monitoring. Silver Spring, MD: U.S. Department of Health and Human Services, Food and Drug Administration, Center for Drug Evaluation and Research, Center for Biologics Evaluation and Research, Center for Devices and Radiological Health; 2011. 2012.
The National Institute of Neurological Disorders and Stroke rtPA Stroke Study Group. Generalized efficacy of tPA for acute stroke. Stroke. 1997; 28: 2119.
- Wahlgren N,
- Ahmed N,
- Eriksson N,
- Aichner F,
- Bluhmki E,
- Davalos A,
- et al
- Mishra NK,
- Ahmed N,
- Andersen G,
- Egido JA,
- Lindsberg PJ,
- Ringleb PA,
- et al