| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
(Stroke. 2002;33:1545.)
© 2002 American Heart Association, Inc.
Original Contributions |
From the University Department of Neurology, Institute of Neurological Sciences, Southern General Hospital, Glasgow, Scotland.
Correspondence to K.W. Muir, MD, South Glasgow University Hospital, NHS Trust, University Department of Neurology, Institute of Neurological Sciences, 1345 Govan Rd, Glasgow G51 4TF, Scotland. E-mail k.muir{at}clinmed.gla.ac.uk
| Abstract |
|---|
|
|
|---|
Methods Stroke subtypes and their individual outcomes in neuroprotective trial control populations were used to derive models incorporating accuracy of clinical classification and probability of an ischemic penumbra. With the use of treatment effect sizes from successful trials (predominantly of reperfusion therapies), sample sizes for neuroprotective trials were calculated. The potential influence of altered recruitment strategies was explored.
Results The proportion of informative patients in 2 large neuroprotective trials was probably only 27% to 30%. Optimistically, this proportion may be 50%; pessimistically, it may be only 17%. These figures necessitate a sample size of 3700 to 4500 subjects per group; at best, 1800 to 2200 are needed per group with optimistic assumptions about treatment effect. Strategies to enhance the proportion with tissue substrate for neuroprotection could reduce sample size to 500 per group and simultaneously reduce the total number of patients screened compared with inclusive trials.
Conclusions Population heterogeneity alone may be sufficient to explain negative neuroprotective trials because even in the largest trials to date sample size is inadequate to detect effect size equivalent to those with thrombolysis, and it is possible that they have been severely underpowered. Reliable trials with inclusive entry criteria may be too large to be commercially feasible for novel compounds. Both sample size and total number of patients needing to be screened should be reduced by restricting entry to patients more likely to have a tissue target.
Key Words: controlled clinical trials neuroprotection
| Introduction |
|---|
|
|
|---|
Pathophysiological heterogeneity is of particular relevance to neuroprotection. There are 2 principal groups of patients in whom neuroprotective therapies are unlikely to be effective: (1) those lacking a biological substrate relevant to the mode of action of the drugeg, N-methyl-D-aspartate antagonists influence neuronal cell body survival but probably have no effect on white-matter injury,4 and excess glutamate release may play a role in cortical infarction but not lacunar stroke5and (2) those lacking an ischemic penumbra, which seems likely to be of restricted volume in most patients but with wide interindividual variation6 and is probably absent in some stroke types such as intracerebral hemorrhage7 and lacunes. It is possible that a third group, those who do not reperfuse, also lacks a biological substrate for neuroprotection. By concentrating on early histological outcome, animal studies have left uncertainty over whether many neuroprotective drugs effect lasting reduction in infarct volume, especially in the absence of reperfusion, or simply prolong viability until delayed reperfusion occurs.
This article explores the potential influence of the inclusion of patients lacking a tissue target for neuroprotective drugs in clinical trials, with implications for trial design.
| Methods |
|---|
|
|
|---|
Tissue Targets
Imaging data do not support the existence of a penumbra around a primary intracerebral hematoma (PICH).7 There is also no biological target for most neuroprotective agents in lacunar strokes, which are characterized by white-matter ischemia, result mostly from end-artery disease, and by definition have no collateral flow for either penumbral conditions or drug delivery. It is therefore assumed that patients with intracerebral hematoma or lacunar strokes are not amenable to neuroprotection.
Clinical Misclassification
Imaging studies emphasize the limited sensitivity of clinical diagnosis in the acute phase of stroke, particularly in distinguishing lacunar from partial middle cerebral artery (MCA) syndromes.810 Models therefore assume that a proportion of clinically diagnosed lacunar syndromes will in fact result from partial MCA occlusion and vice versa. Similar misclassification of large MCA syndromes that are subsequently found to be restricted is assumed on the basis of existing data.9
These assumptions were applied to different stroke subtypes in 3 models called realistic, optimistic, and pessimistic (Table 1). Diffusion-perfusion mismatch on MRI (DWI-PWI mismatch) was assumed to correspond to a penumbra for practical purposes and is documented in
70% of patients with MCA occlusion and 30% of patients with patent MCAs within 6 hours of onset.11 These rates have been applied to the clinical syndromes of complete MCA infarction (total anterior circulation syndromes [TACS] by the Oxfordshire Community Stroke Project [OCSP] classification12) and partial MCA infarction (partial anterior circulation syndromes [PACS] by OCSP).
|
Effect Size
Estimates of effect size expressed as relative risk reduction (RRR) and 95% confidence intervals (95% CIs) were derived from 3 positive stroke trials published to date: the National Institute of Neurological Disorders and Stroke Recombinant Tissue Plasminogen Activator (NINDS rtPA) trial,13 Prolyse in Acute Cerebral Thromboembolism II (PROACT II) trial,14 and Stroke Treatment With Ancrod Trial (STAT).15 These are detailed in Table 2. Data from NINDS part 2 only were usable for the dichotomous outcome using a Barthel Index (BI) <55/100 to signify poor outcome because part 1 data for this end point were not published. Therefore, CIs for this end point are probably broader in this model than is truly the case. Neuroprotective trials have generally defined death and disability as BI <60/100, whereas the 3 positive trials have defined poor outcome as BI <95/100. Data for both definitions were examined. Effect sizes for pooled data appear to be consistent, with an RRR of 12% (95% CI, 10 to 16) for BI <55 to 65 and 15% (95% CI, 12 to 19) for BI <95. In case of a detrimental influence of the smaller effect size in STAT, pooled effect sizes were also calculated for the 2 thrombolysis trials: these were an RRR of 9% (95% CI, 6 to 12) for BI <55 to 65 and 18% (95% CI, 14 to 24) for BI <95. An RRR of 10% to 20% therefore seems most plausible.
|
Sample Size
Sample size calculations assumed 90% power and 2-tailed significance at P=0.05.
Control Event Rates
Combined event rates in the control groups are given in Table 2. Event rates in stroke subtypes were derived from those in the Chlomethiazole Acute Stroke Study (CLASS).8 Event rates in patients with confirmed proximal MCA occlusion were derived from the PROACT II trial.14
Alternative Recruitment Strategies
The effect of 4 different strategies was explored: (1) eliminating PICH by pretreatment CT, (2) restricting recruitment to complete MCA syndromes (TACS), (3) restricting recruitment to patients with DWI-PWI mismatch, and (4) restricting recruitment to patients with MCA occlusion confirmed on imaging. Strategy 1 removes 10% to 20% of patients from 6-hour window trials (Figure 1). Strategy 2 restricts recruitment to about one third of current trial populations, but up to 30% of patients may be misclassified and a further 23% may spontaneously recanalize.16 Strategy 3 restricts recruitment to 20% to 56% of patients with anterior circulation stroke confirmed to involve the MCA territory,11,16 probably 70% of those clinically diagnosed as such.16 Strategy 4 may restrict recruitment to 5% of patients if conventional angiography forms the basis for diagnosis but may include 40% to 60% of anterior circulation strokes (about two thirds of current trial populations) if alternative techniques such as MR angiography,16,17 CT angiography, and transcranial Doppler ultrasound are included.18 MCA proximal occlusion correlates closely with the presence of PWI-DWI mismatch and lesion volume expansion.11,17,19,20 Alternative recruitment strategies were explored through the use of the realistic model estimates for tissue targets and event rates modified to correspond to the scenarios described. The sample size for different RRRs was calculated for each, as was the proportion of screened patients likely to be eligible for each scenario using the study flow data from PROACT II (Figure 2).
|
|
Two large neuroprotective clinical trial populations were considered in each of the model situations: those in CLASS and in the Glycine Antagonist in NeuroprotectionInternational (GAIN-I) trial.21 The GAIN trial reported proportions of patients with intracerebral hemorrhage (18.6%) and lacunar stroke (18%) but did not specify cortical stroke subtypes by OCSP; therefore, 30% have been assumed to be PACS and 33% to be TACS.
| Results |
|---|
|
|
|---|
|
|
The influence of the proportion of informative patients in a trial on sample size is shown in Figure 3, with calculations for different effect sizes.
|
Alternative entry criteria limit eligibility and restrict recruitment to varying proportions of those screened (Table 5). Judging from PROACT II screening figures, up to 30% of patients may be eligible for a 6-hour window neuroprotective trial, but restriction to patients with DWI-PWI mismatch on MRI or MCA occlusion reduces this to 5% to 7% (Figure 2). The 30% figure represents a very optimistic result in the experience of most stroke centers. The number of patients needing to be screened for each patient eligible is therefore 3 for current inclusive entry criteria and increases to 15 to 18 for screening on the basis of imaging correlates of a penumbra. However, because the proportion of informative patients is higher for imaging-based trials, fewer patients must be screened for this type of trial than for conventional trials:
12 600 using conventional inclusion criteria and 20% RRR compared with 5100 for DWI-PWI mismatch.
|
When the proportions of patients with BI <95 were used to define poor outcome (using event rates published for CLASS PICH and TACS patients and extrapolations for LACS [35% poor outcome] and PACS [60%]), there was a reduction in sample size requirements of
40%. Further exploration of the possible influence of different end-point dichotomies is shown in Figure 4. Because outcome event rates are similar for different end points (For example, in the placebo group in GAIN-I, 34% had BI >90, 28% had modified Rankin Scale score <2, and 26% had National Institutes of Health Stroke Scale score <2), there were no substantial changes in the results for different scales.
|
| Discussion |
|---|
|
|
|---|
Most neuroprotective trials to date have been powered to detect effects on the order of an absolute risk reduction of 10%. This strategy reflected a combination of anticipation that large reductions in histological infarct volume in animal models would translate into large functional effects in humans, extrapolation from small phase II trials that risk biasing effect size estimates upwards,2 and limited recruitment rates resulting from the inherent structural inadequacies of stroke care organization, as well as commercial considerations. The only data that permit estimation of the likely effect size of an efficacious treatment are derived from reperfusion therapies: the NINDS rtPA trial, PROACT II, and STAT. Event rates in the control groups of these trials are comparable to those in large neuroprotective studies, and the entry criteria are comparable, except for excluding hemorrhage by CT. Realistic estimates of potential treatment effect indicate an RRR of 10% to 20% for detrimental outcome (death and dependence), with an upper 95% CI limit of 24% for the 2 thrombolytic trials.
Assuming neuroprotection to have a more restricted target population than reperfusion therapies for the reasons outlined above, the sample size estimates derived from the models in this study indicate a requirement for
4000 patients per treatment arm to detect even large treatment effects (RRR of 20%) using current trial entry criteria and a typical trial end point. Even optimistic assumptions about treatment effect, accuracy of initial diagnosis, and penumbra yield a sample size of 1800 to 2200 per group. To detect a more modest treatment effect (RRR of 10%), the minimum sample size exceeds 5000 per group; more patients may need to be recruited than have taken part in all trials ever undertaken of calcium antagonists (
7500) or glutamate antagonists (
11 000). At best, even the GAIN trial program may have recruited only three quarters of the necessary subjects. Negative results from trials to date may thus potentially be explained entirely on the basis of stroke heterogeneity.
There may be important statistical advantages in using a definition that increases the proportion of patients categorized as having poor outcome, with reduction of sample size of
40% if BI <95 is used as an end point (Figure 4). Although this implies that GAIN-I may have had adequate sample size to detect a large treatment effect (RRR of 20%) in the realistic model, it would still be only 75% of the required size for a more modest 15% RRR, and CLASS remains too small to detect anything other than a large effect with optimistic assumptions. It remains unclear whether full or nearly full recovery is an appropriate end point for a neuroprotective treatment as opposed to reperfusion therapy. Although this end point would be valid if the assumption of proportional odds reduction is correct (ie, patients move across all categories of outcome in the same proportion in response to treatment), this remains unproven for neuroprotective agents, and there are biological reasons for uncertainty.
The proportion of informative patients should be enhanced by restricting trial time windows further (perhaps to 3 hours), excluding PICH with CT, instituting mandatory clinical criteria such as minimum severity on stroke scales or the presence of cortical features, and using MRI signatures of a penumbra. All of these strategies restrict recruitment rates, but sample size may to be reduced to 400 per treatment arm to show benefit using conventional outcome measures. Despite needing to screen
18 patients per eligible subject using MRI criteria (6 times as many as with conventional criteria), the total number needing to be screened for a trial should in fact be far fewer than for inclusive entry criteria. Costs may be increased through the need to screen more ineligible patients with expensive imaging and the likely need to expand the number of participating centers and lengthen study duration. Although there are advantages in conducting very large, inclusive trials with appropriate subgroup analyses rather than very restricted trials, a more restrictive trial model may be more cost effective in early drug development.
There are inevitable caveats about the results of this modeling exercise. First, the assumption of a lack of effect in hemorrhage and lacunar syndromes is based on limited data and may be overly pessimistic. Reperfusion appears beneficial in patients classified as having small-vessel disease despite similar mechanistic arguments against it, and ultimately there remains uncertainty about the pathological basis for many lacunar infarcts. All MRI data are based on very small numbers of patients and almost certainly reflect a biased sample. Second, the numbers of patients with MRI evidence of penumbra may be less than assumed. A further concern is the restricted volume of the penumbra in MRI and PET studies25,26; if anatomically very limited, then even major salvage of target tissue may be without clinically detectable (or relevant) benefit. Third, the clinical classification and scales used are subject to error, and the BI in particular has limited discriminatory utility as an outcome measure. The OCSP classification, although not designed for acute use, categorizes patients in a biologically relevant manner by distinguishing large from small cortical strokes and lacunes, something not possible from total stroke scale scores, for instance. There are no published data with alternative categories, but the same modeling exercise could, for example, be applied on the basis of imaging findings. With respect to outcome, almost all trials have used a responder analysis based on the proportions in favorable versus unfavorable outcome categories, and the results from this model using the BI can be extrapolated to any other dichotomous outcome measure, including handicap scales such as the modified Rankin Scale or imaging-based biomarkers (eg, proportion with expansion of MRI lesion volume27) with little change in the implications for sample size, although there are some advantages in choosing a smaller rather than larger outcome category, as shown in Figure 4. Finally, event rates derived from reperfusion trials may be too conservative because the net RRR reflects an aggregate of very favorable outcomes in patients who reperfuse early and less favorable outcomes in those who do not or those who encounter hemorrhagic complications. Reperfusion trials may in fact have a proportion of uninformative patients similar to that of neuroprotective trials. However, most patients in the trials used to estimate effect size were randomized within 3 hours, and effect sizes are likely to be greater than achievable with neuroprotectives, typically delivered 4 to 6 hours after onset. There was no significant heterogeneity between stroke subtypes with respect to benefit. Most importantly, had the true effect size of any of half a dozen neuroprotective agents been as great as or greater than that evident in reperfusion trials, it should have been seen in the trials, many of which had sample sizes many times greater than NINDS. In the absence of any data from neuroprotective trials themselves, effect size estimates represent the best information from which to extrapolate.
In conclusion, heterogeneity of stroke populations recruited to typical neuroprotective trials may reduce substantially the likelihood of showing efficacy, even if a neuroprotective agent were to have an effect size equivalent to thrombolytics. Logical and feasible strategies to limit the variability in the patient populations recruited may enhance the ability of future trials to demonstrate efficacy.
| Acknowledgments |
|---|
Received November 14, 2001; revision received January 7, 2002; accepted January 22, 2002.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
K. A. Dani, M. T. McCormick, and K. W. Muir Brain Lesion Volume and Capacity for Consent in Stroke Trials: Potential Regulatory Barriers to the Use of Surrogate Markers Stroke, August 1, 2008; 39(8): 2336 - 2340. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Ali, P. M.W. Bath, J. Curram, S. M. Davis, H.-C. Diener, G. A. Donnan, M. Fisher, B. A. Gregson, J. Grotta, W. Hacke, et al. The Virtual International Stroke Trials Archive Stroke, June 1, 2007; 38(6): 1905 - 1910. [Abstract] [Full Text] [PDF] |
||||
![]() |
H C A Emsley, C J Smith, R F Georgiou, A Vail, S J Hopkins, N J Rothwell, P J Tyrrell, and for the IL-1ra in Acute Stroke Investigators A randomised phase II study of interleukin-1 receptor antagonist in acute stroke patients J. Neurol. Neurosurg. Psychiatry, October 1, 2005; 76(10): 1366 - 1372. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. R. Ratan, A. Siddiq, L. Aminova, P. S. Lange, B. Langley, I. Ayoub, J. Gensert, and J. Chavez Translation of Ischemic Preconditioning to the Patient: Prolyl Hydroxylase Inhibition and Hypoxia Inducible Factor-1 as Novel Targets for Stroke Therapy Stroke, November 1, 2004; 35(11_suppl_1): 2687 - 2689. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. J. Weir, M. Kaste, K. R. Lees, and for the Glycine Antagonist in Neuroprotection Int Targeting Neuroprotection Clinical Trials to Ischemic Stroke Patients With Potential to Benefit From Therapy Stroke, September 1, 2004; 35(9): 2111 - 2116. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. M. Silakova, J. A. Hewett, and S. J. Hewett Naproxen Reduces Excitotoxic Neurodegeneration in Vivo with an Extended Therapeutic Window J. Pharmacol. Exp. Ther., June 1, 2004; 309(3): 1060 - 1066. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. A. Lapchak and J. A. Zivin Ebselen, a Seleno-Organic Antioxidant, Is Neuroprotective After Embolic Strokes in Rabbits: Synergism With Low-Dose Tissue Plasminogen Activator Stroke, August 1, 2003; 34(8): 2013 - 2018. [Abstract] [Full Text] [PDF] |
||||
![]() |
S J Allder, A R Moody, A L Martel, P S Morgan, G S Delay, J R Gladman, and G G Lennox Differences in the diagnostic accuracy of acute stroke clinical subtypes defined by multimodal magnetic resonance imaging J. Neurol. Neurosurg. Psychiatry, July 1, 2003; 74(7): 886 - 888. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Fisher Recommendations for Advancing Development of Acute Stroke Therapies: Stroke Therapy Academic Industry Roundtable 3 Stroke, June 1, 2003; 34(6): 1539 - 1546. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. J. Furlan Acute Stroke Trials: Strengthening the Underpowered Stroke, June 1, 2002; 33(6): 1450 - 1451. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Stroke Home | Subscriptions | Archives | Feedback | Authors | Help | AHA Journals Home | Search Copyright © 2002 American Heart Association, Inc. All rights reserved. Unauthorized use prohibited. |