| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
(Stroke. 2005;36:2187.)
© 2005 American Heart Association, Inc.
Original Contributions |
From the Department of Cerebrovascular Medicine, Division of Cardiovascular and Medical Sciences (F.B.Y., C.J.W., K.R.L.) and the Robertson Centre for Biostatistics (C.J.W.), University of Glasgow, Glasgow, U.K.
Correspondence to Fiona B. Young, BSc, Gardiner Institute, Western Infirmary, Glasgow, U.K. G11 6NT. E-mail Fby1w{at}clinmed.gla.ac.uk
| Abstract |
|---|
|
|
|---|
Methods We simulated a total of 6000 clinical trials, each with 1400 patients. We estimated statistical power for a range of NIHSS end points, including prognosis-adjusted and fixed dichotomized end points. These end points were compared with the BI and mRS dichotomized at 95 and 1, respectively.
Results The most powerful fixed end point was the NIHSS dichotomized at 1. For prognosis-adjusted outcome, we found greatest power if we defined success as achieving a score of
1 or improvement by at least 11 points from baseline. We are more likely to achieve a statistically significant result by using this prognosis-adjusted end point instead of NIHSS
1 (odds ratio, 2.8; 95% confidence interval [CI], 2.5 to 3.2). Use of the optimal NIHSS prognosis-adjusted end point rather than BI
95 could justify a reduction in sample size of approximately 68% (95% CI, 67% to 69%) without loss of statistical power.
Conclusions The NIHSS neurologic scale appears more sensitive than the BI or mRS, allowing smaller sample sizes or greater statistical power. The use of an NIHSS prognosis-adjusted end point could allow therapeutic effects from drugs to be more easily identified.
Key Words: acute stroke clinical trial end point determination
| Introduction |
|---|
|
|
|---|
| The National Institutes of Health Stroke Scale |
|---|
|
|
|---|
The NIHSS is one of the most reliable and valid instruments of clinical measurement in stroke,5,6 and the modified scale was shown to be highly correlated to BI, mRS, and Glasgow Outcome Score (GOS) at 90 days.7
Like with disability scales, analysis of outcome assessed by NIHSS has often relied on simple dichotomization.8 The ATLANTIS trial9 used a "prognosis-adjusted" secondary end point (Table 1). The NIHSS was used as part of a global primary end point in the NINDS trial,10 along with the GOS, BI, and mRS. Global and "prognosis-adjusted" end points are rarely used in acute stroke trials. Global end points allow patients to be simultaneously assessed on many measurement scales. Furthermore, prognosis-adjusted end points allow patients to be assessed on realistic goals while maintaining generalizability. A selection of NIHSS end points that have been used are given in Table 1.
|
| Aims |
|---|
|
|
|---|
| Methods |
|---|
|
|
|---|
We based our work on data from the GAIN international trial.12 The GAIN trial was neutral; however, to avoid any bias, only the placebo patients were used. We generated 6000 clinical trials, each with 1400 patients split equally between active treatment and placebo groups. Patients were simulated by randomly sampling with replacement from the GAIN data. The characteristics of the simulated patients were based on real examples from the GAIN trial, preserving the correlation among the NIHSS, Oxford classification, and final outcome. The simulated placebo and treatment groups were generated slightly differently. The simulated treatment group was forced to have slightly milder stroke (assessed by NIHSS at baseline). The difference between the average baseline NIHSS for the 2 groups was 1, 2, or 3 points ("treatment level"). The simulated treatment group patients were selected from subgroups with similar clinical characteristics as the sampled placebo patients to avoid confounding by other factors. Figure 1 details the simulation technique.
|
In generating a "treatment" effect that separated the groups by a small difference in baseline NIHSS, we were effectively making an assumption about how neuroprotection or reperfusion would translate into neurologic function in the first few hours after treatment. We anticipate that an effective treatment would limit the extent of infarction and thus be equivalent to presentation with a slightly milder event, but that thereafter, the pattern of recovery and associated final outcome at 90 days would relate to initial severity in an identical manner to any other patient presenting with a stroke of that milder degree. The extent of the simulated treatment effect was equivalent to generating an improvement of 1, 2, or 3 NIHSS points in stroke severity. We also assumed that all patients benefit from treatment. The effect of different patterns of treatment effect have been explored elsewhere and found to have limited impact.24
We assessed several different types of end point. The 90-day NIHSS was dichotomized at 1, 3, 5, and 7. Prognosis-adjusted end points were also included: each patient was considered to have a favorable outcome if they achieved either a score of NIHSS
1 or an improvement from baseline of more than n points, where n ranged from 2 to 15. End points that simply assessed whether a patient had improved from baseline by n points were also included: n ranged from 2 to 15. Finally, a global end point was considered, incorporating the dichotomizations BI
95, mRS
1, and NIHSS
1.
Each of the clinical trials was assessed by Pearsons chi square for dichotomized end points and by generalized estimating equations25 for the global end points. The statistical power and 95% confidence intervals were estimated using a bootstrap approach. Relative sample sizes, showing the trial size that would be required to maintain the statistical power of a reference end point, were assessed using standard sample size equations.26,27 End points were also compared using logistic regression28 controlling for treatment level.
| Results |
|---|
|
|
|---|
1 end point (Table 2). The power tended to decrease as cutpoints were moved toward higher NIHSS values. All of the prognosis-adjusted end points were more powerful than the
1 dichotomized end point (Table 2). Simply assessing whether the patient improved by at least a certain number of NIHSS points tended to be less powerful than the prognosis-adjusted end points. The inclusion of BI
95 and mRS
1 with NIHSS
1 in a global end point had similar power to the NIHSS
1 dichotomized end point.
|
The results in Table 3 are given in terms of a relative sample size. By using any of the prognosis-adjusted end points instead of the
1 dichotomy, the sample size can be reduced without a reduction in power.
|
The odds of reaching a statistically significant result were increased by 188% (95% confidence interval [CI], 158% to 221%) if the prognosis-adjusted (NIHSS
1 or
11-point improvement) end point was used instead of the
1 dichotomized end point. The
1 or improvement by 11 points or more prognosis-adjusted end point was clearly the most powerful of all end points tested across all treatment levels (see Figure 2).
|
We have previously shown the BI
95 and mRS
1 to be the best available disability end points,23 in this study, the NIHSS end points tended to be more powerful. If the NIHSS
1 dichotomy were to be used instead of BI
95, the relative sample size required to maintain the statistical power would be around 62% (see Table 4). The NIHSS end points showed similar advantages over mRS
1.
|
| Discussion |
|---|
|
|
|---|
1 at 90 days or an improvement from baseline of at least 11 points. However, simple dichotomization at NIHSS
1 was also effective, because were many end points that included only an n-point improvement. Useful values of n tended to be small, however, because larger values render the end point subject to a ceiling effect (because patients with mild stroke cannot improve beyond a score of 0). Including the NIHSS, BI, and mRS in a global end point also demonstrated reasonable statistical power, similar to that of the single NIHSS component. However, our global end point did not include the GOS. Although a global endpoint is attractive because it incorporates several important outcome measures, it is likely that the power of the global end point was restricted by the inclusion of less powerful mRS and BI measures. The benefit of prognosis-adjusted end points has been demonstrated. Assessing whether a patient reaches a score of 0 or 1 on the NIHSS or improves from baseline by at least 11 points allows more patients to contribute to the results, because simply dichotomizing the scale ultimately results in a loss of meaningful information. This is logical: some patients may never reach a score of 0 or 1 but could still show significant improvement and therefore will contribute to this type of analysis.
In this study, we tested only a fixed treatment effect, assuming that the initial severity of all active treatment group patients improved by the same amount. This may not be the most likely scenario, because patients can benefit from treatments in a variety of ways. The actual treatment effect may only benefit subgroups of patients such as male or young patients or may depend on the severity of the stroke. However, in a previous study,24 we demonstrated consistent performance of outcome measures across a variety of likely treatment effect patterns. Second, our analysis was not adjusted for any factors relating to stroke outcome. If covariates had been included, the statistical power of the end points would likely have been increased.29 Third, we chose as our surrogate treatment effect an "improved" baseline NIHSS score to increase the chance of improved outcome at 90 days. It is possible that this method of applying a simulated treatment effect may have overestimated the statistical power obtained by the NIHSS end points in comparison to the BI or mRS end points through some subtle link. We tested for this by including baseline NIHSS as a covariate. However, when the effect of baseline NIHSS was controlled, the advantage of the NIHSS end points over the BI and mRS end points decreased only minimally, suggesting that any inflation of the power values was small. All of the end points in our study included dichotomizations of the NIHSS. An alternative would be to use all categories and apply proportional odds (PO) logistic regression.30 However, we found that the use of the PO model did not improve the power compared with the best prognosis-adjusted end point and was similar to the
1 dichotomy.
Our study only used data from the GAIN International Trial.12 This population was considered representative of most patients with stroke presenting to hospital acutely. Nevertheless, it may be informative to use different trial data to assess the generalizability of our findings.
Despite the apparent potential improvement in statistical power, neurologic impairment scales have been used infrequently as primary end points.10,15 Instead, trials have tended to favor disability scales, driven in part by regulatory authorities. The European Medicines Evaluation Authority (EMEA)31 accepts that neurologic outcome scales should be supportive as secondary efficacy end points, but also recommend that they should not be dichotomized, because important information may be missed. Impairment scales are more sensitive to change in patient status and may be more relevant for earlier phase trials. Disability or handicap scales are perceived as being more relevant to patients with stroke for phase III trials,1 and the BI has been shown to be reliable even when administered over the telephone.6
In conclusion, we found benefits from using the NIHSS at 90 days as an end point. We speculate that this will translate into more powerful clinical trials. Clinical trials of acute stroke therapies should consider the use of neurologic impairment scales to assess benefit from treatment.
| Acknowledgments |
|---|
Received May 17, 2005; accepted June 29, 2005.
| References |
|---|
|
|
|---|
2. Orgogozo JM. Advantages and disadvantages of neurological scales. Cerebrovasc Dis. 1998; 8: 27.
3. Brott T, Adams HP, Olinger CP, Marler JR, Barsan WG, Biller J, Spilker J, Holleran R, Eberle R, Hertzberg V. Measurements of acute cerebral infarction: a clinical examination scale. Stroke. 1989; 20: 864870.
4. Muir KW, Weir CJ, Murray GD, Povey C, Lees KR. Comparison of neurological scales and scoring systems for acute stroke prognosis. Stroke. 1996; 27: 18171820.
5. DOlhaberriague L, Litvan I, Mitsias P, Mansbach HH. A reappraisal of reliability and validity studies in stroke. Stroke. 1996; 27: 23312336.
6. Lyden PD, Lau GT. A critical appraisal of stroke evaluation and rating scales. Stroke. 1991; 22: 13451352.
7. Lyden PD, Lu M, Levine SR, Brott TG, Broderick J, NINDS rtPA Stroke Study Group. A modified National Institute of Health Stroke Scale for use in stroke clinical trials: preliminary reliability and validity. Stroke. 2001; 32: 13101317.
8. Duncan PW, Jorgensen HS, Wade DT. Outcome measures in acute stroke trials: a systematic review and some recommendations to improve practice. Stroke. 2000; 31: 14291438.
9. Clark WM, Wissman S, Albers GW, Jhamandas JH, Madden KP, Hamilton S; for the ATLANTIS Study Investigators. Recombinant tissue-type plasminogen activator (Alteplase) for ischemic stroke 3 to 5 hours after symptom Onset: the ATLANTIS study. JAMA. 1999; 282: 20192026.
10. The National Institute of Neurological Disorders and Stroke rt-PA Stroke Study Group. Tissue plasminogen activator for acute ischemic stroke. N Engl J Med. 1995; 333: 15811587.
11. Sacco RL, DeRossa JT, Haley EC, Levin B, Ordronneau P, Phillips SJ, Rundek T, Snipes RG, Thompson JL; Glycine Antagonist in Neuroprotection Americas Investigators. Glycine antagonist in neuroprotection for patients with acute stroke: a randomized controlled trial (GAIN Americas Study). J Am Med Assoc. 2001; 285: 17191761.
12. Lees KR, Asplund K, Carolei A, Davis SM, Diener HD, Kaste M, Orgogozo JM, Whitehead J; for the GAIN International Investigators. Glycine antagonist (gavestinel) in neuroprotection (GAIN International) in patients with acute stroke: a randomised controlled trial. Lancet. 2000; 355: 19491954.[CrossRef][Medline] [Order article via Infotrieve]
13. del Zoppo GJ, Higashida RT, Furlan AJ, Pessin MS, Rowley HA, Gent M; the PROACT Investigators. PROACT: A phase II randomized trial of recombinant Pro-urokinase by direct arterial delivery in acute middle cerebral artery stroke. Stroke. 1998; 29: 411.
14. Furlan A, Higashida R, Wechler L, Gent M, Rowley H, Kase C, Pessin M, Ahuja A, Callahan F, Clark WM, Silver F, Rivera F; for the PROACT Investigators. Intra-arterial prourokinase for acute ischemic stroke: the PROACT II study a randomized controlled trial. J Am Med Assoc. 1999; 282: 20032011.
15. Clark WM, Williams BJ, Selzer KA, Zweifler RM, Sabounjian LA, Gammans RE; for the Citicoline Stroke Study Group. A randomized efficacy trial of citicoline in patients with acute ischemic stroke. Stroke. 1999; 30: 25922597.
16. Grotta J, for the US and Canadian Lubeluzole Ischemic Stroke Study Group. Lubeluzole treatment of acute ischemic stroke. Stroke. 1997; 28: 23382346.
17. The RANTTAS Investigators. A randomized trial of tirilazad mesylate in patients with acute stroke (RANTTAS). Stroke. 1996; 27: 14531458.
18. Davis SM, Lees KR, Albers GW, Diener HC, Markabi S, Karlsson G, Norris J; for the ASSIST Investigators. Selfotel in acute ischemic stroke: possible neurotoxic effects of an NMDA antagonist. Stroke. 2000; 31: 347354.
19. Hacke W, Kaste M, Fieschi C, Toni D, Lesaffre E, Von Kummer R, Boysen G, Bluhmki E, Hoxter G, Mahagne MH; for the ECASS Study Group. Intravenous thrombolysis with recombinant tissue plasminogen activator for acute hemispheric stroke: the European Cooperative Acute Stroke Study (ECASS). J Am Med Assoc. 1995; 274: 10171025.
20. Hacke W, Kaste M, Fieschi C, Von Kummer R, Davalos A, Meier D, Larrue V, Bluhmki E, Davis S, Donnan G, Schneider D, Diez-Tejedor E, Trouillas P; for the Second EuropeanAustralasian Acute Stroke Study Investigators. Randomised double-blind placebo-controlled trial of thrombolytic therapy with intravenous alteplase in acute ischaemic stroke (ECASS II). Lancet. 1998; 352: 12451251.[CrossRef][Medline] [Order article via Infotrieve]
21. Hacke W, Bluhmki E, Steiner T, Tatlisumak T, Mahagne MH, Sacchetti ML, Meier D; for the ECASS Study Group. Dichotomized efficacy end points and global end-point analysis applied to the ECASS intention-to-treat data set: post hoc analysis of ECASS I. Stroke. 1998; 29: 20732075.
22. Bland M. An Introduction to Medical Statistics, 3rd ed. Oxford: Oxford University Press; 2000.
23. Efron B, Tibshirani R. An Introduction to the Bootstrap. New York: Chapman & Hall; 1993.
24. Young FB, Lees KR, Weir CJ. Strengthening acute stroke trials through optimal use of disability endpoints. Stroke. 2003; 34: 26762680.
25. Lipsitz SR, Laird NM, Harrington DP. Generalized estimating equations for correlated binary data: using the odds ratio as a measure of association. Biometrika. 1991; 78: 153160.
26. Wei P. Sample size and power calculations with correlated binary data. Control Clin Trials. 2001; 22: 211227.[CrossRef][Medline] [Order article via Infotrieve]
27. Woodward M. Formulae for sample size, power and minimum detectable relative risk in medical studies. Statistician. 1992; 41: 185196.[CrossRef]
28. Hosmer DW, Lemeshow S. Applied Logistic Regression. New York: Wiley; 1989.
29. Pocock SJ, Assmann SE, Enos LE, Kasten LE. Subgroup analysis, covariate adjustment and baseline comparisons in clinical trial reporting: current practice and problems. Stat Med. 2002; 21: 29172930.[CrossRef][Medline] [Order article via Infotrieve]
30. Agresti A. Categorical Data Analysis. New York: Wiley; 1990.
31. Committee for Proprietary Medicinal Products (CPMP). Points to Consider on Clinical Investigation of Medicinal Products for the Treatment of Acute Stroke. 2001.
This article has been cited by other articles:
![]() |
K.-c. Lin, Y.-h. Huang, Y.-w. Hsieh, and C.-y. Wu Potential Predictors of Motor and Functional Outcomes After Distributed Constraint-Induced Therapy for Patients With Stroke Neurorehabil Neural Repair, May 1, 2009; 23(4): 336 - 342. [Abstract] [PDF] |
||||
![]() |
J. Dawson, J. S. Lees, T.-P. Chang, M. R. Walters, M. Ali, S. M. Davis, H.-C. Diener, K. R. Lees, and for the GAIN and VISTA Investigators Association Between Disability Measures and Healthcare Costs After Initial Treatment for Acute Stroke Stroke, June 1, 2007; 38(6): 1893 - 1898. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. R. Lees, A. Davalos, S. M. Davis, H.-C. Diener, J. Grotta, P. Lyden, A. Shuaib, T. Ashwood, H.-G. Hardemark, W. Wasiewski, et al. Additional Outcomes and Subgroup Analyses of NXY-059 for Acute Ischemic Stroke in the SAINT I Trial Stroke, December 1, 2006; 37(12): 2970 - 2978. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Dawson and M. Walters New and emerging treatments for stroke Br. Med. Bull., November 7, 2006; (2006) ldl011v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. A. Koziol and A. C. Feng On the Analysis and Interpretation of Outcome Measures in Stroke Clinical Trials: Lessons From the SAINT I Study of NXY-059 for Acute Ischemic Stroke Stroke, October 1, 2006; 37(10): 2644 - 2647. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Bruno, C. Saha, and L. S. Williams Using Change in the National Institutes of Health Stroke Scale to Measure Treatment Effect in Acute Stroke Trials Stroke, March 1, 2006; 37(3): 920 - 921. [Abstract] [Full Text] [PDF] |
||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Stroke Home | Subscriptions | Archives | Feedback | Authors | Help | AHA Journals Home | Search Copyright © 2005 American Heart Association, Inc. All rights reserved. Unauthorized use prohibited. |