Seven-Day NIHSS Is a Sensitive Outcome Measure for Exploratory Clinical Trials in Acute Stroke
Evidence From the Virtual International Stroke Trials Archive
Background and Purpose—Clinical trials in stroke typically measure outcome after 90 days. Earlier outcome assessment would reduce costs and may increase power. We aimed to compare the sensitivity of 4 end points (modified Rankin Scale [mRS] at 30 and 90 days, and National Institutes of Health Stroke Scale (NIHSS) at 7 and 90 days, analyzed as ordinal measures) to detect the established treatment effect of recombinant tissue-type plasminogen activator (rtPA).
Methods—Within the Virtual International Stroke Trials Archive, we compared rtPA–treated patients with untreated control subjects using a multiple resampling approach. From our total sample we drew 10 000 random samples of unique patients, constraining the sample sizes in treated and untreated groups to be equal. In each of these samples we tested for the treatment effect of rtPA by each of the 4 studied end points. The percentage of samples yielding significant results approximates the power of each end point at a given sample size. This process was repeated across a range of sample sizes, to determine the relationship between sample size and power for each of the 4 end points.
Results—For our 4 end points of mRS at 30 and 90 days, and NIHSS at 7 and 90 days the smallest sample sizes required to generate statistical power >80% were 620, 480, 370, and 420, respectively, making 7-day NIHSS the most sensitive end point. These results were supported by dichotomized analyses.
Conclusions—Seven-day NIHSS score appears a sensitive end point that should be validated in randomized trial datasets for use in exploratory stroke trials.
Acute stroke trials typically record outcomes after 90 days.1 Prolonged follow-up increases costs, and risks weakening conclusions due to patient attrition. To enhance efficiency of exploratory trials, there may be benefit from reliance on earlier assessment of outcomes, that is, after 7 or 30 days. This may also minimize confounding from unrelated adverse events. With established efficacy of recombinant tissue-type plasminogen activator (rtPA) on the modified Rankin Scale (mRS) at 90 days, we have an opportunity to explore alternative approaches.2,3
The National Institute of Neurological Disorders and Stroke (NINDS) Thrombolysis Trial Group proposed 24-hour change in National Institutes of Health Stroke Scale (NIHSS) as an end point for part 1 of their trial but 90-day global outcome for part 22; however, they concluded in 2000 that early NIHSS measures may be more sensitive than 90-day measures.4 This evolution between the NINDS trials, the conflicting rank order of power in subsequent end point analysis work,4 and a general acceptance that sustained functional benefit should be demonstrated in pivotal trials may together account for the limited implementation of their suggestion.
Nevertheless, if earlier outcome measures are more sensitive to treatment effects than 90-day mRS, they should be implemented in exploratory trials. We investigated the sensitivity of 7-day NIHSS and 30-day mRS to the established treatment effect of rtPA on 90-day mRS, to inform their validity as end points in clinical trials in stroke.
We compared rtPA–treated patients against untreated control subjects, using data from a previously described trials archive, Virtual International Stroke Trials Archive (VISTA).5 From this large dataset of treated and untreated patients, we repeatedly created random samples containing equal numbers of treated and untreated patients. We did this across a wide range of sample sizes, and for each sample size we repeated the procedure 10 000 times. Each time we created a treated and untreated group, we tested for a difference in outcomes on a series of measures: 7-day NIHSS, 30-day mRS, and 90-day NIHSS or mRS. By counting the number of such “trials” that declared a significant treatment difference, we could assess the power to detect a typical thrombolysis treatment effect across a range of sample sizes.
Our primary analysis used ordinal logistic regression. All analyses were adjusted for age and baseline NIHSS score. Full methodology is described in the online-only Data Supplement material (http://stroke.ahajournals.org).
From VISTA, we obtained data on 7886 patients, of whom 4712 met the data requirements for inclusion; 1934 (41.0%) were treated with rtPA. There were important imbalances in baseline demographics (online-only Data Supplement Table 1).
The relationships between sample size and power for each end point are shown in the Figure. Secondary end points based on dichotomization are displayed in online-only Data Supplement Figures 1 and 2. The Table shows the minimum sample sizes required for each end point to achieve statistical power at typical choices of 80% or 90%.
Seven-day NIHSS was confirmed to be the most sensitive outcome measure and 30-day mRS the least. For the 90-day outcome measures, the NIHSS performed modestly better than the mRS, though this difference was negligible on dichotomized analysis.
We found NIHSS as measured at 7 days to be the outcome measure most sensitive to the treatment effect of rtPA. This is consistent with conclusions from independent datasets.4 Nevertheless, potential confounding influences should be considered.
Allocation to the rtPA and control groups in our study was nonrandomized and based on a range of clinical factors, some undocumented. Also, outcome assessors, although masked to investigational treatment allocation, were not blinded to use of alteplase. Some reassurance derives from correspondence of the overall treatment effect, as measured by 90-day mRS, with results of the pooled, randomized trials.3 Given these limitations, our results deserve external validation using randomized trial data.
The VISTA analysis presented in the present report complements reports by Young et al.6 and Broderick et al.4 Young et al found dichotomized NIHSS end points more sensitive to a simulated treatment effect than various dichotomized disability end points. Broderick et al reported 24-hour dichotomized NIHSS to be the most powerful end point. Our analysis is based on a sample that is 8-fold larger and could adjust for known baseline prognostic factors such as age and NIHSS. We have considered the recently favored and statistically more powerful ordinal approaches.7
The NIHSS reflects impairment, the clinical domain in which the effects of acute stroke therapies are likely to be most marked.6,8 In contrast, the mRS covers a broader domain and is influenced by extraneous factors8 that acute stroke therapies may not influence. Restricting such background noise may improve sensitivity to acute treatment effects.
This interpretation may also contribute to the greater sensitivity of the NIHSS as measured at 7 days compared with 90 days. Extraneous factors may have a greater influence on 90-day than 7-day NIHSS, with the latter better reflecting treatment effects in isolation from the myriad factors which come into play after discharge.
For pivotal trials in acute stroke, the European Medicines Agency (EMA) supports neurological scales mainly as a supplement to ordinal analysis of an activity scale (primarily mRS).9 However, for early-phase, exploratory research, 7-day NIHSS score has much to recommend it as an end point. It is sensitive to treatment effects, requiring comparatively low sample sizes to achieve desirable levels of statistical power. Moreover, by recording outcome at 7 days, when most participants will be in hospital, losses to follow-up will be minimized and unrelated adverse events should have less influence on detection of adverse reactions. Trial management decisions, such as dose escalation between patient cohorts, can be expedited and costs contained.
In summary, 7-day NIHSS score appears to be an ideal end point for the early exploratory testing of novel agents. Promising agents could then be validated in larger phase III trials using the mRS at 90 days to inform licensing and purchasing decisions, with validation of the 7-day NIHSS end point on the same sample to permit use of prior data as the necessary supporting evidence for regulatory submissions.
Sources of Funding
R.L. Fulton is supported by studentships from Wyeth/Pfizer and J&J.
K.R. Lees is Associate Director of the NIHR SRN and chaired the independent data monitoring committee for the ECASS-III trial, chairs the VISTA collaboration, and serves on the Steering Committee for the SITS collaboration. He has received fees and expenses from Boehringer Ingelheim for committee work and lectures.
VISTA Steering Committee members are A. Alexandrov, P.W. Bath, E. Bluhmki, L. Claesson, J. Curram, S.M. Davis, G. Donnan, H.C. Diener, M. Fisher, B. Gregson, J. Grotta, W. Hacke, M.G. Hennerici, M. Hommel, M. Kaste, K.R. Lees, P. Lyden, J. Marler, K. Muir, R. Sacco, A. Shuaib, P. Teal, N.G. Wahlgren, S. Warach, and C. Weimar.
Louis Caplan, MD, was the Guest Editor for this paper.
The online-only Data Supplement is available with this article at http://stroke.ahajournals.org/lookup/suppl/doi:10.1161/STROKEAHA.111.644484/-/DC1.
- Received November 8, 2011.
- Accepted December 12, 2011.
- © 2012 American Heart Association, Inc.
- Broderick JP,
- Lu M,
- Kothari R,
- Levine SR,
- Lyden PD,
- Haley C,
- et al
- Ali M,
- Bath P,
- Brady M,
- Davis S,
- Diener HC,
- Donnan G,
- et al
- Young FB,
- Weir CJ,
- Lees KR
- Saver JL
CPMP. Points to consider on clinical investigation of medicinal products for the treatment of acute stroke, 2001. http://www.ema.europa.eu/docs/en_GB/document_library/Scientific_guideline/2009/09/WC500003342.pdf. Accessed August 24, 2011.