(Stroke. 2003;34:200.)
© 2003 American Heart Association, Inc.
Short Communication |
From the Departments of Neurology (K.C.J., E.C.H.) and Health Evaluation Sciences (K.C.J., A.F.C., D.P.W.), University of Virginia, Charlottesville.
Correspondence to Karen C. Johnston, MD, University of Virginia Health System, Department of Neurology, #800394, Charlottesville, VA 22908. E-mail kj4v{at}virginia.edu
Abstract
Background Six multivariable models predicting 3-month outcome of acute ischemic stroke have been developed and internally validated previously. The purpose of this study was to externally validate the previous models in an independent data set.
Summary of Report We predicted outcomes for 299 patients with ischemic stroke who received placebo in the National Institute of Neurological Disorders and Stroke rt-PA trial. The model equations used 6 acute clinical variables and head CT infarct volume at 1 week as independent variables and 3-month National Institutes of Health Stroke Scale, Barthel Index, and Glasgow Outcome Scale as dependent variables. Previously developed model equations were used to forecast excellent and devastating outcome for subjects in the placebo tissue plasminogen activator data set. Area under the receiver operator characteristic curve was used to measure discrimination, and calibration charts were used to measure calibration. The validation data set patients were more severely ill (National Institutes of Health Stroke Scale and infarct volume) than the model development subjects. Area under the receiver operator characteristic curves demonstrated remarkably little degradation in the validation data set and ranged from 0.75 to 0.89. Calibration curves showed fair to good calibration.
Conclusions Our models have demonstrated excellent discrimination and acceptable calibration in an external data set. Development and validation of improved models using variables that are all available acutely are necessary.
Key Words: cerebral ischemia models, statistical prognosis stroke outcome
Amultivariable model that could predict outcome after stroke would be useful in clinical trials to assess the balance of treatment groups and to predict expected outcomes of patients who are lost to follow-up. We developed and internally validated a series of predictive risk models for 229 ischemic stroke patients from the Randomized Trial of Tirilazad Mesylate in Patients with Acute Stroke (RANTTAS)1. However, the models have not been validated in an external data set. The purpose of this study was to assess the validity of those predictive models in an independent data set.
Methods
Study Population
Two hundred ninety-nine patients from the placebo group of the National Institute of Neurological Disorders and Stroke (NINDS) rt-PA trial were used for the validation analysis.2 The NINDS rt-PA trial population has been described in detail previously.2 Briefly, this was an ischemic stroke population treated with intravenous tissue plasminogen activator (tPA) or placebo within 3 hours from symptom onset. Only the placebo group was used for this analysis because intravenous tPA is known to improve clinical outcome, and the predictive model being validated was designed to predict outcome without an intervention.
Three hundred twelve patients were treated with placebo in the NINDS rt-PA trial. Thirteen patients were excluded from our analysis for missing variables: 6 were missing 7- to 10-day CT scan infarct volume, 5 were missing stroke history information, and 2 were missing diabetes history information. The remaining 299 were used for this analysis.
Independent Variables
Baseline clinical information including age, National Institutes of Health Stroke Scale (NIHSS) score,3 history of previous stroke, history of diabetes mellitus, and history of prestroke disability were collected acutely. Infarct volume and stroke subtype were collected at 7 to 10 days after stroke onset.
Outcome Variables
The NIHSS, Barthel Index (BI),4 and Glasgow Outcome Scale (GOS)5 were used as outcome measures 3 months after stroke symptom onset. Each was dichotomized into excellent outcome (NIHSS
1, BI
95, GOS =1) or devastating outcome (NIHSS
20 or death, BI <60 or death, GOS >2) as previously defined.1
Statistical Analysis
The previously defined models were forecast to the study population using the previously defined weights. Model discrimination was assessed using area under the receiver operating characteristic (ROC) curve, which was computed by a nonparametric method.6 An area under the ROC curve of 0.5 indicates no ability to discriminate and an area of 1.0 indicates perfect discrimination. We prespecified an acceptable area under the ROC as
0.8. Calibration was assessed using calibration curves.7 The perfect 45° line demonstrates ideal calibration. The closer the model calibration is to the ideal line, the better the calibration. Hosmer-Lemeshow tests were performed to assess whether the models differed significantly from perfect calibration.8
Results
The clinical and imaging characteristics of the validation population are compared with those of the original population from the RANTTAS trial9 in Table 1. There were more blacks and fewer whites in the validation population. The validation population had substantially more severe strokes, as demonstrated by higher NIHSS scores and greater infarct volumes at 1 week, as well as worse outcomes when compared with the original data set.
|
The models ability to discriminate outcome, in both the original and validation data sets, is demonstrated in Table 2. In the original data set, 5 of the 6 models had excellent discrimination above the 0.8 level. The validation models demonstrate very little decline in area under the ROC curve in all 6 data sets and demonstrate excellent discrimination in 5 of the 6 models.
|
Model calibration is demonstrated in the Figure. Five of the 6 models have calibration curves that are very similar to the line of identity. The excellent outcome as measured by the NIHSS model calibrated less well and predicted a greater probability of excellent outcome than was observed in the validation data set for most of the range of predicted probabilities. For example, in the calibration curve "Excellent NIH Outcome," at a model prediction of 70% probability of excellent outcome (fourth point on the curve), only about 40% of patients were actually observed to have excellent outcome by the NIHSS. Hosmer-Lemeshow tests of the calibration accuracy indicate that 5 of the 6 models were detectably (P<0.05) less than perfect, with all of the models tending to predict better outcomes than were observed.
|
Discussion
These 6 models have now been validated in an external data set. Five of the models have excellent discrimination (ROC area
0.8). The model for devastating outcome defined by the NIHSS (ROC area=0.75) likely discriminates less well due to the small number of devastating outcomes in the model development data set. Five of the 6 models appear to be reasonably well calibrated, as shown in the calibration curves. The Hosmer-Lemeshow test demonstrates that these are not perfectly calibrated but the calibration appears to be adequate.
The validation population clearly had more severe strokes as demonstrated by both the NIHSS and infarct volume. Although the model discrimination was affected little by this, the poorer calibrations in the 6 models compared with the original models may be related to this difference. The fact that the models had such good discrimination and adequate calibration in 5 of the 6 models, even in such a different population, supports the generalizability of these models.
A predictive tool that would allow an accurate prediction of individual outcome at 3 months could be very useful in clinical research. The great heterogeneity among stroke patients may contribute to difficulty identifying treatment effects in clinical trials.10,11 Heterogeneity in a randomized clinical trial is mathematically expected to result in an underestimate of the treatment effect.11 A predictive model that could adjust for heterogeneity would allow a less biased estimate of the treatment effect and demonstrate a larger treatment effect in the same sized sample.
The small number of subjects and least frequent outcomes in both data sets limits how well the models can predict outcome. These models are also limited by the fact that only 5 of the 7 variables used in the prediction were collected acutely. Infarct volume and stroke subtype were collected at 1 week. Although a prediction of 3-month outcome is still valuable to the clinician at 1 week, this limits the use of these models in acute stroke clinical research. The dichotomized outcomes, though identifying the extreme outcomes (excellent outcome suggesting full or nearly full recovery; devastating outcome suggesting nursing home level disability or death), are not designed to predict the clinically relevant recovery levels in between. These extreme outcomes using the NIHSS, BI, and GOS are most useful in the clinical research realm in that they allow more reliable comparisons of standardized outcomes. These models, therefore, function as the proof of concept that predictive models can be developed and then internally and externally validated for the prediction of 3-month outcome. Future models must now be developed using variables that are all available in the acute setting.
Acknowledgments
Dr Johnston is supported by the National Institutes of Health, National Institute of Neurologic Disorders and Stroke (grant No. K23NS02168-01).
The NINDS rt-PA Stroke Trial was funded by the National Institutes of Health, National Institute of Neurologic Disorders and Stroke through contracts to the participating sites. The RANTTAS study was supported, in part, by the National Institutes of Health, National Institute of Neurological Disorders and Stroke (grant No. R01-NS31554), and Pharmacia and Upjohn Company (Kalamazoo, Mich).
The authors gratefully acknowledge the contribution of the NINDS rt-PA Stroke Trial investigators and the RANTTAS investigators, without whose efforts this work would not have been possible.
Footnotes
This work was presented, in part, at the 27th International Stroke Conference of the American Heart Association, San Antonio, Texas, February 7, 2002.
Received May 28, 2002; revision received July 24, 2002; accepted August 2, 2002.
References
This article has been cited by other articles:
![]() |
M Uyttenboogaart, R E Stewart, P C Vroomen, G-J Luijckx, and J De Keyser Utility of the stroke-thrombolytic predictive instrument J. Neurol. Neurosurg. Psychiatry, September 1, 2008; 79(9): 1079 - 1081. [Abstract] [Full Text] [PDF] |
||||
![]() |
H F Lingsma, D W J Dippel, S E Hoeks, E W Steyerberg, C L Franke, R J van Oostenbrugge, G de Jong, M L Simoons, W J M Scholte op Reimer, and The Netherlands Stroke Survey investigators Variation between hospitals in patient outcome after stroke is only partly explained by differences in quality of care: results from the Netherlands Stroke Survey J. Neurol. Neurosurg. Psychiatry, August 1, 2008; 79(8): 888 - 894. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. R. Konig, A. Ziegler, E. Bluhmki, W. Hacke, P. M.W. Bath, R. L. Sacco, H. C. Diener, C. Weimar, and on behalf of the Virtual International Stroke Tria Predicting Long-Term Outcome After Acute Ischemic Stroke: A Simple Index Works in Patients From Controlled Clinical Trials Stroke, June 1, 2008; 39(6): 1821 - 1826. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Khatri, R. A. Taylor, V. Palumbo, V. Rajajee, J. M. Katz, J. A. Chalela, A. Geers, J. Haymore, D. M. Kolansky, S. E. Kasner, et al. The safety and efficacy of thrombolysis for strokes after cardiac catheterization. J. Am. Coll. Cardiol., March 4, 2008; 51(9): 906 - 911. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. C. Johnston, D. P. Wagner, X.-Q. Wang, G. C. Newman, V. Thijs, S. Sen, S. Warach, and for the GAIN, Citicoline, and ASAP Investigators Validation of an Acute Ischemic Stroke Model: Does Diffusion-Weighted Imaging Lesion Volume Offer a Clinically Significant Improvement in Prediction of Outcome? * Definitions and Explanations Stroke, June 1, 2007; 38(6): 1820 - 1825. [Abstract] [Full Text] [PDF] |
||||
![]() |
M.S.V. Elkind, S. Prabhakaran, J. Pittman, W. Koroshetz, M. Jacoby, K. C. Johnston, and for the GAIN Americas Investigators Sex as a predictor of outcomes in patients treated with thrombolysis for acute stroke Neurology, March 13, 2007; 68(11): 842 - 848. [Abstract] [Full Text] [PDF] |
||||
![]() |
German Stroke Study Collaboration Predicting outcome after acute ischemic stroke: An external validation of prognostic models Neurology, February 24, 2004; 62(4): 581 - 585. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Weimar, I.R. Konig, K. Kraywinkel, A. Ziegler, and H.C. Diener Age and National Institutes of Health Stroke Scale Score Within 6 Hours After Onset Are Accurate Predictors of Outcome After Cerebral Ischemia: Development and External Validation of Prognostic Models Stroke, January 1, 2004; 35(1): 158 - 162. [Abstract] [Full Text] [PDF] |
||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Stroke Home | Subscriptions | Archives | Feedback | Authors | Help | AHA Journals Home | Search Copyright © 2003 American Heart Association, Inc. All rights reserved. Unauthorized use prohibited. |