Letter by Venema et al Regarding Article, “Validating a Predictive Model of Acute Advanced Imaging Biomarkers in Ischemic Stroke”
To the Editor:
We read the study by Bivard et al,1 in which they develop a model that predicts functional outcome in patients with acute ischemic stroke based on clinical and advanced imaging characteristics, with great interest. Prognostic modeling for acute ischemic stroke is a very relevant topic and may be useful in identifying individual patients with no or little benefit from reperfusion therapy. Unfortunately, we noted several methodological shortcomings in the model development and validation. We will discuss some of these and propose recommendations for future prediction studies in stroke research.
One of the first steps in prediction modeling is the coding of predictors. Bivard et al1 have dichotomized the continuous predictors (age, baseline National Institutes of Health Stroke Scale score, time from symptom onset, and computed tomography perfusion volumes), although it is well known that dichotomization of continuous variables by using cutoff values leads to unnecessary loss of information.2 Moreover, the data-driven approach of defining cutoff values in the same cohort in which the model will be developed results in overfitting and unstable predictions.3 Overfitting results in models of which the performance in new patients is often worse than expected based on the apparent performance in the derivation cohort.
Another, important step is internal validation. Internal validation gives insight in the amount of overfitting. All modeling steps that are based on the data should be included in the validation process, that is: the coding of predictors, selection of predictors, and fitting the final regression model. It seems that Bivard et al1 have not validated the model as fitted in the derivation cohort. Actually, they refitted the model in the validation cohort, which gives new regression coefficients that are optimal for the validation data. Furthermore, using a split-sample approach for validation is data inefficient, given that less than half of the total cohort is now used for model development. Bootstrapping is preferred because it uses all the data, resulting in more stable estimates and lower bias.4
The validation procedure reveal very large differences in model performance between the derivation and the validation cohort, for example: area under the curve 0.894 versus 0.727 in the thrombolysis patients on the outcome modified Rankin Scale score of 5 and 6. The authors state that these areas under the curve are not significantly different between the derivation and the validation cohort, based on the Hanley test. However, this test was specifically developed to compare areas under the curve in the same sample of patients.5 In our opinion, the large absolute differences in area under the curve actually confirm the suspicion that the model is strongly overfitted, resulting in poor external validity. It seems, therefore, too early to conclude that the model was successfully replicated in the validation cohort.
Given the methodological shortcomings in the model development and validation, the findings of this study should be interpreted with care. Application of the developed thresholds can only be applied in clinical practice or trials after good performance of the model in extensive, correctly performed, external validation.
Esmee Venema, MD
Maxim J.H.L. Mulder, MD
Hester F. Lingsma, PhD
Erasmus MC University Medical Center
Rotterdam, the Netherlands
Stroke welcomes Letters to the Editor and will publish them, if suitable, as space permits. Letters must reference a Stroke published-ahead-of-print article or an article printed within the past 4 weeks. The maximum length is 750 words including no more than 5 references and 3 authors. Please submit letters typed double-spaced. Letters may be shortened or edited.
- © 2017 American Heart Association, Inc.