Initial Lesion Volume Is an Independent Predictor of Clinical Stroke Outcome at Day 90
An Analysis of the Virtual International Stroke Trials Archive (VISTA) Database
Background and Purpose—Age and National Institutes of Health Stroke Scale early after stroke onset have been identified as important determinants of final stroke outcome. We analyzed the Virtual International Stroke Trials Archive (VISTA) database to define the influence of infarct or hemorrhagic volume on clinical outcome after stroke.
Methods—All patients were extracted from VISTA where infarct or hemorrhage volume information was available (n=2538; most images obtained by CT within 72 hours after stroke onset with a subset of MRI data included, volumes calculated by the ABC/2 approximation method). We used multivariate regression models to study the influence of age, National Institutes of Health Stroke Scale at baseline, and initial infarct/hemorrhage volume on clinical outcome (modified Rankin Scale, National Institutes of Health Stroke Scale, mortality) at day 90.
Results—We find that in a large cohort of >1800 patients with ischemic stroke, initial lesion size is a strong and independent predictor of stroke outcome in a statistical regression model that also accounts for age and National Institutes of Health Stroke Scale at baseline (P<0.0001). The use of infarct/hemorrhage volume as an additional predictive factor further reduces the fraction of unexplained variance in outcome by approximately 15% (R2 of 0.41 versus 0.26 in a model without lesion volume). The predictive strength of initial lesion size is only marginally influenced by image modality or time point of image acquisition within the first 72 hours. The model was equally valid for both ischemic and hemorrhagic strokes.
Conclusions—Infarct/hemorrhage volume at baseline together with age and National Institutes of Health Stroke Scale at baseline should be used in the effect analysis of future therapeutic stroke trials to improve power.
Apart from all other possible reasons for the failure of a large number of stroke trials, one large problematic area is statistical analysis of trial outcome. In the majority of trials in the past, very conservative and demanding approaches to treatment effects have been used such as responder analyses using a dichotomized modified Rankin Scale (mRS) with little adjustments for the considerable patient heterogeneity. In more recent trials, the awareness for approaches with higher power has increased, for example by analyzing changes over the whole mRS or by using covariate models.
Two main factors influencing outcome have emerged very clearly from analyses of large data sets, age and National Institutes of Health Stroke Scale (NIHSS) at baseline.1 A number of smaller studies have suggested effects of initial infarct size2 on clinical outcome. We sought to explore the influence of infarct and hemorrhage volume at baseline in a large set of patients with stroke, the Virtual International Stroke Trials Archive (VISTA). In addition, we systematically identify all influential baseline variables on final outcome and rate their value in terms of power gain.
VISTA is a large database that has collected data from patients with stroke from 29 anonymized acute stroke trials and 1 acute stroke registry with data on >27 500 patients.3,4 VISTA's aim is to ultimately improve stroke care and trial design by the collection and analysis of very large data sets from clinical trials for the treatment of stroke. A considerable number of studies have been performed already on the VISTA data sets that have yielded several important insights into various questions of outcome.3 We have examined all patients in VISTA with baseline imaging information available.
Materials and Methods
All data sets were extracted from the VISTA database (April 2009 version) that had baseline imaging information. Data were imported to and analyzed with JMP 8.01 (SAS Institute). The following groups of parameters were imported from VISTA: medical history (hypertension, ischemic heart disease, imaging data, atrial fibrillation, congestive heart failure, diabetes, transient ischemic attack, smoker, previous stroke, myocardial infarction), imaging data (type of scan at various time points, time of scan acquisition, lesion volumes, infarct location, hemorrhage location), demographics (gender, age, ethnicity), and stroke specifics (stroke type, NIHSS [baseline and day 90], mRS [prestroke and day 90], Barthel Index at day 90, recombinant tissue-type plasminogen activator [rtPA] administration, onset of rtPA administration, day of death, mortality at day 90). For the analysis, the parameter ethnicity was reduced to the categories “Caucasian,” “black,” “Asian,” and “other” due to the very large number of small categories. A number of dependent variables were studied: mRS at day 90, dichotomized mRS at day 90, NIHSS at day 90, Barthel Index at day 90, and mortality. NIHSS at day 90 was analyzed by multiple linear regression. Mortality was analyzed using ordinal logistic regression. Modeling of the mRS at day 90 as our main analysis was performed as ordinal logistic regression. Additionally, we applied linear regression for better estimating residual variance of the model. Except for the ordinal or continuous nature of the y-variable and the different model, all covariates were identical with different clinical outcomes at day 90. Stepwise regression was applied as carried forward with an initial probability to enter of 0.250. A model was then constructed from those terms, and all effects with P<0.05 kept as an extended model.
Calculation of Lesion Volumes
All volumes(cm3/mL; infarcts and hemorrhages) were calculated using a simplification of the ellipsoid rule, that is, A×B×C/2, in which A, B, and C represent the diameter (cm) of the CT hyperdensity in 3 axes perpendicular to each other. Intraventricular hemorrhage was not included into the intracerebral hemorrhage volume calculation. If >1 infarct/hemorrhage was present, volumes were just added. Intraventricular hemorrhage was measured as follows: the degree of intraventricular hemorrhage was calculated using a 4-graded scoring system (0–3; categories: no blood [0 points], sedimentation of blood [1 point], partly filled with blood [2 points], completely filled with blood [3 points]) as defined by Hijdra et al5 in which the amount of blood is determined for the third, fourth, and each lateral ventricle and the 4 scores are summed. A total score of 0 indicates no intraventricular blood and the maximum score of 12 indicates a completely blood-filled ventricular system. All measurements were done centrally by neuroradiologists trained in methods for determining infarct and hemorrhage volumes. In the studies incorporated in VISTA and used in our study in which MRI was used, T2-weighted imaging was used for lesion volume determination.
Description of the Data Set
At the time of data compilation, data from 28 131 patients were stored in the acute section of VISTA. Of these patients, we screened patients for the following eligibility criteria: baseline NIHSS, age, baseline CT/MRI scan available, mRS at 90 days, NIHSS score at 90 days, gender, stroke onset to treatment time, type of stroke, medical history, and mortality. This resulted in a data set of 2538 patient with complete data for all variables on which this analysis was based.
Of the 2538 patients, 297 patients had 1 scan in the 0- to 24-hours time window, 1085 patients had 1 scan in the 24- to 72-hour window, and 1156 patients had scans at both time points. Therefore, a total of 1453 scans were available for the 0- to 24-hour time window and 2241 scans for the 24- to 72-hour time window. A total of 1470 patients had received rtPA in the data set in this subset of patients, 2 scans were available from 278 patients, a 0- to 24-hour scan only for 129 patients, and a 24- to 72-hour scan for 1063 patients. The vast majority of scans were done by CT; only 6% of the scans were done by MRI. In case of 2 scans, the maximal lesion volume was taken for all main analyses. All main analyses are therefore based on the overwhelming majority of the later scans.
The majority of patients (97.5%) had a prestroke mRS of 0 (no symptoms at all). Roughly 1% of patients had a prestroke mRS of 1; the rest was ≥2. We restricted our analyses to patients with prestroke mRS of 0 and 1, the selection criterion used in nearly all stroke trials. Twenty-six patients had no information regarding hemispheric location and were excluded from the analysis data set.
A total of 1938 patients had ischemic lesions and 600 had hemorrhages. Distribution of patient characteristics in the data set is given in Table 1. For the whole data set, mean age of the patients was 67.5 years, 58% of the patients were male, and roughly 90% were of Caucasian origin. Mean NIHSS was 14.6 points. For the ischemic stroke population only, mean age was 67.9 years, 56% of the patients were male, 90% were of Caucasian origin, NIHSS (mean) was 14.9, and 76% of the patients had received rtPA treatment. mRS and NIHSS at day 90 were reasonably correlated (Pearson coefficient of 0.79, R2=0.58). Infarct location information in VISTA was available for 548 patients with ischemic strokes out of 1938 patients. Of those 548, 9 (2%) had brain stem or cerebellar infarct location. In addition, only 9 right and 17 left occipital locations were noted. The overwhelming majority of the infarcts therefore are supratentorial and in the middle cerebral artery or anterior cerebral artery territory.
A total of 1938 of the strokes were classified as ischemic. We first applied a core multiple regression model consisting of the factors age, NIHSS at baseline, and infarct volume and studied effects on mRS at day 90. rtPA treatment was added as a proven medical intervention. Infarct volume was log-transformed to approximate normal distribution. If 2 scans were done within the first 72 hours, the maximal volume was taken for the analysis.
All 4 factors had a significant influence on mRS at day 90 (ordinal logistic regression model, coefficient of determination R2 [U])=0.12; n=1838). Infarct volume was the variable with the highest significance followed by age and NIHSS at baseline (Table 2). To estimate relative weight of each factor, we calculated a mean weight estimate ([mean–minimum value]* unit estimate) for each factor based on linear modeling the mRS. The factor with the highest weight was age followed by infarct size and NIHSS at baseline.
We next studied our base model for the second type of outcome required by the authorities and used in most stroke trials as the coprimary or secondary end point, the NIHSS (Table 2). A total of 1452 patients with ischemic stroke had information on the NIHSS at day 90. The linear model had an coefficient of determination R2 of 0.32, and all factors in our base model highly significantly influenced outcome in the order infarct volume>age>NIHSS at baseline. Figure 1 gives an impression of the dependence of outcome in all 3 clinical outcome scales on initial lesion volume in this linear model in patients at an age of 68 years and an initial NIHSS of 15. It is remarkable that even for the NIHSS outcome at day 90, NIHSS at baseline is the least significant factor. The weight order of the factors was the same as described previously. Although the linear regression provides the easiest and most straightforward statistical approach, this does not mean that the linear relationship depicted here is the best possible fit for the data. See “Spline Modeling of Effects” subsequently for further details.
When using other possible endpoints such as the Barthel Index (modeled as linear), or the dichotomized mRS (using either 0 and 1 or 0–2 as a good outcome), the strongly significant influence of those 3 factors plus rtPA treatment remained stable. The model also remained valid when examining mortality until day 90 (n=1774).
Additional Predictors of Outcome
To identify other influential factors, we applied a stepwise model procedure for mRS at day 90 (ordinal logistic model). Four factors were identified that had additional influence on outcome. Diabetes is still a considerably strong covariate (P<0.000001) followed by ethnicity (worst outcome Asian), history of previous stroke, gender, and history of transient ischemic attack. Fit of the model, however, improved only marginally over the base model described in the previous paragraph (factors age, NIHSS, rtPA treatment, infarct volume) by adding those factors (coefficient of determination from logistic regression R2 [U]=0.13; n=1729 versus R2 [U]=0.12 in the base model).
We also determined additional influential parameters for the final outcome NIHSS by using stepwise linear regression. Three other parameters were influential: diabetes (yes—worse outcome), history of previous stroke (yes—worse outcome), and ethnicity (Asian—worst outcome). Again, these factors had limited additional value for improving the model (R2=0.34, n=1795 versus the linear regression base model R2=0.32). History of transient ischemic attacks and gender failed significance for NIHSS as the end point.
Using Barthel Index as the end point (modeled as linear) also yielded the same additional minor cofactors: diabetes, previous stroke, and ethnicity.
Using mortality as the outcome, only diabetes remained as an additional significant covariate. In all following analyses, we used the most conservative core model consisting of infarct volume, age, rtPA treatment, NIHSS at baseline, and diabetes.
Influence of Imaging Modality and Scan Time
We tested different input alterations in the model with the end point mRS to ensure stability. Using lesion volume obtained only during the first 0 to 24 hours resulted in comparable model parameters, although the number of input patients was considerably reduced (n=782). R2 (U) was 0.14 by ordinal logistic regression. Infarct volume remained the most influential parameter.
A total of 484 patients with ischemic stroke had undergone repeat CT scanning in the time period of 0 to 24 hours and 24 to 72 hours after stroke onset. We asked whether the early or later scans provided a better prediction of final outcome. Model fit was improved by 16.2% by using the later scan times (mRS day 90 as outcome, ordinal logistic regression, base model). In hemorrhagic strokes, the gain in using later scans was smaller (9.5% based on 472 patients), possibly explained by the smaller difference between the initial and developing hemorrhage in contrast to ischemic lesions.
Infarct volume in most of our patients was derived from CT scans. We therefore examined whether the influence of infarct volume remained valid with MRI scans. One hundred twenty-eight data sets were contained in our VISTA selection who had MRI data available; 73 of those had MRI scans performed within 24 hours after stroke onset. The maximal infarct volume from 0 to 72 hours poststroke onset had a significant influence on outcome prediction (mRS; R2 [U]=0.15 by ordinal logistic regression). Also here, infarct volume had a greater influence than baseline NIHSS. In this set of patients, diabetes was not an influential parameter.
Because the proportion of patients receiving rtPA treatment was very high in our population, we also tested the model in non-rtPA patients only. The model was also valid here (ordinal logistic regression R2 [U]=0.20), again with infarct volume as the most influential predictor. Also, there was no interaction between factors rtPA use and initial lesion volume for outcome prediction, implying that lesion volume is of comparable value in both patient populations for predicting outcome if rtPA is accounted for in the model.
For the imaging outcome at day 90, 266 data entries were available. All 3 core parameters had an influence on imaging outcome with lesion volume at baseline carrying the overwhelming majority of the effect. The resulting linear regression model has an excellent R2 of 0.77.
We conclude that the model described here is stable regarding different outcome scales, different imaging subsets, and different input sets of patients.
Spline Modeling of Effects
To get a better appreciation of how the different effectors influenced outcome, we allowed spline fitting of the effector curves for the continuous variables NIHSS at baseline, age, and infarct volume (Figure 2A). The ordinal logistic model on mRS had an R2 (U) of 0.15. Although age and infarct volume displayed a rather continuous relationship to mRS outcome at day 90 with a smooth increase in curve slope toward higher age (steeper increase beyond 75 years) and higher infarct volumes (beyond approximately 90 cm3), the influence of NIHSS appeared discontinuous with an initial linear relationship to mRS until an NIHSS of 11, a flat relationship between 11 and 16, a steep linear part between 16 and 20, and a flat linear response beyond 20. A similar pattern was observed when regarding NIHSS at day 90 as the outcome (Figure 2B).
Effects in Hemorrhages
We finally examined predictivity of our model for hemorrhagic stroke. Using the stepwise fit model procedure, we identified that the following parameters had a significant influence on mRS (model R2 [U]=0.15): age, NIHSS at baseline, hemorrhage volume, diabetes, hemisphere location (right worse outcome), and previous stroke (yes=worse outcome). Hemisphere takes away some of the effect size of the initial hemorrhage volume, because right-sided hemorrhages are significantly larger.
When regarding NIHSS at day 90 as the end point, we discovered significant influences of hemorrhage volume at baseline, age, NIHSS at baseline, and previous stroke (R2=0.35, n=572). For Barthel Index, previous stroke and hemisphere location were additional influential parameters.
Mortality at day 90 was 20% in the hemorrhagic stroke subset. Strongest predictors of mortality (in that order) were hemorrhage volume, age, and previous stroke. NIHSS at baseline had no significant influence on mortality if hemorrhage volume is present as a cofactor but had a strong influence if hemorrhage volume was deleted from the list of covariates. This implies that the influence of NIHSS at baseline on mortality is fully taken over by hemorrhage volume and better explained by this variable.
Allowing spline curve fitting in the model increased the overall predictivity of the model (R2 [U]=0.16) and showed the same profile of age, hemorrhage volume, and NIHSS baseline influence as in patients with ischemic stroke. Similar to ischemic strokes, there was a “flat” phase of the NIHSS baseline curve that went from approximately 14 to 20 points. Thus, our core model is also valid for hemorrhagic strokes.
When entering the total intraventricular hemorrhage grade into our model together with the hemorrhage volume, this increased model fit slightly (by approximately 10%). Adding intraventricular hemorrhage grade to hemorrhage volume is therefore of additional value in predicting outcome more exactly.
We finally applied the model to all patients in the database without regarding infarct type. Again, the model proved highly valid with infarct/hemorrhage volume as the most influential factor followed by age and NIHSS at baseline. Diabetes and history of previous stroke were also significant covariates. Stroke type (ie, ischemic or hemorrhagic stroke) only had a very minor influence on outcome only detected when using spline modeling.
Estimation of Power Gain by Infarct/Hemorrhage Volume as an Explanatory Covariate
To estimate the power gain for stroke trials aimed at detecting therapeutic efficacy, we used linear modeling of the mRS at day 90 in the population of patients with ischemic stroke who did not receive rtPA treatment. We derived the number of patients needed to detect differences of 0.5 or 0.25 points effect size on the mRS with a power of 0.8 from the remaining error in the model (Table 3). The full core model (infarct volume, age, and NIHSS baseline) had a fraction of unexplained variance in outcome of 59%, resulting in 1241 patients needed for detecting a 0.25-point difference, whereas without using any covariates, 2038 patients would be needed. Using NIHSS at baseline as the only covariate leads to a fraction of unexplained variance of 80%, and 1640 patients would be needed for detecting a difference of 0.25 points on the mRS. Adding infarct volume to a model composed of age and NIHSS increases R2 from 0.26 to 0.41, meaning that an additional 15% of outcome variability is explained by baseline parameters alone.
We have shown in a large and heterogeneous data set that infarct or hemorrhage volume at baseline is a strong predictor of clinical outcome at day 90. This factor is more influential than age and NIHSS at baseline and has a more continuous relationship to clinical outcome than NIHSS.
A number of studies have already examined potential predictivity of baseline lesion volume for clinical outcome (13 studies; reviewed in Schiemanck et al2). Most of the studies found a predictive value except for 1.6 However, all these studies were small (patient numbers between 10 and 102) and used varying statistical methods and different outcome scales. Moreover, most did not systematically take into account the other important covariates of age and NIHSS at baseline. The first notion that infarct volume is a predictor of outcome comes from Sauders and colleagues in 1995 in a small study with 21 patients.7 They report that infarct volumes derived from T2-weighted images within 72 hours after stroke onset are correlated to outcome on the Scandinavian Stroke Scale categorized to 3 levels.
Thijs et al in a study on 63 patients are the first researchers who went beyond simple correlations, found a significant influence of all 3 of our core factors, and incorporated them into a statistical model.8 For the first time this work implicitly states that infarct volume is a factor that contributes independently from NIHSS to clinical outcome. Shortcomings of this study are that the authors did not account for rtPA treatment among their patients and that the patient population was not highly representative of the patients typically enrolled in stroke trials.
The study with the most patients examined is the one by Baird and colleagues.9 They identify infarct volume at baseline determined by diffusion-weighted imaging as a predictive factor and verify this in an independent cohort, although the number of patients was still low (n=129 patients total). However, they fail to account for the highly influential factor age in their model and unnecessarily categorize input factors and outcome into 2 or 3 step categories, thereby strongly reducing power of their analysis and making the meaning of the model difficult to understand.
In our study we have confirmed the results of these studies in a large and heterogeneous patient population and established the overwhelming stability of the factors age, NIHSS at baseline, and infarct volume at baseline for outcome prediction. Our core model is stable against all variations of statistical analysis, predictive of mortality and other outcome scales. Also, the effect of infarct volume is seen by both imaging modalities (MRI and CT), although the MRI subset seems to provide a slightly better model fit. Importantly, the model is also valid in hemorrhagic stroke.
A number of caveats and questions remain in the current analysis that should be addressed in future studies. First, infarct/hemorrhage volumes were determined using the less exact ABC/2 method that introduces measurement error. In addition, data were pooled from different studies likely introducing additional variance in the lesion volume data. Although an increase in nonsystematic variance does not interfere with the basic finding of influence of infarct/hemorrhage volume on stroke outcome here, future studies should study lesion volumes obtained by more exact methodology (MRI and computer-assisted volumetry) to allow for a potential refinement of our model. Large-scale high-quality imaging data may soon be available from several ongoing trials. Second, it remains to be further explored what scanning time point after stroke would offer the optimal predictive value. We know from our repeat-scan comparison that, at least for CT, the scan taken at the second or third day allows for a much better prediction of stroke outcome. Also, although we have an indication from our data that MRI appears superior, it needs to be further explored what image modality is optimal for the purpose of outcome prediction (CT versus MRI, diffusion-weighted imaging or T2). One testable hypothesis we put forward is that the predictive value of the image increases with correlation to lesion volume at day 90. Finally, caution is needed before full generalization of the model, because only moderate to severe strokes in the supratentorial territory were contained in our data set. Further analyses need to explore how appropriate this model is for infratentorial and light strokes. However, the population studied here is also the most relevant one for acute stroke trials, and the model should be most useful here.
We have found that simple infarct/hemorrhage volume is a powerful predictor of outcome, although functions in the brain that are being compromised by stroke are located in very specific areas, and the relative importance of 1 function lost is not well correlated to the volume of tissue it resides in. This is different in other organs like the liver where the amount of tissue lost is well linearly reflected in the functional loss (eg, detoxification capacity). Apart from the obvious connection of massive lesions to bad outcome because of edema formation and associated problems, it is less clear why also strokes of medium severity are predicted so well by the factor lesion volume. We propose that other factors that are linearly related to brain tissue play a role, both for the acute phase (one candidate being systemic immunosuppression10) and for regeneration. Functional recovery also of very localized functions in the brain may depend on the relative intactness of far larger networks. Besides the amount of tissue lost, the quality of function(s) lost (the NIHSS at baseline), however, remains as an independent predictor in our model, because this defines the domain range in which recovery is possible.
We conclude from our analysis that for all future stroke trials with therapeutic interventions, a core outcome model should be used incorporating age, infarct or hemorrhage volume at baseline, and NIHSS as predictors. This model has been chosen for end point analyses in the AX200 for the Treatment of Acute Ischemic Stroke (AXIS 2) trial (NCT00927836). Diabetes, previous transient ischemic attacks, and history of previous strokes may be included in a more extended model but are not as robust and influential as the 3 core parameters. The use of lesion volume as an additional predictive factor reduces the fraction of unexplained variability by 15%. Our model should be used to increase power for effect detection in acute clinical stroke trials rather than decrease sample size. It is to be hoped that this addition will increase the chances of finding therapies effective for stroke treatment in future studies.
We thank Dr Myzoon Ali from the VISTA database for her outstanding support during this study and all members of the VISTA steering committee for helpful comments and discussions on the article.
VISTA Steering Committee
A. Alexandrov, P. W. Bath, E. Bluhmki, L. Claesson, J. Curram, S. M. Davis, G. Donnan, H. C. Diener, M. Fisher, B. Gregson, J. Grotta, W. Hacke, M. G. Hennerici, M. Hommel, M. Kaste, K. R. Lees (Chair), P. Lyden, J. Marler, K. Muir, R. Sacco, A. Shuaib, P. Teal, N. G. Wahlgren, S. Warach, and C. Weimar.
Jeffrey L. Saver, MD, was the Guest Editor for this paper.
- Received November 29, 2011.
- Accepted January 17, 2012.
- © 2012 American Heart Association, Inc.
- Weimar C,
- Konig IR,
- Kraywinkel K,
- Ziegler A,
- Diener HC
- Schiemanck SK,
- Kwakkel G,
- Post MW,
- Prevo AJ
- Ali M,
- Bath PM,
- Curram J,
- Davis SM,
- Diener HC,
- Donnan GA,
- et al
- Hijdra A,
- Brouwers PJ,
- Vermeulen M,
- van Gijn J
- Wardlaw JM,
- Keir SL,
- Bastin ME,
- Armitage PA,
- Rana AK
- Saunders DE,
- Clifton AG,
- Brown MM
- Thijs VN,
- Lansberg MG,
- Beaulieu C,
- Marks MP,
- Moseley ME,
- Albers GW