The Role of Bolus Delay and Dispersion in Predictor Models for Stroke
Background and Purpose—In combination with diffusion-weighted imaging, perfusion-weighted imaging parameters are hypothesized to detect tissue at risk of infarction in patients with acute stroke. Recent studies have suggested that in addition to perfusion deficits, vascular flow parameters indicating bolus delay and/or dispersion may also contain important predictive information. This work investigates the infarct risk associated with delay/dispersion using multiparametric predictor models.
Methods—Predictor models were developed using specific combinations of perfusion parameters calculated using global arterial input function deconvolution (where perfusion is biased by dispersion), local arterial input function deconvolution (where perfusion has minimal dispersion bias), and parameters approximating bolus delay/dispersion. We also compare predictor models formed using summary parameters (which primarily reflect delay/dispersion). The models were trained on 15 patients with acute stroke imaged at 3 to 6 hours.
Results—The global arterial input function models performed significantly better than their local arterial input function counterparts. Furthermore, in a paired comparison, the models including the delay/dispersion parameter performed significantly better than those without. There was no significant difference between the best deconvolution model and the best summary parameter model.
Conclusions—Delay and dispersion information is important to achieve accurate infarct prediction in the acute time window.
- acute stroke
- cerebral blood flow
- cerebral hemodynamics
- diffusion-weighted imaging
- infarct prediction
- perfusion-weighted imaging
Early and accurate detection of tissue at risk of infarction is the principal aim of acute stroke imaging. This would provide important information to guide treatment choice and offer a surrogate method for comparing different treatment options. To this end, voxelwise predictor models combining perfusion-weighted imaging (PWI) and diffusion-weighted imaging (DWI) parameters1 have attracted considerable interest. These models may contain various perfusion parameters calculated using various PWI postprocessing methods, each of which offers markedly different information about tissue perfusion and vascular flow. The specific choice and combination of parameters within predictor models can critically affect their performance.
A standard analysis of bolus-tracking PWI data (ie, using the mathematical technique of deconvolution to calculate perfusion from a global arterial input function [AIF] and the concentration time course [CTC] data2) can significantly underestimate perfusion if there is any delay and/or dispersion (D/D) of the bolus as it passes through the supplying vessels.3 This may be caused by, for example, arterial stenosis or collateral flow.4 As such, the “standard” estimates of cerebral blood flow (CBF) and mean transit time (MTT) are actually a coupling of tissue perfusion and vascular flow properties. Although underestimation of perfusion caused by bolus delay can be minimized using delay-insensitive deconvolution such as block-circulant singular value decomposition (oSVD),5 any bolus dispersion between the AIF and the tissue will always bias tissue perfusion estimated using a global AIF (GAIF) analysis.
One approach to minimize dispersion error is to use multiple local AIFs located closer to the tissue and downstream of any arterial abnormalities. Local AIF (LAIF) approaches have been shown to increase the perfusion estimates.6 Furthermore, circumstantial evidence supports the notion that this is due to reduced D/D errors,7 implying more accurate perfusion measurement. Indeed, Lorenz et al8 found that for data acquired within 12 hours of symptom onset, LAIF-derived perfusion was more predictive of tissue outcome than GAIF-derived perfusion, suggesting that unbiased perfusion measurement is important to identify at-risk tissue. On the other hand, Christensen et al found that for data acquired within 6 hours, the first-moment of the CTC (FMCTC), which primarily reflects D/D, was the best single parameter predictor of outcome.9 Furthermore, several major clinical trials, for example, DEFUSE10 and EPITHET,11 have used the time-to-maximum of the deconvolved tissue residue function (Tmax) in their diffusion–perfusion mismatch definition. Similar to FMCTC, Tmax primarily reflects bolus D/D and has only mild dependence on tissue perfusion itself.12
The goals of this study were (1) to investigate the influence of both unbiased perfusion parameters and D/D information in predicting tissue infarction; and (2) to investigate the effectiveness of nondeconvolution or summary parameters in predicting tissue infarction. These investigations were conducted using voxelwise prediction algorithms (generalized linear models [GLM]1) developed using specific combinations of (1) LAIF- and GAIF-derived perfusion parameters with and without D/D parameters; and (2) summary parameters.
Materials and Methods
Tissue outcome prediction using a GLM analysis has been described previously in detail.1 In brief, the probability of future infarction, P, in each voxel is modeled as: (1) where x is the acute parameter value, k is the acute parameter type (DWI, CBF, etc), βk are GLM coefficients, and α is the offset term. P is the probability of infarction, in the training phase, equal to 1 or 0 for infarcted or noninfarcted tissue, respectively. For a set of acute parameter maps and known follow-up outcome images (known as training data), the GLM offset and coefficients are calculated by regression analysis. These are then used in Eq.  to assign infarct probabilities to new acute data.
In this work, the training data were formed from patients with stroke without reperfusion scanned as part of the multisite Echoplanar Imaging Thrombolytic Evaluation Trial (EPITHET) trial.11 Patients with spontaneous reperfusion or reperfusion due to thrombolysis were excluded, because predictor models of acute imaging data cannot model these postimaging events. The EPITHET trial study protocol and patient consent procedures were approved by the local ethics committees. Details of the imaging protocols have been reported previously.11 In brief, patients were imaged on 1.5 T MRI scanners; a gadolinium-based contrast agent (0.2 mmol/kg) was intravenously injected and gradient-echo PWI images acquired every 1.4 to 2.5 seconds over 10 to 24 axial slices (5–7 mm thickness); isotropic diffusion images (iDWI) were created from DWI images (obtained with diffusion gradients applied in 3 orthogonal planes) with b-values of 0 s/mm2 and 1000 s/mm2 over 13 to 27 axial slices (5–7 mm). MR angiography (time-of-flight or phase-contrast) was also done acutely and subacutely. Patient selection was limited to those who had acute (<6 hours) and subacute (Day 3–5) PWI and DWI and follow-up T2-weighted images (Day 90). Nonreperfusing patients were selected using the criterion of <50% reduction in the Tmax abnormality volume between the acute and subacute PWI as defined by Tmax ≥2 seconds.11 Fifteen patients were identified (median National Institutes of Health Stroke Scale (NIHSS), 15 [7–12]; median age 64 years [39–87 years]). The occlusion types were: 10 internal carotid artery, 3 middle cerebral artery, and 2 without occlusions.
GAIF-derived perfusion maps, CBV, Tmax, CBFG, and MTTG (where subscript G indicates the global AIF was used) were calculated using PENGUIN software (www.cfin.au.dk/software/penguin) with delay-insensitive oSVD deconvolution5 and a GAIF measured in a branch of middle cerebral artery. LAIF-derived perfusion maps, CBFL, and MTTL (where subscript L indicates local AIFs were used) were calculated using the automated process detailed in Willats et al.7 D/D was represented by the adjusted-FM parameter: (2) where FMR is the first-moment of the GAIF-deconvolved tissue residue function, R. FMadj is approximately independent of MTT, and approximately proportional to both delay and dispersion.7
Six predictor models are detailed in Table 1. In the first four models, perfusion parameters were chosen to delineate the contributions of true perfusion and D/D: two of these models (named Local and Local+) contain LAIF-derived perfusion parameters; two models (named Global and Global+) contain GAIF-derived perfusion parameters, wherein models Local+ and Global+ contain the additional FMadj parameter to represent D/D. A summary parameter model (named SP) containing pseudoperfusion parameters was studied to gauge the benefit of deconvolution-based maps for predictor models. The final model (named FM) contains a single parameter, FMCTC, and was only included to provide a comparison with previous threshold-based work, which found FMCTC to give the best prediction of outcome.9 Because combined perfusion/diffusion models have been shown to predict tissue outcome more accurately than using the parameters separately,1 the iDWI intensity was included in all models, except for the single parameter model FM. We did not include the b=0 s/mm2 image, because preliminary investigations revealed it to have a negligible effect on model performance (data not shown).
To prepare the training data, the iDWI and follow-up T2 images were coregistered to the PWI images using MINC tools software14 and visually inspected for proper alignment. All subsequent calculations were implemented in MATLAB (Natwick, MA). The acute parameter maps were normalized to their mean contralateral white matter value, by division for iDWI, CBV, and CBF, or difference for MTT, FMCTC, and FMadj. Training regions were defined as “infarcted tissue” (manually delineated on the follow-up T2 images) and “noninfarcted tissue” as the remaining ipsilateral hemisphere. Acute and follow-up cerebrospinal fluid voxels were excluded from the training regions to minimize bias caused by brain shrinkage. Within the coregistered training regions, an iterative reweighted least-squares algorithm was used to calculate the GLM offset/coefficients.1
Evaluation of Model Performance
To assess the models, a jackknife approach15 was used, whereby for each patient, the infarct probability map (P-map; associated with a particular model) was calculated using the model trained on the remaining 14 patients.1 The P-maps were subsequently filtered within-slice (using 3×3 voxel full-width half-maximum Gaussian kernel) to minimize false-positives due to noise. For each model, the GLM offset/coefficients used in the jackknife rounds were the bootstrap mean15 across 20 bootstrap training data subsets. These subsets were created by random sampling (with replacement) the infarcting voxels and an equal number of noninfarcting voxels from each patient in the training data.
The performance of each model was characterized by generating receiver operating characteristic (ROC) curves for each patient's P-map formed by varying the probability threshold for tissue infarction between 0% and 100%, and for each threshold measuring the sensitivity and specificity of the model prediction. For each curve, the area under the curve (AUC) and optimal operating point (OOP) were determined. The performance measure AUC is the probability that the model will predict higher probabilities in areas that do infarct compared with areas that do not. The OOP is the probability threshold that gives the smallest false-positive rate (1-specificity) for the largest true-positive rate (sensitivity).
The overall performance of each model was determined by pooling all 15 patient predictions (made in the jackknife rounds) into one large data set and running a single ROC analysis.
For each model, the impact of each constituent normalized acute parameter on infarction risk was estimated using odds-ratios (OR): (3) where the coefficients βk are the bootstrap mean from an “aggregate-model” formed using all 15 patients in the training data. OR is the relative amount by which the predicted odds of infarction, odds=P/(1-P), increase (OR >1) or decrease (OR <1) per unit increase in the associated (normalized) parameter, whilst holding all other parameters constant.
The median and interquartile range were calculated for the AUC, OOP, sensitivity, and specificity of each model. Nonparametric tests advocated by Demsar16 were used to detect differences between the models' AUC, OOP, sensitivity, and specificity across all patients. The Friedman and post hoc Nemenyi tests were used to make comparisons between all models. The Wilcoxon signed-rank test was used to compare model pairs Global/Global+ and Local/Local+ in order to determine if the inclusion of the D/D parameter improved model performance.
Figure 1 shows box plots of AUC, OOP, sensitivity, and specificity for the jackknife patient predictions. The sensitivities and specificities were computed at the OOP of the corresponding pooled ROC curve (Figure 2; Table 2A) but show the same trend at the previously used1 probability threshold of 0.5 (data not shown).
The AUCs in Figure 1A indicate good performance across all models. Model Local, which includes LAIF-derived perfusion parameters that contain no D/D information, performed the worst (as indicated by median AUC), and significantly worse than Global, Global+, and SP. Moreover, the LAIF models (Local and Local+) performed significantly worse than their GAIF counterparts (Global and Global+, respectively). The only difference between these pairs of models (Local versus Global and Local+ versus Global+) is the calculation of the perfusion parameters, implying that the dispersion naturally included with the GAIF-derived parameters improves model performance. Interestingly, there was no significant difference between Global+ (the best performing model as indicated by median AUC) and SP. For the OOPs, sensitivities, and specificities, there was only a significant difference between the sensitivities of Local and Global+ (marked in Figure 1C).
The Wilcoxon signed-rank tests between the AUCs of model pairs, Global/Global+ and Local/Local+, found the inclusion of the D/D parameter, FMadj, significantly improved model performance (P<0.05; not marked in Figure 1A). Furthermore, the inclusion of FMadj narrowed the interquartile range across AUC, OOP, sensitivity, and specificity in both the LAIF and GAIF models (Local+ narrower than Local and Global+ narrower than Global, respectively; Figure 1).
The pooled AUC results given in Table 2A show the same trend as the jackknife results of Figure 1A, with the best predictions achieved with Global+ closely followed by SP. More generally the AUC, sensitivity, and specificity are higher in both the GAIF and LAIF models containing the D/D parameter, FMadj (ie, Global+ higher than Global, Local+ higher than Local). The pooled ROC curves, which summarize the overall performance of the models over all patients, are shown in Figure 2.
The GLM offset and coefficients for the normalized acute parameters of each aggregate-model are shown in Figure 3A and the corresponding odds-ratios (OR) in Figure 3B. Our data show that: a unit rise in iDWI most strongly increased the odds of infarction; a unit rise in CBV increased the odds in models Local and Local+ but decreased it in Global, Global+, and SP; a unit rise in CBF reduced the odds more in models Local and Local+; unit rises in MTT, FMadj, and FMCTC were consistent and increased the odds in their associated models; the summary parameter CBV/FMCTC had a negligible influence in model SP.
As illustrative examples, infarct probability maps (P-maps) for two patients are shown in Figure 4 (Patient A: a 55-year-old man, NIHSS acute/follow-up 21/14, left internal carotid artery occlusion; Patient B: a 43-year-old woman, NIHSS acute/follow-up 17/5, left internal carotid artery occlusion) together with some constituent acute parameter maps and the follow-up lesion. Patient A has a large diffusion–perfusion mismatch acutely, the majority of which progresses to infarction. Patient A's P-maps illustrate the superior predictions obtained using the GAIF models (Global and Global+) over the LAIF models (Local and Local+). Furthermore, the fact that models SP and FM also perform well indicates the predictive value of D/D information in this patient. Patient B shows a smaller diffusion–perfusion mismatch acutely. Only a strip of the mismatch progresses to infarction, and this is most accurately predicted by models Global+ and Local+. Therefore, in this patient, the D/D parameter, FMadj, is important. For both patients, the AUC, sensitivity, and specificity of the LAIF and GAIF models improve with the inclusion of FMadj with best predictions obtained using model Global+ (Table 2B).
In this work we have investigated the role of delay/dispersion in predicting infarction using predictor models trained on data acquired in the 3 to 6 hour time window. Although all models have good performance, prediction accuracy, as measured by AUC, was found to be greatest for Global+, which includes D/D information (contained in the parameter FMadj) and whose constituent GAIF-derived perfusion parameters are naturally coupled with dispersion. Infarct prediction was least accurate for model Local, which contains no additional D/D information and whose LAIF-derived perfusion parameters have minimal D/D bias. For clarity, in this discussion, we refer to the dispersion information that is intrinsic to the GAIF-derived perfusion parameters (but not the LAIF-derived perfusion parameters) as the “GAIF-dispersion information” in order to distinguish it from the dispersion information contained in FMadj.
Looking at the performance over all the models (the Nemenyi test), Local performed significantly worse than Global, Global+, and SP, all of which contain parameters naturally coupled with dispersion. Therefore, perfusion estimates that are not biased by dispersion, such as those in model Local, were not helpful for accurate infarct prediction in these data. Conversely, dispersion information is relevant for accurate infarct prediction.
With the Nemenyi test, no significant difference was found between models Local and Local+, whereas there was a significant difference between Local and Global. Therefore, the D/D information contained in FMadj (included in model Local+ but not Local) has weaker predictive value than the GAIF-dispersion information (in Global but not Local). In addition, model Global+ performed significantly better than Local+, signifying that FMadj (included in both Global+ and Local+) does not fully represent all the GAIF-dispersion information (in Global+ but not Local+). On the other hand, there was no significant difference between models Global and Local+. This indicates that the D/D information contained in FMadj (included in Local+ but not Global) is large enough to partially balance the GAIF dispersion information (in Global but not Local+) such that the performance difference (between Global and Local+) falls below significance.
Furthermore, with the Nemenyi test, no significant difference was found between models Global and Global+. Assuming that the dispersion information contained in FMadj (included in model Global+ but not Global) is outweighed by the GAIF-dispersion information (ie, see previous paragraph), Global and Global+ differ primarily by the extra delay information contained in FMadj. The fact that this delay information did not result in a significant difference between models Global and Global+ suggests that in these data, this delay information has weaker predictive value than the GAIF-dispersion information, which distinguished the significantly different Local+ and Global+, and Global and Local models.
Note however, that using the more sensitive Wilcoxon test did reveal a difference between model pairs Global/Global+ and Local/Local+, confirming that the D/D parameter, FMadj, does add predictive value in itself (by adding some information from delay and/or additional dispersion). However, as shown above, the effect is less strong than the predictive value of the GAIF-dispersion information.
Due to the coupling of dispersion in the GAIF-derived perfusion parameters, a decrease in CBFG comprises a decrease in true CBF and/or an increase in dispersion. Consequently, for CBFG, the odds-ratio for CBF (OR <1) is weighted by the odds-ratio for dispersion (OR >1). This may explain the weaker weighting of CBFG in the GAIF models compared with CBFL in the LAIF models.
Interestingly, there was no significant difference between the best deconvolution model, Global+, and the summary parameter model, SP, which in addition to iDWI and CBV, had a strong OR from FMCTC. Because FMCTC primarily reflects D/D, this result further indicates the importance of D/D information for predicting infarction in these data. Practically, model SP has the advantage of parameters that are user-independent (unlike the selection of GAIF required for the parameters in model Global+). Such simple postprocessing would make this model attractive for a clinical setting.
The AUC, sensitivity, and specificity obtained for the single parameter model FM are in good agreement with those reported by Christensen et al,9 a study that found the normalized parameter FMCTC to be the best single parameter predictor of infarction. Although model FM performed significantly worse than Global+, there was no significance difference between Global+ and a two-parameter model formed from FMCTC and iDWI, which performed with pooled AUC=0.830 (data not shown). This confirms both the importance of diffusion for infarct prediction and the utility of the summary parameter FMCTC.
Finally, it is important to note that the LAIF method used here7 has been previously validated and was found to reduce D/D error in LAIF-derived perfusion (compared with standard GAIF-derived perfusion). This suggests that perfusion deficits are more accurately characterized with LAIF-derived perfusion. For this reason, the performance difference between the LAIF and GAIF models is attributed to the predictive value of GAIF-dispersion information. However, due to the potential for partial-volume contamination of the local AIF estimates, some of this performance difference may also be attributed to suboptimal LAIF selection.
The sensitivity to detect significant differences between models may be limited by the relatively small training data set (determined by the number of patients satisfying the selection criteria). However, despite limited power, we still found significant benefit of including D/D information in the predictor models for this acute time window. This could have potentially important clinical implications if this result is upheld in larger prospective studies.
Although the GAIF and summary parameter models were found to perform strongest in these data (acquired within 3–6 hours and without reperfusion), it is possible that at later time windows, unbiased perfusion parameters that are decoupled from D/D (ie, LAIF-derived perfusion) may become more important and D/D may become less predictive. Indeed, it is known that patients can have chronic regional hemodynamic delay without necessarily developing infarction. The predictor models and findings from this study may not therefore be applicable to later time windows such as the Extending the time for Thrombolysis in Emergency Neurological Deficits (EXTEND) or Diffusion and Perfusion Imaging Evaluation for Understanding Stroke Evolution study (DEFUSE) 2 trials. However, the GAIF and LAIF predictor models presented here provide the framework for further investigation into the significance of D/D information for infarct prediction at progressive time windows.
A predictor model framework was used to investigate the role of delay and dispersion for infarct prediction. Prediction was most accurate using a model formed from global AIF perfusion parameters (which are naturally coupled with dispersion) and an independent delay/dispersion parameter. A model formed from summary parameters also performed strongly. Our results demonstrate that delay and dispersion have a significant role above and beyond perfusion for predicting infarction in this acute 3 to 6 hour time window.
Sources of Funding
This study was supported by grants from the National Health and Medical Research Council (NHMRC) of Australia and the Victorian Government's Operational Infrastructure Support Program.
Louis Caplan, MD, was the Guest Editor for this paper.
- Received August 22, 2011.
- Revision received December 8, 2011.
- Accepted December 16, 2011.
- © 2012 American Heart Association, Inc.
- Wu O,
- Koroshetz WJ,
- Ostergaard L,
- Buonanno FS,
- Copen WA,
- Gonzalez RG,
- et al
- Christensen S,
- Mouridsen K,
- Wu O,
- Hjort N,
- Karstoft H,
- Thomalla G,
- et al
- Albers GW,
- Thijs VN,
- Wechsler L,
- Kemp S,
- Schlaug G,
- Skalabrin E,
- et al
- Calamante F,
- Christensen S,
- Desmond PM,
- Ostergaard L,
- Davis SM,
- Connelly A
- Efron B
- Demsar J