Evaluating Performance of the Spetzler-Martin Supplemented Model in Selecting Patients With Brain Arteriovenous Malformation for Surgery
Background and Purpose—Our recently proposed point scoring model includes the widely-used Spetzler-Martin (SM)-5 variables, along with age, unruptured presentation, and diffuse border (SM-Supp). Here we evaluate the SM-Supp model performance compared with SM-5, SM-3, and Toronto prediction models using net reclassification index, which quantifies the correct movement in risk reclassification, and validate the model in an independent data set.
Methods—Bad outcome was defined as worsening between preoperative and final postoperative modified Rankin Scale score. Point scores for each model were used as predictors in logistic regression and predictions evaluated using net reclassification index at varying thresholds (10%–30%) and any threshold (continuous net reclassification index >0). Performance was validated in an independent data set (n=117).
Results—Net gain in risk reclassification was better using the SM-Supp model over a range of threshold values (net reclassification index=9%–25%) and significantly improved overall predictions for outcomes in the development data set, yielding a continuous net reclassification index of 64% versus SM-5, 67% versus SM-3, and 61% versus Toronto (all P<0.001). In the validation data set, the SM-Supp model again correctly reclassified a greater proportion of patients versus SM-5 (82%), SM-3 (85%), and Toronto models (69%).
Conclusions—The SM-Supp model demonstrated better discrimination and risk reclassification than several existing models and should be considered for clinical practice to estimate surgical risk in patients with brain arteriovenous malformation.
- modified Rankin Scale
- net reclassification
- receiver operator curve
- cerebral arteriovenous malformations
The Spetzler-Martin (SM) 5-point grading scale is the most widely accepted surgical risk prediction tool for brain arteriovenous malformations, although other models have been proposed.1–6 We recently developed a simple point scoring model that incorporates SM angiographic variables but supplements with additional clinical factors (SM-Supp) to improve outcome prediction and demonstrated improved discrimination over SM-5 using area under the receiver operating characteristic curve (AUROC).7 Here we extend our previous work by comparing SM-Supp performance with other models using the net reclassification index (NRI) and validating the model in an independent data set.
We included consecutive patients with brain arteriovenous malformation who underwent microsurgical resection between 2000 and 2010 with at least one postoperative visit and no missing outcome data. The development data set consisted of 300 patients with brain arteriovenous malformation treated by a single neurosurgeon (M.T.L.) between 2000 and 2007.7 The primary validation data set consisted of 117 patients (67 new M.T.L. cases between 2007 and 2010; 50 cases from other neurosurgeons between 2000 and 2010) with no missing data. We also included data from a larger validation data set (n=183) for which we multiply imputed missing angiographic data (provided in the online-only Data Supplement).
Outcome was change between preoperative and last postoperative modified Rankin Scale score8 dichotomized into >0 (bad outcome) versus ≤0 (good outcome).7 Predictors included age at surgery, sex, nonhemorrhagic presentation, arteriovenous malformation size, any deep venous drainage, eloquence, diffuse border, and time from surgery to last postoperative modified Rankin Scale assessment (days). SM-5,1 SM-3,6 Toronto,5 and SM-Supp7 scores are defined in online-only Data Supplement Table I.
NRI9,10 was used to evaluate model performance and quantifies the correct movement in risk reclassification when comparing predictions between 2 models at various risk thresholds (10%–30%) or any threshold (continuous, cNRI >0).10 NRI was compared by combining one-sided McNemar tests across outcomes using the Fisher method.11 We derived bootstrap 95% CI for cNRI using 1000 replications.
Characteristics were similar between development and validation data sets (P>0.05; Table 1; online-only Data Supplement Table II). Outcomes were bad for 73 (24%) and good for 227 (76%) patients in the development data set. In the validation data set, outcomes were bad for 39 (21%) and good for 144 (79%) patients.
In the development data set, NRI showed improvement in reclassification of 9% to 25% with SM-Supp than SM-5 over all threshold values (Table 2). A greater net gain was observed at lower thresholds for good and at higher thresholds for bad outcomes. For example, at 15% risk threshold, 85 of 300 (28%) were reclassified into different risk categories. Net gain in reclassification was −6.8% for those with bad outcomes and 27% for those with good outcomes (NRI=0.205, P<0.001). Thus, patients with good outcomes were 21% more likely to move down a risk category than up compared with patients with bad outcomes.
Because risk categories for brain arteriovenous malformation surgical outcome are not well established, we also calculated the cNRI comparing SM-Supp to SM-5. The cNRI was 64% (95% CI, 39%–89%; P<0.001) with a net gain of 26% in those with good outcomes and 37% in those with bad outcomes (Table 2). Thus, 64% had predicted risks reclassified in the correct direction with SM-Supp. Results were similar when comparing SM-Supp with SM-3 (cNRI=67%; 95% CI, 41%–93%) and with Toronto (cNRI=61%; 95% CI, 37%–85%). Scatterplots of predicted probabilities (Figure) by good and bad outcomes reflected a greater proportion of patients with correct assignments using the SM-Supp model compared with SM-5 (Figure A), SM-3 (Figure B), or Toronto models (Figure C). In the validation data set, the SM-Supp model again correctly reclassified a greater proportion of patients versus SM-5 (cNRI=82%; 95% CI, 43.6%–121%), SM-3 (cNRI=85%; 95% CI, 44.7%–126%), and Toronto models (cNRI=69%; 95% CI, 26.4%–121%).
Consistent with NRI results, the SM-Supp model yielded better discrimination and highest AUROC than all other models (online-only Data Supplement Figure I) in development (AUROC=0.76, P<0.001) and validation (AUROC=0.77, P=0.402) data sets.
The SM-Supp model performed equally well in predicting outcomes in an independent data set and consistently showed better risk reclassification and discrimination. For example, >60% of patients were correctly reclassified as having higher risk for those with bad outcomes and lower risk for those with good outcomes compared with each of SM-5, SM-3, or Toronto models.
Direct comparisons with other models2–5 are difficult because outcome measures and time points assessed differ among studies, for example, we examined change in outcome, which takes into account preoperative state. Only Spears et al5 compared performance of their prediction model to SM-5 using modified Rankin Scale and AUROC, showing good discrimination and performance (AUROC=0.80).5 Our model showed equally high discrimination in both development (AUROC=0.76) and validation data sets (AUROC=0.77).
Although the SM-Supp model derives from a single neurosurgeon and referral institution, we provide an independent validation using the NRI and include cases treated by other neurosurgeons in the largest series to date. However, further validation in external settings would be useful to assess generalizability and clinical use. A limitation of all scoring systems is dealing with missing data. In our full validation data set (n=183), 34% were missing angiographic data for SM-Supp, 36% for Toronto, and 13% for SM-5 and SM-3 scores. One way of accommodating missing data are through multiple imputation (see the online-only Data Supplement). Prospective studies planning to use SM-Supp should have minimal issues with missing data: all variables should be available from angiograms and MRI, which are standard for diagnostic evaluation and pretreatment planning, or from records at clinic visits.
In conclusion, the SM-Supp model performs better than current prediction models and should be considered for use in clinical practice. An online calculator is provided to assist clinicians (http://avm.ucsf.edu/healthcare_pro/).
Sources of Funding
Supported by K23NS058357 (H.K.), R01NS034949 (W.L.Y.), P01NS044155 (W.L.Y.), and the Doris Duke Charitable Foundation (E.M.W.).
The online-only Data Supplement is available with this article at http://stroke.ahajournals.org/lookup/suppl/doi:10.1161/STROKEAHA.111.661942/-/DC1.
- Received April 20, 2012.
- Revision received May 25, 2012.
- Accepted June 12, 2012.
- © 2012 American Heart Association, Inc.
- Spears J,
- Terbrugge KG,
- Moosavian M,
- Montanera W,
- Willinsky RA,
- Wallace MC,
- et al
- van Swieten JC,
- Koudstaal PJ,
- Visser MC,
- Schouten HJ,
- van Gijn J
- Fisher RA