(Stroke. 2003;34:1324.)
© 2003 American Heart Association, Inc.
Comments, Opinions, and Reviews |
From the Julius Center for Health Sciences and Primary Care, University Medical Center Utrecht, Utrecht, the Netherlands (P.J.N., Y.v.d.G.), and Program for the Assessment of Radiologic Technology, Department of Epidemiology and Biostatistics and Department of Radiology, Erasmus Medical Center, Rotterdam, the Netherlands, and Department of Health Policy and Management, Harvard School of Public Health, Boston, Mass (M.G.M.H.).
Correspondence to M.G.M. Hunink, MD, PhD, Erasmus Medical Center Rotterdam, Room EE 2140, PO Box 1738, 3000 DR Rotterdam, Netherlands. E-mail Hunink{at}epib.fgg.eur.nl
| Abstract |
|---|
|
|
|---|
Methods We performed a systematic review of published studies retrieved through PUBMED, from bibliographies of review papers, and from experts. The English-language medical literature was searched for studies that met the selection criteria: (1) The study was published between 1994 and 2001; (2) MRA and/or DUS was performed to estimate the severity of carotid artery stenosis; (3) DSA was used as the standard of reference; and (4) the absolute numbers of true positives, false negatives, true negatives, and false positives were available or derivable for at least one definition of disease (degree of stenosis).
Results Sixty-three publications on duplex, MRA, or both were included in the analysis, yielding the test results of 64 different patient series on DUS and 21 on MRA. For the diagnosis of 70% to 99% versus <70% stenosis, MRA had a pooled sensitivity of 95% (95% CI, 92 to 97) and a pooled specificity of 90% (95% CI, 86 to 93). These numbers were 86% (95% CI, 84 to 89) and 87% (95% CI, 84 to 90) for DUS, respectively. For recognizing occlusion, MRA yielded a sensitivity of 98% (95% CI, 94 to 100) and a specificity of 100% (95% CI, 99 to 100), and DUS had a sensitivity of 96% (95% CI, 94 to 98) and a specificity of 100% (95% CI, 99 to 100). A multivariable summary receiver-operating characteristic curve (ROC) analysis for diagnosing 70% to 99% stenosis demonstrated that the type of MR scanner predicted the performance of MRA, whereas the presence of verification bias predicted the performance of DUS. For diagnosing occlusion, no significant heterogeneity was found for MRA; for DUS, the presence of verification bias and type of DUS scanner were explanatory variables. MRA had a significantly better discriminatory power than DUS in diagnosing 70% to 99% stenosis (regression coefficient, 1.6; 95% CI, 0.37 to 2.77). No significant difference was found in detecting occlusion (regression coefficient, 0.73; 95% CI, -2.06 to 3.51).
Conclusions These results suggest that MRA has a better discriminatory power compared with DUS in diagnosing 70% to 99% stenosis and is a sensitive and specific test compared with DSA in the evaluation of carotid artery stenosis. For detecting occlusion, both DUS and MRA are very accurate.
Key Words: carotid stenosis magnetic resonance angiography meta-analysis ultrasonography, Doppler, duplex
| Introduction |
|---|
|
|
|---|
MR angiography (MRA) and contrast-enhanced MRA (CE-MRA) are increasingly used supplementary to duplex ultrasonography (DUS) and conventional DSA in the diagnosis of carotid artery stenosis.6 Many institutions have published diagnostic studies in which MRA and/or DUS was compared with DSA. The results suggest that the decision to perform carotid endarterectomy could be based on one or a combination of these noninvasive tests. Two meta-analytic reviews have been published that summarize the literature on the diagnostic performance of DUS and MRA from before 1996, one reporting an increasing role for noninvasive testing in carotid artery disease (DUS and MRA),7 and the other concluding moderate test results for MRA.8 Recently, a review of previous studies on this topic published between 1993 and 1998 criticized the design of the studies and proposed guidelines for diagnostic studies.9
Accordingly, a recent review summarizing publications from 1990 to 1999 concluded that MRA seemed an accurate test in selecting patients for carotid surgery but that evidence was not yet very robust because of the heterogeneity of the studies included.10 In the mentioned literature, however, variables explaining the observed heterogeneity have not been investigated.
The purpose of this study was to systematically review the contemporary literature and to compare the diagnostic performance of DUS, MRA, and CE-MRA. Our aim was to increase precision by combining published studies and to determine variables that might explain part of the difference in outcome across studies. Recently published guidelines for meta-analyses of randomized, controlled trials were followed when applicable to a meta-analysis on diagnostic tests.11
| Methods |
|---|
|
|
|---|
To find the studies, we performed a PUBMED search using the following keywords and all possible related terms: carotid artery and angiography combined with magnetic resonance and/or duplex or ultrasound. We limited the search to publications published in the English language. Reference lists of original and review publications on this subject were checked, and experts on the subject were consulted to find additional studies.
Studies that met the following criteria were included: (1) The study was published between 1994 and 2001; (2) MRA or CE-MRA and/or DUS was performed to estimate the severity of carotid artery stenosis; (3) DSA was used as the reference standard; and (4) the absolute numbers of true positive, false negatives, true negatives, and false positives were available or derivable from the presented data for at least one cutoff criterion for the degree of stenosis based on DSA. We reconstructed these numbers if sensitivity, specificity, and prevalence of disease were presented.
Our intent was to collect the absolute numbers for each study as completely as possible for the following categories of carotid artery stenosis: 0% to 29%, 30% to 49%, 50% to 69%, 70% to 99%, and 100%. Authors were contacted for 2 reasons: (1) to give them the opportunity to send us additional data so that we could work with a complete data set for all described categories, and (2) if neither absolute numbers or sensitivity/specificity was derivable but the study suggested availability of this data. Studies with occlusion as their main outcome (often describing diagnostic tests only to determine if occlusion was present or not) were excluded if the authors did not respond to our request for more precise specification of the nonocclusion group. We excluded these studies because the main subject of our meta-analysis was treatment decisions based on the category of 70% to 99% stenosis and because the cutoff criteria used in these publications were too diverse to include in the meta-analysis. We also excluded studies with a population of <15 patients. If publications used the same or an overlapping population, we chose the publication from which we could derive the required data in the most straightforward manner. We contacted the authors if it was not clear whether separate published populations were overlapping.
Data Extraction
Two authors (Y.v.d.G.) and (P.N.) independently extracted data from all publications. All abstracts collected from PUBMED through the described search criteria were evaluated. The full text was studied to check the inclusion criteria from all studies that could not definitely be excluded on the basis of the abstract. From the included publications, the absolute numbers of true positives, false negatives, true negatives, and false positives of the described test modalities were extracted as completely as possible for all the different categories of stenosis: 0% to 29%, 30% to 49%, 50% to 69%, 70% to 99%, and 100%. Sensitivity and specificity were extracted or calculated from the data, and absolute numbers were derived if the prevalence was reported. Additionally, the following variables were extracted for each study population: mean age and range, percentage of men and women, percentage of symptomatic patients, and type of symptoms (amaurosis fugax, transient ischemic attack, or stroke), as well as whether the tests were studied in a consecutive patient population. The following test characteristics were determined: method of stenosis measurement used on DSA (according to NASCET or ECST criteria or a different method), type of MR and/or DUS machine, time interval between DUS and/or MRA and DSA, number of visualized carotid arteries, and whether a different cutoff was used to define severe stenosis (eg, 60% or 80% instead of 70%). We converted cutoff values determined according to ECST criteria to their corresponding NASCET criteria. In studies presenting DUS results, often >1 velocity parameter (peak systolic velocity, end-diastolic velocity, mean velocity) or a ratio was used to determine the degree of stenosis. We chose the parameter that the authors considered optimal. Single peak systolic velocity (PSV) values referring to a degree of stenosis of 70% were extracted if available. We determined whether the DUS thresholds were defined before the study was performed, or if the optimal thresholds had been analyzed afterward on the basis of the results (yielding a higher diagnostic performance). We also noted whether verification bias was present, which may occur if the decision to perform the reference standard procedure depends on the results of the test under investigation.12 (In practice, DUS is often used as the screening test to select patients for DSA.) In the MRA studies. use of a contrast-enhanced protocol versus time-of-flight (TOF) technique was noted. Finally, we determined whether tests were read with the observer blinded to the results of the other test(s).
Discrepancies between the 2 observers in the extracted data were discussed and, in all cases, resolved through consensus.
Analyses
The analyses presented are limited to 2 groups: 70% to 99% (severe stenosis) versus <70% and 100% (occlusion) versus <100%. The data collected from the other subgroups under the 70% threshold (0% to 29%, 30% to 49%, 50% to 69%) were not included in the presented analyses. The data in these groups contained too many missing values for a meaningful analysis, even though some publications presented a very complete description of the data and some of the authors gave enthusiastic replies to our request for additional data. Furthermore, we thought that the data of the 30% to 49% and 50% to 69% stenosis categories might suffer from selection and/or verification bias because many institutions use a threshold close to 50% stenosis on DUS as their inclusion criterion to perform DSA. For both reasons, the analyses in these stenosis groups gave very inconsistent preliminary results.
All analyzed variables are listed in Table 1. Sex and distribution of disease (ie, localization of the event) were often missing, which precluded meaningful analysis.
|
All probability values are approximate. The software package used was Intercooled Stata version 6.0 for Windows.
Pooled Weighted Analysis
We calculated the pooled weighted results of sensitivity, specificity, and diagnostic performance. The diagnostic performance was defined as the natural logarithm of the diagnostic odds ratio: D=ln[(TPxTN)/(FPxFN)]. Weighting was done with the inverse of the variance. We used a random-effects model to account for the heterogeneity across studies.
Summary Receiver-Operating Characteristic Curve Analysis
To adjust for the heterogeneity in positivity criteria, a summary receiver-operating characteristic (ROC) curve analysis was performed for each test.13 Summary ROC analysis is a meta-analytic method to summarize true- and false-positive rates from different diagnostic studies.14 In this method, the positivity criterion of each study is approximated by calculating S=ln[(TPxFP)/(TNxFN)].
Initially, we applied both a fixed- and a random-effects model in all analyses. The fixed-effects model assumes that the operating points from the individual studies lie on one underlying true ROC curve and that the differences in test results can be explained by differences in positivity criteria and other definable covariates. The random-effects summary ROC model assumes that there is always some residual cross-study heterogeneity even after adjustment for differences in positivity criteria and study characteristics such as population size, age, sex, definition of disease, scanner type, MRA technique used, PSV cutoff criterion, blinded scoring, and verification bias.15 In the remaining sections of this article, we present only the methods and results of the random-effects summary ROC model. In the Discussion, we elaborate on the differences in outcome between the random- and fixed-effects models.
Summary ROC Analysis by Diagnostic Test
Summary ROC models were developed for each diagnostic test separately. In a bivariable analysis, we evaluated each covariate (Table 1) to determine its explanatory value in explaining differences across studies in diagnostic performance (D) after adjustment for the positivity criterion used (S). Variables were considered explanatory in the random-effects analysis if their inclusion decreased the estimate of the between-study variance by at least 10% (which was calculated with
2, the method of moments), if they had a regression coefficient of at least 1.0 for dummy variables or 1.0 over the range of the variable, or if they were statistically significant (P<0.05).16
Subsequently, multivariable summary ROC models were developed for each diagnostic test separately. The explanatory variables from the bivariable analysis were evaluated in a stepwise forward-selection regression model including variables one by one, starting with the variable that decreased
2 the most, and keeping the variable in the model using the same selection criteria as above.
2, ie, the residual between-study variance calculated with the method of moments, was used as a measure of model fit. A lower
2 value indicates less residual between-study variance and therefore a better model fit and a better explanatory power by the model of the heterogeneity across studies. S (indicating the positivity criterion) was retained in the model regardless of its significance.
Summary ROC Analysis Comparing MRA and DUS
Finally, we performed a summary ROC analysis for the comparison between DUS and MRA. The significant variables from the multivariable analysis performed for each test separately were included as test-specific covariates in the multivariable comparative model. A dummy variable was added to compare the 2 tests. The regression coefficient of this dummy variable represents the difference in diagnostic performance (D) of MRA compared with DUS. A positive regression coefficient indicates better discriminatory power of MRA compared with DUS; a negative coefficient indicates reduced discriminatory ability of MRA.
To assess whether any individual study was particularly influential on the final results, we performed a jackknife type of sensitivity analysis of the final comparative model in which the analysis was repeated multiple times, each time excluding one study.
| Results |
|---|
|
|
|---|
|
Pooled Weighted Analysis
For the diagnosis of 70% to 99% versus <70% stenosis, MRA had a pooled sensitivity of 95% (95% CI, 92 to 97) and a pooled specificity of 90% (95% CI, 86 to 93) (Table 2). These numbers were 86% (95% CI, 84 to 89) and 87% (95% CI, 84 to 90) for DUS, respectively. For diagnosing occlusion (<100% versus 100%), MRA had a sensitivity of 98% (95% CI, 94 to 100) and a specificity of 100% (95% CI, 99 to 100); for DUS, these numbers were 96% (95% CI, 94 to 98) and 100% (95% CI, 99 to 100). These pooled data indicate a better discriminatory power for MRA in diagnosing severe stenosis (70% to 99%), whereas MRA and DUS are equally good in recognizing carotid occlusion (100%). The pooled values of D (the natural logarithm of the diagnostic odds ratio) were very similar between tests: 4.1 (95% CI, 3.5 to 4.8) for MRA versus 4.0 (95% CI, 3.5 to 4.5) for DUS in the 70% to 99% category and 6.5 (95% CI, 5.7 to 7.4) for MRA versus 6.5 (95% CI, 5.9 to 7.0) for DUS in diagnosing occlusion.
|
Summary ROC Analysis by Diagnostic Test
In the multivariable model for the diagnosis of 70% to 99% stenosis, type of MR scanner was a significant predictor for the diagnostic performance of MRA. Of note, the regression coefficient for CE-MRA versus nonCE-MRA techniques (practically always a combination of 2- and 3-dimensional TOF) was close to 0 and nonsignificant (-0.35; 95% CI, -1.80 to 1.10). The presence of verification bias and choice of a different cutoff to define severe stenosis were associated with better DUS performance. In recognizing occlusion, no significant heterogeneity was demonstrated among the MRA studies. Presence of verification bias and type of DUS scanner were significant predictors for DUS.
Summary ROC Analysis Comparing MRA and DUS
The multivariable comparative model with adjustment for significant predictors demonstrated a regression coefficient for MRA versus DUS of 1.6 (95% CI, 0.37 to 2.77; P=0.01) for 70% to 99% versus <70% stenosis, indicating that MRA discriminates significantly better than DUS. The jackknife sensitivity analyses, in which each study was excluded one by one from the final model, did not demonstrate any disproportionate influences of individual studies (regression coefficient for MRA versus DUS varied from 1.25 to 1.87 and was always significant). The
2 value for the final comparative model for severe stenosis was 1.29, indicating a moderate model fit and some residual between-study variance. The multivariable summary ROC curves for severe stenosis presented in Figures 2 and 3 were based on the final comparative model. The summary ROC curve for DUS was adjusted to reflect a cutoff of 70% for severe stenosis and the absence of verification bias (Figure 2). The summary ROC curve for MRA was adjusted to reflect the most commonly used types of MR scanners (Figure 3). The dots represent the original true-positive and true-negative rates of the individual publications.
|
|
In differentiating occlusion from <100% stenosis, the regression coefficient for MRA versus DUS was 0.73 (95% CI, -2.06 to 3.51; P=0.51), indicating no difference in diagnostic performance. The jackknife sensitivity analysis demonstrated that the results were not influenced by any particular study (range regression coefficient, 0.42 to 1.12; never significant). The
2 value for the final model for occlusions was 0, indicating excellent model fit and no residual between-study variance. The summary ROC curves for the diagnosis of occlusion (100%) were both close to perfect (not shown).
| Discussion |
|---|
|
|
|---|
An earlier meta-analysis published by Blakeley et al7 in 1995 concluded that DUS and MRA had similar diagnostic performance in predicting carotid artery occlusion and >70% stenosis. In our study, we found a better discriminatory power for MRA compared with their results. In the study by Blakeley et al, the literature from 1977 to 1993 was reviewed, whereas our literature search started with 1994. Improved MRA technology might explain an increase in diagnostic performance of MRA. Kallmes et al8 reviewed MRA studies published between 1990 and 1994 and found lower sensitivity for MRA in recognizing severe carotid artery stenosis than we did.8 Furthermore, they discussed whether the asymptomatic arteries should be excluded from the results. Exclusion of the asymptomatic side in their analyses gave even lower sensitivities. Finally, a recent review by Westwood et al10 summarizing publications of 1990 to 1999 concluded that MRA seemed test in selecting patients for carotid surgery but that evidence was not yet very robust because of the heterogeneity of the studies included.10 This study also applied a ROC analysis, and their results on MRA are very similar to ours. Our article differs substantially from the literature in that we performed a systematic review of both MRA and DUS and compared them. Despite the introduction of MRA for imaging the carotid artery, DUS remains very important in the initial workup of patients suspected of carotid artery stenosis. Furthermore, our article differs in that we present variables that explain the heterogeneity observed in literature.9,10
A limitation in the collection of our data was the fact that, to be able to perform a summary ROC analysis, we could include only those studies from which absolute numbers (true positives, false positives, true negatives, false negatives) for at least one defined threshold were available or derivable. Often, only sensitivities and specificities were presented. When we were unable to reconstruct the absolute numbers or when authors did not reply to our request for additional data, we had to exclude the study. Therefore, it was possible to include only a selection of the contemporary literature in this analysis. Furthermore, in the included papers, we often found incomplete or missing data concerning population baseline characteristics, distribution of disease, and technical aspects of the imaging protocols. For some of the variables, these omissions precluded meaningful analysis.
Another limitation of our study is that we used angiography as the reference standard test. Angiography has several limitations related to the fact that it consists of a limited number of projections of the vessel lumen. The true lumen can be more severely stenosed, and information regarding the extent of plaque and plaque morphology is more readily available from ultrasonography and MR imaging.79,80 The vast majority of studies evaluating DUS and MRA, however, used angiography as the reference, and in reviewing these studies, we were therefore compelled to do so also.
In the meta-analysis, we first calculated the pooled sensitivities and specificities. This method provides a relatively crude estimate of the overall diagnostic performance of the different tests. The pooled weighted analysis showed that both DUS and MRA are highly sensitive and specific for diagnosing carotid artery occlusion. For the diagnosis of occlusion, only the absence of signal needs to be determined, whereas judgment of the severity of a stenosis is not necessary. For detecting the 70% cutoff, MRA showed better sensitivity and slightly better specificity than DUS. MRA probably gives a more precise estimate of the degree of stenosis because it provides a direct measurement of the stenotic lumen (NASCET method), whereas DUS allows only an indirect estimate through measurement of parameters based on blood flow velocities. Therefore, DUS gives a wider dispersion in results compared with DSA, resulting in lower sensitivities and specificities. The pooled sensitivity and specificity of 95% and 90%, respectively, for MRA in detecting severe stenosis are very high, and this technique may improve even further in the near future.
In addition to calculating pooled weighted sensitivities and specificities, we performed a summary ROC analysis, which is especially useful to evaluate overall diagnostic performance. The main advantage of this method is that it adjusts for different positivity criteria, which cannot be achieved with pooling sensitivities and specificities. Initially, we studied both a fixed- and a random-effects model in the summary ROC analyses. Although the random-effects model seems more elegant in a meta-analysis combining the results from diverse studies, a fixed-effects model can also be justified in a diagnostic meta-analysis. Study populations selected for a specific test and disease often have comparable baseline characteristics, which supports the assumption of a true underlying ROC curve. Because of the stricter assumptions, the fixed-effects model can potentially identify additional explanatory variables. In our study, the fixed- and random-effects models showed good consistency in finding significant variables that explain part of the heterogeneity of the different publications. For DUS, publication year was found as an additional explanatory variable for the diagnosis of severe stenosis (70% to 99%) when the fixed-effects model was applied. Earlier studies showed better diagnostic performance, indicating a possible effect of publication bias. Alternatively, this finding may indicate the selection bias that commonly occurs in early studies of new diagnostic technologies. In the diagnosis of occlusion, a consecutive study population was found to be an additional significant variable for MRA.
Remarkably, the type of MR scanner but not the type of DUS scanner was shown to be an explanatory variable in the diagnosis of a 70% to 99% stenosis. It is generally known that different DUS machines with different technologists in different institutions show variable test results, even if the same thresholds for the test parameters are used.81 Although we showed a difference across MRA scanners, the results did not indicate a difference in diagnostic performance depending on the imaging technique used. Technically there is a major difference between nonCE-MRA and CE-MRA. In practically all studies reporting on nonCE-MRA, a combination of 2- and 3-dimensional TOF was used. We evaluated the effect of MRA technique in the summary ROC analysis and found the regression coefficient for CE-MRA versus TOF to be close to 0 and nonsignificant. Thus, according to the results published so far, the technical difference between CE-MRA and TOF techniques does not seem to translate into a significant difference in diagnostic performance. A caveat is, however, that only 4 of the 21 MRA series were based on CE-MRA.
The fact that verification bias plays an important role in detecting 70% to 99% stenoses on DUS is a plausible and important finding. Verification bias may exist if the decision to perform the reference standard procedure depends on the results of the test under investigation. In the included studies, DUS has often been used as a screening test to decide whether to perform DSA.
In conclusion, our results suggest that MRA has a better discriminatory power compared with DUS in recognizing 70% to 99% stenosis and is a sensitive and specific test compared with DSA in the evaluation of carotid artery stenosis. For detecting occlusion of the carotid artery, both modalities are very accurate. To determine whether noninvasive tests can replace DSA in clinical practice, however, not only the test results but also the associated costs and effectiveness should be taken into account.
| Acknowledgments |
|---|
Received July 7, 2002; accepted November 27, 2002.
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
S. M. Debrey, H. Yu, J. K. Lynch, K.-O. Lovblad, V. L. Wright, S.-J. D. Janket, and A. E. Baird Diagnostic Accuracy of Magnetic Resonance Angiography for Internal Carotid Artery Disease: A Systematic Review and Meta-Analysis Stroke, August 1, 2008; 39(8): 2237 - 2248. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Saam, H. R. Underhill, B. Chu, N. Takaya, J. Cai, N. L. Polissar, C. Yuan, and T. S. Hatsukami Prevalence of American Heart Association type VI carotid atherosclerotic lesions identified by magnetic resonance imaging for different levels of stenosis as measured by duplex ultrasound. J. Am. Coll. Cardiol., March 11, 2008; 51(10): 1014 - 1021. [Abstract] [Full Text] [PDF] |
||||
![]() |
U.S. Preventive Services Task Force Screening for Carotid Artery Stenosis: U.S. Preventive Services Task Force Recommendation Statement Ann Intern Med, December 18, 2007; 147(12): 854 - 859. [Abstract] [Full Text] [PDF] |