Performance of Carotid Ultrasound in Evaluating Candidates for Carotid Endarterectomy Is Optimized by an Approach Based on Clinical Outcome Rather Than Accuracy
Background and Purpose The best method of selecting endarterectomy candidates for cerebral angiography is controversial. Carotid duplex ultrasound (CDUS) is widely used, but its performance varies across institutions. The clinical utility of CDUS could be improved with test criteria based on patient outcome rather than test accuracy.
Methods In 155 carotid bifurcations studied by CDUS and cerebral angiography, the degree of angiographic stenosis was measured by a reader, blinded to CDUS, using the North American Symptomatic Carotid Endarterectomy Trial (NASCET) method. We calculated accuracy, sensitivity, and specificity for predicting ≥70% angiographic carotid stenosis of different peak systolic frequencies (PSF) measured by CDUS and generated a receiver operator characteristic (ROC) curve. We used NASCET outcome data and published data on angiographic complications to define relative “costs” of false-positive and false-negative CDUS, and we determined the point on the ROC curve representing the CDUS criterion with the highest clinical utility. We compared projected morbidity and mortality rates for 1000 hypothetical endarterectomy candidates resulting from the use of the most accurate CDUS criterion versus the CDUS criterion with the highest clinical utility by ROC analysis.
Results While PSF ≥8 kHz had the highest CDUS accuracy (93%), its projected stroke and death rate due to CDUS error was 10.4/1000. On the other hand, PSF ≥7 kHz, defined by ROC analysis to have the highest clinical utility, had a lower morbidity and mortality rate of 6.8/1000.
Conclusions The use of ROC analysis and available outcome data can improve the performance of CDUS in selecting endarterectomy candidates for cerebral angiography.
The importance of identifying severe carotid stenosis has been demonstrated by two recent clinical trials, NASCET and ECST, which showed the benefit of carotid endarterectomy in symptomatic patients.1 2 In these trials, severe carotid stenosis was defined by cerebral angiography, the current “gold standard.” However, cerebral angiography has an associated morbidity, mortality, and financial cost. Its unselected use in patients with transient ischemic attack and minor stroke who are potential endarterectomy candidates seems undesirable, since most of these patients do not have an ipsilateral high-grade stenosis.3 4 5 6 Consequently, most clinicians use noninvasive studies to select patients for cerebral angiography. CDUS is generally recommended for this purpose.7
The carotid endarterectomy trials have generated controversy over the role of CDUS in evaluating patients for angiography. Investigators have commented on the “disappointing” performance of CDUS in these trials, in which the overall accuracy approached 65%,8 9 10 compared with previously reported sensitivities and specificities between 80% and 100%.11 This apparent discrepancy may confuse clinicians as to the role of CDUS in selecting patients for cerebral angiography.
The evolving role of CDUS contributes to this confusion. In the “pre-NASCET era,” the information requested from CDUS was relatively nonspecific. Ultrasound criteria were developed to classify arteries into categories of stenosis severity, such as 0% to 15%, 16% to 50%, 50% to 79%, 80% to 99%, and occluded.12 This method of categorizing disease severity does not meet our present needs well. Because it is now known that symptomatic patients benefit from surgery if they have a ≥70% carotid stenosis, clinicians need CDUS to specifically determine whether a patient is likely to have such a stenosis.1 Since this is one of the most important indications for CDUS, the criteria for interpreting CDUS should be refined to better answer this specific question.
In developing criteria for test interpretation, our natural tendency is to maximize accuracy, that is, to minimize the total number of false-positives and false-negatives. However, patient outcome rather than test accuracy is the ideal measure of test performance. The most “accurate” test may not lead to the best clinical outcome, because not all test errors are equal. In a particular clinical situation, a false-negative error may be more harmful than a false-positive error. Therefore, high sensitivity (few false-negatives) would be more important than either high specificity (few false-positives) or overall accuracy. For example, a test that screens newborns for phenylketonuria, a disease in which early intervention is critical to avoid severe neurological sequelae, should be very sensitive. On the other hand, if a false-positive result is more harmful than a false-negative result, then high specificity is more important than high sensitivity. This would apply to a test for brain death in which a positive test would lead to discontinuation of life support. Adjustments of test criteria generally maximize either sensitivity or specificity, one at the expense of the other. Maximizing accuracy may maximize neither sensitivity nor specificity.
We believe that the development of test criteria for CDUS, based on outcome data published in recent clinical trials, to maximize patient outcome (minimize morbidity and mortality) rather than to maximize test accuracy will improve the performance of CDUS in selecting patients for carotid endarterectomy.
Our laboratory maintains a computerized database (Helix Express) of CDUS results, which includes recorded PSF and end-diastolic frequency values (recorded in 0.5-kHz increments) for internal, external, and common carotid arteries. During a 2-year period, 155 carotid bifurcations in 83 patients were also studied by cerebral angiography (not every patient had bilateral cerebral angiography). These angiograms were interpreted by a single investigator (J.L.W.) blinded to CDUS results, using the method of NASCET.1 As shown in Fig 1⇓, the degree of stenosis of the internal carotid artery was measured as follows: The accuracy, sensitivity, and specificity of different PSF values measured by CDUS in predicting ≥70% angiographic internal carotid artery stenosis were calculated and used to generate an ROC curve. ROC analysis applies to tests that report clinical data as a continuous range of variables. The ROC is the relationship between sensitivity and specificity for different test “cutoff” values used as criteria for detecting disease. Various criteria for an “abnormal” CDUS test are compared with a gold standard, cerebral angiography. True-positive rates (sensitivity) for the different criteria are plotted against false-positive rates (1−specificity).13
Identifying the PSF criterion with the highest clinical utility requires knowledge of the prevalence of severe carotid disease, the relative “cost” of missing the diagnosis (false-negative), and the relative cost of making an incorrect diagnosis (false-positive) among patients being evaluated. The formula
where Cfp is the cost of a false-positive diagnosis, Cfn is the cost of a false-negative diagnosis, and P is the prevalence of the disease in question, gives a slope m. The point on the ROC curve that has this slope is the test criterion with the highest clinical utility for identifying disease.13 Here, cost refers to the risk of stroke and death rather than financial expenditure. Cfp represents the 1% morbidity and mortality associated with “unnecessary” angiography,14 and Cfn is the 16.5% excess morbidity and mortality of a severe stenosis treated medically versus surgically as reported in NASCET.1 The latter is calculated as the difference between stroke and death rates for medically treated disease (32.3%) and stroke and death rates for surgically treated disease (15.8%) (32.3%−15.8% =16.5%). The prevalence of a ≥70% stenosis among patients with carotid territory ischemia was estimated from published data, which report a 20% to 60% prevalence of angiographic high-grade carotid stenosis among patients with carotid territory ischemia.3 4 5 6 We chose 40% as the prevalence of disease. This results in a slope of m=0.09 at the point on the ROC curve representing the test criterion with the highest clinical utility.
We compared the projected morbidity and mortality rates for a hypothetical cohort of 1000 symptomatic carotid endarterectomy candidates who are selected for angiography using two different CDUS interpretation criteria: the most accurate criterion versus that defined by ROC analysis to have the highest clinical utility. A 40% prevalence of high-grade disease among symptomatic patients was assumed. We projected stroke and death rates due to angiographic complications or due to failure to diagnose a high-grade stenosis for the cohort of 1000 patients evaluated by three different methods: angiography without prior ultrasound (algorithm 1), ultrasound selection for angiography using the CDUS criterion with the highest clinical utility (algorithm 2), and ultrasound selection for angiography using the most accurate CDUS criterion (algorithm 3).
The ROC curve generated for our data is presented in Fig 2⇓. Each point on the curve represents the false-positive rate and the true-positive rate for a specific PSF for predicting ≥70% angiographic stenosis.
The line with the slope 0.09 (calculated above) is tangential to the ROC curve at a PSF of 7 kHz, defining the test criterion with the highest clinical utility. Calculated accuracies for each PSF revealed that PSF of ≥8 kHz had the highest accuracy (93%) in predicting ≥70% internal carotid artery stenosis, while PSF of ≥7 kHz had an accuracy of 91%.
The projected clinical outcomes of three algorithms used to evaluate 1000 carotid endarterectomy candidates, with 40% prevalence of true high-grade (≥70%) stenosis, are presented in Table 1⇓. Although the accuracy is higher (93%) for PSF of ≥8 kHz (algorithm 3), the cost in strokes and deaths is lowest (6.8/1000) for PSF of ≥7 kHz (algorithm 2), as predicted from our ROC analysis. If all 1000 symptomatic patients have angiography without CDUS selection (algorithm 1), 539 more cerebral angiograms and 3.2 (10−6.8) more strokes or deaths would occur compared with the use of algorithm 2. If algorithm 3 were used, 63 fewer angiograms would be performed compared with algorithm 2, but 3.6 (10.4−6.8) more strokes or deaths would result.
Defining the clinically most useful test criterion in a specific situation depends on finding the correct trade-off between sensitivity and specificity. This is facilitated by ROC analysis. For example, in our analysis, a PSF of ≥3.5 kHz (the right-most point on the curve in Fig 2⇑) is a very sensitive criterion in detecting ≥70% stenosis (the true-positive rate approaches 1) but has poor specificity. Using this criterion would result in missing very few high-grade stenoses but would lead to a high number of unnecessary angiograms. Conversely, for a PSF of ≥11 kHz (the left-most point), CDUS is very specific (no false-positives) but relatively insensitive. Therefore, while fewer angiograms would be performed on patients with lower grades of stenoses, an unacceptably high number of high-grade stenoses would be missed.
The natural history of the disease to be identified and the efficacy of treatment play a role in determining the relative importance of sensitivity and specificity. Symptomatic patients with severe carotid stenosis treated medically have a stroke and death rate of 32.3% over 2 years, while those treated surgically have a stroke and death rate of 15.8% over the same time period.1 Therefore, undiagnosed severe carotid disease has an excess 2-year morbidity and mortality of 16.5%. To identify such patients, CDUS criteria should be highly sensitive, with the fewest number of false-negatives, since patients with a false-negative diagnosis carry a high excess morbidity and mortality. False-positive diagnoses carry a significant but substantially lower penalty. A false-positive CDUS diagnosis of ≥70% stenosis leads to unnecessary cerebral angiography, with an estimated stroke and death rate of 1%, as well as a financial cost. These data suggest that the ultrasound criterion that performs best for these patients will allow very few false-negatives and relatively more false-positives. In fact, the optimum point by our analysis has a higher sensitivity (97%) than specificity (88%) (Table 1⇑).
The true disease prevalence in the population being evaluated also plays a critical role in defining the best test criterion. Ideally, if the ROC analysis were performed on an angiographic-CDUS correlation of unselected carotid endarterectomy candidates, the true disease prevalence would be known. This would require that every patient with carotid territory ischemic symptoms undergo both angiography and CDUS examination of the ipsilateral carotid artery. However, it is common practice in our institution and elsewhere to select patients for angiography with the use of CDUS, and it is rare for patients with normal CDUS evaluation to undergo angiography. To obtain a broad range of CDUS and angiographic results for our analysis, we included the results from the contralateral (asymptomatic) carotid arteries in our analysis, assuming that the diagnostic performance of CDUS in asymptomatic arteries is not different from that in symptomatic arteries. These factors required that we estimate the true prevalence of ≥70% stenosis in unselected symptomatic patients from published literature rather than use the prevalence of ≥70% stenosis in our population. In our model, we assumed a prevalence of high-grade carotid disease among symptomatic patients of 40%, on the basis of studies that estimated the prevalence between 20% and 60%.3 4 5 6 If the true prevalence were higher (60%), then the derived slope, m, is lower (0.04), and the optimum point on the curve moves to the right to a PSF of ≥6 or 6.5 kHz. If the prevalence of high-grade stenosis were lower (20%), then the slope (m) is higher (0.24). However, the closest point on the curve corresponding to this slope is still a PSF of ≥7 kHz. This cutoff value appears to optimize our selection of patients for angiography.
Because patients in our database were selected for angiography, verification bias may have occurred. This results when only a subset of patients screened (with CDUS) undergoes verification (cerebral angiography). While our use of the asymptomatic side could potentially minimize the degree of verification bias, its effect is unknown. Adjustment for verification bias requires that a subset of unselected patients with normal CDUS undergo angiography. The risk of angiography precludes this in our and others’ studies. In general, verification bias may lead to an underestimation of the sensitivity and an overestimation of the specificity of the screening test parameters.15 16
The sensitivity and specificity in our laboratory of PSF of ≥7 kHz for detecting ≥70% angiographic stenosis are 97% and 88%, respectively. This is similar to the sensitivity and specificity of 97% and 87% reported recently for color-assisted CDUS.17 The criterion that has the highest clinical utility may not be the value for which CDUS performs most accurately. The accuracy is highest (93%) for a PSF of ≥8 kHz, while the accuracy is 91% at the clinically optimum point of ≥7 kHz.
The methodology of angiographic interpretation is not consistent among institutions or clinical trials but is also very important in this analysis. The NASCET method uses the ratio of the narrowest residual lumen diameter to the lumen diameter of the “normal” artery distal to the stenosis (Fig 1⇑).18 Other criteria estimate a normal diameter at the site of the stenosis2 and compare this estimate to the residual lumen. Others determine a percent area rather than percent diameter reduction. Still others report a residual lumen diameter. Each method leads to different estimates of disease severity. The term “hemodynamically significant stenosis” has many definitions, such as >50% stenosis, >70% stenosis, or <2 mm residual lumen. It is important to realize that published outcome data are specific to the method of angiographic interpretation. Therefore, internal consistency in the choice of outcome data and angiographic interpretation method is required. We used the outcome data and the method of angiographic interpretation published by NASCET.
Our suggestions for optimal CDUS evaluation of endarterectomy candidates are presented in Table 2⇓. CDUS criteria differ across institutions, reflecting variable methodology, operator skill, and technology. The best use of CDUS as a tool to select patients for angiography requires that criteria be identified and validated at individual institutions. Institutions may use velocity rather than frequency measurements or a velocity ratio between the internal and common carotid arteries.19 An ROC analysis should be performed, ideally with local surgical and angiographic morbidity and mortality data and local measures of disease prevalence. When these are not available, we suggest that NASCET and other published data be used, as illustrated here. This analysis requires CDUS-angiography correlation in each institution. Angiographic interpretation should use measurement criteria for which clinical outcome data exist. For example, in our analysis we used NASCET outcome data and the NASCET method of measuring angiographic stenosis. While such CDUS-angiography correlations are time consuming, they reflect current requirements for accreditation by the Intersocietal Commission for the Accreditation of Vascular Laboratories (ICAVL). The successful application of this approach requires communication between the physician ordering the test and the ultrasonographer. Physicians ordering CDUS need to alert the ultrasonographer to the indication for the test to obtain the most clinically useful information from the result.
Importantly, this analysis is flexible. If future carotid surgery trials show that symptomatic patients with 50% or 60% stenoses benefit from surgery, the criteria can be modified with the use of the same approach. Similarly, criteria for interpreting MR angiography may be developed with the use of ROC and efficacy analysis as outlined here.
Our analysis applies only to a single indication for CDUS, evaluating symptomatic patients for carotid endarterectomy. While it is the most important clinical indication for CDUS at present, it is not the only reason for which CDUS is performed. Specific outcome and treatment data for other indications are not currently available. For these situations, the traditional approach of estimating disease severity with the use of criteria that maximize accuracy is appropriate. Because recent clinical trials have demonstrated a surgical benefit for asymptomatic stenoses, a separate ROC analysis should be performed and separate ultrasound criteria developed for asymptomatic patients, using the natural history of asymptomatic disease and the degree of stenosis that is shown to benefit from surgery.20
In conclusion, the concepts outlined here promote interpretation of CDUS based on patient outcome rather than test accuracy. We believe that this approach will maximize the clinical utility of CDUS. Proof of the clinical benefit of this approach requires a prospective clinical trial.
Selected Abbreviations and Acronyms
|CDUS||=||carotid duplex ultrasound|
|ECST||=||European Carotid Surgery Trial|
|NASCET||=||North American Symptomatic Carotid Endarterectomy Trial|
|PSF||=||peak systolic frequency|
|ROC||=||receiver operator characteristic|
- Received February 2, 1996.
- Revision received February 23, 1996.
- Accepted February 23, 1996.
- Copyright © 1996 by American Heart Association
Hankey G, Warlow C. Symptomatic carotid ischaemic events: safest and most cost effective way of selecting patients for angiography, before carotid endarterectomy. BMJ. 1990;300:1485-1491.
Ueda K, Toole J, McHenry L. Carotid and vertebrobasilar transient ischemic attacks: clinical and angiographic correlation. Neurology. 1979;29:1094-1101.
Health and Public Policy Committee, American College of Physicians. Diagnostic evaluation of the carotid arteries. Ann Intern Med.. 1988;109:835-837.
Riles T, Eidelman E, Litt A, Pinto R, Oldford F, Schwartzenberg G. Comparison of magnetic resonance angiography, conventional angiography, and duplex scanning. Stroke. 1992;23:341-346.
Feussner J, Matchar D. When and how to study the carotid arteries. Ann Intern Med. 1988;109:805-818.
Dion J, Gates P, Fox A, Barnett H, Blom R. Clinical events following neuroangiography: a prospective study. Stroke. 1987;18:997-1004.
Sitzer M, Furst G, Fischer H, Siebler M, Fehlings T, Kleinschmidt A, Kahn T, Steinmetz H. Between-method correlation in quantifying internal carotid stenosis. Stroke. 1993;24:1513-1518.
Barnett H, Peerless S, Fox A, Haynes R, Taylor D. North American Symptomatic Carotid Endarterectomy Trial: methods, patient characteristics, and progress. Stroke. 1991;22:711-720.
Moneta GL, Edwards JM, Chitwood RW, Taylor LM, Lee RW, Cummings CA, Porter JM. Correlation of North American Symptomatic Carotid Endarterectomy Trial (NASCET) angiographic definition of 70% to 99% internal carotid artery stenosis with duplex scanning. J Vasc Surg. 1993;17:152-159.
Wilterdink JL, Feldmann E, Easton JD, Ward R. Carotid duplex ultrasound (CDUS) interpretation in asymptomatic carotid endarterectomy candidates: a patient-outcome rather than accuracy-based approach. Neurology. 1995;45:A224. Abstract.