Interobserver Agreement for 10% Categories of Angiographic Carotid Stenosis
Background and Purpose Although the reliability of the assessment of severe 70% to 99% carotid stenosis by carotid angiography has been proven excellent, this may not necessarily be the case for a more detailed classification of carotid stenoses by 10% categories.
Methods Angiograms of the carotid arteries were assessed pairwise by three independent, experienced observers. The measurements of the degree of stenosis of both the carotid bifurcation and the internal carotid artery were made according to the European Carotid Surgery Trial method. Kappa statistics were used to assess the agreement beyond chance for severe (70% to 99%) carotid stenosis (κ1) and for 10% categories of carotid stenosis (κ2). The penalty scores were adjusted by weights for the relative difference in risk (RDR) of stroke in the ipsilateral carotid distribution between the 10% categories (κ3). An adjustment of the RDR method was made by assuming that only patients with a severe carotid stenosis would undergo surgery, and the penalty would be 0 if no disagreement would exist about the indication for surgery (κ4). An even further adjustment (κ5) was made by assuming that assessment of the rate of carotid stenosis by one or both observers would lead to different treatment recommendations in 50% of the cases, and accordingly the penalty for disagreement (RDR) was halved.
Results One hundred twenty-one carotid bifurcations in 65 patients with a transient ischemic attack or nondisabling stroke were assessed. The intraclass correlation between the exact estimates of carotid stenosis was .90 (95% confidence interval, .85 to .92). The mean difference in stenosis between the two raters was 0.8% (95% confidence interval, −2.1% to 3.7%). κ1 to κ5 equaled 0.80, 0.40, 0.79, 0.91, and 0.92, respectively.
Conclusions Interobserver agreement for distinct 10% categories of angiographic carotid stenosis is moderate, but when realistic risk- and decision-based weights are used, agreement between experienced observers can be almost perfect.
Carotid angiography has been proven a reliable screening technique for distinguishing between presence or absence of a severe carotid stenosis.1 2 3 This is clinically important because carotid endarterectomy has been proven effective in patients with a severe (70% to 99%) symptomatic carotid stenosis.4 5 Interobserver agreement may not necessarily be as good for a more detailed classification of carotid stenoses, for example by 10% categories. It may be clinically relevant to make such a distinction because the risk of stroke in the ipsilateral carotid distribution and the absolute risk reduction by endarterectomy increase with the degree of carotid stenosis. Some patients with increased risks of death or stroke within 30 days of surgery and a 70% to 80% carotid stenosis could therefore be in a more advantageous position without an operation, whereas patients with low surgical risks, many other risk factors, and a 60% to 70% carotid stenosis may actually benefit from endarterectomy. A beneficial effect of endarterectomy in the 30% to 69% category of carotid stenosis has not been proven and is at best small, although it probably depends on the presence of other risk factors for surgical complications and stroke.6 We therefore assessed interobserver agreement for 10% categories of carotid stenosis.
Subjects and Methods
We selected a random sample of 65 (by assigning random numbers) from 164 patients with a transient ischemic attack (TIA), amaurosis fugax, retinal infarction, or nondisabling stroke who were admitted between May 1990 and October 1995 to our department and underwent carotid angiography.
Selective angiography of the carotid arteries was performed by means of the Seldinger arterial catheterization technique, with the use of the femoral approach. In all patients the aortic arch and the intracranial vasculature were visualized. The common, internal, and external carotid arteries were visualized in at least two directions. The asymptomatic carotid artery was not visualized only when the symptomatic carotid artery appeared not to be stenosed at all or seemed occluded. Measurements of the degree of stenosis of the common and internal carotid arteries were made at its most severe site, according to the European Carotid Surgery Trial (ECST) method,7 with the help of a small scale graduated in millimeters. A carotid stenosis of 40%, 70%, 80%, and 90% measured according to the ECST method would be rated on average 0%, 50%, 67%, and 84%, respectively, according to the North American Symptomatic Carotid Endarterectomy Trial criteria.8 Each angiogram was assessed by two of three experienced clinicians, who were blinded to the results of each other’s assessment. Clinical information, other than that the patient had had a recent TIA or nondisabling stroke in the anterior circulation, was not provided.
We computed the intraclass correlation between the observers’ estimates, which were randomly divided into two groups,9 and the mean difference in percent stenosis between the two assessments with a 95% confidence interval (CI) for the whole study group and for each of the three pairs of observers separately to identify any systematic deviations.10 The agreement between the observers was also computed after adjustment for the effects of chance with the use of the κ statistic.11 An advantage of the κ statistic is that it can accommodate weights that reflect the severity or importance of a disagreement. First, the agreement for presence of a severe (70% to 99%) carotid stenosis was assessed (κ1). After that, the agreement for specific 10% categories (ie, 0% to 9%, 10% to 19%, . . ., 90% to 99%, 100%) of carotid stenosis was computed (κ2). In this way, however, each disagreement between observers would be penalized equally, irrespective of the extent of the difference between the two observers and the consequences of the disagreement for the decision to recommend endarterectomy. We therefore constructed a third statistic (κ3) based on the difference in risk of stroke in the ipsilateral carotid distribution as a function of the degree of carotid stenosis. For example, a 70% to 79% symptomatic carotid stenosis carries a 24.7% risk of stroke within 3 years, whereas a 30% to 39% stenosis carries an 8.8% risk of stroke within 3 years (Fig 1⇓) (J. Slattery, personal communication; data presented at the final ECST investigators meeting in Münich, Germany, September 1996). If the first observer would rate the stenosis at 75% and the second observer would rate the stenosis at 55%, the penalty for this disagreement would be taken as the difference in risk divided by the maximal possible risk difference, ie, the difference in risk of stroke associated with a tight 90% to 99% symptomatic carotid stenosis (39.4%) and the risk of stroke associated with a minimal carotid stenosis of 0% to 9% (3.8%): (24.7%−8.8%)/(39.4%−3.8%)=15.9%/35.6%=0.45. Although this statistic considers the magnitude of the disagreement between the observers, it does not take into account the consequences of such a disagreement with respect to the recommendation for endarterectomy. The fourth statistic (κ4) therefore applies the same weights as the third, but now only when the recommendation for endarterectomy would not coincide, assuming that surgery would be recommended only for patients with a 70% to 99% carotid stenosis. The penalty for the disagreement in the previous example would remain unchanged, but a disagreement in which the first observer would assess the stenosis at 65% and the second observer would assess the stenosis at 55% would be zero because there would be no changes in therapeutic choice. The fifth statistic (κ5) is the same as the fourth, but now disagreements in this range of stenosis were assumed to lead to different treatment recommendations in 50% of the cases, and thus the penalty score (relative risk difference) was halved. All analyses were performed with Stata statistical software.12
Sixty-five patients with either amaurosis fugax (n=8), TIA (n=19), or nondisabling stroke (n=38) were included in this study. Forty-three patients were male, and the mean age was 53 years (range, 19 to 77 years). The number of patients and angiograms in our study was limited, but the stenosis grading was quite evenly distributed over the study population (Fig 2⇓). The intraclass correlation coefficient between the two assessments was .90 (95% CI, .85 to .92). The mean difference in percent stenosis between the two assessments was 0.8% (95% CI, −2.1% to 3.7%). For observer pairs separately, the 95% CI for the mean difference always contained 0, and the point estimate of the difference was always less than 5%. Of the 121 carotid arteries to be assessed, 15 were classified by both observers as a severe carotid stenosis, and six were classified by only one observer as a severe carotid stenosis. The overall agreement on the presence or absence of a severe carotid stenosis was 95%, and κ1 was 0.80. When 10% categories were considered as the diagnostic criterion, the overall agreement dropped to 54%, and the chance-adjusted agreement (κ2) dropped to 0.40 (Table⇓). When the “penalty” for a disagreement was weighted with the relative difference in risk of stroke in the ipsilateral carotid distribution, the agreement between observers (κ3) improved considerably (Table⇓). The improvement was even better when disagreement about the exact degree of severe carotid stenosis was not penalized (κ4). Only a small further improvement in the κ statistic was obtained by assuming that in only half of the cases with a moderate (30% to 69%) carotid stenosis interobserver disagreement would lead to conflicting treatment recommendations (κ5).
We show that although interobserver agreement for distinct 10% categories of angiographic carotid stenosis was moderate, the use of appropriate and realistic risk- and decision-based weights improves agreement between experienced observers to a high level.
Optimal treatment decisions for patients with symptomatic carotid stenosis depend on an accurate and reliable assessment of the degree of carotid stenosis as well as on other risk factors for ischemic stroke. It is therefore reassuring that interobserver agreement for a detailed, clinically relevant categorization of carotid stenoses on angiography is almost perfect when realistic “penalties” for disagreement between observers are used.
In no other study has an assessment of observer agreement for detailed categories of carotid stenosis, by angiography or any other imaging method, been made that also takes into account the size and importance of the disagreements,1 2 3 13 14 although others have made use of intraclass correlations with good results.2 We propose that a validated method for detailed assessment of interobserver agreement, with adjustment for chance and for the extent of disagreement, be incorporated in the evaluation of any diagnostic procedure for carotid artery stenosis.
This study was supported in part by the Stichting Neurovasculair Onderzoek Rotterdam. We thank the main investigators of the ECST for allowing us the use of their estimates of the risk of stroke in the ipsilateral carotid distribution and for their helpful comments.
- Received May 5, 1997.
- Revision received August 5, 1997.
- Accepted August 28, 1997.
- Copyright © 1997 by American Heart Association
Young GR, Sandercock PAG, Slattery J, Humphrey PRD, Smith ETS, Brock L. Observer variation in the interpretation of intra-arterial angiograms and the risk of inappropriate decisions about carotid endarterectomy. J Neurol Neurosurg Psychiatry. 1996;60:152–157.
Vanninen R, Manninen H, Koivisto K, Tulla H, Partanen K, Puranen M. Carotid stenosis by digital subtraction angiography: reproducibility of the European Carotid Surgery Trial and the North American Symptomatic Carotid Endarterectomy Trial measurement methods and visual interpretation. AJNR Am J Neuroradiol. 1994;15:1635–1641.
Rothwell PM, Gibson RJ, Slattery J, Warlow CP. Prognostic value and reproducibility of measurements of carotid stenosis: a comparison of three methods on 1001 angiograms. Stroke. 1994;25:2440–2444.
Rothwell PM, Gibson RJ, Slattery J, Sellar RJ, Warlow CP. Equivalence of measurements of carotid stenosis: a comparison of three methods on 1001 angiograms. Stroke. 1994;25:2435–2439.
Stata Corporation. Stata 4.0. College Station, Tex: Stata Press; 1995.
Carpenter JP, Lexa FJ, Davis JT. Determination of duplex Doppler ultrasound criteria appropriate to the North American Symptomatic Carotid Endarterectomy Trial. Stroke. 1996;27:695–659.
Young GR, Humphrey PR, Nixon TE, Smith ET. Variability in measurement of extracranial internal carotid artery stenosis as displayed by both digital subtraction and magnetic resonance angiography: an assessment of three caliper techniques and visual impression of stenosis. Stroke. 1996;27:467–473.