Interobserver Variation of ASPECTS in Real Time
Background— The Alberta Stroke Program Early CT Score (ASPECTS) has been used to quantify early ischemic changes on computed tomography (CT) brain scans of acute stroke patients. We sought to assess the reliability of the score when performed in real time as compared with an expert rating performed at a later time point.
Methods— Two hundred fourteen patients presenting with acute ischemic stroke or transient ischemic attack were prospectively recruited if they had a brain CT scan performed within 12 hours of symptom onset. Each scan was read for ASPECTS prospectively by the treating physician and later by 1 expert reader. A weighted kappa statistic was used to determine the interobserver agreement.
Results— The median baseline National Institutes of Health Stroke Scale score was 5 (range: 0 to 32) and the median time to CT scan was 152 minutes (range: 22 to 769). The interobserver agreement between ASPECTS performed in real time and expert ASPECTS was substantial (κw=0.69). The mean difference between real-time ASPECTS and expert ASPECTS was 0 (SD: 1.1).
Conclusions— ASPECTS is a reliable clinical scale for rating early ischemic changes on CT when performed in real time.
Thrombolysis for stroke1 within 3 hours of symptom onset remains the only approved acute treatment for ischemic stroke. Appropriate patient selection for thrombolysis in acute stroke is important and continues to be refined. Computed tomographic (CT) imaging may be central to this selection2,3 by identifying ischemic changes in the hyperacute phase.
The Alberta Stroke Program Early CT Score (ASPECTS) was developed to offer a reproducible grading system to assess early ischemic changes on pretreatment CT studies in patients with acute ischemic stroke. It is a 10-point scale that grades the extent of ischemic change within the territory of the middle cerebral artery.4,5 With good training and experience, the early changes of acute ischemia in the middle cerebral artery can be detected reliably. However, previous studies of ASPECTS have been performed with consensus assessment3 or not prospectively.4 In this article, we describe the reliability of ASPECTS rated prospectively by the treating physician as compared with the ASPECTS performed at a later date by an expert rater.
Patients and Methods
Prospectively, 214 patients were recruited into this study. Inclusion criteria were stroke or transient ischemic attack (consisting of hemiparesis or aphasia lasting >5 minutes) that were scanned within 12 hours of when the patient was last seen as being well, were older than 18 years, and were functionally independent on the modified Rankin scale (score ≤2). Patient demographics were recorded at the time of admission to the emergency department. The protocol was approved by the local institutional ethics review board.
Standard noncontrast CT was performed with a fourth-generation multi-slice CT scanner (GE Medical Systems) in the emergency room. The noncontrast CT scanning technique was as follows: 120 kV, 170 mA, 2-second scan time, and 5-mm slice thickness. Coverage was from skull base to vertex with contiguous axial slices parallel to the inferior orbitomeatal line. A window width of 75 to 80 Hounsfield units (HU) and window level of 30 to 40 HU were used to maximize tissue contrast. Physicians were able to alter the window width and leveling as appropriate to maximize the appearance of ischemic changes.6
The ASPECTS was recorded prospectively by the treating physician at the time of CT scan. The treating physician (stroke fellow or stroke neurologist experienced in rating ASPECTS) scored the ASPECTS without knowledge of any other imaging modalities, but with knowledge of clinical symptoms. At a later point, 1 of 4 expert readers (different individual than the treating physician, including 1 neuroradiologist, 2 stroke neurologists, and 1 stroke fellow) rated the baseline CT scan using ASPECTS. The expert reader was blind to all clinical information except symptom side. Each scan was read by the treating physician, and then each scan was read later by 1 expert reader.
Descriptive statistics were used to evaluate the study population. Interrater reliability between experts was not measured but has previously been shown to be equivalent.7 Agreement between the real-time and expert ASPECTS rating was assessed using weighted kappa (κw) scores. The weighting was designed to heavily penalize any difference >1 ASPECTS point. The κw values were interpreted as: slight agreement, 0.00 to 0.20; fair agreement, 0.21 to 0.40; moderate agreement, 0.41 to 0.60; substantial agreement, 0.61 to 0.80; or almost perfect agreement, 0.81 to 1.00.8
The distribution of ASPECTS scores (real-time and expert) was skewed. The difference between the real-time and expert ASPECTS scores was normally distributed. The unit of analysis was the difference between the real-time and expert ASPECTS. Analysis of variance was used to assess the difference between categories of real-time ASPECTS values and the difference between the real-time and expert ASPECTS. Hypothesis testing between groups was adjusted using the Bonferroni method. Linear regression was used to plot this relationship.
There were 214 patients included in this study; 88 were female. The median baseline National Institutes of Health Stroke Scale score (NIHSS) was 5 (range: 0 to 32). Median age was 72.5 (27 to 91). Fifty-four patients had a transient ischemic attack by the current definition9 (NIHSS=0 at 24 hours). Median time from symptom onset to CT scan was 152 minutes (range: 22 to 769 minutes).
Interobserver agreement between real-time and expert ASPECTS was substantial; κw=0.69 (95% confidence interval [CI]: 0.59 to 0.79). The mean difference was 0 (SD: 1.1). There was no difference in the reliability if transient ischemic attack patients were excluded (κw=0.66; n=160), or if only patients scanned at <6 hours are considered (κw=0.68; n=183), or if only stroke patients scanned at <6 hours are considered (κw=0.65; n=140).
Trichotomizing the ASPECTS scale into <3 (unfavorable), 3 to 7 (neutral), and 8 to 10 (favorable) had no impact on the mean differences between real-time and expert ASPECTS, except when the real-time ASPECTS was <3. In this situation, the expert reader was likely to call the scan ≈2 ASPECTS points greater than the treating physician (P=0.007 for 3 to 7; P<0.001 for 8 to 10). There was a trend toward the treating physician undercalling the ischemic change in the 8-to-10 group, ie, the treating physician gives a clinically more favorable score than does the expert. However, this did not reach statistical significance (P=0.064). This relationship was not affected by age, blood glucose, or the baseline NIHSS score.
Figure 1 gives a graphic representation of the correlation between the real-time ASPECTS and the difference between the real-time ASPECTS and the expert ASPECTS. The slope of the line suggests that, at lower real-time ASPECTS ratings, there is a trend for the treating physician to overinterpret the presence of ischemic change (Figure 2).
The reliability of rating ischemic change on the acute stroke CT in routine clinical practice compared with expert rating has not been previously evaluated to our knowledge. We have found that ASPECTS is reliable between real-time and expert ratings. This is important because the true significance of a clinical scale is its ability to be used in routine clinical practice. However, it must be emphasized that this clinical scale is only useful for the middle cerebral artery territory strokes and not the posterior cerebral artery or anterior cerebral artery.
At higher ASPECTS (>7) scores (favorable scan appearance), the real-time observer tends to undercall ischemic change. The effect size is quite small and is likely to be clinically insignificant. At lower ASPECTS (<3) scores (unfavorable scan appearance), the real-time observer tends to overcall the ischemic change by nearly 2 points. The reasons for this probably reflect a combination of factors, including the human visual perception system’s tendency to overestimate boundaries.10 Human factors such as a desire not to administer thrombolysis may also spur the real-time observer to naturally err toward lower scores. There was no relationship between NIHSS and the differences seen between the real-time and expert ASPECTS. This suggests that the real-time physician does not overcall ischemic change because of bias introduced by knowledge of stroke severity. At the high end of the scale, the physician may be keen to offer thrombolysis and underinterpret the ischemic change. A final reason for disparity in scoring between the real-time and expert rating is that the treating physician often does not have ideal conditions for the rating of brain CT scan.
A weighted kappa was used for the statistics because artificially dichotomizing the results into >7 and ≤7 would overly penalize a scan that an expert called a 7 and that the treating physician called a 6. A score of ±1 point on the ASPECTS scale has been to shown to the expected margin of error,7 is felt not to be clinically significant, and would not deter thrombolytic treatment for any given patient. Hence, the kappa score was weighted to penalize any difference more than ±1 point. Our population included patients with transient ischemic attack and patients who were imaged between 6 and 12 hours after symptom onset. However, removing these patients from the analysis made no difference to agreement.
The results in our study are limited to stroke neurologists and neuroradiologists. We did not assess the ability of emergency physicians or trainees to rate ASPECTS. This is an important limitation and future work is needed in this area.
In conclusion, we have found that ASPECTS performed in real time is a reliable method of quantifying the early ischemic changes of the middle cerebral artery in acute stroke.
This project received funding support from the Alberta Foundation for Health Research, Alberta Heritage Foundation for Medical Research, Heart and Stroke Foundation of Alberta, NWT and Nunavut, and Canadian Institutes of Health Research. Dr Barber was supported by Heart and Stroke Foundation of Canada (HSFC), Canadian Institutes for Health Research, and Alberta Heritage. Dr Buchan was supported by the Heart and Stroke Foundation of Canada (HSFC), Canadian Institutes for Health Research, and The Canadian Stroke Network. Dr Coutts was supported by Heart and Stroke Foundation of Canada (HSFC) Fellowship and Alberta Heritage Foundation for Medical Research Fellowship. Dr Demchuk was supported by the Alberta Heritage Foundation for Medical Research and the Canadian Institutes for Health Research. Dr Hill was supported by Heart and Stroke Foundation of Alberta/NWT and Nunavut, and the Canadian Institutes for Health Research. Dr Simon was supported by an Alberta Heritage Foundation for Medical Research Fellowship. We acknowledge the following people within the VISION study group for their help with this study: Dr James Kennedy, Dr Alexis Gagnon, Dr Vanessa Palumbo, Dr Jayanta Roy, Dr Timothy Watson, Karyn Fischer, Linda Anderson-Armitage, Andrea Cole-Haskayne, Carol Kenney, Karla Ryckborst, Lisa Sinclair, Nancy Newcommon, and Marie McClelland.
- Received November 20, 2003.
- Accepted January 21, 2004.
Hill MD, Rowley HA, Adler F, Eliasziw M, Furlan A, Higashida RT, Wechsler LR, Roberts HC, Dillon WP, Fischbein NJ, Firszt CM, Schulz GA, Buchan AM; PROACT-II Investigators. Selection of acute ischemic stroke patients for intra-arterial thrombolysis with pro-urokinase by using ASPECTS. Stroke. 2003; 34: 1925–1931.
Pexman JH, Barber PA, Hill MD, Sevick RJ, Demchuk AM, Hudon ME, Hu WY, Buchan AM. Use of the Alberta Stroke Program Early CT Score (ASPECTS) for assessing CT scans in patients with acute stroke. AJNR Am J Neuroradiol. 2001; 22: 1534–1542.
Hill MD, Barber PA, Hudon ME, Pexman, JHW. Inter- and intra-observer reliability of ASPECTS. Stroke. 2003; 34: 281. Abstract.