Variable Agreement Between Visual Rating Scales for White Matter Hyperintensities on MRI
Comparison of 13 Rating Scales in a Poststroke Cohort
Background and Purpose Previous reports on the frequency, extent, and clinical correlates of white matter hyperintensities (WMHIs) have been contradictory. The purpose of this study was to test whether part of this variation could be explained by the different properties of the visual WMHI rating scales used.
Methods The periventricular (PVHIs) and deep white matter (DWMHIs) hyperintensities of 395 poststroke patients were systematically analyzed and transformed to correspond to 13 different rating scales. The scales were compared with the use of Goodman-Kruskal measures of association. The relative frequencies, means, and medians of PVHI and DWMHI grades as well as Spearman rank correlations between WMHI grade and hypertension were calculated.
Results At best more than 80% of the patients received an equivalent WMHI grade by different scales, but at worst the corresponding values were only 0.4% for PVHI and 18% for DWMHI. At best different scales categorized patients similarly in regard to WMHI grade, but at worst the corresponding values were 8% for PVHI and 57% for DWMHI ratings. The distribution of WMHI grades also varied, and when the effect of age on WMHI was assessed, some of the scales had a ceiling effect and some had a floor effect. Only 1 of the 7 PVHI, 5 of the 9 DWMHI, and 1 of the 3 combined rating scales showed a significant correlation with arterial hypertension, a putative risk factor for WMHIs.
Conclusions Some of the inconsistencies in previous studies of WMHIs are due to differences in visual rating scales. Our findings may warrant international debate regarding harmonization of WMHI ratings.
The advent of new brain imaging methods, particularly MRI, created a multitude of focal, patchy, and diffuse signal changes seen in the cerebral WM.1 2 These changes are hyperintense compared with normal WM on T2- and PD-weighted SE or FLAIR sequences, with or without only minor corresponding hypointensity on T1-weighted images or low attenuation on x-ray CT.3 4 5 Despite intensive research, the pathogenesis, clinical significance, and morphological substrate of these changes are still incompletely understood.6 7
On MRI the frequency of WMHIs increases with advancing age.2 8 9 10 11 12 13 14 In addition to cerebrovascular disorders,2 10 12 15 they have been related to various risk factors such as arterial hypertension,2 13 14 16 cardiac disorders,12 13 diabetes,11 13 and cigarette smoking.17 The clinical conditions related to WMHIs include gait disorders,14 18 cognitive impairment,19 20 21 22 and mood disorders.21 23 24
However, the results have often been conflicting. Since some nonspecific WMHIs may exist in clinically “healthy” individuals as well,9 13 25 26 27 28 29 the distinction between normal physiological variation, “normal WM aging,” and a pathological WM change is essential.
One source of controversy in the results of previous studies could be differences in the rating scales used to assess WMHIs.2 9 11 12 13 14 19 25 30 31 32 33 34 35 36 37 38 39 40 41 42 The aim of the present study was to investigate this hypothesis by comparing 13 commonly used MRI rating scales for WMHIs in a defined population to elucidate whether the scales differ. If differences exist, we sought to determine which of the scales are alike, which differ the most, and how this difference influences the assessment of frequency and degree of WM change, as well as correlations between WMHIs and putative risk factors.
Subjects and Methods
Based on a computer-aided literature search, review of the reference lists of published studies, and discussions with other investigators active in the field, we identified 13 MRI visual rating scales for WMHIs published between 1986 and 1994. Of these scales, 2 assessed only PVHIs,32 35 3 assessed only DWMHIs,9 36 37 5 included distinct scales for PVHIs and DWMHIs,13 19 33 38 39 43 and 3 assessed PVHIs and DWMHIs combined.11 12 25 See the “Appendix” for details of these scales.
PVHI Rating Scales
(1) The Gerard and Weisberg scale (1986)32 focuses on PVHIs on PD-weighted images and rates the changes on a 5-grade scale.
(2) The Shimada scale (1990)35 rates PVHIs on T2/PD-weighted images on a 4-grade scale.
PVHI and DWMHI Distinct Rating Scales
(1) The Erkinjuntti scale (1994)13 rates HIs on T2/PD-weighted images in the deep gray matter (vascular centrencephalon), as well as in the periventricular, centrum semiovale, watershed, and subcortical WM areas. PVHIs are rated on a 4-grade scale and DWMHIs on a 5-grade scale.
(2) The Fazekas scale (1987)33 rates PVHIs and DWMHIs on T2/PD-weighted images on a 4-grade scale.
(3) The Mirsen scale (1991)38 rates PVHIs and DWMHIs on T2/PD-weighted images. PVHI is rated as absent or present, and small HIs around frontal or occipital horns are disregarded. DWMHI is rated on a 5-grade scale.
(4) The Scheltens scale (1992 )39 43 rates HIs on T2/PD-weighted images in periventricular and deep WM (frontal, parietal, occipital, and temporal) areas. It also includes ratings for basal ganglia and infratentorial areas, which were not used in this report. The PVHI score rates caps or bands on a 3-grade scale. PVHIs >10 mm are rated as DWMHIs. DWMHIs are rated on a 6-grade scale on the basis of the size and number of the lesions.
(5) The Ylikoski scale (1993)19 rates HIs on T2-weighted images in the periventricular and deep WM and watershed on a 4-grade scale on both sides in four areas: around the frontal horns, ventricular body, trigones, and occipital horns. A total score reflecting WMHI is calculated by summing all the scores in different areas (PVHI Score 0 to 24+DWMHI Score 0 to 24=Total WMHI Score 0 to 48).
PVHI and DWMHI Combined Rating Scales
(1) The Breteler scale (1994)12 (attributed to Caplan and van Swieten) rates PVHIs and DWMHIs jointly on T2-weighted images on a 3-grade scale.
(2) The Schmidt scale (1992)11 is a modification of the Fazekas scale in rating DWMHI on T2/PD-weighted images on a 4-grade scale. Irregular PVHIs are considered part of grade 3 DWMHIs based on assumed ischemic origin.44
(3) The Wahlund scale (1990)25 rates PVHIs and DWMHIs combined on a 3-grade scale on T2-weighted images.
DWMHI Rating Scales
(1) The Herholz scale (1990)36 rates DWMHIs on T2-weighted images on a 4-grade scale taking into account the size and number of HIs.
(2) The Hunt scale (1989)9 assesses DWMHIs on T2-weighted images in five different anatomic regions: frontal, parietal, temporal, occipital, and internal capsule. The size of WMHIs is rated on a 3-point scale and the number on a 4-point scale. A severity score for WMHI within each anatomic region is determined by multiplying size by number, and an overall measure is determined by summing severity scores across regions. In addition, the Hunt scale includes a 4-grade overall severity score from normal to severe.
(3) The van Swieten scale (1990)37 rates WMHIs on T2-weighted images on a 3-grade scale with assessments of three subsequent slices around the anterior horns of the lateral ventricles, the posterior part of cella media, and the posterior part of the centrum semiovale.
The patients in the present study originate from the Helsinki Stroke Aging Memory Study (SAM Study), which included consecutive persons with any ischemic stroke aged 55 to 85 years, speaking the Finnish language, and living in Helsinki, Finland. The detailed protocol has been described elsewhere.45 Three months after the index stroke the subjects underwent a structured medical and neurological history based on review of all available hospital charts, an interview of the subject and knowledgeable informant, and a structured clinical and neurological examination by a board-certified neurologist (T. Pohjasvaara). In addition, all the cases were reviewed by a senior neurologist (T.E.). The examination included basic laboratory examinations and MRI of the head. For the present study we included 395 patients who had a technically reliable set of MRI scans. The mean age of the patients was 70.8 years (SD, 7.7; range, 55 to 85 years), and 51.6% were women.
For risk factor analysis, each patient’s case history was obtained regarding (among other factors) arterial hypertension, which was defined as systolic blood pressure >160 and diastolic blood pressure >95 mm Hg. One hundred ninety-one (48.4%) of the 395 patients had a history of arterial hypertension (52.9%, 49.1%, and 44.7% in those aged 55 to 64, 65 to 74, and 75 to 85 years, respectively).
MRI was performed with a superconducting MRI system operating at 1.0 T (Siemens Magnetom). Transaxial T2-, PD-, and T1-weighted images were obtained with the SE technique. The TR for T2- and PD-weighted images was 3000 ms, TE was 15 to 90 ms, and NEX was 1. The corresponding parameters for T1-weighted images were as follows: TR, 400 ms; TE, 15 ms; and NEX, 2. The slice thickness was 5 mm, gap 0, field of view 230 mm, matrix size 256×256 pixels, and number of slices 26 on every pulse sequence. In addition, a three-dimensional gradient echo sequence (TR, 30 ms; TE, 5 ms; alpha, 40; NEX, 1) with 64 3-mm-thick coronal sections was used.
Visual Rating System
All MR images were reviewed by the same neuroradiologist (R.M.) blinded to the clinical data. WMHIs in contact with the ventricular wall were defined as PVHIs, and they were rated separately from DWMHIs. DWMHIs were separated from the ventricular wall by a strip of normal-appearing WM and were situated in the deep WM sparing the subcortical U-fiber region.
PVHI (Fig 1⇓) was analyzed in six slices (L1, L2, L3, H1, H2, H3) covering the periventricular area13 ; L1 was the lowest slice where the frontal horns of lateral ventricles could be seen. The size of the WMHIs around the frontal and occipital horns (“caps”) was measured separately on T2- and PD-weighted images. PVHI along the bodies of lateral ventricles was analyzed with both T2- and PD-weighted images, since the border between the body of caudate nucleus and periventricular signal change was difficult to judge when only PD-weighted images were used.
PVHIs around the frontal and occipital horns were classified based on size and shape into small cap, large cap, and extending cap (small cap, ≤5 mm in diameter, rounded with regular margins; large cap, 6 to 10 mm in diameter, mostly with regular margins; extending cap, >10 mm in diameter, irregular margins). The size of the cap was measured in a direction parallel to the axis of the ventricular horn.
PVHIs along the bodies of lateral ventricles were classified based on thickness and shape into thin lining, smooth halo, and irregular halo (thin lining, ≤5 mm, regular margins; smooth halo, 6 to 10 mm, smooth and mostly regular margins; irregular halo, >10 mm, irregular margins and extending into the deep WM).
DWMHI (Fig 2⇓) was analyzed on three selected slices (S1, S2, S3),13 separately on T2- and PD-weighted images. The defined slices were chosen to represent the area of centrum semiovale; S1 was the first slice just above the level of the roof of lateral ventricles. HIs equivalent to cerebrospinal fluid on T1- and PD-weighted images were regarded as old infarcts or perivascular (Virchow-Robin) spaces and were excluded from the WMHI count. In addition, WMHI around a cystic or a wedge-shaped cortico-subcortical infarct was not included in the WMHI count.
DWMHIs were classified based on size (greatest diameter) and shape into small focal, large focal, focal confluent, diffusely confluent, and extensive WM change. The number of each type of DWMHIs was counted, and extensive WM change was rated as absent or present (small focal, punctate hyperintensities ≤5 mm, mostly rounded; large focal, 6 to 10 mm, mostly rounded; focal confluent, 11 to 25 mm, often various shapes and may have irregular borders; diffusely confluent, >25 mm, mostly irregular borders; and extensive WM change, diffuse HI without distinct focal lesions affecting the majority of deep WM area).
Reliability of the visual ratings was tested by reviewing 60 MRI scans independently by the same rater (R.M.), by a board-certified neuroradiologist (O.S.), and a general radiologist (H.J.A.). The weighted κ values46 for intraobserver agreement were 0.90 for periventricular caps, 0.93 for linings and halos, and 0.95 for DWMHIs. The corresponding κ values for interobserver agreement were 0.84, 0.82, and 0.84 between R.M. and O.S., and 0.82, 0.72, and 0.77 between R.M. and H.J.A., respectively. All values are >0.61, indicating good intrarater and interrater agreement.46
Reconstruction of Different Rating Scales
The original data were transformed to WMHI grades to correspond as closely as possible to the given criteria of the 13 rating scales studied. The same kind of pulse sequence (T2- or PD-weighted) as in the original article was used. Small and large caps were used as synonyms for “caps,” whereas extending caps were regarded as “irregular PVHI extending into the deep WM.”33 In the Scheltens scale, the extending caps were scored as focal confluent HIs and irregular halo as diffusely confluent HIs as part of the DWMHIs.39 Thin lining was used as a synonym for “thin band”32 and “rims,”35 and irregular halo was used as a synonym for “irregular PVHI.”11 33
In rating DWMHIs, small and large focal lesions were used as synonyms for “focal lesions”37 38 and “punctate foci,”9 33 large focal lesions for “patchy white matter HIs,”9 and focal confluent lesions for “beginning confluent”11 33 and “partly confluent lesions.”25 Confluent lesions and extensive WM change were regarded as the most advanced state of WMHI and were rated as the most severe grade. When the scans were read it was very difficult to make a definite distinction between focal lesions with a diameter of 3 and 5 mm, and therefore the original cutoff point of 3 mm in the Scheltens scale was changed to 5 mm in the present study.
The comparison of the results of different ratings was done in pairs with the use of Goodman-Kruskal measures of association.47 The probability value (P) was used to measure how much more probable it is to get an equivalent, rather than a nonequivalent, randomly chosen pair of ratings x and y from rating systems X and Y, which can be symmetrical or asymmetrical. The estimator of P is the product of vertical and horizontal Somer’s D. The value P=0 means that it is as possible to get nonequivalent as equivalent ratings, and the systems are probably unrelated. If P=1, the two systems are totally equivalent but not necessarily the same.
The measure gamma (γ) was used to define how much more probable it is to get like, rather than unlike, orders for a randomly chosen pair of ratings x and y from rating systems X and Y, which are symmetrically related. If γ=1, some classes of the system X may consist of two or more classes of system Y, as long as the order of the ratings remains the same. If γ=0, the systems are independent.
The different rating scales applied had different numbers of grades. To make the grades comparable, the range of each scale was adjusted to a range of 0 to 100 by linear transformation: y=100(x−Min)/(Max−Min), where y is the new rating, x is the original rating, Min is the lower boundary of the original range, and Max is the upper boundary of the original range. The lowest grade in the original rating scale was marked 0, and the remaining grades were divided into three classes (I, II, and III), indicating WMHI grades “mild,” “moderate,” and “severe.” The classification was made with the assumption that the steps between different grades are equal in each scale. Rating scales with four or fewer original grades remained unchanged. The statistical tests were performed with BMDP New System 1.1 and BMDP Classic 7.0.48
Probability of Obtaining Equivalent WMHI Grades and Same Order of Ratings
Comparisons of the rating scales are shown in Tables 1⇓ and 2⇓. The probability value indicates the equivalence of the WMHI grades between different scales. Scales with the highest probability values indicate WMHI grades that were closest to each other, and low probability values indicate that the ratings differed the most. The corresponding γ values express how well the different rating scales classify the patients in the same order with respect to WMHI grades. A low γ value indicates that the ratings are independent from each other.
In the PVHI ratings the probability values between two rating scales varied from .004 to .88. Thus, at worst 4 of 1000 patients were likely to receive an equivalent PVHI grade; at best 88 of 100 were likely to receive an equivalent PVHI grade. The corresponding γ values varied from 0.08 to 1.00, indicating that at best the order of ratings was equivalent, but at worst the order of ratings was equivalent in only 8% of cases. In the DWMHI ratings the probability value varied from .18 to .81 and the corresponding γ from 0.57 to 1.00.
Distribution of WMHI Grades
The relative frequencies of PVHI and DWMHI grades are shown in Fig 3⇓. According to the different rating scales, 1.3% to 26.3% of the patients were classified as having no abnormal PVHI, 6.8% to 40.5% with mild, 4.8% to 41.0% with moderate, and 7.8% to 58.2% with severe PVHI. When assessed by Mirsen’s dichotomous rating scale (absent or present), 84.3% of our patients had PVHI. For DWMHI the corresponding percentages were 7.3% to 14.2% for no abnormal DWMHI, 16.7% to 67.9% for mild, 4.8% to 71.1% for moderate, and 11.1% to 56.7% for severe DWMHI. If PVHI and DWMHI were rated jointly, 20.3% to 59.0% of our patients had no or mild, 2.3% to 27.1% moderate, and 13.9% to 56.7% severe WMHI.
Effect of Age on WMHI
The influence of age on WMHI was assessed by calculating the medians of WMHI grades (adjusted to a range from 0 to 100) in three age groups: 55 to 64 (group I), 65 to 74 (group II), and 75 to 85 years (group III) (Fig 4⇓). When assessed by the Mirsen scale as absent or present, the median of PVHI grades reached the maximal value in all age groups. In four other PVHI rating scales, the median of the age group III was the same as the maximum of the rating scale, and among these in one scale even the median of the age group II reached the maximal value of the scale.
In the DWMHI ratings the median was unchanged in four of 12 scales, and in two rating scales the median of the age group III reached the upper limit of the scale. The differences of medians between age groups I and III varied in different ratings from 0 to 66.7 for both PVHI and DWMHI.
Furthermore, the relative means of WMHI grades were calculated in the corresponding age groups (Fig 5⇓). The means of PVHI varied from 28.2 to 65.9 in group I, 36.7 to 82.8 in group II, and 45.2 to 97.2 in group III depending on the scale applied. The difference in the means of PVHI grade between the youngest and oldest age groups varied from 11.3 to 31.3. Correspondingly, the means of DWMHI grades varied from 27.0 to 51.2 in group I, from 31.5 to 65.4 in group II, and from 37.9 to 81.9 in group III. The difference in the means of DWMHI grade between the youngest and oldest age groups varied from 10.9 to 30.7.
Effect of Different Rating Scales on Risk Factor Analysis
The influence of the rating scale on the results of risk factor analysis was studied with arterial hypertension used as an example (Table 3⇓). In those aged 55 to 64 years, none of the rating scales showed a statistically significant correlation (P<.05) between WMHI grade and arterial hypertension. In the group aged 65 to 74 years, one of the rating scales showed a significant correlation between PVHI grade, and another between DWMHI grade and hypertension. In the group aged 75 to 85 years, none of the scales found a statistically significant correlation between PVHI grade, but two showed a significant and four a highly significant (P<.01) correlation between DWMHI grade and arterial hypertension. Altogether three ratings did not find a significant correlation. When PVHI and DWMHI combined were rated, one of three rating scales showed a highly significant correlation between arterial hypertension and WMHIs.
To the best of our knowledge, this is the first study comparing commonly used visual MRI rating scales for WMHIs. We studied 13 scales using reliable ratings in a well-documented poststroke cohort. The results indicate that the probabilities of obtaining equivalent WMHI grades, the same order of ratings, and the same distribution of different WMHI grades varied considerably. Thus, in addition to factors such as patient selection, sample size, age distribution, and MRI technology used, the inconsistencies regarding the frequency and extent of WMHIs, as well as diverse results in risk factor studies, are due to differences between the WM rating scales used. We emphasize that we are not attempting to judge the superiority or goodness of the scales in assessing WMHIs; this can only be examined with the use of a distinct standard.
When PVHIs were rated by different scales, the probabilities of obtaining equivalent grades varied; at best, 88% of the cases received an equivalent grade, but at worst only 0.4% of the cases received an equivalent grade. The Fazekas and Erkinjuntti scales showed the best correspondence, which can be explained by the minimal difference between them; the Fazekas grade 1 (caps or pencil-thin lining) includes the Erkinjuntti grades 1 and 2, which keeps the order of ratings similar, as expressed by the γ value of 1. On the other hand, a slightly different modification of the Fazekas scale by Ylikoski (moving caps or pencil-thin lining into grade 2 and omitting smooth halo) changes the probability value to .50, which still keeps the order of ratings similar (γ=0.94). The best mean correspondence with other scales was the Erkinjuntti scale (mean P=.50, γ=0.84), and the worst was the Scheltens scale (mean P=.06, γ=0.31). This, however, only reflects the fact that the Scheltens scale differs the most from the others. This difference is mainly explained by rating PVHIs >10 mm as DWMHIs in the Scheltens scale, unlike the other scales.
In the deep WM, the probabilities of obtaining equivalent HI grades (P=.18 to P=.81) or the same order of HI ratings (γ=0.57 to γ=1.0) varied less between the scales than in the periventricular area (P=.004 to P=.88, γ=0.08 to γ=1.0). The Hunt severity scale and Wahlund scale showed the best correspondence with each other. The Ylikoski scale applied the Fazekas method; the difference between their ratings (P=.80) can be explained by the areas rated and pulse sequence used (PD- versus T2-weighted).19 33 In terms of obtaining equivalent DWMHI grades, the Breteler, Schmidt, and Scheltens scales showed good correspondence with each other. The lowest mean level of correspondence with other scales was the Mirsen scale and the highest were the Hunt and Fazekas scales.
The distribution of WMHI grades varied considerably if different rating scales were used; 1.3% to 26.3% of the patients had no WMHI, 6.8% to 67.9% had mild, 2.3% to 71.1% had moderate, and 7.8% to 58.2% had severe WMHI. Thus, in the present series approximately one of 13 persons or every second person showed severe WMHI. This apparently leads to confusion in the frequency estimates of WMHI, as repeatedly reported in previous studies.12 25 27 34 49 50 51 52
In assessment of the effect of age on the relative degrees of WMHI, some of the scales showed a ceiling effect and some a floor effect. The present series consisted of poststroke patients, who are expected to have a high frequency of WMHIs. One may argue that some of the scales were originally designed to study a different type of population: healthy elderly or younger people. However, a practical rating scale should be sensitive enough to detect the mildest abnormal WMHI and at the same time show a spectrum of different degrees of WM pathology.
Previous reports on the clinical correlates of WMHIs have been contradictory; for example, a number of studies have found a correlation between arterial hypertension and WMHIs,2 13 14 16 but some have not.50 53 We examined the possible effect of different rating scales on this discrepancy. In the present series only 1 of the 7 PVHI, 5 of the 9 DWMHI, and 1 of the 3 combined ratings showed a statistically significant correlation between arterial hypertension and WMHIs in the whole group. Since the patient material was equal for every scale, the difference must be due to the scale adopted. If the clinical correlates differ from one scale to another, the scales are probably not reflecting the same pathophysiological substrate.
Constructions of the rating scales have been diverse. Some of the scales are simple (eg, the van Swieten scale or the Fazekas scale), and some are more time-consuming, including counts of different types of lesions (eg, the Erkinjuntti and Scheltens scales). Most of the scales make a distinction between periventricular and deep WM areas, which is supported by their different vascular vulnerability,54 and some apply combined ratings. Definitions of the type and extent are also diverse, and unfortunately too often the lesions to be rated are vaguely described. We converted the MRI readings based on our best interpretation to correspond to the different rating scales, but distinction between expressions such as “small” or “large,” “punctate” or “patchy,” “some” or “several,” and “several” or “widely distributed” creates difficulty in adopting some of the scales and may have caused minor inaccuracies. Furthermore, the types of lesions regarded as normal vary; the Mirsen scale ignores small caps around frontal and occipital horns, the Shimada scale rates small caps and thin lining as normal, and the Schmidt scale ignores all other types of PVHI except irregular changes. Finally, selection of the pulse sequence used also varies: T2-, PD-weighted, or a combination of both.
Our results highlight the validity of WMHI rating. WMHIs are related to different etiologic factors (eg, small-vessel disease, hypoperfusion) and different pathological changes (eg, araiosis, demyelination, état criblé, focal cavitated and noncavitated infarcts).4 44 55 56 57 58 59 60 Some of the WMHIs are likely to be physiological61 rather than pathological changes, and some are of nonischemic origin.44 The important questions are as follows: Does the rating scale measure the essential pathophysiological features of WMHIs (construct validity)? How well does the rating scale reflect the different types, extent, and sites of WMHIs (content validity)? How accurate is the scale compared with a “golden standard” (criterion validity)?
The problems with construct and content validity might be solved by large enough prospective premortem and postmortem MRI studies compared with neuropathological changes. If we accept that the extent of WMHIs in defined WM areas is the criterion standard, comparison of visual ratings with planimetric and volumetric assessments could guide new discoveries.
Our results show that some of the inconsistencies in the previous studies on the frequency and extent of WM changes, as well as diverse results in risk factor studies, are due to differences in the WM rating scales used. Our results challenge prospective clinical MRI studies in different populations with neuropathological correlates to clarify the validity of the rating scales, as well as debate regarding the international harmonization of WM rating scales.
Selected Abbreviations and Acronyms
|DWMHI||=||deep white matter hyperintensity|
|FLAIR||=||fluid-attenuated inversion recovery|
|NEX||=||number of excitations|
|WMHI||=||white matter hyperintensity|
Rating Scales for White Matter Hyperintensities on MRI
Gerard and Weisberg Scale (1986)
(1) Punctate discrete round regions contiguous with the anterior frontal horn, with no other abnormal findings seen in the periventricular region
(2) Dense bands surrounding the frontal horns and sometimes with distinct high-signal intensity surrounding the occipital region
(3) Thick caps surrounding the frontal horns with thin bands surrounding the bodies of the lateral ventricles
(4) Thick caps surrounding the frontal and occipital horns and thick bands surrounding the bodies of the lateral ventricles
(5) Thick irregular or smooth caps or bands surrounding the entire ventricular system
Shimada Scale (1990)
(1) No abnormality or minimal periventricular signal hyperintensities in the form of caps only in the anterior horns or rims lining the ventricle
(2) Caps in both anterior and posterior horns of lateral ventricles or periventricular unifocal patches
(3) Multifocal periventricular hyperintense punctate lesions and their early confluent stages
(4) Multiple high signal intensity areas that reach confluence in the periventricular region
Periventricular and Deep White Matter Hyperintensity (Distinct)
Erkinjuntti Scale (1994)
Periventricular hyperintensities (HIs):
(0) Lesions absent
(2) Pencil-thin lining
(3) Smooth halo
(4) Irregular HIs extending to the deep white matter
Other white matter hyperintensities:
(1) <5 small focal and/or <2 large focal lesions
(2) 5 to 12 small and/or 2 to 4 large focal lesions
(3) >12 small focal and/or >4 large focal or some confluent lesions
(4) Predominantly confluent lesions
Fazekas Scale (1987)
Periventricular hyperintensities (PVH):
(1) “Caps” or pencil-thin lining
(2) Smooth “halo”
(3) Irregular PVH extending into the deep white matter
Deep white matter hyperintense signals (DWMH):
(1) Punctate foci
(2) Beginning confluence of foci
(3) Large confluent areas
Mirsen Scale (1991)
Periventricular hyperintensities (PVH):
(1) Absent (includes small areas of hyperintensity at the frontal or occipital horns of the lateral ventricles)
Hyperintensity distant from ventricles (L-A, leukoaraiosis):
(0) Absence of lesions
(1) 1 or 2 focal lesions
(2) 3 to 5 lesions
(3) >5 lesions
(4) Confluent lesions
Scheltens Scale (1992-1993)
Periventricular hyperintensities (PVH score 0-6):
Caps: occipital and frontal
Bands: lateral ventricles
(1) ≤5 mm
(2) ≥6 mm and ≤10 mm
Periventricular hyperintensities >10 mm are scored as lobar white matter changes (WMH).
White matter hyperintensities (WMH score 0-24):
Frontal, parietal, occipital, temporal
Basal ganglia hyperintensities (BGH score 0-30):
Caudate nucleus, putamen, globus pallidus, thalamus
Infratentorial foci of hyperintensity (ITFH score 0-24):
Cerebellum, mesencephalon, pons, medulla
(0) No abnormalities
(1) ≤3 mm: n≤5
(2) ≤3 mm: n≥6
(3) 4-10 mm: n≤5
(4) 4-10 mm: n≥6
(5) ≥11 mm: n≥1
Ylikoski Scale (1993)
Periventricular hyperintensities/leukoaraiosis (PV-LA):
(0) No hyperintensity
(1) Punctuate, small foci (mild)
(2) Cap, pencil-thin lining (moderate)
(3) Nodular band, extending hyperintensity (severe)
Centrum semiovale leukoaraiosis (CS-LA, including watershed=Fazekas rating scale):
(0) No hyperintensity
(1) Punctuate, small foci (mild)
(2) Beginning confluent (moderate)
(3) Large confluent areas (severe)
Total LA score:
PV-LA 0-24 + CS-LA 0-24=0-48
Periventricular and Deep White Matter Hyperintensity (Combined)
Breteler Scale (1994)
Periventricular and focal white matter lesions:
(0) No change or slight periventricular hyperintensity (small caps or pencil-thin lining), <5 focal lesions, and no confluent lesions
(1) Moderate periventricular hyperintensity (caps on both anterior and posterior horns of the lateral ventricles, corpus only partly involved, not irregularly extending into the deep white matter) or ≥5 focal lesions or both, but no confluent lesions
(2) Severe periventricular hyperintensity (irregularly extending into the deep white matter or marked areas of hyperintensity completely surrounding the lateral ventricles) or confluent lesions (regardless of the presence of focal lesions)
Schmidt Scale (1992)
White matter hyperintensity (WMH):
(2) Beginning confluent
(3) Confluent or irregular periventricular hyperintensities
Wahlund Scale (1990)
Deep white matter lesions (WMLs) and periventricular hyperintensity combined and called WMLs:
(1) No white matter changes
(1.5) Small solitary white matter changes
(2) Multiple discrete or large solitary white matter changes
(2.5) Multiple, partly confluent white matter changes
(3) Multiple, large confluent white matter changes
Deep White Matter Hyperintensity
Herholz Scale (1990)
White matter lucencies/lesions (WMLs):
(0) No lesions
(1) ≤3 lesions <5 mm and/or 1 lesion ≥5 mm
(2) ≤10 lesions 5 mm and/or ≤3 lesions ≥5 mm
(3) >10 lesions <5 mm and/or ≤10 lesions ≥5 mm
(4) >10 single lesions >5 mm and/or ≥1 large confluent lesion
Hunt Scale (1989)
White matter hyperintensities (WMH):
(4) Widely distributed
A severity score for WMH within each anatomic region (frontal, parietal, temporal, occipital, and internal capsule) is determined by multiplying size by number. An overall measure of WMH is determined by summing severity ratings across regions.
In addition, a 4-level grading score ranges from normal to severe:
Normal: no lesions or 1 punctate lesion
Mild: patchy or ≥2 punctate lesions
Moderate: 1 large or ≥2 patchy lesions
Severe: ≥2 large, confluent lesions
van Swieten Scale (1990)
White matter lesions on three subsequent slices:
1. Around the anterior horns of the lateral ventricles
2. Around the posterior part of cella media and the posterior part of the centrum semiovale
(0) No lesion or only a single lesion
(1) Multiple focal lesions
(2) Multiple confluent lesions scattered throughout the white matter
This study was supported in part by grants from the Medical Council of the Academy of Finland, Helsinki, Finland; the Clinical Research Institute of the Helsinki University Central Hospital, Helsinki, Finland; and the Finnish Alzheimer Foundation for Research, Helsinki, Finland. Dr R. Mäntylä is supported by the Academy of Finland and the Clinical Research Institute of the Helsinki University Central Hospital. Dr T. Erkinjuntti is supported by the Medical Council of the Academy of Finland and the Finnish Alzheimer Foundation for Research. Dr H.J. Aronen is supported by the Academy of Finland and the Paulo Foundation, Helsinki, Finland. Teemu Peltonen, BA, is supported by the Paulo Foundation and the Pehr Oscar Klingendahl Foundation, Helsinki, Finland. Dr T. Pohjasvaara is supported by the Clinical Research Institute of the Helsinki University Central Hospital. We thank Mauno Korpelainen ScM, for statistical support and Seppo Sarna, PhD, Department of Public Health, University of Helsinki, Finland, for statistical review.
Reprint requests to Dr R. Mäntylä, Department of Radiology, University of Helsinki, Haartmaninkatu 4, FIN-00290 Helsinki, Finland.
- Received February 13, 1997.
- Revision received May 2, 1997.
- Accepted May 2, 1997.
- Copyright © 1997 by American Heart Association
Bradley WG, Waluch V, Brant-Zawadzki M, Yadley RA, Wycoff RR. Patchy, periventricular white matter lesions in the elderly: a common observation during NMR imaging. Noninvas Med Imaging. 1984;1:35-41.
Awad IA, Spetzler RF, Hodak JA, Awad CA, Carey R. Incidental subcortical lesions identified on magnetic resonance imaging in the elderly, I: correlation with age and cerebrovascular risk factors. Stroke. 1986;17:1084-1089.
Erkinjuntti T, Ketonen L, Sulkava R, Sipponen J, Vuorialho M, Iivanainen M. Do white matter changes on MRI and CT differentiate vascular dementia from Alzheimer’s disease? J Neurol Neurosurg Psychiatry. 1987;50:37-42.
Braffman BH, Zimmerman RA, Trojanowski JQ, Gonatas NK, Hickey WF, Schlaepfer WW. Brain MR: pathologic correlation with gross and histopathology, II: hyperintense white-matter foci in the elderly. AJNR Am J Neuroradiol. 1988;9:629-636.
Fazekas F, Alavi A, Chawluk JB, Zimmerman RA, Hackney D, Bilaniuk L, Rosen M, Alves WM, Hurtig HI, Jamieson DG, et al. Comparison of CT, MR, and PET in Alzheimer’s dementia and normal aging. J Nucl Med. 1989;30:1607-1615.
Erkinjuntti T, Hachinski VC. Rethinking vascular dementia. Cerebrovasc Dis. 1993;3:3-23.
Pantoni L, Garcia JH. The significance of cerebral white matter abnormalities 100 years after Binswanger’s report: a review. Stroke. 1995;26:1293-1301.
Hunt AL, Orrison WW, Yeo RA, Haaland KY, Rhyne RL, Garry PJ, Rosenberg GA. Clinical significance of MRI white matter lesions in the elderly. Neurology. 1989;39:1470-1474.
Sullivan P, Pary R, Telang F, Rifai AH, Zubenko GS. Risk factors for white matter changes detected by magnetic resonance imaging in the elderly. Stroke. 1990;21:1424-1428.
Schmidt R, Fazekas F, Kleinert G, Offenbacher H, Gindl K, Payer F, Freidl W, Niederkorn K, Lechner H. Magnetic resonance imaging signal hyperintensities in the deep and subcortical white matter: a comparative study between stroke patients and normal volunteers. Arch Neurol. 1992;49:825-827.
Breteler MM, van Swieten JC, Bots ML, Grobbee DE, Claus JJ, van den Hout JH, van Harskamp F, Tanghe HL, de Jong PT, van Gijn J, Hofman A. Cerebral white matter lesions, vascular risk factors, and cognitive function in a population-based study: the Rotterdam Study. Neurology. 1994;44:1246-1252.
Longstreth WT, Manolio TA, Arnold A, Burke GL, Bryan N, Jungreis CA, Enright PL, Oleary D, Fried L. Clinical correlates of white matter findings on cranial magnetic resonance imaging of 3301 elderly people: the Cardiovascular Health Study. Stroke. 1996;27:1274-1282.
Fukuda H, Kitani M. Cigarette smoking is correlated with the periventricular hyperintensity grade on brain magnetic resonance imaging. Stroke. 1996;27:645-649.
Breteler MM, van Amerongen NM, van Swieten JC, Claus JJ, Grobbee DE, van Gijn J, Hofman A, van Harskamp F. Cognitive correlates of ventricular enlargement and cerebral white matter lesions on magnetic resonance: the Rotterdam Study. Stroke. 1994;25:1109-1115.
Ylikoski A, Erkinjuntti T, Raininko R, Sarna S, Sulkava R, Tilvis R. White matter hyperintensities on MRI in the neurologically nondiseased elderly: analysis of cohorts of consecutive subjects aged 55 to 85 years living at home. Stroke. 1995;26:1171-1177.
Schmidt R, Hayn M, Fazekas F, Kapeller P, Esterbauer H. Magnetic resonance imaging white matter hyperintensities in clinically normal elderly individuals: correlations with plasma concentrations of naturally occurring antioxidants. Stroke. 1996;27:2043-2044.
Brant-Zawadzki M, Fein G, Van Dyke C, Kiernan R, Davenport L, de Groot J. MR imaging of the aging brain: patchy white-matter lesions and dementia. AJNR Am J Neuroradiol. 1985;6:675-682.
Gerard G, Weisberg LA. MRI periventricular lesions in adults. Neurology. 1986;36:998-1001.
Shimada K, Kawamoto A, Matsubayashi K, Ozawa T. Silent cerebrovascular disease in the elderly: correlation with ambulatory pressure. Hypertension. 1990;16:692-699.
van Swieten JC, Hijdra A, Koudstaal PJ, van Gijn J. Grading white matter lesions on CT and MRI: a simple scale. J Neurol Neurosurg Psychiatry. 1990;53:1080-1083.
Fukuda H, Kitani M. Differences between treated and untreated hypertensive subjects in the extent of periventricular hyperintensities observed on brain MRI. Stroke. 1995;26:1593-1597.
Victoroff J, Mack WJ, Grafton ST, Schreiber SS, Chui HC. A method to improve interrater reliability of visual inspection of brain MRI scans in dementia. Neurology. 1994;44:2267-2276.
Scheltens P, Barkhof F, Valk J, Algra PR, van der Hoop RG, Nauta J, Wolters EC. White matter lesions on magnetic resonance imaging in clinically diagnosed Alzheimer’s disease. evidence for heterogeneity. Brain. 1992;115:735-748.
Fazekas F, Kleinert R, Offenbacher H, Schmidt R, Kleinert G, Payer F, Radner H, Lechner H. Pathologic correlates of incidental MRI white matter signal hyperintensities. Neurology. 1993;43:1683-1689.
Pohjasvaara T, Erkinjuntti T, Vataja R, Kaste M. Comparison of stroke features and disability in daily life in patients with ischemic stroke aged 55 to 70 and 71 to 85 years. Stroke. 1997;28:729-735.
Altman DG. Practical Statistics for Medical Research. London, UK: Chapman & Hall; 1995:403-409.
Afifi AA, Azen SP. Statistical Analysis: A Computer Oriented Approach. 2nd ed. New York, NY: Academic Press; 1979:106-113.
Dixon WJ, ed. BMDP Statistical Software Manual: BMDP Release 7 Edition. Los Angeles, Calif: University of California Press; 1992.
Fazekas F, Niederkorn K, Schmidt R, Offenbacher H, Horner S, Bertha G, Lechner H. White matter signal abnormalities in normal individuals: correlation with carotid ultrasonography, cerebral blood flow measurements, and cerebrovascular risk factors. Stroke. 1988;19:1285-1288.
Hendrie HC, Farlow MR, Guerriero Austrom M, Edwards MK, Williams MA. Foci of increased T2 signal intensity on brain MR scans of healthy elderly subjects. AJNR Am J Neuroradiol. 1989;10:703-707.
Matsubayashi K, Shimada K, Kawamoto A, Ozawa T. Incidental brain lesions on magnetic resonance imaging and neurobehavioral functions in the apparently healthy elderly. Stroke. 1992;23:175-180.
Schmidt R, Fazekas F, Offenbacher H, Dusek T, Zach E, Reinhart B, Grieshofer P, Freidl W, Eber B, Schumacher M, Koch M, Lechner H. Neuropsychologic correlates of MRI white matter hyperintensities: a study of 150 normal volunteers. Neurology. 1993;43:2490-2494.
Lopez OL, Becker JT, Jungreis CA, Rezek D, Estol C, Boller F, DeKosky ST. Computed tomography– but not magnetic resonance imaging–identified periventricular white-matter lesions predict symptomatic cerebrovascular disease in probable Alzheimer’s disease. Arch Neurol. 1995;52:659-664.
Moody DM, Bell MA, Challa VR. Features of the cerebral vascular pattern that predict vulnerability to perfusion or oxygenation deficiency: an anatomic study. AJNR Am J Neuroradiol. 1990;11:431-439.
Awad IA, Johnson PC, Spetzler RF, Hodak JA. Incidental subcortical lesions identified on magnetic resonance imaging in the elderly, II: postmortem pathological correlations. Stroke. 1986;17:1090-1097.
Leifer D, Buonanno FS, Richardson EP Jr. Clinicopathologic correlations of cranial magnetic resonance imaging of periventricular white matter. Neurology. 1990;40:911-918.
Fazekas F, Kleinert R, Offenbacher H, Payer F, Schmidt R, Kleinert G, Radner H, Lechner H. The morphologic correlate of incidental punctate white matter hyperintensities on MR images. AJNR Am J Neuroradiol. 1991;12:915-921.