Optimizing Cutoff Scores for the Barthel Index and the Modified Rankin Scale for Defining Outcome in Acute Stroke Trials
Background and Purpose— There is little agreement on how to assess outcome in acute stroke trials. Cutoff scores for the Barthel Index (BI) and modified Rankin Scale (mRS) are frequently arbitrarily chosen to dichotomize favorable and unfavorable outcome. We investigated sensitivity and specificity of BI cutoff scores in relation to the mRS to obtain the optimal corresponding BI and mRS scores.
Methods— BI and mRS scores were collected from 1034 ischemic stroke patients. Sensitivity and specificity were calculated for BI cutoff scores from 45 to 100 in mRS score 1, 2, and 3 and were plotted in receiver operator characteristic (ROC) curves.
Results— The cutoff scores for the BI with the highest sum of sensitivity and specificity were 95 (sensitivity 85.6%; specificity 91.7%), 90 (sensitivity 90.7%; specificity 88.1%), and 75 (sensitivity 95.7%; specificity, 88.5%) for, respectively, mRS 1, 2, and 3. The area under the ROC curve was 0.933 in mRS 1, 0.960 in mRS 2, and 0.979 in mRS 3.
Conclusions— The optimal cutoff scores for the BI were 95 for mRS 1, 90 for mRS 2, and 75 for mRS 3. For future acute stroke trials that assess stroke outcome with the BI and mRS, we recommend the use of these BI cutoff score(s) with the corresponding mRS cutoff score(s), to ensure the use of consistent and uniform end points.
Several randomized controlled acute stroke trials have been designed to investigate effectiveness of therapeutic interventions. A major point of discussion is how to define outcome in acute stroke trials with disability and handicap scales.1–6 The most widely used scales are the modified Rankin Scale (mRS) and the Barthel Index (BI).
The mRS has proved to be valid and reliable for defining outcome in stroke patients.7,8 Although the mRS was designed as a handicap scale,9 it should be considered a disability scale.10 The mRS defines 6 different grades of disability, from 0 for “no symptoms at all” to 5 for “severe disability or bedridden, incontinent, and requiring constant nursing care and attention,” and grade 6 for death.
The BI has also shown to be valid and reliable for assessing disability in stroke patients.8,11 It contains 10 items with varying weights that score activities of daily living (ADL). The items bathing and grooming are scored 0 or 5; the items feeding, dressing, controlling bladder, controlling bowel, getting onto and off the toilet, and ascending and descending stairs are scored 0, 5, or 10. Items regarding moving from wheelchair to bed and walking on level surface are scored 0, 5, 10, or 15. The total BI is a cumulative score of the 10 items, with a maximum score of 100 corresponding with complete independence, and a minimum score of 0 corresponding with total dependence.
There is little consensus on the optimal implementation of the BI and mRS as outcome measure in acute stroke trials. It is unclear which outcome scale is preferable. Moreover, the cutoff scores distinguishing favorable and unfavorable outcome are highly variable between various acute stroke trials.4 This issue has great consequences for the design and interpretation of acute stroke trials. The BI has a larger score range and therefore more possible cutoff scores compared with the mRS. Less is known which BI scores are corresponding with the different mRS scores. There have been only a few studies that determined pivotal BI cutoff scores, and none of them were related to the mRS.12–14 In this article, we investigated which cutoff scores on the BI corresponded to mRS grades 1, 2, and 3.
Subjects and Methods
Population and Data Collection
Data were obtained from the United States and Canadian Lubeluzole Ischemic Stroke Study (INT-LUB-9) and the European and Australian Lubeluzole Ischemic Stroke Study (INT-LUB-5),15,16 provided by the Janssen Research Foundation (Beerse, Belgium). These trials have been published respectively in 1997 and 1998. In summary, these trials studied the neuroprotective effect of lubeluzole in acute ischemic stroke. In both trials, there was no significant difference in mortality rate (primary end point) between lubeluzole-treated patients and placebo-treated patients.
The INT-LUB-5 study included 725 stroke patients (675 ischemic and 50 hemorrhagic), and the INT-LUB-9 included 721 patients (700 ischemic stroke and 21 nonischemic stroke or other causes). BI and mRS scores from ischemic stroke patients at 12 weeks after stroke onset were analyzed. Dead patients were excluded because our analysis focused on disability scores of the BI and mRS. We did not make a distinction between lubeluzole-treated and placebo-treated patients because the intention was only to study the relationship between BI and mRS scores. At 12 weeks, 519 corresponding BI and mRS scores were present in INT-LUB-9 and 515 in INT-LUB-5, forming a total of 1034 BI and mRS scores.
Outcome was dichotomized into favorable and unfavorable using 3 different mRS scores to obtain the corresponding BI score for each mRS score. An mRS score ≤1, 2, or 3 reflected favorable outcome, and an mRS score >1, 2, or 3 reflected unfavorable outcome. The BI cutoff scores were defined as BI ≥45 to 100 for favorable outcome and as BI <45 to 100 for unfavorable outcome. Sensitivity was expressed as the rate of unfavorable outcome according to the mRS and BI. Specificity was expressed as the rate of favorable outcome according to the BI and mRS (Table 1).
The maximal distinction between favorable and unfavorable outcome defined by the mRS is reached when the sensitivity and specificity of a BI score are maximal because false favorable and false unfavorable outcome rates were considered to be equally important. This corresponds with the BI score that has the highest sum of sensitivity and specificity.17 To investigate the relationship between sensitivity and specificity, receiver operator characteristic (ROC) curves were obtained and the areas under the curve (AUCs) were calculated. ROC curves plot sensitivity versus 1-specificity and visualize the optimal cutoff scores for the BI in each mRS grade. The AUC indicates the discriminative properties between favorable and unfavorable outcome for the BI cutoff scores in the 3 mRS scores.
From the 1034 patients, 547 (52.9%) were female. The mean age was 69.1 years (SD 12.8 years). Median BI score was 80 with an interquartile range from 40 to 100. The mRS score distribution was mRS 0, 9.1%; mRS 1, 17.8%; mRS 2, 13.1%; mRS 3, 19.1%; mRS 4, 29.7%; and mRS 5, 11.2%.
Sensitivity and Specificity of BI and mRS Cutoff Scores
For mRS 1, the optimal cutoff score on the BI was 95, with a sensitivity of 85.6% (95% CI, 82.9% to 87.9%) and a specificity of 91.7% (95% CI, 87.8% to 94.5%). For mRS 2, the BI score with the highest sum of sensitivity and specificity was 90, with a sensitivity of 90.7% (95% CI, 88.1% to 92.7%) and a specificity of 88.1% (95% CI, 84.6% to 90.9%). An mRS 3 agreed most with a BI score of 75, with a sensitivity of 95.7% (95% CI, 93.3% to 97.5%) and a specificity of 88.5% (95% CI, 85.8% to 90.8%). In all 3 mRS cutoff scores, sensitivity (rate true unfavorable outcome) increased and specificity (rate true favorable outcome) decreased when BI scores increased.
Subsequently, AUCs were calculated. The AUC for the BI cutoff scores was 0.932 (95% CI, 0.916 to 0.949) in mRS 1, 0.960 (95% CI, 0.949 to 0.971) in mRS 2, and 0.979 (95% CI, 0.972 to 0.985) in mRS 3.
In this study, we analyzed the optimal cutoff scores for the BI and the mRS. These were found to be BI 95 for mRS 1, BI 90 for mRS 2, and BI 75 for mRS 3. This finding may have consequences for the definition of outcome in acute stroke trials.
A recent acute stroke trial defined favorable outcome with an mRS ≤2 and BI ≥75.18 According to our results, these cutoff scores could be suboptimal. The sensitivity (75.0%) and specificity (97.8%) of these cutoff scores implicates that 25% percent of the patients would have a favorable outcome according to the BI but an unfavorable outcome according to the mRS. With regard to the specificity, 2.2% would have an unfavorable outcome according to the BI but a favorable outcome according to the mRS. By choosing a BI cutoff score of ≥90 (sensitivity 90.7%; specificity 88.1%), the false favorable outcome rate could be reduced to 9.3%, whereas the false unfavorable outcome rate would increase to 11.9%. Minimizing false favorable and false unfavorable outcome rates could decrease unnecessary heterogeneity of outcome in acute stroke trials.
Compared with our results, Celani et al found that BI >90 (sensitivity 98%; specificity 97%) was a pivotal score for which patients did not require help from another person for everyday activities.14 Kay et al concluded that BI ≤80 (sensitivity 94%; specificity 80%) was the optimal cutoff score for self-reported dependency.12 These cutoff scores differed with those of our study when dependency is considered to be mRS >2, for which the optimal BI score was <90. These differences may be explained by the subjectivity of self-reported dependency, which will be influenced by personal circumstances such as socioeconomic status and psychological factors.
The mRS cutoff scores were used as a reference to distinguish favorable from unfavorable outcome. Although this is actually not a “gold standard” for dichotomizing outcome, we think that the mRS is suitable for this purpose. First, the mRS is a clinically relevant scale, with 6 different easily understandable and well-defined grades. Second, the BI is highly correlated with the mRS;19 therefore, we can compare BI cutoff scores with the mRS. Third, the mRS measures global disability, whereas the BI scores only ADL.
A point of criticism is that there is only a 5-point difference between the optimal BI cutoff scores in mRS 1 and mRS 2. These BI scores are near the maximum score of the BI. This can be explained by the frequently observed ceiling effects of the BI.5,6,20 Weimar et al concluded that because of the ceiling effect, the mRS is preferable to the BI for defining outcome.5 Kwon et al showed that there was no significant difference in BI scores between mRS 0, mRS 1, and mRS 2 because of ceiling effects of the BI.19
If the intention of a therapeutic intervention is to obtain excellent recovery after stroke, which could be defined as mRS ≤1, the corresponding BI cutoff score was ≥95, according to our results. There is consensus that mRS ≤2 reflects independence and mRS >2 implicates dependence.21 Our study showed that a BI score ≥90 is the optimal cutoff score in relation to mRS ≤2. In severe strokes, one could decide to choose mRS ≤3 and BI ≥75 as cut-off scores for favorable outcome. An example of stroke severity–related outcome has been mentioned by Adams et al.22 They used the mRS as primary end point, where mRS cutoff scores 0, ≤1, or ≤2 reflected favorable outcome, depending on the baseline National Institutes of Health Stroke Scale (NIHSS) score.
In conclusion, we determined the optimal corresponding BI and mRS cutoff scores: BI 95 for mRS 1, BI 90 for mRS 2, and BI 75 for mRS 3. We recommend the use of this/these BI cutoff score(s) with the corresponding mRS score(s) for future acute stroke trials in which BI and mRS scores dichotomize favorable and unfavorable outcome.
This study was supported by a grant from the Catharina Heerdt Foundation.
- Received April 13, 2005.
- Revision received May 27, 2005.
- Accepted June 20, 2005.
Duncan P, Jorgensen HS, Wade DT. Outcome measures in acute stroke trials: a systematic review and some recommendations to improve practice. Stroke. 2000; 31: 1429–1438.
Sulter G, Steen C, de Keyser J. Use of the Barthel Index and modified Rankin Scale in acute stroke trials. Stroke. 1999; 30: 1538–1541.
Weimar C, Kurth T, Kraywinkel K, Wagner M, Busse O, Haberl R, Diener H; for the German Stroke Data Bank Collaborators. Assessment of functioning and disability after ischemic stroke. Stroke. 2002; 33: 2053–2059.
Young F, Lees K, Weir C. Strengthening acute stroke trials through optimal use of disability end points. Stroke. 2003; 34: 2676–2680.
van Swieten J, Koudstaal P, Visser M, Schouten H, van Gijn J. Interobserver agreement for the assessment of handicap in stroke patients. Stroke. 1988; 19: 604–607.
D’Olhaberriague L, Litvan I, Mitsias P, Mansbach HH. A reappraisal of reliability and validity studies in stroke. Stroke. 1996; 27: 2331–2336.
de Haan R, Limburg M, Bossuyt P, van der Meulen J, Aaronson N. The clinical meaning of Rankin “handicap” grades after stroke. Stroke. 1995; 26: 2027–2030.
Mahoney F, Barthel D. Functional evaluation: the Barthel Index. Md State Med J. 1965; 14: 56–61.
Kay R, Wong KS, Perez G, Woo J. Dichotomizing stroke outcomes based on self-reported dependency. Neurology. 1997; 49: 1694–1696.
Celani M, Cantisani T, Righetti E, Spizzichino L, Ricci S. Different measures for assessing stroke outcome: an analysis from the international stroke trial in Italy. Stroke. 2002; 33: 218–223.
Grotta J. Lubeluzole treatment of acute ischemic stroke. The US and Canadian Lubeluzole Ischemic Stroke Study Group. Stroke. 1997; 28: 2338–2346.
Connell FA, Koepsell TD. Measures of gain in certainty from a diagnostic test. Am J Epidemiol. 1985; 121: 744–753.
Hacke W, Albers G, Al Rawi Y, Bogousslavsky J, Davalos A, Eliasziw M, Fischer M, Furlan A, Kaste M, Lees KR, Soehngen M, Warach S; for the DIAS Study Group. The desmoteplase in acute ischemic stroke trial (DIAS): a phase II MRI-based 9-hour window acute stroke thrombolysis trial with intravenous desmoteplase. Stroke. 2005; 36: 66–73.
Kwon S, Hartzema A, Min Lai S, Duncan P. Disability measures in stroke: relationship among the Barthel Index, the Functional Independence Measure, and the modified Rankin Scale. Stroke. 2004; 35: 918–923.
Wellwood I, Dennis MS, Warlow CP. A comparison of the Barthel Index and the OPCS disability instrument used to measure outcome after acute stroke. Age Ageing. 1995; 24: 54–57.
Warlow C, Dennis M, van Gijn J, Hankey G, Sandercock P, Bamford J, Wardlaw J. The organization of stroke services: outcome. In: Stroke: A Practical Guide to Management. Malden, Mass: Blackwell Sciences; 1996: 746–753.