Comparison of Psychometric Properties of Three Mobility Measures for Patients With Stroke
Background and Purpose— This study compared the validity, responsiveness, and interrater reliability of 3 mobility measures in stroke patients from the acute stage up to 180 days after stroke onset. The 3 measures were the Rivermead Mobility Index (RMI), a modified RMI (MRMI), and the Mobility Subscale of the Stroke Rehabilitation Assessment of Movement (STREAM).
Methods— The validity and responsiveness of the 3 mobility measures were prospectively examined by monitoring 57 stroke patients with the measures and the Barthel Index at 14, 30, 90, and 180 days after stroke onset. Two individual raters used the 3 measures to evaluate a different sample of 40 patients on 2 separate occasions to determine the interrater reliability.
Results— The Spearman ρ between STREAM and MRMI was ≥0.92; the intraclass correlation coefficient (ICC, a measure of agreement) between them was ≥0.89, indicating high concurrent validity of both measures. RMI showed a moderate to high relationship and agreement with STREAM and MRMI (ρ≥0.78, ICC≥0.5). Responsiveness of the 3 measures was high before 90 days after stroke onset (standardized response mean ≥0.83) and low at 90 to 180 days after stroke onset (0.2≤standardized response mean≤0.4). The score changes of the 3 measures at each stage were significant (P≤0.05), except for RMI and MRMI at 90 to 180 days after stroke onset. The interrater agreement of the 3 measures was high (ICC≥0.92).
Conclusions— All 3 measures examined showed acceptable levels of reliability, validity, and responsiveness in stroke patients. The psychometric characteristics of STREAM were slightly superior to those of the other 2 measures among our patients. We prefer and recommend STREAM for measuring mobility disability in stroke patients.
Mobility is essential to developing an independent lifestyle after stroke and is perhaps the particular ability that patients consider the most important.1 Therefore, improving mobility is 1 of the major goals of stroke rehabilitation.2 To identify mobility disabilities and manage their associated problems, both clinicians and researchers need a measure of mobility that is simple to administer and has sound psychometric properties (eg, reliability, validity, and responsiveness).3
The Rivermead Mobility Index (RMI), one of the few clinical mobility measures specifically designed for stroke patients, is the only mobility measure endorsed by the US Agency for Health Care Policy and Research.4 RMI has been used in several studies to examine treatment effect on mobility.5 According to Lennon and Hastings,6 the responsiveness of RMI is poor because its items are scored on a dichotomous (yes/no) basis. Therefore, the modified RMI (MRMI)2 was developed to increase the responsiveness of the measure by extending the scoring level to 6 points. However, the psychometric properties of the MRMI have yet to be systematically explored. The Mobility Subscale of the Stroke Rehabilitation Assessment of Movement Measure (STREAM)7 that assesses mobility after stroke is a more recently developed measure. STREAM is simple to administer and is reliable and valid in stroke patients.7,8 Although the 3 measures appear to be accepted by both clinicians and researchers to measure mobility disability after stroke, to the best of our knowledge, no empirical data exist to determine which measure is most psychometrically sound.
Comparison of the psychometric properties of clinical mobility measures can provide useful guidelines for both clinicians and researchers to determine an objective and scientific measure.9 The purpose of this study was to compare the reliability, validity, and responsiveness of RMI, MRMI, and STREAM in stroke patients.
Subjects were recruited from stroke patients consecutively admitted to National Taiwan University Hospital between April 1 and December 31, 2001. Patients were included in the study if they met the following criteria: (1) diagnosis (International Classification of Diseases, ninth revision, clinical modification [ICD-9] codes) of cerebral hemorrhage (ICD-9, 431) or cerebral infarction (ICD-9, 434), (2) first onset of stroke without other major disease and the absence of a preexisting disability, (3) stroke onset within 14 days before admission, (4) ability to follow instructions, (5) willingness to participate in this study, and (6) residence in the greater Taipei area. The clinical diagnosis of stroke was confirmed by neuroimaging examination. Patients who suffered another stroke or had another major disease during the follow-up period were excluded.
The study protocol consisted of 2 parts. The first part was a validity and responsiveness study. The 3 mobility measures and the Barthel Activities of Daily Living Index (BI)10 were administered to patients at 14, 30, 90, and 180 days after stroke onset. The BI was used as the external criterion to examine convergent and predictive validity. Initial stroke severity was ascertained with the Canadian Neurological Scale11 applied retrospectively to medical records. Degrees of responsiveness of the mobility measures were calculated from the changes occurring between 14 to 30, 30 to 90, 90 to 180, 14 to 90, and 14 to 180 days after stroke onset.
When necessary, patients were allowed to rest during the testing protocol, which lasted ≈1 hour. All of the above assessments were made by occupational therapist A, according to previously published standardized methods of administration.2,11–14
The second part of the protocol was an interrater reliability study. Forty stroke patients at a rehabilitation unit participated in this part of the study. The 3 mobility measures were administered separately by 2 occupational therapists (A and B). To minimize the effects of possible recovery, assessments were administered within a 24-hour period according to a counterbalanced sequence. The therapists were blinded to the results of each other’s assessments during the study period.
RMI, which covers a range of hierarchical activities from turning over in bed to running, comprises 14 questions and 1 direct observation.13 Each patient’s mobility performance is rated primarily by interviewing the patients and/or their primary caregiver. The highest score, 15, indicates the highest mobility status. Although previous studies3,13,15 found that RMI had good psychometric properties in stroke patients, sample sizes in 2 of these studies were modest (≤23),13,15 limiting generalization of their results.
MRMI has 8 test items: turning over, changing from lying to sitting, maintaining sitting balance, going from sitting to standing, standing, transferring, walking indoors, and climbing stairs. Scores of the MRMI range from 0 to 40. One main characteristic of the MRMI is that patients are scored by direct observation of their performance on the items. Lennon and Johnson2 have proposed that the psychometric characteristics of the MRMI be examined further.
STREAM7 measures mobility after stroke by direct observation. It contains 10 four-point items: rolling, bridging, going from supine to sitting, changing from sitting to standing, standing, placing affected foot onto first step, stepping backward, stepping to affected side, walking 10 m, and walking down stairs. Scores of STREAM range from 0 to 30. Although the reliability and validity of STREAM are high in stroke patients,8,12 its responsiveness has not been reported.
The BI is a weighted scale of 10 items of basic activities of daily living (ADL).10 The highest score, 100, indicates that the patient is fully independent in physical function; the lowest score, 0, represents a totally dependent, bedridden state. The reliability, validity, and responsiveness of the BI are well established in stroke patients.16,17
Stroke severity at admission was determined by the Canadian Neurological Scale as described by Goldstein and Chilukuri.11 The score ranges from 0 to 11.5. This instrument has been shown to be valid and reliable in assessing stroke severity.11,18
A mobility measure should be able to reflect the whole range of mobility disability after stroke. We calculated the floor and ceiling effects, representing the percentage of subjects achieving the lowest and highest scores possible, respectively. Floor and ceiling effects exceeding 20% of sample size are considered to be significant,19 indicating that the measure can represent only a limited range of mobility disability.
Concurrent validity is usually established by demonstrating a high correlation or agreement between the measure and a gold standard.16 Because each of the 3 measures used has a different score range, the scores from each measure were converted to a range of 0 to 100.16 The relationship and agreement between the 3 mobility measures at 4 time points were examined by use of the Spearman correlation coefficient (ρ) and the intraclass correlation coefficient (ICC), respectively.
The convergent validity of the mobility measures was assessed by examining the relationships between the total scores of the mobility measures and those of the BI at all 4 time points after stroke using Spearman’s ρ. The predictive validity of the mobility measures was assessed by examining the associations between results of the mobility measures at 3 time points (14, 30, and 90 days after stroke onset) and those of the BI at 180 days after stroke onset using the Spearman ρ.
Responsiveness was examined with the standardized response mean (SRM), 1 type of effect size. SRM was calculated by dividing the mean change scores by the SD of the change score in the same subjects. An effect size >0.8 is usually considered large; 0.5 to 0.8, moderate; and 0.2 to 0.5, small.20 Wilcoxon matched-pairs signed-rank tests were performed to determine the statistical significance of the change scores.
The interrater agreement on individual items of the mobility measures was analyzed with the quadratic weighted κ statistic. The interrater agreement of the total score of the mobility measures was analyzed with the ICC statistic. The fixed effect of ICC model 321 was used to compute the ICC value for interrater reliability. Both weighted κ and ICC values ≥0.80 indicate very good agreement; 0.60 to 0.79, good agreement; 0.40 to 0.59, moderate agreement; 0.20 to 0.39, fair agreement; and 0 to 0.2, poor agreement.18
A total of 59 patients met the selection criteria. We excluded 157 patients because of stroke onset that was more than 14 days before admission, the patient living outside the greater Taipei area, the occurrence of recurrent stroke, and/or communication difficulties. Two patients who met the selection criteria declined to participate in the validity and responsiveness study. Of the remaining 57 patients, 43 completed the follow-up at 180 days after stroke. The Canadian Neurological Scale scores showed that patients had a broad range of severity (from mild to severe stroke) at admission. In addition, the BI scores indicated that patients were severely disabled at baseline. Nonetheless, 42% of the patients were nearly independent at 180 days after stroke onset (BI ≥95). Characteristics of the study sample are presented in Table 1.
The interquartile score range of the RMI at baseline was quite limited (Table 1). Furthermore, Table 2 shows that, except for RMI, none of the mobility measures exhibited significant floor or ceiling effects at the 4 time points after stroke. These results indicate that MRMI and STREAM demonstrated acceptable distribution from the acute stage up to 180 days after stroke onset.
Table 3 shows that the correlation (ρ≥0.92) and agreement (ICC≥0.89) between STREAM and MRMI were high, indicating that both measures had high concurrent validity. RMI showed moderate to high concurrent validity (ρ≥0.78, ICC≥0.50) when evaluated against STREAM and MRMI. Table 4 shows that the 3 mobility measures had high convergent validity (ρ≥0.72) and acceptable predictive validity (ρ≥0.5).
The 3 mobility measures were highly responsive in detecting changes before 90 days after stroke onset (14 to 30 days, SRM≥1.14; 30 to 90 days, SRM≥0.83; Table 5). At 90 to 180 days after stroke onset, the levels of responsiveness of these measures, as expected, were low (0.2≤SRM≤0.4). The changes shown by the 3 measures at each stage were all significant (P<0.05), except for those shown by RMI and MRMI at 90 to 180 days after stroke onset (P>0.14).
Forty patients participated in the interrater reliability study. This sample consisted of 19 men and 21 women with a mean age of 63 years (SD, 10.2 years). The medians of the weighted κ statistic for each item of RMI, MRMI, and STREAM were 0.71 (range, 0.37 to 0.94), 0.72 (range, 0.47 to 0.9), and 0.81 (range, 0.55 to 0.89), respectively, indicating generally acceptable interrater agreement on the item level. Four RMI items, 2 MRMI items, and 1 STREAM item had fair to moderate agreement (0.37≤κ≤0.6). The ICCs for the total scores of RMI, MRMI, and STREAM were 0.92 (95% confidence interval [CI], 0.84 to 0.96), 0.95 (95% CI, 0.90 to 0.97), and 0.97 (95% CI, 0.95 to 0.99), respectively, indicating excellent total score agreement.
A simple and psychometrically sound mobility measure enables both clinicians and researchers to identify, monitor, and manage mobility disability after stroke.3,9 This study is the first to compare the psychometric properties of the RMI, MRMI, and STREAM mobility measures in stroke patients concurrently and systematically. In addition, patients in the study were monitored at 4 specific time points after stroke for an extended period (up to 180 days after stroke onset) to evaluate how appropriate these measures were for use at different recovery stages after stroke. The findings of this study can provide information useful for the selection of mobility measures for both clinicians and researchers.
The score distributions of the measures of the study sample should not exhibit severe ceiling or floor effects. We found that all 3 mobility measures demonstrated acceptable distributions from the acute stage up to 180 days after stroke onset, except for RMI, which at 14 days after stroke onset showed a limited score range and a notable floor effect. These results indicate that the RMI might not adequately characterize patients’ mobility functions in the early stages of stroke, especially for patients with severe disabilities.
The validity of the 3 mobility measures has been reported in previous studies.2,3,8,12,13 However, given the notable heterogeneity of stroke effects, comparison with previous results was difficult because they were not examined in the same group of patients. The validity of the 3 measures was first compared in a cohort of patients in this study. We found that STREAM and MRMI had high concurrent, convergent, and predictive validity, whereas RMI was not highly correlated with either STREAM or MRMI. It is of note that the RMI score was retrieved mainly via interview, whereas the STREAM and MRMI scores were based on direct observation of a patient’s performance. Patients may overstate their functional abilities.22 Our results suggest that among our patients, STREAM and MRMI were more valid measures of mobility after stroke than RMI. However, the differences in validity between the 3 measures may not be statistically significant.
Responsiveness is important for any measurement tool designed to measure change over time.23 Nonetheless, the responsiveness of most mobility measures has yet to be examined. The 3 mobility measures were highly responsive in detecting changes before 90 days after stroke onset. All changes in the 3 measures at each stage were significant, except for the RMI and MRMI at 90 to 180 days after stroke onset. According to the present results, STREAM was slightly more responsive than the other 2 measures. Responsiveness of the 3 measures was, as expected, low at later stages of recovery (90 to 180 days after stroke onset). One possible explanation is that patients’ improvements in mobility had reached a plateau after 90 days after stroke onset. Moreover, balance, motor, and ADL functions showed only minor improvement after 90 days after stroke onset.24,25 Another possible explanation is that these 3 mobility measures lack items sensitive enough to detect change >90 days after stroke onset.
Interestingly, MRMI was no more responsive than RMI, despite the fact that MRMI, with more scoring levels, was revised from RMI to make it more responsive.2 Some recent studies have demonstrated that an increase in the number of items or grading levels does not necessarily improve the responsiveness or difference detection between patient groups of mobility measures,26 balance measures,24 and ADL measures.16,27,28 These results support the argument that selection of a measurement tool should be based on empirical evidence, not on clinical opinion.27
Interrater agreement on individual items of mobility measures has rarely been examined. The interrater agreement of STREAM was high for individual items and the total scores. Although the total score interrater agreement of the RMI and MRMI was high, at least 2 items of both measures demonstrated only fair to moderate agreement between raters. These findings indicate that the interrater reliability of STREAM is slightly higher than that of RMI and MRMI.
Some mobility measures were not selected for comparison in this study. For example, the gait speed test (eg, 10-m walking speed test and 6-minute walking distance) is commonly used to measure mobility after stroke in both clinical and research settings. However, the gait speed test is not relevant for all patients with stroke. Mobility, by nature, is complex and multifactorial, whereas the gait speed test simply reflects 1 unique and specific dimension of mobility. Furthermore, the speed test cannot be used for the patients without the ability to walk. On the other hand, the 3 mobility measures used in this study measure patients’ performance on some tasks that reflect the multifactorial nature of mobility. Furthermore, the 3 mobility measures examined in this study are feasible for assessing most stroke patients, including those with very poor mobility.
A limitation of the present study is that the intrarater reliability of the measures was not examined. We found a high interrater reliability of the measures. Therefore, the intrarater reliability of the 3 measures might not be an issue of great concern. Another limitation is that the sample size in this study was not large enough to further analyze the data according to type or severity of stroke. Because the type of stroke or level of severity could affect the results of these measures, further studies with larger sample sizes are necessary to analyze these effects on the psychometric characteristics of the measures. Furthermore, the psychometric properties of STREAM appeared to be slightly better than those of the other 2 measures (eg, RMI showed a notable floor effect in the early stages of stroke; the score changes of RMI and MRMI at 90 to 180 days after stroke onset were not significant). However, the psychometric differences among the 3 measures may not be statistically significant.
In summary, the 3 mobility measures examined demonstrated acceptable levels of reliability, validity, and responsiveness among our stroke patients. The psychometric characteristics of STREAM were slightly superior to those of RMI and MRMI. We prefer and recommend STREAM for assessing mobility disability after stroke in both clinical and research settings.
This study was supported by research grants from the National Science Council (NSC 91-2314-B-002-353 and NSC 91-2314-B-002-354).
- Received October 29, 2002.
- Revision received January 17, 2003.
- Accepted January 28, 2003.
Chiou I, Burnett CN. Values of activities of daily living: a survey of stroke patients and their home therapists. Phys Ther. 1985; 65: 901–906.
Gresham G, Duncan P, Stason W. Post-Stroke Rehabilitation: Assessment, Referral, and Patient Management: Quick Reference Guide Number 16. Rockville, Md: US Agency for Health Care Policy and Research. AHCPR publication No. 95-0663, 1995.
Forlander DA, Bohannon RW. Rivermead Mobility Index: a brief review of research to date. Clin Rehabil. 1999; 13: 97–100.
Daley K, Mayo N, Danys I, Cabot R, Wood-Dauphinee S. The Stroke Rehabilitation Assessment of Movement (STREAM): refining and validating the content. Physiother Can. 1997; 49: 269–278.
Spilg EG, Martin BJ, Mitchell SL, Aitchison TC. A comparison of mobility assessments in a geriatric day hospital. Clin Rehabil. 2001; 15: 296–300.
Goldstein LB, Chilukuri V. Retrospective assessment of initial stroke severity with the Canadian Neurological Scale. Stroke. 1997; 28: 1181–1184.
Daley K, Mayo N, Wood-Dauphinee S. Reliability of scores on the Stroke Rehabilitation Assessment of Movement (STREAM) measure. Phys Ther. 1999; 79: 8–19.
Hsueh IP, Lin JH, Jeng JS, Hsieh CL. Comparison of the psychometric characteristics of the Functional Independence Measure, 5 item Barthel Index, and 10 item Barthel Index in patients with stroke. J Neurol Neurosurg Psychiatry. 2002; 73: 188–190.
Bushnell CD, Johnston DC, Goldstein LB. Retrospective assessment of initial stroke severity: comparison of the NIH Stroke Scale and the Canadian Neurological Scale. Stroke. 2001; 32: 656–660.
Cohen J. Statistical Power Analysis for the Behavioral Sciences. Hillsdale, NJ: Lawrence Erlbaum Associates; 1988.
Dorman PJ, Waddell F, Slattery J, Dennis M, Sandercock P. Are proxy assessments of health status after stroke with the EuroQol questionnaire feasible, accurate, and unbiased? Stroke. 1997; 28: 1883–1887.
Mao HF, Hsueh IP, Tang PF, Sheu CF, Hsieh CL. Analysis and comparison of the psychometric properties of three balance measures for stroke patients. Stroke. 2002; 33: 1022–1027.
Hobart JC, Thompson AJ. The five item Barthel Index. J Neurol Neurosurg Psychiatry. 2001; 71: 225–230.