Refining 3 Measures to Construct an Efficient Functional Assessment of Stroke
Background and Purpose—The Fugl-Meyer Assessment motor scale, Postural Assessment Scale for Stroke patients, and Barthel Index are widely used to assess patients’ upper extremity and lower extremity motor function, balance, and basic activities of daily living after stroke, respectively. However, these 3 measures (72 items) require a great amount of time for assessment. Therefore, we aimed to develop an efficient test, the Functional Assessment of Stroke (FAS).
Methods—The FAS was constructed from 4 short-form tests of the Fugl-Meyer Assessment-upper extremity, Fugl-Meyer Assessment-lower extremity, Postural Assessment Scale for Stroke patients, and Barthel Index based on the results of Rasch analyses and the items’ content. We examined the psychometric properties of the FAS, including Rasch reliability, concurrent validity, convergent validity, known-group validity, and responsiveness.
Results—The FAS contained 29 items (10, 6, 8, and 5 items for the 4 short-form tests, respectively). The FAS demonstrated high Rasch reliability (0.92–0.94), concurrent validity (r=0.90–0.97 with the original tests), convergent validity (r=0.62–0.94 with the 5-scale Fugl-Meyer Assessment), and known-group validity (significant difference in the FAS scores among 3 groups of disability levels; P<0.001). In addition, the responsiveness of the FAS (standardized response mean=0.55–1.93) was similar or significantly superior to those of the original tests (standardized response mean=0.46–1.39).
Conclusions—The FAS contains 29 items and has sufficient Rasch reliability, validities, and responsiveness. These findings support that the FAS is efficient for reliably and validly assessing upper extremity/lower extremity motor function, balance, and basic activities of daily living and for sensitively detecting change in those functions in patients with stroke.
The Fugl-Meyer Assessment (FM) motor scale, Postural Assessment Scale for Stroke patients (PASS), and Barthel Index (BI) are widely used in both clinical and research settings. The FM motor scale contains 2 subscales, the upper extremity (UE) subscale (FM-UE) and lower extremity (LE) subscale (FM-LE), for assessing patients’ UE and LE motor functions, respectively.1 The PASS assesses balance function,2 and the BI assesses basic activities of daily living (BADL).3 Particularly, both the FM motor scale and PASS are stroke-specific and performance-based measures. These 3 measures (the FM motor scale, PASS, and BI) have sound psychometric properties in patients with stroke.2,4,5 However, administering all 3 measures to a patient would be time consuming and pose a heavy administrative burden on raters and patients. That is, the 50-item FM motor scale, 12-item PASS, and 10-item BI can hardly be completed in 1 session, especially in a time-limited clinical setting. To overcome the prohibitive assessment time and great administrative burden, a brief measure with sound psychometric properties is required to improve the efficiency of assessing patients’ UE/LE motor functions, balance, and BADL.
This study had 2 purposes. The first was to develop the Functional Assessment of Stroke (FAS) based on the FM-UE, FM-LE, PASS, and BI for efficiently assessing UE/LE motor function, balance, and BADL. The second was to examine the psychometric properties of the FAS (including Rasch reliability, concurrent validity, convergent validity, known-group validity, and responsiveness) in patients with stroke.
The data were extracted from a prospective study,6 wherein the patients were consecutively recruited from a medical center in Taiwan. The study recruited participants who had (1) a diagnosis of stroke, (2) first onset of stroke, (3) onset of stroke within 14 days before hospitalization, (4) the ability to follow commands, and (5) the ability to give informed consent personally or by proxy. Patients who had other major diseases (eg, arthritis and Parkinson disease), which may have affected their functional abilities, were excluded from the study. The participants were assessed using the FM, PASS, and BI at 3 time points (ie, 14, 30, and 90 days after stroke).
Development of the FAS
The FAS included 4 short-form tests: the short-form FM-UE (SF-FM-UE), short-form FM-LE (SF-FM-LE), short-form PASS (SF-PASS), and short-form BI (SF-BI). We designed the short-form tests based on their original tests (the FM-UE, FM-LE, PASS, and BI) by examining the unidimensionality of the 4 original tests and selecting items to construct the 4 short-form tests of the FAS.
Validation of the FAS
The Rasch reliability of the FAS was obtained from the 4-dimensional Rasch analysis. We examined the concurrent validity, convergent validity, known-group validity, and responsiveness of the FAS across the 3 time points.
The FM1 contains 5 scales: motor (including the UE subscale [score range, 0–66 points] and LE subscale [0–34]), sensory (0–24), balance (0–14), joint range of motion (0–44), and joint pain (0–44). The total score of the 5-scale FM (0–226) represents overall sensorimotor function after stroke. The FM has good interrater reliability and concurrent validity in patients with stroke.4,7–9
Postural Assessment Scale for Stroke Patients
The PASS2 was developed to evaluate balance function in patients with stroke. The PASS has 12 items, and the score range of the PASS is 0 to 36 points. The reliability, validity, and responsiveness of the PASS are good in patients with stroke.2,6
Development of the FAS
To examine the unidimensionality of each original test (ie, FM-UE, FM-LE, PASS, and BI), we conducted unidimensional Rasch analyses using the partial credit model.11 We used the partial credit model instead of other item response theory models (eg, 2- or 3-parameter logistic models) because the partial credit model has advantages, such as specific objectivity (ie, the comparisons of individuals’ abilities are independent of which particular set of items is administered) and invariance (ie, the hierarchy of item difficulties is identical for patients with different abilities).12–14 We examined the item-model fit and deleted the items that did not fit the assumptions of the Rasch model. The item-model fit was investigated by infit and outfit mean-squares. The values of both infit and outfit mean-squares were ≈1.00, representing good item-model fit. An item with both infit and outfit mean-square values of >1.40 indicated that the item-model fit was poor. In each original test, we deleted 1 misfit item at a time and conducted another unidimensional Rasch analysis until all remaining items had good item-model fit.
To select a set of important items to construct the FAS with a wide range of item difficulties, we selected the items according to the content assessed by the items and parameters of difficulty (estimated using a 4-dimensional Rash analysis).
Validation of the FAS
We examined the Rasch reliability of the FAS at both group and individual levels from the previous 4-dimensional Rasch analysis. The group-level Rasch reliability was investigated using an overall reliability coefficient (possible values ranging from 0 to 1). A value of ≥0.70 was acceptable, and one of ≥0.90 was excellent.15 The individual-level Rasch reliability was represented using the percentages of the participants whose individual reliability coefficients were ≥0.70 and 0.90, respectively.
We investigated the extent of correlation between the scores of the short-form tests of the FAS and those of the corresponding original tests using Pearson correlation coefficient (r). An r<0.40 was considered poor concurrent validity; 0.40 to 0.74, acceptable; and ≥0.75, high.2 In addition, we estimated the 95% confidence interval (CI) of the r using a bootstrap analysis with 10 000 bootstrap samples.
We used Pearson r to examine the correlations between the scores of the FAS and those of the FM. A Pearson r>0.60 was considered good convergent validity.16
First, we used the BI scores to classify the participants into 3 groups of disability levels: mild (≥10 points), moderate (3–9), and severe (≤2) disability.17 Next, we examined whether the mean scores of the FAS among these 3 groups were significantly different using ANOVA and post hoc analysis. Cohen d was calculated to examine the extent to which FAS can discriminate patients with different levels of disability. A value of Cohen d≥0.50 indicates moderate known-group validity, and d≥0.80 indicates good.18 In addition, the overlapping coefficient (OVL) was calculated to investigate the overlap of the FAS scores among the 3 groups of disability levels.19 The OVL ranged from 0% to 100%, and smaller OVL represented smaller overlap between a pair of groups.
Paired t test and standardized response mean (SRM) were adopted to examine the responsiveness of the FAS and original tests during 14 to 30 and 14 to 90 days after stroke (ie, 14–30 days’ responsiveness and 14–90 days’ responsiveness). The 95% CIs of the 14 to 30 days’ SRM and 14 to 90 days’ SRM were estimated using a bootstrap analysis. For the UE and LE motor tests, a value of SRM≥0.20 was considered to indicate good responsiveness during 14 to 30 days after stroke and a value of ≥0.50 indicates good responsiveness during 14 to 90 days after stroke.20 For the balance and BADL tests, a value of SRM≥0.50 was considered to indicate good responsiveness during 14 to 30 days after stroke; a value of ≥0.80 indicates good responsiveness during 14 to 90 days after stroke.20 The criteria of the UE and LE motor tests were looser than those of the balance and BADL tests because progress in UE and LE motor function has been shown to be less than that in balance and BADL during the first 3 months after stroke.20
To further compare the responsiveness of the FAS and those of the original tests in all participants and the 3 groups of disability levels, respectively, we conducted a bootstrap analysis to examine whether the SRM of each short-form test in the FAS was significantly different from that of its corresponding original test. The differences in SRM between the FAS and the original tests were calculated in each of the 10 000 bootstrap samples, and the 97.5% CI of those SRM differences was estimated. If a 97.5% CI was >0, the responsiveness of a certain short-form test in the FAS was considered significantly better than that of its corresponding original test; if it included 0, not significantly different; and if <0, significantly worse. The CI was set at 97.5% (calculated by the formula: 100%−5%/2) because we used the Bonferroni method to adjust the level of significance (0.05) of 2 sets of comparisons (14–30 and 14–90 days’ responsiveness of the FAS versus those of the original tests).
Demographics of the Participants
About one tenth (9.5%) of the eligible patients did not agree to participate in this study. The data of the 301 participants obtained at 14 days after stroke were used for developing the FAS (Table 1). The participants’ UE/LE motor function, balance, and BADL were scattered across the entire ranges of the original tests. A total of 293, 258, and 209 patients were assessed at 14, 30, and 90 days after stroke, respectively, and had complete data that were used for examining validities and responsiveness.
Development of the FAS
Unidimensional Rasch analyses of the 4 original tests revealed that 7 items of the FM-UE, 7 items of the FM-LE, and 1 item of the BI failed to conform with the assumptions of the Rasch model (both infit and outfit mean-square values >1.40). However, the item, speed of heel to opposite knee, of the FM-LE had item fit ≈1.40 and was the most difficult item assessing LE motor function, so we retained this item for further analyses to cover a wider range of patients’ function. Thus, 14 items were removed and the 58 fitting items were retained. The Rasch parameters of the fitting items are shown in the online-only Data Supplement I.
We then analyzed the 58 fitting items with a 4-dimensional Rasch analysis. We selected a total of 29 items, which scattered evenly over the range of the difficulty continuum (online-only Data Supplement I), to construct the FAS: 10 items for the SF-FM-UE, 6 items for the SF-FM-LE, 8 items for the SF-PASS, and 5 items for the SF-BI. The mean item difficulties of the 4 short-form tests of the FAS were 0.0.
Validation of the FAS
With respect to the FAS scores of the participants, the mean scores of the short-form tests were in the range of −1.0 to 0.9 (Table 1), except for that of the SF-BI (−2.1). Each short-form test of the FAS contained 264 scoring points representing the participants’ function of UE/LE motor status, balance, or BADL, respectively.
Table 2 shows the group- and individual-level Rasch reliabilities of the FAS and the fitting items of the original tests. The group-level Rasch reliability of the FAS and the fitting items were >0.90. For individual-level Rasch reliabilities of the FAS, >94% of the participants had individual reliability coefficients ≥0.90, except that of the SF-BI (66%). For the fitting items, all the participants’ Rasch reliability coefficients were ≥0.90, except that of the 9-item BI (53%).
The scores of the FAS were highly correlated (r≥0.90) with the corresponding original tests across the 3 time points. The bootstrap analysis showed that the lower bounds of the 95% CIs of the rs were ≥0.90 (online-only Data Supplement II).
The correlation between the FAS scores and the FM total scores ranged from 0.62 to 0.94 across the 3 time points (online-only Data Supplement III).
The post hoc analysis showed that the mean scores of the FAS were significantly different among the 3 groups of disability levels (P<0.001; online-only Data Supplement IV), and Cohen d’s ranged from 0.87 (OVL=66.4%) to 4.26 (OVL=3.3%).
The 14 to 30 days’ SRMs of the SF-FM-UE and SF-FM-LE were ≥0.50 (95% CIs=0.42–0.81; Table 3) and those of the SF-PASS and SF-BI were ≥0.80 (95% CIs=0.56–1.49). The 14 to 90 days’ SRMs of the FAS ranged from 0.90 to 1.93 (95% CIs=0.78–2.25; Table 3).
The comparisons between the SRMs of the FAS and those of the original tests showed that the 97.5% CIs of the SRM differences within both 14 to 30 and 14 to 90 days were >0 (Table 3), except that of the SF-FM-UE versus FM-UE within 14 to 30 days, which included 0. In addition, regardless of the disability levels of the patients, the 97.5% CIs of the SRM differences between the FAS and the original tests were >0 or included 0 (online-only Data Supplement V).
To the best of our knowledge, the FAS is the first measure to use a multidimensional approach to assess the 4 important functions (UE and LE motor function, balance, and BADL) in patients with stroke. The FAS, as compared with the fitting items of the original tests, has half the number of items (29 items in the FAS versus 58 items in the fitting items) and comparable psychometric properties. Although we did not record exactly how much time was saved by administering the FAS, it seems reasonable to suppose that the FAS could needed about halve the time for administering the original tests because half the number of items are needed. These features of the FAS can reduce the administrative time and burden for users to improve the efficiency of functional assessments.
We found that the group- and individual-level Rasch reliabilities of the FAS were similarly high to those of the corresponding fitting items. Particularly, the number of items of the FAS is only a half that of the fitting items. These features can be ascribed to the estimation of the FAS scores using multidimensional Rasch analysis. In multidimensional Rasch analysis, a patient’s response on each short-form test can provide information for simultaneously estimating the patient’s functional level on all 4 short-form tests. Therefore, although the FAS contains fewer items than the original 4 tests, it can precisely estimate UE/LE motor, balance, and BADL functions in a group of patients with stroke (eg, a group of participants in a research setting) or in an individual patient (eg, a patient in a clinical setting).
In the FAS, the individual-level Rasch reliability of the SF-BI (66% of the participants) was lower than that of the other short-form tests (94%–97%). This phenomenon may be ascribed to the greater difficulty of the SF-BI items for patients with subacute stroke (mean score of the participants=−2.1 versus mean item difficulty=0.0). Because of the mismatch between patients’ function and item difficulty, the SF-BI may not be able to estimate BADL function with excellent precision, especially that of patients with lower BADL function. Notably, >90% of the patients’ individual-level Rasch reliability coefficients were >0.70. This finding indicates that the SF-BI has acceptable individual-level Rasch reliability for assessing BADL in patients with stroke.
The results on concurrent validity showed that the scores of the short-form tests of the FAS were highly correlated with those of their corresponding original tests. In addition, the results on convergent validity demonstrated that the FAS scores were moderately to highly correlated with the total scores of the FM. These results indicate that the FAS has good concurrent validity and convergent validity, supporting that the constructs assessed by the FAS are UE/LE motor function, balance, and BADL.
The results of the known-group validity showed that the mean scores of the FAS were significantly different among the 3 groups of disability levels, and the differences were large. These findings support the good known-group validity of the FAS and indicate that the FAS can be used to discriminate patients with different levels of disability after stroke.
The 14 to 30 days’ SRMs of the FAS ranged from 0.46 to 0.76 and the 14 to 90 days’ SRMs ranged from 0.90 to 1.93, indicating good responsiveness of the FAS. We also found that the 97.5% CIs of the SRM differences between the FAS and the original tests were >0 or included 0 across different periods after stroke (14–30 and 14–90 days) and disability levels (mild, moderate, and severe) of the patients. These findings indicate that the responsiveness of the FAS is similar or superior to those of the original tests, regardless of the participants’ severity of disability or duration after stroke onset. The similar or even better responsiveness of the FAS may result from the fact that the FAS, using multidimensional Rasch estimates, provides more scoring points (ie, 264 points) than the original tests (ie, 21–66 points) to represent patients’ motor function, balance, and BADL.21–23 The number of scoring points of the FAS is large because the multidimensional Rasch model used all the testing information from the 4 short-form tests. The additional scoring points of the FAS are likely able to sensitively reflect minor progress of the patients and may have improved the responsiveness of the FAS. These findings indicate that the FAS is at least as sensitive as the original tests for detecting patients’ progress in UE/LE motor function, balance, and BADL. In addition, for the FAS and the original tests, their 14 to 90 days’ responsiveness was higher than their 14 to 30 days’ responsiveness. This finding supports that both the FAS and original tests can sensitively detect the different levels of recovery (ie, progress within 14–90 days is greater than that within 14–30 days) in patients with stroke.
An advantage of the FAS score is that it is rated on an interval scale, but the raw scores of the original tests are rated on ordinal scales.24 The interval-level FAS score, with equal differences between any 2 adjacent points,25 seems to be more useful for quantifying the extent of change in UE/LE motor function, balance, or BADL and for comparisons within or between patients with stroke. For example, if a patient’s SF-BI score improves twice (eg, from 1 to 2 points and from 2 to 3 points) during time, the extents of his/her BADL progress are the same between these 2 improvements. In contrast, the same difference scores of the original tests (on an ordinal scale) may not represent identical extents of real change because of the unequal scale units.
It should be noted that all 4 short-form tests of the FAS have to be administered together to obtain the aforementioned psychometric properties and advantages because comprehensive information from all 4 tests (dimensions) is needed to simultaneously estimate the FAS scores of the UE/LE motor function, balance, and BADL. In addition, prospective users may find it difficult to conduct multidimensional Rasch analysis to obtain the FAS score. To solve this problem, users can contact the authors to request assistance in obtaining the FAS scores.
This study had 4 limitations. First, the participants were recruited from only 1 medical center, and all of them were in the subacute stage. Such sampling bias might have compromised the generalizability of our findings. Second, we used the same data set to develop and validate the FAS. This approach might have caused overestimation of the psychometric properties of the FAS. Third, the FAS, having fewer items than the original tests, may compromise the comprehensiveness of the contents and provide less practical information for clinicians to set treatment goals or plans. For example, the SF-BI of the FAS does not contain the item of feeding, so clinicians might not be certain whether patients can feed themselves or should receive feeding training. Finally, although we provide preliminary support for the use of the FAS, further examination of the test–retest reliability and feasibility of the FAS is needed to fully demonstrate the psychometric properties and use of the FAS.
The FAS contains 29 items (≈40% of the total number of items in their original tests) and has good Rasch reliability, concurrent validity, convergent validity, known-group validity, and responsiveness. These findings indicate that the FAS can be used to efficiently, reliably, and validly assess those important functions and sensitively detect the change in those functions in patients with stroke.
Sources of Funding
This study was supported by research grants from Chung Shan Medical University and Chi Mei Medical Center (CSMU-CMMC-103-05 and CMCSMU10305).
The online-only Data Supplement is available with this article at http://stroke.ahajournals.org/lookup/suppl/doi:10.1161/STROKEAHA.116.015516/-/DC1.
- Received September 20, 2016.
- Revision received February 17, 2017.
- Accepted March 14, 2017.
- © 2017 American Heart Association, Inc.
- Sanford J,
- Moreland J,
- Swanson LR,
- Stratford PW,
- Gowland C
- Hsueh IP,
- Lin JH,
- Jeng JS,
- Hsieh CL
- Mao HF,
- Hsueh IP,
- Tang PF,
- Sheu CF,
- Hsieh CL
- Duncan PW,
- Propst M,
- Nelson SG
- Boomsma A,
- van Duijn MAJ,
- Snijders TAB
- Rost J
- Reise SP,
- Revicki DA
- Ostini R,
- Finkelman M,
- Nering M
- Govan L,
- Langhorne P,
- Weir CJ
- Cohen J
- Lee KB,
- Lim SH,
- Kim KH,
- Kim KJ,
- Kim YR,
- Chang WN,
- et al
- Hsueh IP,
- Wang WC,
- Wang CH,
- Sheu CF,
- Lo SK,
- Lin JH,
- et al