(Stroke. 2001;32:681.)
© 2001 American Heart Association, Inc.
Original Contributions |
From the Department of Public Health Sciences, Guys Kings and St Thomas School of Medicine, Kings College London (UK).
Correspondence to Christopher McKevitt, PhD, Department of Public Health Sciences, Guys Kings and St Thomas School of Medicine, Kings College London, Capital House, 42 Weston Street, London SE1 3QD, UK. E-mail christopher.mckevitt{at}kcl.ac.uk
| Abstract |
|---|
|
|
|---|
MethodsData were taken from the Biomed II prospective study of stroke care and outcomes. Three-month poststroke data from 8 European centers were analyzed. Sensitivity and specificity were assessed by comparing responses to the 2 simple questions with Barthel Index and modified Rankin scale scores. Adjusting for case mix, logistic regression was used to compare patients in each center with "good" outcome (not dependent and fully recovered) at 3 months.
ResultsData for 793 patients were analyzed. For the total sample, the dependency question had a sensitivity of 88% and a specificity of 77%; the recovery question had a sensitivity of 78% and a specificity of 90%. Dependency data from Riga had much lower sensitivity. There was variation in good outcome between centers (P=0.0015). Compared with the reference center (Kaunas), patients in Dijon, Florence, and Menorca were more likely to have good outcome, after adjusting for case mix.
ConclusionsDependency and recovery questions showed generally high sensitivity and specificity. There were significant differences across centers in outcome, but reasons for these are unclear. Such differences raise particular questions about how patients interpreted and answered the simple questions and the extent to which expectations of recovery and perceived needs for assistance vary cross-culturally.
Key Words: disability outcome assessment recovery
| Introduction |
|---|
|
|
|---|
There is a drive toward comparing outcomes cross-nationally. This can be related to several factors, including the World Health Organizations goal of "Health for All,"2 the globalization of markets targeted by multinational pharmaceutical companies, and, in the European context, moves toward harmonization within the region. This drive has led to a need for indicators which are equally meaningful across countries so that questions may be expected to elicit the same types of responses.
Many measures are lengthy, making them unsuitable for large-scale studies. Two simple questions were developed by Lindley et al3 to meet the need for a simple, inexpensive, and quick way of assessing functional outcome in large numbers of stroke patients recruited to an open, randomized, controlled trial. It was suggested that the tool would also be useful to monitor patient outcome in routine practice and clinical audit.4 5 The questions were developed to assess dependency and recovery.
The 2 simple questions were developed in the United Kingdom and were used in the pilot and subsequent International Stroke Trial (IST).6 The IST authors accepted the validity and reliability of the 2 simple questions, although in fact studies testing the measures were conducted only in the United Kingdom, whereas the IST involved centers in many non-English speaking countries. Because IST data are not presented in disaggregated form, we do not have information about any possible intercountry differences in responses to the 2 simple questions. In this study we aimed to investigate the sensitivity and specificity of the 2 simple questions used in a pan-European prospective study of stroke care, resource use, and outcomes. If possible, we then aimed to compare outcomes across centers using this tool.
| Subjects and Methods |
|---|
|
|
|---|
Sociodemographic data collected include age, gender, and place of residence (at home alone, at home with carer, or institution such as nursing home). Clinical data collected included measures of stroke severity (urinary incontinence, level of consciousness, and any limb weakness at time of maximum impairment).11 At the 3-month follow-up, the patients were reassessed. In addition to clinical data collected at onset, resource use data were collected (including use of clinical services, use of social services, and assistance from informal carers), as were outcomes including death, disability measured by the Barthel Index (BI)12 and the modified Rankin scale (RS),13 and responses to the 2 simple questions.
The 2 simple questions assess (1) patient dependency through the question, "In the last 2 weeks did you require help from another person for everyday activities?" and (2) recovery through the question, "Do you feel that you have made a complete recovery from your stroke?" The questions were included in the Biomed II questionnaire as an adjunct to the agreed functional outcome measures. Following discussion at an early planning meeting of participants to introduce the questions and clarify the concepts, the questions were then translated into the relevant language by each local study coordinator.
For the present study, data from 8 centers were analyzed (ie, all centers at which the 2 simple questions were asked at 3-month follow-up. Five centers are excluded from this analysis because they did not collect either the 2-simple-questions data or 3-month follow-up data.
Testing the Validity of the 2 Simple
Questions
To investigate how successfully the 2 simple
questions captured dependency and recovery, sensitivity and specificity
were calculated. Sensitivity is defined as how accurately a question
identifies positive cases. Specificity is defined as how well a
question identifies negative
cases.14 The standard way of
calculating sensitivity and specificity of a new measure is to compare
responses to the new measure with responses to a gold standard or usual
measure. Following Lindley et
al,3 we defined dependence in
2 ways: BI score of <20 and modified RS score of 3, 4, or 5. Recovery
was defined as modified RS score 0 (a BI score of 20 indicates
independence in functional
abilities).15 The modified
RS is scored as follows: 0, no symptoms; 1 or 2, functionally
independent; 3, moderate handicap; and 5, moderate to severe
handicap.13 We further
hypothesized that recovery would be defined by responders as a return
to prestroke ability. Therefore, we also defined recovery as a 3-month
modified RS score the same as or better than the prestroke modified RS,
and 3-month BI same as or better than prestroke BI.
To test the validity of the 2 simple questions, we compared responses to these questions with BI and modified RS scores. One center, London, did not collect the modified RS score and is therefore not included in this part of the analysis. For both the dependency and the recovery questions, sensitivity and specificity were calculated for each center individually as well as for the total sample.
The algorithm developed by Lindley et al3 was used to identify 3 outcome groups. Bad outcome was defined as a positive response to the dependency question; indifferent outcome was defined as a negative response to both dependency and recovery questions; and good outcome was defined as a positive response to the recovery question.
McNemars test14 was used to compare the proportions classified with bad outcome defined as (1) BI <20 and (2) a positive response to the simple question, "In the last 2 weeks did you require help from another person for everyday activities?" This is a test of proportions for paired data and as such does not require adjustment for confounding variables. The analysis was repeated to compare good outcome using the 2 measures, ie, modified RS score of 0 and negative response to both simple questions.
To compare outcome across centers, logistic regression was used. The relationship between good outcome and center was investigated, with adjustment for age group, sex, consciousness level, any limb weakness, and incontinence.
| Results |
|---|
|
|
|---|
|
The sensitivity and specificity of the dependency and
recovery questions, using a BI score of 20 (dependency) and a modified
RS score of 0 (recovery), are reported in
Table 2
. For the total sample, the dependency question had
a sensitivity of 88% and a specificity of 77%. In individual centers,
apart from Riga, sensitivity ranged from 83% in Dijon to 100% in
Menorca; specificity ranged from 67% in Menorca to 94% in Warsaw.
Riga was exceptional, with both sensitivity and specificity being low.
For the total sample, the recovery question had a sensitivity of 78%
and a specificity of 90%. In individual centers, sensitivity ranged
from 63% in Riga to 100% in Florence and Menorca; specificity ranged
from 80% in Menorca to 100% in Warsaw. There were significant
differences in sensitivity and specificity between the centers for the
dependency question and also in sensitivity for the recovery
question.
|
Using the modified RS to examine dependency (modified RS score of 3, 4, or 5), the overall sensitivity was 83% and overall specificity 90%. When recovery was defined as 3-month modified RS score equal to or better than prestroke modified RS score, the overall sensitivity was 41% and specificity 84%. Similarly, using the BI, the overall sensitivity was 29% and specificity 95%.
Table 3
reports the difference between the 2 methods of
measuring bad and good outcomes. Overall, the BI classified a larger
proportion of respondents as having bad outcome than did the dependency
question, although the upper limit of the confidence interval was only
9.2%. There was no significant difference between the 2 methods for
most centers. For Kaunas and Menorca, however, the BI classified a
larger proportion of respondents as having a bad outcome than the
dependency question. Overall, the modified RS classified a larger
proportion of respondents as having good outcome than the 2 simple
questions, although the upper limit of the confidence interval is only
9.4%. In Almada, Riga, and Warsaw, there was no significant difference
between the 2 methods for identifying good outcome. For the remaining
centers, the modified RS classified a larger proportion of respondents
as having good outcome than the 2 simple questions.
|
Table 4
reports the distributions of the BI for those who
reported dependency using the 2 simple questions and also for those who
did not report dependency. With the exception of respondents in Riga,
high proportions of patients who were not dependent reported a BI score
of 20. In all centers, only negligible numbers who were not dependent
had BI score
9. In Riga, a large proportion (43%) of respondents who
reported dependency had BI=20. In all centers, the proportion of
respondents who reported that they had not made a complete recovery but
were scored RS=0 was small (3%). However, there were high proportions
of respondents who reported having made a complete recovery while
scoring RS=1 or 2, especially in Dijon (71%) and Kaunas
(79%).
|
Table 5
reports the unadjusted proportions classified into
each outcome category (bad, indifferent, good) by using the 2 simple
questions. After adjusting for case mix, the centers with the least
likelihood of good outcome were in Eastern Europe. The odds ratios,
compared with Kaunas, were highest in Menorca and Florence. London and
Dijon also had higher odds of a good outcome compared with
Kaunas.
|
| Discussion |
|---|
|
|
|---|
Overall, there was a significant difference between the proportion classified as having a bad outcome using the definition of BI<20 and the proportion classified as bad using the simple question. Considering the 95% confidence interval, the definition of bad outcome of BI<20 will classify at most 9.2% more as bad outcome defined by the 2-simple-questions method. However, the wider confidence interval in the estimate might represent more of problem in choosing between 2 methods when the sample size is small, as in the case of Menorca in this study, or for other unknown reasons, as in the case of Kaunas (upper limit of the 95% CI, 19.2%).
There was a similar pattern for the proportions classified as having a good outcome. Overall, the upper limit of the 95% CI for the difference was 9.4%, although the upper limits of the 95% CI for the differences in Dijon and Florence were high: 21.9% and 23.1%, respectively. This means that the modified RS score of 0 may classify considerably more respondents as having good outcome than the 2-simple-questions method. Nevertheless, the patterns of outcome identified in this way are broadly similar to previous cross-national studies of stroke10 11 in which poorer outcomes have been reported in Eastern European countries, and in the United Kingdom, compared with other Western European countries.
There are a number of potential limitations to the study, which must be taken into consideration when interpreting the outcomes presented here. First, variations in data collection methods adopted across centers for 3-month follow-up data (face-to-face interviews, telephone interviews, and postal survey) might constitute a methodological limitation. On the one hand, data collection of BI scores by telephone interview has been shown to have high validity compared with face-to-face assessment.16 On the other hand, reporting of subjective health status using the Short Form (SF)-36 is affected by the data collection method. A study comparing postal and telephone administration of the SF-36 found that health ratings were poorer and chronic illness more frequently reported in postal responses.17 Thus, we acknowledge the possibility that different data collection methods might produce different results, with more subjective assessments (general health status rather than ability to perform specific task) perhaps more liable to collection method influence. Allowing centers to use their own preferred method of data collection in this study was not ideal but was necessary to encourage the participation of all centers, including some centers in eastern Europe with limited resources. It is also a pragmatic approach to data collection that will be required should such outcome measures be used in routine practice in the future. A further limitation, as acknowledged above, relates to the small sample size of 1 center in particular, Menorca, which resulted in wide confidence intervals.
Other factors should be considered when differences in outcome across centers are interpreted. Because variations in case mix have an impact on outcomes and their interpretation, statistical correction for confounding variables is essential.18 In this study, outcome as measured by the 2 simple questions was adjusted for by using variables appropriate to the clinical condition being investigated. Nevertheless, some case-mix variation might remain unmeasured, which could account for different outcomes across centers.19 Social class data were not collected in the study, because this information is difficult to standardize across countries, especially in the elderly. However, social class might be associated with expectations of recovery20 and uptake of support services.21
Another caveat relates to the quality of the questionnaire translations from English into the local languages. In the field of cross-national quality-of-life measurement, it has been proposed that, at the very least, correct instrument translation requires forward-backward translation, as well as a test of psychometric criteria on appropriate subjects.22 Such development work was beyond the resources of this project, as well as those of the IST. The 2 simple questions were therefore translated by local study coordinators following discussions at initial project meetings attended by all participants. We were unable to monitor the quality of the translation used in each center, although it was assumed that the questions were conceptually and semantically unproblematic. This assumption may well be unfounded: the meaning of "complete recovery" in particular may be open to different interpretations. This should also be taken into consideration when interpreting differences between the 2-simple-questions scores and modified RS scores, as well as differences in 2-simple-questions outcome across centers.
The importance of our study, however, lies not so much in the outcome data themselves but in the questions raised about the kind of outcome the 2 simple questions capture. They were devised as functional outcomes. Unlike the modified RS, which was developed for assessment by an observer, the 2-simple-questions method asks the patient (or proxy) for information and therefore requires the patient to interpret the question and to decide what information to divulge. One question might be assumed to ask only for factual information: "Did you require help from another person for everyday activities?" However, even this raised problems in the original study, with some subjects being unclear about the questions meaning or intention.3 Researchers participating in this study agreed to a definition of everyday activities that encompassed basic functional tasks (ie, feeding, dressing, personal hygiene). Nevertheless, the question is open to differing interpretations, because the concept of "everyday" activities may well vary from one respondent to another, as might expectations about activities for which it is legitimate to receive assistance from another person. Factors that might influence views of what constitutes an everyday activity and views of legitimate assistance include age, prestroke activities, family situation, and local culture. A striking example is provided by Ali and Mulley23 in their study of the use of the BI in rural Pakistan, in which they concluded that the measure was not appropriate, given local customs and lifestyle.
The second question asks subjects to comment on their own feeling of recovery and therefore invites an entirely subjective assessment. Consequently, the issue of how respondents interpret the question posed is likewise of crucial importance. To interpret better the responses elicited across different age groups (younger stroke patients compared with older stroke patients) and across cultures (a stroke patient in post-Soviet Latvia compared with an Italian living in Florence or a first generation African-Caribbean living in inner-city London), we need to know more about the meanings attached to "recovery." These will surely be influenced by different expectations of recovery, disability, and well-being in different age groups, cultures, and perhaps also social class/wealth. The investigators of an American qualitative study of 102 stroke patients have reported that none of the patients interviewed, "even those who had visibly and tangibly regained lost function," considered themselves to have recovered.24 25 However, it is difficult to generalize about stroke patients concepts of recovery, because few studies have investigated their expectations, much less how these may vary in different contexts.
In conclusion, this study has tested the feasibility of using simple questions rather than longer outcome instruments in a cross-national investigation. Both simple questions showed high sensitivity and specificity, demonstrating their theoretical validity. There were significant differences across centers in outcome, but reasons for these differences remain unclear. They may reflect inadequate translation of the questions, residual case-mix variation, or different cultural expectations of recovery and needs for assistance. The 2 simple questions were proposed as a resource-efficient method of collecting data on physical outcome after stroke. The need to collect data efficiently in large-scale studies can lead to a trade-off between sophistication and feasibility. Because the 2 simple questions have been used in a well-known international study that recruited over 19 000 subjects, their feasibility has been demonstrated. However, our study suggests that this method of outcome assessment should be adopted with caution. Although designed to assess functional outcome, respondents may interpret the questions with reference to a number of criteria. Some of these, such as culturally specific views of what constitutes recovery, have not yet been investigated. However, their importance in interpreting responses to questions such as those considered here should not be discounted.
| Appendix 1 |
|---|
|
|
|---|
|
|
|
|
|
| Acknowledgments |
|---|
Received July 25, 2000;
revision received November 17, 2000;
| References |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
J. Y. Chong, H. S. Lee, B. Boden-Albala, M. C. Paik, and R. L. Sacco Gender differences in self-report of recovery after stroke: The Northern Manhattan Study. Neurology, October 10, 2006; 67(7): 1282 - 1284. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. W. Sturm, H. M. Dewey, G. A. Donnan, R. A.L. Macdonell, J. J. McNeil, and A. G. Thrift Handicap After Stroke: How Does It Relate to Disability, Perception of Recovery, and Stroke Subtype?: The North East Melbourne Stroke Incidence Study (NEMESIS) Stroke, March 1, 2002; 33(3): 762 - 768. [Abstract] [Full Text] [PDF] |
||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Stroke Home | Subscriptions | Archives | Feedback | Authors | Help | AHA Journals Home | Search Copyright © 2001 American Heart Association, Inc. All rights reserved. Unauthorized use prohibited. |