Different Measures for Assessing Stroke Outcome
An Analysis From the International Stroke Trial in Italy
Background and Purpose— We sought to assess the relationship between 2 simple questions on recovery (question 1: do you feel that you have made a complete recovery from your stroke?) and dependency (question 2: do you require help from another person for everyday activities?) and the Barthel Index (BI) and Oxford Handicap Scale (OHS), as well as the relationship between BI and OHS, in a large number of Italian stroke survivors who participated in the International Stroke Trial (IST).
Methods— We used data from 2423 patients interviewed by telephone at 6 months after the event. The κ statistic, sensitivity, and specificity were calculated for several comparisons. Internal consistency for BI was calculated.
Results— The reliability of the dependency question compared with BI=20 (κ=0.93) and the reliability of the recovery question compared with OHS=0 (κ=0.89) were good. Sensitivity of the dependency question in predicting whether patients scored BI >18 was 0.98; sensitivity of the recovery question in predicting whether patients scored OHS=0 was 0.99. The reliability of BI=20 compared with OHS <3 was good (κ=0.87). Internal consistency of BI was very high (Cronbach’s α=0.96).
Conclusions— The 2 simple questions are a good means of evaluating outcome from a patient’s view and of dichotomizing the stroke survivor in a time-effective and reliable way.
It is becoming increasingly important for those involved in the management of stroke patients to have a consistent tool capable of providing a realistic evaluation of a patient’s life in a simple, reliable way. Indeed, this assessment of outcome is essential in clinical trials of intervention and in epidemiological studies that seek to define the natural history of the disease.
In a review of acute stroke trials from 1955 to 1995,1 24% of the trials failed to report data on deaths, and a significantly higher number of trials considered impairment (76%) rather than disability (42%) or, to an even lesser extent, handicap or quality of life (2%) as a measure of outcome. In the same review, the Barthel Index (BI)2 was the most common measure for disability (21%), and the Rankin3 or modified Rankin Scale, also referred to as the Oxford Handicap Scale (OHS),4 was the most common measure for disability/handicap (9%).
Recently, 2 simple questions have been proposed to dichotomize stroke survivors into those who are dependent or those who are independent and, for those who are independent, into those who have a good or a fair recovery.5–7 This novel measure has been used with ease in a large number of countries with different cultures and varying medical practices, as seen in 2 large pragmatic stroke trials, the International Stroke Trial (IST)8 and the Chinese Acute Stroke Trial,9 in a smaller trial on low-molecular-weight heparin in acute ischemic stroke,10 and in a replication study with a larger sample.11
The reliability of the BI and OHS by telephone and postal questionnaire compared with the “gold standard” of a clinical visit has been established,5,12,13 and the 2 simple questions have also been validated.5,6
The aim of this study is to assess, by telephone interview in a large number of stroke survivors, the relationship between the new measure and the BI and OHS and subsequently between the BI and the OHS. The internal consistency of the BI will also be evaluated.
Subjects and Methods
In Italy, 3437 patients entered into the pilot and main phases of the IST from July 1, 1992, to May 31, 1996. The 3113 patients participating in the main phase were included in the present study and were recruited from the 77 centers spread throughout the country, with a higher density in the central and northern regions.
The IST scheduled a postal or telephone follow-up at 6 months after stroke, in which recovery was assessed by the question, “Do you feel that you have made a complete recovery from your stroke?” (question 1 [Q1]); dependency was assessed by the question, “Do you require help from another person for everyday activities?” (question 2 [Q2]). Information on usual residence and current medications or death and possible cause of death was also collected. Additionally, in the present study, responses for the BI and OHS were elicited.
Since the most common means of communication in Italy is by telephone, 1 physician (T.A.C.) with experience in stroke cases was assigned to personally contact each patient for follow-up. Whenever the patient could not be reached because of cognitive communication problems or because the patient was not at home, the caregiver or closest relative (who is often the same person in Italy) was interviewed. Only if it was impossible to trace the patient was the general practitioner contacted because general practitioners would not have recorded detailed information on their patients’ disabilities or handicaps.
The telephone calls were performed as follows: the interviewer began with the 2 questions, then a brief conversation followed, in which the information required to complete the items of the 2 scales was elicited. All responses were recorded immediately to avoid any interference from the interviewer. When possible, the conversation was conducted as a friendly long chat to optimize the quality of the answers, to minimize the anxiety that could arise when talking about daily limitations, and to respect the patient’s privacy. The BI (0 to 20) was used to measure daily activities, where BI=20 indicates “not disabled”; the OHS (0 to 5) was used to determine disability/handicap, where 0 indicates “fully recovered” and <3 “good outcome.”
The analysis was performed as follows: The levels of sensitivity, specificity, and accuracy of the dependency question (Q2) and the recovery question (Q1) were compared with the most common BI “good outcome” cutoffs reported in previous studies.6,14 The aforementioned evaluations were also performed to compare Q1 with OHS=0. Then a BI score of 0 was used as gold standard versus OHS=0 and OHS <3.
Unweighted κ15 with 95% CI was used to assess agreement without the play of chance between the 2 questions and each of the complex scales and between the 2 scales themselves. The internal consistency of the BI was assessed by Cronbach’s α.16,17 The relation between total BI score and individual BI tasks was analyzed by means of the Spearman rank correlation test.
Of 3113 patients included in the main phase of the IST in Italy, 622 (20%) died within 6 months. IST follow-up at 6 months after stroke, consisting of Q1 and Q2 answers, was available for all of the 2491 survivors. However, information on 68 people (2.7%) was lacking to complete the BI and OHS items included in the present study. In 34 cases the interviewees failed to describe the detailed situation after answering the 2 simple questions; in 22 cases general practitioners were contacted for Q1 and Q2 answers, but no further information was available; and in 12 cases the interviewer did not insist on asking detailed questions because the patients were totally dependent, at times as a result of other causes (eg, terminal cancer), and the interview itself proved to be a source of grief for the caregiver.
Therefore, the results of Q1, Q2, BI, and OHS for a total of 2423 patients were considered in the present study. The mean age was 70.5 years (range, 20 to 99 years). There were 1016 women (42%; mean age, 73.3 years) and 1407 men (58%; mean age, 68.5 years). In 35.7% of cases, patients were contacted personally.
When we compared Q2 with previously reported BI cutoffs, BI >18 produced the best reproducibility (κ=0.95, sensitivity=0.98, specificity=0.97, accuracy=0.98). BI=20 (κ=0.93, sensitivity=0.99, specificity=0.93, accuracy=0.96) and BI >17 (κ=0.94, sensitivity=0.96, specificity=0.99, accuracy=0.98) had similar results, albeit with slightly lower κ index. (Table 1).
Similarly, Q1 compared with OHS=0 resulted in sensitivity=0.99, specificity=0.96, accuracy=0.96, and κ=0.89 (Table 2). To ascertain whether there was any agreement between “fully recovered” and “not disabled,” Q1 was compared with the best BI score (20), resulting in a poor agreement (κ=0.35) and relatively low accuracy (accuracy=0.66) (Table 3). There were differences in the judgment regarding complete recovery (Q1) and independence (BI=20) between patient and caregiver; in fact, when the patients self-evaluated their own outcome, specificity and agreement between the 2 measures were poor (specificity=0.36, κ=0.20), while when the caregivers answered, the measures improved (specificity=0.73, κ=0.43) (Table 4).
To better understand the meaning of the 2 different OHS cutoffs, OHS=0 and OHS <3 were compared with the best score of BI (20). The former comparison gave a very low sensitivity (sensitivity=0.36) as well as agreement (κ=0.36), while in the latter sensitivity (sensitivity=0.94), specificity (specificity=0.93), and agreement (κ=0.87) were very high (Table 5). Similar results were obtained when we compared OHS <3 and the dependency question (sensitivity=0.94, specificity=0.98, κ=0.91) (Table 6). This confirms that stroke survivors with an OHS score of <3 are to be considered functionally recovered.
Internal consistency was evaluated to determine whether there was a single item in the BI able to predict the total score. The correlation index and dissimilarities index were calculated for each ordinal variable. We eliminated the effect of the score of the item on the total score without the contribution of the considered particular item. Cronbach’s α showed that the BI had a high internal consistency (α=0.96). All 10 items were strictly correlated with the total score, with a value of R=0.937 for dressing and R=0.929 for climbing stairs (Table 7).
The quality of outcome data is determined by the quality of the measures used to produce them. Using poorly evaluated measures may yield misleading results and may thereby affect important clinical decisions, especially for the individual patient. The main purpose of conducting this study on previously validated scales and simple questions was to obtain more information about these measures through a very large case series and, additionally, to validate the simple questions with the complex measures through a follow-up via telephone and to confirm their applicability in different cultures.
The strong correlation between Q2 and BI >18 and between Q1 and OHS=0 indicates that the 2 questions actually measure what they were intended to, ie, independence and full recovery, in a quick, simple, and direct manner. Q1 mainly identifies patients who regain a good functional, cognitive, and psychological state (436/520 true positive in our series) and also identifies those with health limitations before the stroke (84/520 false-positive [16.2%]; 95% CI, 13 to 19.3). Indeed, the main difference between Q1 and OHS=0 is that the latter indicates those patients with a complete absence of symptoms due to any cause. Furthermore, agreement between Q1 and the maximum BI is poor, reinforcing the well-known concept that independent patients do not consider themselves healthy if they suffer from even minor residual stroke symptoms and perceive that this affects their life.
This becomes more evident when answers from patients are compared with those from caregivers. Agreement between Q1 and absence of disability (BI=20) is very poor in the first case, in which a self-evaluation is made (κ=0.20), while it improves to moderate (κ=0.43) when the caregiver attempts to evaluate the patient’s condition. From the patient’s point of view, this could mean that a feeling of lost previous life exists and that “there is more to life than getting into the bath on one’s own.”18 However, this consideration must be taken cautiously; in fact, it is highly probable that the caregiver answered more frequently when the outcome was poor, and thus the better agreement may be due to a greater proportion of patients being definitely disabled and not recovered.
When the 2 complex scales are compared, there is a strong correlation between no disability (BI=20) and absence of handicap (OHS <3). This is crucial because the latter has been widely used in many recent clinical trials on both secondary prevention and acute treatment. Our data confirm the suggestion that OHS is basically a disability scale when the cutoff between 2 and 3 is used to separate surviving stroke patients into categories. However, this would exclude those independent patients who have psychological and/or cognitive problems or undergo a change of lifestyle, which in our series amounted only to 70 of 1223 (5.7%; 95% CI, 4.5 to 7). These findings are similar to those reported by de Haan et al,19 in which mobility, disability in daily living, and instrumental activity scales showed a stronger association with OHS than cognitive and social functioning scales.
BI score of 20, OHS score <3, and a negative answer to Q2 all indicate independence. In a multicenter study or in cases of local monitoring involving >1 clinician, the simplest and most easily reproducible measure should be used, which is BI or, optimally, Q2. However, whenever assessment is performed by a single expert, OHS is recommended because, to a certain extent, it affords the distinction between disability and handicap, which in our series was 5.7%.
If cutoff is considered between OHS=0 (ie, no symptoms) and OHS >0, the scale would then become a handicap scale because symptoms, whether related to stroke or not, interfere with social life. Thus, if a disability scale is desired, it would be advisable to use OHS=0 as well.
Furthermore, our results show that each item of the BI measures the same concept; in fact, each predicts the main result equally well. This reflects the homogeneity of the scale and its inter-item consistency and explains its wide international application. However, in Italy, the most indicative item reported by telephone was dressing. In a similar study in Britain, bathing was found to be the most indicative item,5 while in Japan, feeding was less indicative,20 suggesting that when even a simple parameter such as disability is measured, cultural differences must be taken into account. Indeed, the most indicative item should be investigated to the fullest during the interview to better evaluate disability and focus on the most sensitive tasks in rehabilitation.
The findings of the present study support the use of the 2 questions when evaluating the outcome of stroke patients. They are easy to administer (even on the telephone), brief, cost effective, and valid in measuring disability and complete recovery to prestroke status. Any differences between these 2 questions and the more complex measures, ie, BI <20 and OHS=0, are negligible, especially in large stroke series, allowing large, pragmatic stroke trials to be completed. On the other hand, they are not able to describe handicap or quality of life restrictions if they are not directly or indirectly related to the stroke event.
The OHS maintains its function as a handicap scale only if the dichotomy is between 0 and >0. In fact, scores 1 and 2 refer to quality of life (symptoms without impairment or disability) and social roles (lifestyle). The more frequent use of the OHS with a cutoff between 2 and 3 describes the functional status of the patient, as does the BI. A balance should be considered between the ease in administering the BI, including the easiest dependency question, and the comprehensiveness of the OHS, which is able to identify those people suffering from a decline in social well-being, even if they are few in number. In stroke rehabilitation clinics, where a precise estimate of stroke outcome in an individual patient is mandatory, the 2 questions and OHS and BI could be used to obtain a global view of a patient’s life and to monitor clinical improvement. These scales and the 2 questions could be also used in small analytical treatment studies, in which follow-up could be accomplished at a higher cost and with more time by means of a local visit, bearing in mind that detailed measures can be unreliable if studies include only a few subjects, since the effect of random variation is likely to be larger than the effect of the treatment.5,21 However, the 2 questions may be used alone in large, pragmatic trials because it is far better to use a simpler (but valid) outcome measure in thousands of patients than a more complex (and therefore more expensive and possibly unaffordable) measure in a few hundred patients.
Measurement of health-related outcomes should be a compromise between simplicity and patient perspective, as these 2 questions are. These questions may help to solve the problem of dichotomizing the outcome of stroke survivors in a simple manner from the beginning of the trial and not with post hoc analyses.
Italian IST Collaborators
(Acqupendente) Ospedale di Acquapendente: Pisanti P., Rollo F.; (Ancona) Ospedale Geriatrico: Del Gobbo M., Guidi M., Pelliccioni G., Scarpino O.; Ospedale Torrette: Ceravolo M.G., Pelonara S., Provinciali L., Reginelli R.; (Assisi) Ospedale di Assisi: Bondi L.; (Bari) Policlinico di Bari: Federico F., Inchingolo V., Insabato R., Laddomeda G., Lucivero V.; (Belluno) Ospedale di Belluno: Fassetta G., Gentile M., Giuseppe G., Tournier B.; (Bergamo) Ospedali Riuniti Neurologia I: Defanti C.A., De Marco R.; Neurologia 2: Belloni G., Camerlingo M., Casto L., Censori B., Mamoli A.; (Bologna) Ospedale S. Orsola e Malpighi: Azzimondi G., Bacci M., D’Alessandro R., Fiorani L., Naldi S., Nonino F., Peta G., Pugliese S.; (Brescia) Ospedale di Brescia: Anzola P., Mangoni; (Cagliari) Ospedale San Michele: Melis M., Spissu A.; (Camposampiero) Ospedale Civile P Cosma: Chiavinato G.L.; (Carpi) Ospedale Civile: Lolli V., Lugli M.L., Miele V., Santangelo M.; (Cascia) Ospedale di Cascia Norcia: Buccolieri A., Cozzari M.; (Catania) Policlinico Università: Giammona G., Giuffrida S., Le Pira F., Nicoletti F., Saponara R.; (Cento) Ospedale di Cento, USL 30: Sarti G.; (Cesena) Ospedale Bufalini: Mazzini G., Pagliarani G., Pretolani E., Pretolani M., Rasi F., Tonti D.; (Chiaravalle) Ospedale Civile: Lopresti; (Chioggia) Ospedale Civile di Chioggia: Zotti S.; (Città di castello) Ospedale Civile: Arcelli G.; (Como) Ospedale Valduce: Guidotti M.; (Cortona) Ospedale di Cortona: Aimi M., Conti G., Corbacelli C., Migliacci R., Mollaioli M.; (Firenze) Ospedale S.M. Annunziata Medicina 2: Landini G., Manetti F.; Medicina 3: Bartolozzi A., Bellesi R.; (Foligno) Ospedale di Foligno, Medicina: Massi Benedetti M., Maremmani A.M.; Neurologia: Bacchi O., Brustengi P., Stefanucci S.; (Forlì) Ospedale di Forlì: Cirrillo G., Pedone V.; (Galatina) Azienda USL LE/1 Galatina: Marzo A.; (Genova) Dipartimento di Scienze Neurologiche: Bruzzone G., Del Sette M., Finocchi C., Gandolfo C.; (Imola) Ospedale di Imola: Ballotta A., Bertuzzi D., Chioma V., Fini M., Matacena C., Marzara G., Michelini M., Pirazzoli G., Sacchet C.; (Isernia) Instituto Sanatrix: Aloj F., Buzzi M.G., Castellano A.E., Gatta A., Minotta S., Rossi F.; (L’Aquila) Ospedale Collemaggio: Carolei A., Marini C.; (Latisana) Ospedale di Latisana Medicina: Gavardi M.; (Lavagna) Ospedale di Lavagna: Caneva E., Canevari E., Colombo R., Giunchedi M., Ratto S., Rocca I., Sivori D.; (Mede) Ospedale San Martino: Gallotti P., Garbagnoli P., Rossanigo P.L., Tardani F., Zaccone M.T.; (Messina) Ospedale Piemonte: Arena A.; Policlinico Universitario: Musolino P., Rosario G.; (Mestre-Venezia) Ospedale “Umberto I” Mestre: Haefele M., Pistollato G.; (Milano) Ospedale Niguarda: Bottini G., Brucato A., Juli E., Ferraro G., Thiella G., Rinaldi M., Santilli I., Sterzi R.; Ospedale San Raffaele: Comola L.M., Francesci M., Volonte L.M.A.; Ospedale Sesto San Giovanni: Cavestri R., Longhini E., Mazza P.; (Modena) Ospedale di Modena: Bernardi C., Malferrari G.; (Moncalieri) Moncalieri Santa Croce: Curti A., Fogliati M., Frediani R., Pecorari L.; (Monselice) Ospedale di Monselice: Conforto L., Turrin M.; (Negrar) Ospedale Don Calabria: Cacace C., Rimondi B.; (Nuoro) Ospedale S. Francesco: Murgia S.B.; (Offida) Ospedale di Offida: Cipollini F.; (Olbia) Ospedale Civile di Olbia-San Giovanni di Dio: Mura G., Pirisi A., Secchi G.; (Orvieto) Ospedale di Orvieto: Franciosini M.F.; (Osimo) Ospedale di Osimo S. Benvenuto E Rocco: Pellegrini F.; (Padova) Università di Padova: Meneghetti G.; (Parma) Ospedale Maggiore: Catahmo A., Finzi G., Ponari O., Tonelli C.; (Pavia) Fondazione Mondino: Bosone D., Cavallini A., Micieli G., Nappi G., Poli M., Zappoli F.; (Perugia) Istituto di Gerontologia e Geriatria: Aisa G., Cherubini A., Polidori M.C., Romano G., Savastano V., Senin U.; Ospedale Silvestrini: Cantisani T.A., Caselli P., Floridi P., Tiacci C.; Policlinico: Benemio C., Celani M.G., Ciorba E., Comparato E., Duca E., Ricci S., Righetti E., Zampolini M.; (Piacenza) Ospedale Civile Piacenza: Bionda E., Cammarata S., Debenedictis M., Gala B., Poli V., Vignola A.; (Pistoia) Ospedale di Pistoia: Sita D., Volpi G.; (Potenza) Ospedale di Potenza, San Carlo: Paciello, Peluso D., Sica U.; (Putignano) San Michele in Monte Laureto: Dellarosa A.; (Salerno) Ospedale Riuniti: Iuliano G.; (Sarnico) Ospedale di Sarnico, P.A. Faccanoni: Casella G., Mascaretti L., Scatena L., Spadaro C.; (Sassari) Ospedale di Sassari: Casu G., Marras F.A., Pirisi A., Spanu M.A., Zuddas M.; (Spoleto) Ospedale di Spoleto: Cenciarelli S., Grasselli S., Miele N.; (Terni) Ospedale di Terni Geriatria: Carnevali P., Consalvi G., Finistauri D., Grilli G., Maragoni M.; Neurologia: Bartocci A., Costantini F., De Santis L., Iannone G., Moschini E., Paci A., Sensidoni A., Trenta A.; (Todi) Ospedale di Todi: Alunni G., Biscottini B., Boccali A., Cruciani M., Ibba R., Pacini M.; (Tredabissi) Ospedale di Melegnano: Amodeo M., Colombo A., Marsile C., Pontrelli V., Sasanelli F.; (Trieste) Ospedale Maggiore: Antonutti L., Boniccioli B., Chiarandini G., Chiodo-Grandi F., Gregori M., Guerrini N., Koscica N., Musco G., Nider G., Polo S., Relta G., Valli R.; (San Benedetto del Tronto) Ospedale S. Benedetto del Tronto: Carboni R., Coccia G., Curatola L., Gobbato R., Infriccioli P., Sabatini D., Sfrappini M.; (S. GiovanniValdarno) Ospedale di San Giovanni Valdarno: Cuccuini A.; (Vibo Valentia) Presidio Ospedaliero “G. Iazzolino”: Consoli D., Vecchio A.; (Vicenza) Ospedale San Bortolo: Dudine P., Morra M., Toso V.; (Vimercate) Ospedale di Vimercate: Casati G., Ciccone A., Marmiroli P.; (Zingonia) Policlinico San Marco: Chia F., Mauro A., Munari L., Perretti A.
Telephone costs of IST follow-ups were covered by a grant from the IST trial office, Neurosciences Trial Unit, Western General Hospital, Edinburgh, Scotland. We wish to thank Michele Kildea for her invaluable help in translating the manuscript.
A complete list of the Italian IST Collaborators is provided in the Appendix.
- Received May 31, 2001.
- Revision received September 18, 2001.
- Accepted September 25, 2001.
Roberts L, Counsell C. Assessment of clinical outcomes in acute stroke trials. Stroke. 1998; 29: 986–991.
Dennis M, Wellwood I, Warlow C. Are simple questions a valid measure of outcome after stroke? Cerebrovasc Dis. 1997; 7: 22–27.
Dennis M, Wellwood I, O’Rourke S, Mac Hale S, Warlow C. How reliable are simple questions in assessing outcome after stroke? Cerebrovasc Dis. 1997; 7: 19–21.
Hommel M, for the FISS bis Investigators Group. Fraxiparine in Ischaemic Stroke Study (FISS bis). Cerebrovasc Dis. 1998; 8 (suppl 4): 19.
Candelise L, Pinardi G, Aritzu E, Musicco M. Telephone interview for stroke outcome assessment. Cerebrovasc Dis. 1994; 4: 341–343.
Sulter G, Steen C, DeKeyser J. Use of the Barthel Index and modified Rankin Scale in acute stroke trials. Stroke. 1999; 30: 1538–1541.
Duncan PV, Jorgensen HS, Wade DT. Outcome measures in acute stroke trials: a systematic review and some recommendations to improve practice. Stroke. 2000; 31: 1429–1438.
Van Gijn J. Measurement of outcome in stroke prevention trials. Cerebrovasc Dis. 1992; 2 (suppl 1): 23–34.
de Haan R, Limburg M, Bossuyt P, van der Meulen J, Aaronson N. The clinical meaning of Rankin “handicap” grades after stroke. Stroke. 1995; 26: 2027–2030.
Chino N. Efficacy of Barthel Index in evaluating activities of daily living in Japan, the USA and UK. Stroke. 1990; 21 (suppl II): II-264–II-265.
Peto R. Monitoring cancer patients in clinical trials need not be precise.In: Symington T, Williams AE, Mc Vie JG, eds. Cancer: Assessment and Monitoring: Tenth Pfizer International Symposium. Edinburgh, Scotland: Churchill Livingstone; 1980: 377–381.