Self- and Proxy-Report Agreement on the Stroke Impact Scale
Background and Purpose— The purpose of this study was to examine proxy-patient agreement on the domains of the Stroke Impact Scale (SIS), as per the proxy-proxy perspective.
Methods— Stroke patients were prospectively assessed by means of the NIH Stroke Scale, Barthel index, and modified Rankin scale. Proxies and patients answered the Hospital Anxiety and Depression Scale and the SIS 3.0. Comparisons of patient-proxy mean scores (paired t test), effect size, and intraclass correlation coefficients (ICC) were calculated for each of the SIS domains, and weighted kappa for individual items.
Results— 180 proxy-patient pairs were assessed. Proxies were younger (mean age: 43.1 versus 57.9 years) and had a higher education level (P<0.0001). The bias between patient-proxy mean differences was low (from 5.3, Strength, to 0.1, Communication). Proxies significantly scored patients as more severally affected in Strength (41.7 versus 36.6; P<0.0001) and ADL (46.2 versus 43.1; P=0.01) domains, and Composite Physical Domain (CPD; 39.7 versus 34.9; P<0.0001). The magnitude of difference was small (size effect: 0.21). ICC values for the SIS domains ranged from 0.17 (Emotion) to 0.79 (Hand function). The ICC value for the CPD was 0.83. Memory, Communication, Emotion, and Social Participation domains had ICC lower values. The weighted kappa values for the SIS items ranged from 0.09 (item 4e) to 0.80 (item 7d). Highest values (moderate/high agreement) were observed for the SIS-16 and CPD (kappa values: 0.31 to 0.80).
Conclusions— Agreement between stroke patients and proxies was acceptable for most SIS domains and SIS-16. Proxy’s assessment of SIS subjective domains should be taken with caution.
- interrater agreement
- health related quality of life
- Stroke Impact Scale
- stroke outcome
Health-related quality of life (HRQoL) assessment is increasingly used in stroke research, as well as in public health in the last decade. HRQoL is a highly subjective concept, and stroke patients should be the primary informants because self-report is more valid than any proxy report.1
Nevertheless, around 25% of stroke patients are excluded in HRQoL studies because of aphasia or dementia.2 In addition, many stroke patients are unable to complete HRQoL questionnaires by themselves. Missing data can result in biased estimates of stroke treatment effect, diminish the power of the study to detect responsiveness, and limit the generalizability of the results in rehabilitation clinical trials. In these cases, a proxy such as a health care professional or a family caregiver may help to evaluate the patient’s HRQoL.3
A few studies have assessed the proxy-version of some HRQoL measures in stroke. The reliability of proxy raters has been examined in generic (Sickness Impact Profile,4 EQ-5D,5 Health Utility Index6,7) and specific HRQoL measures (Stroke Impact Scale,8 Stroke Specific quality of Life Scale,9 Stroke and Aphasia Quality of Life-39.10 However, there is an ambiguity about the proxy viewpoint elicited. It is not clear whether proxy assessments were elicited by asking a proxy to assess the stroke patients as caregiver think the patient would respond (proxy-patient perspective) or for the proxy to provide their own perspective on the stroke survivor’s HRQoL (proxy-proxy perspective3).
The quality of HRQoL assessment by stroke patients and proxies based on specific HRQoL measures has not been previously studied in Brazil. The aim of this study was to assess the agreement on HRQoL between stroke patients and proxies, as per the proxy-proxy perspective, using the Stroke Impact Scale (SIS).11 The proxy-proxy perspective was chosen because it may be more objective and reliable than the proxy-patient perspective. We hypothesized that proxies may assess the HRQoL of stroke patients reliably, and that the agreement between stroke survivors and their caregivers would be satisfactory for observational functioning.
Patients and Methods
Eligible subjects were patients with clinical diagnosis of stroke and a stable caregiver. They were consecutively admitted at the outpatient Stroke Rehabilitation Clinics between July 2007 and April 2008. Stroke was defined as a focal deficit of sudden onset that lasted at least 24 hours with no known alternative to a vascular cause.12 Stroke was confirmed by clinical examination and neuroimaging findings. Both ischemic and hemorrhagic strokes were included in the study, as well as first and recurrent strokes. Exclusion criteria were: (1) Patients with transient ischemic attack; (2) patients with subdural hematoma or brain injury; (3) patients who were not able to fill out the questionnaires because of severe aphasia or dementia; (4) absence of stable caregiver.
Stroke patients who had a proxy or caregiver were candidates to participate in the study. The proxy was a family member such as a spouse or partner, sibling, or offspring, or, if unavailable, a close friend. Proxies should be ≥18 years old. Institutional board approved the study protocol. All patients and proxies included in the study gave their informed consent.
Stroke patient data were prospectively collected on age, sex, education level, occupation, marital status, stroke etiology, and vascular risk factors. Sociodemographic information of proxies included age, sex, education, and relationship to the patient.
Patients were assessed by means of National Institute of Health Stroke Scale (NIHSS),13 the modified version14 of the Rankin scale (m-RS),15 the Barthel index (BI),16 and the Mini-Mental State Examination.17 Both, patients and proxies answered the Hospital Anxiety and Depression Scale (HADS)18 and the SIS11 by themselves during their visits to the clinic. Patients and the proxies were blinded to each other’s responses. Proxies were instructed to provide their own perspective on the stroke patient’s HRQoL.
The SIS 3.0 is a 59-item self-report assessment of stroke outcome used to assess HRQoL.19 The SIS has 8 domains: Strength, Hand function, Mobility, Physical and instrumental activities of daily living (ADL/IADL), Memory and thinking, Communication, Emotion, and Social participation. Scores for each domain range from 0 to 100, and higher scores indicate better HRQoL. Four of the subscales (Strength, Hand function, ADL/IADL, and Mobility) can be combined into a Composite Physical Domain (CPD). The SIS 3.0 also includes a question (item 50) to assess the patient’s global perception of recovery. The same group of investigators developed the SIS-16, a short Composite Physical Domain.20 The SIS Brazilian version was used for the purpose of the study.21
Acceptability and scaling assumptions, convergent validity, reliability, and precision of the SIS proxy-version were explored. Cronbach’s alpha was used to measure internal consistency of the SIS proxy-version. An alpha value ≥0.70 was considered acceptable.
The following analyses were performed to assess SIS patient-proxy agreement7,22:
Comparison of patient-proxy mean scores for each of the SIS domains by means of a paired-sample Student t test.
The magnitude of the systematic bias between patient-proxy mean scores was calculated by means of the effect size (ES) and the standardized response mean (SRM). An effect size or a SRM of 0.2 was considered a small bias, and a value between 0.20 and 0.5 a moderate bias.
Agreement and concordance on SIS individual items between the proxy and patient version was evaluated by means of the weighted kappa with quadratic weights. Values <0.2 implies poor agreement, 0.21 to 0.40 fair agreement, 0.41 to 0.60 moderate agreement, 0.61 to 0.80 good agreement, and 0.81 to 1.0 excellent agreement.23 Percentage agreement (exact patient-proxy score matches) for items with poor or fair kappa values (<0.4) was also calculated to determine whether the low values represented lack of agreement or lack of variability in the data.
Calculation of a 2-way mixed (3, 1 model) intraclass correlation coefficient (ICC) between proxies and stroke patients for each of the individual SIS domains was performed. An ICC <0.40 was considered as poor agreement, 0.40 to 0.70 good agreement, and ICC higher than 0.70 excellent.24
Association between stroke functional measures and proxy/patients SIS domain mean scores was tested using Spearman rank correlation coefficients. Values lower than 0.30 were considered indicative of weak correlation; 0.30 to 0.59, moderate; and ≥0.60, strong.
To assess the discriminative validity of the SIS proxy-version, Kruskal-Wallis nonparametric test was performed to assess whether mean differences between patients and proxies on the SIS domain varied across disability level (BI), functional status (m-RS), and presence of depression (HADS-Depression mean score ≥11).
Multiple regression analysis was performed to assess differences between SIS patient-and-proxy mean scores (dependent variables). Independent variables were stroke severity (NIHSS), disability (BI), depression (HADS-Depression subscale), and sociodemographic variables. The SPSS version 13.0 was used for analysis.
The 67.4% of the 267 stroke patients consecutively recruited (180 patients; 55.6% males) had an identifiable caregiver who could participate as a proxy. Stroke patients who had a caregiver were significantly older (58.2 versus 51.9 years), and more disabled (BI: 67.4 versus 85.9, P<0.0001). Demographic characteristics of stroke patients and their proxies are shown in Tables 1 and 2⇓, respectively. Proxies were younger (mean age: 43.1 versus 57.9 years) and had a higher education level (9.9 versus 7.9 years, P<0.0001). Most of proxies were females (81.1%), and the 37.8% were spouses. Significant differences were observed in the HADS-Depression subscale between patients and proxies (7.2 versus 5.4; P<0.0001).
Metric Attributes of the SIS Proxy-Version
The metric attributes of the proxy-version are shown in Table 3. Mean scores of SIS domain proxy-version ranged from 20.5 (Hand function) to 76.9 (Communication). A prominent floor effect was observed in Hand function domain (52.5%). Internal consistency of the SIS domains (Cronbach’s alpha from 0.82, Strength, to 0.95, Hand function) was satisfactory, and only Emotion domain (Cronbach’s alpha=0.49) did not attain the alpha value of 0.70. Most corrected item-domain correlation coefficients attained values higher than the criterion (0.40). SEM values for each SIS domain ranged from 6.5 (Mobility) to 10.1 (Strength), and attained the criterion (SEM≤SD/2).
Proxy- and Self-Report Agreement
Comparison of SIS patient and proxy mean scores appears in Table 4. No significant differences between proxy and patient scores were observed in 6 of the 8 domains. Proxies significantly scored lower (worse) Strength (41.7 versus 36.6; P<0.0001) and ADL/IADL domains (46.2 versus 43.1; P=0.01), the CPD (39.7 versus 34.9; P<0.0001), and the SIS-16 (47.1 versus 44.7; P=0.03). Nevertheless, the estimated effect size was small (0.21 for the Strength domain). A strong nonsignificant trend was observed for the Memory and thinking domain (P=0.07). The standardized response mean for the CPD was 0.37 (small/moderate effect). The bias between patient-proxy mean differences was low (from 5.3, Strength, to 0.1, Communication).
The weighted kappa values for the SIS individual items ranged from 0.09 (item 4e) to 0.80 (item 7d). Highest weighted kappa values (moderate/high agreement) were observed for the Physical domains (Strength, Hand function, Mobility and ADL/IADL; kappa values: 0.31 to 0.80). Poor agreement (weighted kappa value <0.2) was observed in the following items: 2c (remember to do things; 0.19), 3h (feel that life is worth living; 0.12), 4a (say the name of someone; 0.17), 4e (participate in a conversation; 0.09), 8b (social activities; 0.13), 8c (quiet recreation; 0.12), 8d (active recreation; 0.11), 8e (role as a family member; 0.13), and 8f (participation in religious activities; 0.16). The 32.3% of items with poor/fair kappa value had an exact patient-proxy score agreement higher than 40%.
Intraclass correlation coefficient values for the SIS domains ranged from 0.17 (Emotion) to 0.79 (Hand function). The ICC value for the CPD was 0.83. Less observable (subjective) domains of the SIS (Memory, Communication, Emotion, and Social Participation) had ICC lower values.
Validity of the Proxy’s Assessment
Correlation between proxy ratings and stroke functional measures tended to be slightly lower than for patient-based self-assessment. Significant correlations (P<0.0001) were observed between functional status (as measured by the m-RS) and the following SIS proxy-version domains: Mobility (rS=−0.73), ADL/IADL (rS=−0.69), Strength (rS=−0.44), and Hand function (rS=−0.44). Concerning disability, as measured by the BI, the highest association (P<0.0001) was obtained between BI and the following SIS proxy-version domains: Mobility (rS=0.80), ADL/IADL (rS=0.74), Strength (rS=0.52), and Hand Function (rS=0.52). Moderate correlation (P=0.01) was observed between HADS-Depression subscale and SIS Emotion domain (rS=−0.20), and between HADS-Depression subscale and Memory domain (rS=−0.26). CPD significantly (P<0.0001) correlated at high level with NIHSS (rS=−0.68), BI (rS=86), m-RS (rS=−0.77), and HADS-Depression subscale (rS=−0.32) in stroke patients.
Patient-proxy mean differences broken down by functional status and disability are shown in Table 5. Mean differences on the SIS domains remained similar across levels of disability and functional status; mean differences were significantly higher in m-RS 1 patients on Strength and Social participation domains. SIS Composite Physical Domain mean scores tended to significantly decrease as the severity of the disease, based on m-RS, increased (Kruskal-Wallis test, P<0.0001), in both patient and proxy assessments (Table 6). Concerning the SIS proxy-version domains, the discriminant validity was poor for the Memory, Communication, and Emotion domains. Strength, Hand function, Mobility, and ADL/IADL mean scores significantly decreased as disability worsened (P<0.0001).
Predictors of Agreement Between Patients and Proxies
Proxies significantly rated worse (lower scores) female stroke patients on Emotion domain (51.3 versus 55.3; P=0.03). Mean differences between proxies and patients were significantly higher when assessing ADL (5.6 versus 0.4; P=0.04) and Memory (8.4 versus -3.3; P=0.001) domains in stroke males. No association between mean score differences and proxy gender was found. Neither proxy’s age nor patient’s age influenced SIS mean scores.
Proxies significantly rated worse depressed stroke patients. Significant differences in the mean differences were observed for the following SIS domains: Strength (6.9 versus −4.5; P=0.01), Emotion (0.7 versus −10.0; P=0.001), Communication (1.3 versus −10.0; P=0.01), ADL (4.5 versus −3.9; P=0.01), and Social Participation (−0.3 versus −13.5; P=0.02). No association between proxy’s HADS-Depression mean score and proxy-patient mean differences was found.
No variables were identified influencing the CPD mean difference between patients and proxies in the stepwise multivariate regression analysis. Patient’s age, disability (BI), proxy’s depression (HADS-Depression), and patient’s motor impairment (NIHSS) were independent predictors (adjusted R square=0.65; P<0.0001) for the proxy’s CPD. Patient’s age, stroke severity (NIHSS), and disability (BI) were independent predictors (adjusted R2=0.66; P<0.0001) for the proxy’s SIS-16 mean scores. Disability (BI), education level, and depression (HADS-Depression) were independent predictors (adjusted R2=0.79; P<0.0001) for the patient’s SIS-16 mean scores.
Several studies have assessed the validity of proxies, as a substitute for stroke patients, for the assessment of ADL,25 instrumental ADL,26,27 social participation,28 and HRQoL.4–10,29 The results of these studies suggest that the level of agreement between proxies and stroke patients may differ depending on the type of construct measured. Whereas adequate agreement is reported for the assessment of physical abilities and ADL, poor agreement may be observed for the psychosocial domains of the HRQoL measures.
This discrepancy and the systematic variance between raters of HRQoL in stroke have been attributed to specific rater characteristics such as the presence of mood disorders, age, and sex.7,8,30 Other factors that may contribute to disagreement are the specific HRQoL domain under study, the random error, and the choice of statistic method to measure agreement.5 Among different types of proxy raters, agreement and reliability of proxy respondents for people with disabilities tend to be best for relatives and lower for health care providers. Some studies have reported that living together and a close relationship appear to increase agreement30; we did not find any significant differences on proxy mean scores by relationship, marital status, or occupation. Caregiver depression has been associated with proxy rating bias.7 This association could not be confirmed in Brazilian stroke proxies, although proxy’s HADS-Depression score was an independent factor for the proxy’s CPD score.
Patient-proxy agreement may be stronger for the more concrete and observable domains than for the subjective and emotional domains of HRQoL measures. Duncan et al8 also reported a tendency of proxies to allocate worse ratings than did the stroke patients themselves on most SIS domains. Sneeuw et al proposed a proxy-patient U-shaped relationship; agreement would be better for very good or very poor health status and worse for moderately impaired stroke survivors.29
Brazilian proxies significantly scored stroke patients as more severally affected than patients scored themselves in Strength domain and the CPD. Nevertheless, proxy bias toward overrating physical domains of the SIS decreased as the severity of the disease increased, and its magnitude was small and not clinically meaningful. No bias was observed for the SIS-16 scores between patients and proxies. Other studies have shown that the strength of agreement was less among severely affected patients.5,8 Discrepancies with our study may be partly explained by cultural and geographic factors and differences in sample composition.
Proxies overestimated patient’s disability, as measured by the ADL domain of the SIS, whereas patients themselves tended to overestimate their functional state. Significant differences were observed in the CPD proxy-patient mean differences between severe disabled stroke patients (BI<60) and those who were independent in their ADL. Brazilian proxies did not report a lower level of participation in social roles than do stroke patients.
The strength of the agreement was higher for the physical domains of the SIS and lower for the more subjective domains of the measure. ICC-based agreement was good for the more observable attributes of the SIS (Hand Function, Mobility, ADL) and for the SIS-16. ICC-based agreement was poor for the Emotion and Social participation domains. The low-weighted kappa values observed in some psychosocial domain items may have been influenced by lack of variability in the scores and lack of agreement.
This proxy-patient agreement study was performed within the proxy-proxy perspective. Many of the studies in the literature are ambiguous about the proxy viewpoint elicited. The extent to which the proxy-proxy perspective is informative may depend on the proxy’s ability to provide complementary information on the HRQOL of stroke patients.3 Examination of the validity of the proxy-proxy perspective-based assessment was performed. Correlations of SIS proxy-version mean scores with external anchors such as functional stroke measures (NIHSS, BI, HADS-Depression subscale) provided evidence of proxy-assessment validity for each SIS domain.
Generalizability of the results may be limited by the exclusion of stroke patients who lacked informal caregivers. However, patients without caregivers were significantly more independent in their ADL. It is also possible that some proxies have been selected because they were more available and might not have known the patient well enough to accurately rate the patient’s HRQoL. Nevertheless, most proxies were close relatives who were usually taking care of the patient. In addition, some of the findings could be significant by chance alone because of the large number of statistical comparisons performed.
In conclusion, patient and proxy ratings on the SIS Brazilian version are valid; agreement between stroke patients and proxies was adequate for most SIS domains. Proxy raters tended to report more HRQoL problems than patients themselves on the SIS physical domains. Proxy’s assessment of SIS subjective domains should be evaluated with caution because the strength of the agreement was low. The use of the SIS-16 proxy-version is also encouraged, although bias may weaken the reliability of proxy reports.
- Received May 15, 2009.
- Accepted May 20, 2009.
Sneeuw KCA, Aaronson NK, de Haan RJ, Limburg M. Assessing the quality of life after stroke: the value and limitations of proxy ratings. Stroke. 1997; 28: 1541–1549.
Dorman PJ, Waddell F, Slattery J, Dennis M, Sandercock P. Are proxy assessments of health status after stroke with the EuroQol questionnaire feasible, accurate, and unbiased? Stroke. 1997; 28: 1883–1887.
Pickard AS, Johnson JA, Feeny DH, Shuaib A, Carriere KC, Nasser AM. Agreement between patient and proxy assessments of health-related quality of life after stroke using the EQ-5D and Health Utilities Index. Stroke. 2004; 35: 607–612.
Duncan PW, Lai SM, Tyler D, Perera S, Reker DM, Studenski S. Evaluation of proxy responses to the Stroke Impact Scale. Stroke. 2002; 33: 2593–2599.
Williams LS, Bakas T, Brizendine E, Plue L, Tu W, Hendrie H, Kroenke K. How valid are family proxy assessments of stroke patients’ health-related quality of life? Stroke. 2006; 37: 2081–2085.
Hilari K, Owen S, Farrelly SJ. Proxy and self-report agreement on the Stroke and Aphasia Quality of Life Scale-39. J Neurol Neurosurg Psychiatry. 2007; 78: 1072–1075.
Foulkes MA, Wolf PA, Price TR, Mohr JP, Hier DB. The Stroke Data Bank: design, methods, and baseline characteristics. Stroke. 1988; 19: 547–554.
Brott T, Adams HP Jr, Olinger CP, Marler JR, Barsan WG, Biller J, Spilker J, Holleran J, Eberle R, Hertzberg V. Measurements of acute cerebral infarction: a clinical examination scale. Stroke. 1989; 20: 864–870.
Bonita R, Beaglehole R. Modification of Rankin scale: recovery of motor function after stroke. Stroke. 1988; 19: 1497–1500.
Lai S, Studenski S, Duncan PW, Perera S. Persisting consequences of stroke measured by the Stroke Impact Scale. Stroke. 2002; 33: 1840–1844.
Duncan PW, Lai SM, Bode RK, Perera S, DeRosa J. Stroke Impact Scale-16: a brief assessment of physical function. Neurology. 2003; 60: 291–296.
Carod-Artal FJ, Ferreira Coral L, Stieven Trizotto D, Menezes Moreira C. The Stroke Impact Scale 3.0: evaluation of acceptability, reliability and validity of the Brazilian version. Stroke. 2008; 39: 2477–2484.
Nunnally JC, Bernstein IH. Psychometric theory. New York: McGraw-Hill Inc, 1994.
Tooth LR, McKenna KT, Smith M. Further evidence for the agreement between patients with stroke and their proxies on the Frenchay Activities Index. Clin Rehab. 2003; 17: 656–665.
Chen MH, Hsieh CL, Mao HF, Huang SL. Differences between patient and proxy reports in the assessment of disability after stroke. Clin Rehab. 2007; 21: 351–356.
Horowitz A, Goodman CR, Reinhardt JP. Congruence between disabled elders and their primary caregivers. Gerontologist. 2004; 44: 532–542.