| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
(Stroke. 2004;35:1692.)
© 2004 American Heart Association, Inc.
Original Contributions |
From the Service de Reeducation Neurologique (C.B., J.P.), Centre Hospitalo-Universitaire de Nîmes, Le Grau du Roi; Service de Psychiatrie A (B.C.), Centre Hospitalo-Universitaire de Nîmes, Hôpital Caremeau, Nîmes; and Service de Reeducation Neurologique (D.P.), Centre Hospitalo-Universitaire de Reeducation, Dijon, France.
Correspondence to Dr Charles Benaim, Service de Reeducation Neurologique, Centre Hospitalo-Universitaire de Nimes, Centre Helio-Marin, 30240 Le Grau du Roi, France. E-mail charles.benaim{at}chu-nimes.fr
| Abstract |
|---|
|
|
|---|
Methods Six experts selected an initial sampling of behavioral items from existing depression rating scales. Stroke patients (aphasic and nonaphasic) were assessed with these items by the rehabilitation staff, with the Hamilton Depression Rating Scale (HDRS) for nonaphasic patients only, by a psychiatrist, and by the rehabilitation staff with Visual Analog Scales (VAS). A second item selection was conducted after a regression algorithm was run including VAS as independent variables (criterion validity) and after their factorial structure was analyzed with a principal component analysis (factorial validity). The construct validity was evaluated with respect to the other depression assessments. A threshold for the diagnosis of depression was computed with respect to the psychiatrists diagnosis. Interrater and test-retest reliability were assessed in 2 additional groups of aphasic patients.
Results Eighty patients participated in the study (59 aphasic). Fifteen behavioral items from existing depression rating scales were selected, and 9 were retained after the validation process. ADRS correlated highly with VAS and HDRS (r=0.60 to 0.78, P=104 to 106). With respect to the psychiatrists diagnosis, the sensitivity and specificity of ADRS were 0.83 and 0.71, respectively, when the threshold was set at 9/32. Its factorial structure was comparable to HDRS structure. Interrater and test-retest reliability were high (average
coefficient of the 9 items=0.69).
Conclusions ADRS is a valid, reliable, sensitive, and specific tool for the evaluation of depression in aphasic patients during the stroke subacute phase.
Key Words: depression aphasia stroke assessment reproducibility of results
| Introduction |
|---|
|
|
|---|
To develop the Aphasic Depression Rating Scale (ADRS), we performed 3 consecutive studies corresponding to the usual steps of clinical scale validation.7
| Study 1: Item Preselection |
|---|
|
|
|---|
Methods
Eighteen members of the neurorehabilitation team were interviewed concerning typical depressed behavior in aphasic patients (1 psychiatrist, 1 neurologist, 1 physiatrist, 1 psychologist, 3 speech therapists, 5 physiotherapists, 2 hand therapists, 2 nurses, 2 auxiliary nurses), and the most frequently reported behaviors were noted. Six experts (1 psychiatrist, 1 neurologist, 1 physiatrist, 1 psychologist, 2 speech therapists) analyzed 3 existing depression scales that contain items describing observable behavior: the Hamilton Depression Rating Scale (HDRS),1 the Montgomery & Asberg Depression Rating Scale (MADRS),2 and the Salpetriere Retardation Rating Scale (SRRS).8 Only items (1) that could be completed without interviewing patients, (2) that described depression behavior reported by the team, and (3) that were selected by at least 4 experts were retained.
Results
Fifteen items were selected by experts. Minor modifications were made for adaptation to hemiplegic patients. For instance, "slowness and paucity of movementstrunk and limbs" was replaced by "slowness and paucity of movementstrunk and nonaffected limbs."
| Study 2: Final Item Selection |
|---|
|
|
|---|
Methods
After having given informed consent in accordance with the guidelines of the local ethics committee, every stroke patient admitted to our neurorehabilitation unit for a given period of time was examined by the same psychiatrist, who graded the severity of depression symptoms from 0 (no symptom of depression) to 100 (extremely severe depression), which we have termed dep-psy. Patients were assessed during the weekly rehabilitation staff meeting. For each patient, the members of the rehabilitation team (1 physiatrist, 1 resident, 1 psychologist, 1 speech therapist, 1 physiotherapist, 1 hand therapist, 1 nurse, and 1 auxiliary nurse) who were best acquainted with him or her (1) completed the 15 items selected in study 1 and (2) assessed the severity of depression on a basic visual analog scale of 0 to 100 (1 for each member). An average score was then calculated for the team, termed dep-rehab. Members of the team were unaware of the psychiatrists rating.
Criterion Validity
In the process of validating the clinical scale, the correlation of each item with a gold standard, if any, is used to select the best items and thereby refine draft versions of the scale.7 There is no gold standard for assessing depression in an aphasic patient. In our study the psychiatrists assessment was important because the psychiatrist is an expert in mood disorders; the rehabilitation teams assessment was also important because members of the team are well acquainted with the patient. We then chose both dep-psy and dep-rehab as gold standards, and we used the McHenry variable selection algorithm, which allows specification of 2 dependent variables (instead of 1 in classic regression methods) and usually yields the same subset as the "all possible regressions" procedure.9 The algorithm searches for the best 1-variable model and produces an index, the Wilks
, which is the multivariate extension of the R2 and behaves like 1R2. It then searches for the best 2-variable model and calculates the Wilks
decrease. It subsequently searches for the best 3-variable model, and so on. If the decrease of Wilks
is small when 1 variable is added to the n-variable model, then the n-variable model is selected as the final model.
Factorial Validity
A principal component analysis (PCA) was performed to analyze the structure of the 15 items selected in Study 1; subsequently, some items were eliminated to avoid redundancies. For each set of redundant items, we retained the item (1) that was selected in the criterion validity section, or (2) that correlated with the greatest number of PCA factors, or (3) whose completion by the staff was the easiest. From that point on, the questionnaire was called ADRS.
Construct Validity
The correlation between ADRS and other constructs was measured with the product-moment correlation coefficient. We considered 3 constructs: dep-psy, dep-rehab (for all patients), and HDRS (for patients able to answer all HDRS questions, ie, all but severely aphasic patients).
Estimating a Threshold for the Diagnosis of Depression
A threshold was established after comparison of ADRS scores with the diagnosis made by the psychiatrist (depression or no depression) in the patients.
Results
Of the 52 patients admitted to the unit during the study period, 2 refused to take part; no patient was excluded. Fifty patients participated in the study (20 women and 30 men); mean±SD age was 60±13 years (range, 28 to 80 years); 35 were left-hemisphere stroke patients (LHS), and 15 were right-hemisphere stroke patients (RHS); 29 LHS were aphasic, and 6 LHS were not. Twenty-five patients could not complete nonbehavioral items of HDRS: 21 LHS had severe aphasia, and 3 LHS and 1 RHS had other cognitive deficiencies. Mean dep-psy was 29.6/100 (±17.4), mean dep-rehab was 37.1/100 (±17.7), and mean HDRS was 10.8/58 (±6.6). Depression was diagnosed in 29 of 50 patients (58%) by the psychiatrist and in 17 of 25 patients (68%) by the HDRS (score >7/52). The time since admission was 60±45 days (range, 4 to 174 days).
Criterion Validity
The Wilks
value was 0.121 for the 7-item model and 0.116 for the 8-item model, which is a minor increase (0.005). We then selected the 7-item model: Apparent Sadness; InsomniaMiddle; AnxietyPsychological; Somatic SymptomsGastrointestinal; MimicSlowness of Facial Mobility; Loss of Weight; and AnxietySomatic.
Factorial Validity
Four axes included 71% of the information. The 4-dimensional structure can be described as follows. For factor 1 (F1), all factor loadings were >0.50, except for InsomniaEarly (0.37), InsomniaMiddle (0.39), and Agitation (0.32). For factor 2 (F2), factor loadings of MimicSlowness of Facial MobilityNonaffected Side and Slowness and Paucity of MovementsTrunk, Nonaffected Limbs were the highest for this factor (0.59, 0.58). The correlation between these items was very high (r=0.77, P<106). Factor loadings of AnxietyPsychic and AnxietySomatic were also relatively high (0.49, 0.45, respectively) on F2. For factor 3 (F3), the 3 insomnia items had the highest factor loadings (0.59 to 0.68) and were closely correlated with one another (r=0.49 to 0.84, P<103 to P<106). For factor 4 (F4), factor loadings of Somatic SymptomsGeneral and Somatic SymptomsGastrointestinal were the highest (0.46, 0.40, respectively) for this factor. The correlation between these items was very high (r=0.63, P=106). Compared with the other items, AnxietySomatic also had a relatively high factor loading on F4 (0.34).
The other item redundancies were AgitationHypochondriasis (r=0.53, P<104) and FatigabilityLassitude (r=0.74, P<106).
These results suggest that F1 is a general factor of depressive illness, measuring the severity of the symptoms, F3 is an insomnia factor, and F2 and F4 are both anxiety axes, with F2 being a factor of retardation and F4 representing somatic symptoms.
To reduce redundancies among items, 6 items were eliminated according to predefined rules (see Methods, Factorial Validity). The 9 remaining items made up the ADRS (see Appendix).
Because the 3 depression rating scales from which ADRS items are derived yield similar scores (HDRS: 0 to 52; MADRS: 0 to 60; SRRS: 0 to 60), each of the 9 items has approximately the same weight in the total score with respect to the 8 others as in its original scale. For example, Apparent Sadness is a 0 to 6 ordinal item of MADRS, and therefore its weight is 0.10 (6/60); MimicSlowness of Facial Mobility is a 0 to 4 ordinal item of SRRS, and therefore its weight is 0.07 (4/60); the weight ratio for these 2 items is 1.5 (6/4). The weight of an item reflects the importance it represents for the assessment of depression according to the authors. In ADRS, which is a scale from 0 to 32, the weights of all items are increased by 60/32 (MADRS and SRRS items) or 52/32 (HDRS items). This is due to the fewer number of items in ADRS, which only contains 9 behavioral items. The preceding weights then become 0.19 (6/32) and 0.13 (4/32), but the ratio between the 2 items, which is the most important, remains the same. As a result, the weights of HDRS items are slightly underestimated in ADRS because their weights are increased by 52/32, whereas items from MADRS and SRRS are increased by 60/32. Because the ratio 60/52 is close to 1 (1.15), we decided not to change HDRS item scales.
Construct Validity of ADRS
The correlations between ADRS and dep-psy, dep-rehab (50 patients), and HDRS (25 patients) were high (r=0.60, P<104; r=0.78, P<106; r=0.77, P<105, respectively). Correlation coefficients in RHS only and in LHS only were also high (r=0.58, P<0.03; r=0.70, P<102; r=0.84, P<103 for RHS; and r=0.60, P<103; r=0.86, P<106; r=0.64, P<0.04 for LHS).
When only the 25 patients able to complete HDRS were considered, ADRS correlated better with dep-psy and dep-rehab (r=0.59, P<106; r=0.85, P<106, respectively) than did HDRS (r=0.40, P<0.05; r=0.59, P<102, respectively).
Estimating a Threshold for the Diagnosis of Depression
Mean ADRS was 10.1/32 (±5.6). Thirty patients (60%) had a score
9/32. With this value considered as a threshold, the sensitivity of ADRS compared with the diagnosis made by the psychiatrist was 0.83, and the specificity was 0.71. The value 9/32 was preferred to 8/32, which provided a poor specificity (0.52), and to 10/32, which provided a poor sensitivity (0.72). The sensitivity and the specificity of HDRS were poor (0.59 and 0.63, respectively).
| Study 3: Reliability |
|---|
|
|
|---|
Methods
The interrater reliability was assessed by comparing the ADRS scores of 15 new aphasic stroke patients, completed by 2 different rehabilitation teams within 24 hours. The test-retest reliability was assessed by comparing the ADRS scores of 15 additional patients assessed at a 2-week interval by the same team.
For each item, agreement between the teams ratings was assessed with
coefficients. For ADRS global scores, the association between teams ratings was assessed with correlation coefficients. Because of the relatively small number of observations at this stage, we used Spearman rank correlation coefficients.
Results
The average
coefficient over the 9 items was 0.69 (range, 0.37 to 1) for the interrater reliability and 0.58 (range, 0.33 to 1) for the test-retest reliability. Items the completion of which does not require communicating with the patient at all (either verbal or nonverbal communication) were the more reliable (
>0.80): Loss of Weight, Apparent Sadness, and InsomniaMiddle. On the other hand, the item Hypochondriasis was the least reliable (
=0.35, 0.37).
When the global scores of ADRS were calculated, interrater and test-retest reliabilities were very high (both r=0.89, P<104).
| Discussion |
|---|
|
|
|---|
The preceding results suggest that ADRS is a valid and reliable instrument. It has shown good content, criterion, factorial, and construct validity; high test-retest and interrater reliability; and high sensitivity and specificity. A score
9/32 strongly suggests the presence of depression. The factorial structure of ADRS is similar to the structure of HDRS.1 In a group of 172 healthy subjects, Hamilton1 described 3 main factors. The first measured the severity of the depression symptoms, which corresponds to F1 in the present study. The second was a bipolar factor with the symptoms of anxiety (psychic and somatic) and agitation counterbalancing retardation, suicide, depression, and loss of insight, which corresponds to F2 and F4. The third factor contained many items, including insomnia items, which are supported by F3. This provides support for the factorial validity of ADRS.
Despite these encouraging results, some remaining issues require attention. First, even if ADRS has high construct validity, the best way to measure internal mood state in patients is to ask them.10 VAMS have been validated for self-administration in stroke patients.3 In its validation study, however, patients had to be able to complete the Profile of Mood States, and those with severely impaired language comprehension were thus excluded. The authors of VAMS indicated that in their clinical experience VAMS can be completed by most patients, even those with very limited communication and cognitive abilities. They also wrote that VAMS cannot be used to diagnose mood disorders but may be beneficial as part of a more extensive clinical evaluation. ADRS can be completed for all patients and may be used to diagnose depression. The association of ADRS and VAMS could then be very useful for assessing mood disorders in aphasic patients.
A second issue relates to the possibility of using ADRS in nonaphasic stroke patients. Fifteen RHS and 10 LHS without severe aphasia participated in the study because of the need to include the HDRS to assess the construct validity of ADRS. ADRS may then theoretically be used in all stroke patients. We should, however, closely examine the validity in RHS patients in whom reduced facial emotion expression or euphoria may influence an observers interpretation of the patients internal mood state.6 In our study, the correlations between ADRS and dep-psy, dep-rehab, and HDRS were very high in RHS, providing support for high validity in those patients, but this must be confirmed in a more extensive RHS population.
The fact that ADRS correlated better with dep-psy and dep-rehab than did HDRS in communicant stroke patients suggests that it is a better tool for examining these patients. This is probably due to the fact that ADRS items have been selected to mathematically explain dep-psy and dep-rehab. Nevertheless, it is evident that among the HDRS items that are not included in ADRS, Loss of Libido and Work and Interests are not suitable for subacute stroke inpatients. On the other hand, the items Depression, Guilt, and Loss of Insight should be pertinent in communicant stroke patients. A comparison of classic depression rating scales and ADRS remains to be made in these patients. Another explanation for this better correlation could be that both assessments (dep-rehab and ADRS) are based on patient observation, while HDRS relies on interviews.
Another important issue concerns the interpretation of symptoms described in ADRS items. First, it is difficult for the clinician to determine whether somatic symptoms (including insomnia, loss of appetite, constipation, fatigability) are due to depression or to stroke. It is probable that in many cases both depression and somatic symptoms are direct consequences of the stroke. In this study we have only demonstrated that the presence of these symptoms is often associated with depression (whatever the reason) and that ADRS displays a good sensitivity and a good specificity, which is most important. Second, although items of ADRS are completed without interviewing patients, it is of course easier with patients who are able to describe their symptoms or express their feelings with words. Although in our study this phenomenon had no negative effect on ADRS properties (the construct validity was good in both RHS and LHS patients), it should be studied in an extensive group of profoundly dysphasic patients.
A final issue pertains to the psychometric qualities of ADRS. The sensitivity of ADRS to fine clinical changes, as detected by a psychiatrist or the rehabilitation staff, should be assessed in a new group of poststroke patients. In conclusion, this preliminary study suggests that ADRS is a valid, reliable, sensitive, and specific rating scale for the evaluation of depression in aphasic poststroke inpatients hospitalized in a neurorehabilitation unit. It can be used for the diagnosis and the follow-up of depression, but this should be confirmed by other studies. A few methodological issues remain to be addressed, particularly the sensitivity to change.
| Appendix |
|---|
|
|
|---|
Item 2. AnxietyPsychic
Item 3. AnxietySomatic
Symptoms can be gastrointestinal (dry mouth, flatulence, indigestion, diarrhea, cramps, belching); cardiovascular (palpitations, headaches); respiratory (hyperventilation, sighing). Other symptoms include urinary frequency, sweating.
Item 4. Somatic SymptomsGastrointestinal
Item 5. Hypochondriasis
Item 6. Loss of Weight
Rating is by measurement and does not take into account loss of weight that occurred during the acute poststroke phase.
Item 7. Apparent Sadness
Item represents despondency, gloom, and despair (more than just ordinary transient low spirits) reflected in speech (not available in case of severe aphasia), facial expression, and posture. Rating is by depth and inability to brighten mood.
Item 8. MimicSlowness of Facial Mobility (concerns only the nonaffected side)
Item 9. Fatigability (takes into account motor deficiency, if any)
Received October 7, 2003; revision received February 24, 2004; accepted March 25, 2004.
| References |
|---|
|
|
|---|
2. Montgomery SA, Asberg M. A new depression scale designed to be sensitive to change. Br J Psychiatry. 1979; 134: 382389.
3. Arruda JE, Stern RA, Somerville JA. Measurement of mood states in stroke patients: validation of the visual analog mood scales. Arch Phys Med Rehabil. 1999; 80: 676680.[CrossRef][Medline] [Order article via Infotrieve]
4. Price CIM, Curless RH, Rodgers H. Can stroke patients use visual analogue scales? Stroke. 1999; 30: 13571361.
5. Lincoln NB, Sutcliffe LM, Unsworth G. Validation of the Stroke Aphasic Depression Questionnaire (SADQ) for use with patients in hospital. Clin Neuropsychol Assess. 2000; 1: 8896.
6. Gainotti G. Emotional behavior and hemispheric side of the lesion. Cortex. 1972; 8: 4155.[Medline] [Order article via Infotrieve]
7. McDowell I, Newell C. Measuring Health: A Guide to Rating Scales and Questionnaires. New York, NY: Oxford University Press; 1987.
8. Dantchev N, Widlocher D. The measurement of retardation in depression. J Clin Psychiatry. 1998; 59: 1925.
9. McHenry C. Multivariate subset selection. J R Stat Soc C. 1978; 27: 291296.
10. Stern RA. Assessment of mood states in aphasia. Semin Speech Lang. 1999; 20: 3349.[Medline] [Order article via Infotrieve]
This article has been cited by other articles:
![]() |
A. Berg, J. Lonnqvist, H. Palomaki, and M. Kaste Assessment of Depression After Stroke: A Comparison of Different Screening Instruments Stroke, February 1, 2009; 40(2): 523 - 529. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Townend, M. Brady, and K. McLaughlan A Systematic Evaluation of the Adaptation of Depression Diagnostic Methods for Stroke Survivors Who Have Aphasia Stroke, November 1, 2007; 38(11): 3076 - 3083. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Reading and C. Richie Documenting changes in communication behaviours using a Structured Observation System Child Language Teaching and Therapy, June 1, 2007; 23(2): 181 - 200. [Abstract] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Stroke Home | Subscriptions | Archives | Feedback | Authors | Help | AHA Journals Home | Search Copyright © 2004 American Heart Association, Inc. All rights reserved. Unauthorized use prohibited. |