Donate Help Contact The AHA Sign In Home
American Heart Association
Stroke
Search: search_blue_button Advanced Search
Stroke. 1997;28:1174-1180

This Article
Right arrow Abstract Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow Request Permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Heinemann, A. W.
Right arrow Articles by Roth, E. J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Heinemann, A. W.
Right arrow Articles by Roth, E. J.

(Stroke. 1997;28:1174-1180.)
© 1997 American Heart Association, Inc.


Articles

Measurement Properties of the NIH Stroke Scale During Acute Rehabilitation

Allen W. Heinemann, PhD; Richard L. Harvey, MD; John R. McGuire, MD; Dinora Ingberman, MD; Linda Lovell, BA; Patrick Semik, BA; Elliot J. Roth, MD

From the Rehabilitation Institute of Chicago (A.W.H., R.L.H., J.R.M., D.I., L.L., P.S., E.J.R.) and the Department of Physical Medicine and Rehabilitation, Northwestern University Medical School (A.W.H., R.L.H., J.R.M., D.I., E.J.R.), Chicago, Ill.


*    Abstract
up arrowTop
*Abstract
down arrowIntroduction
down arrowSubjects and Methods
down arrowResults
down arrowDiscussion
down arrowReferences
 
Background and Purpose The scale of stroke impairment characteristics by Brott and associates, the National Institutes of Health (NIH) Stroke Scale, has been used widely in various studies of stroke outcome; however, the measurement properties of the items applied to patients during medical rehabilitation have not been evaluated thoroughly. This study evaluated the extent to which scale items cohere to define a unidimensional construct and have a useful range for application to patients during medical rehabilitation.

Methods Rating scale (or Rasch) analysis of the 15 NIH Stroke Scale items was conducted using the BIGSTEPS computer program to evaluate (1) the range of impairment assessed by the items, (2) the items' coherence with an underlying construct of impairment, and (3) range of impairment measured in rehabilitation patients. We sought to maximize the range of impairment measured by conducting analyses recursively; at each subsequent step, the worst fitting item was deleted or rescored. The sample comprised 1291 admission and discharge records from 693 rehabilitation inpatients with stroke.

Results Thirteen items arrayed the sample across a sufficient range of impairment. The limb ataxia item fit poorly and was deleted; lower ratings for this item were associated with higher scores on the total scale. Pupillary response was also deleted because ratings reflected poor congruence with the total score. Best language was rescored because intermediate ratings were inconsistently related to the total score. Patients with hemorrhagic strokes had poorer fitting measures than did patients with ischemic strokes.

Conclusions The items in a revised NIH Stroke Scale worked well together to define the severity of impairment resulting from stroke that is observed during medical rehabilitation. Directions regarding limb ataxia should be modified to indicate untestability due to hemiplegia.


Key Words: disability evaluation • rehabilitation • stroke assessment


*    Introduction
up arrowTop
up arrowAbstract
*Introduction
down arrowSubjects and Methods
down arrowResults
down arrowDiscussion
down arrowReferences
 
A reliable and valid measure of the impairment resulting from stroke is needed in acute care and medical rehabilitation to describe the consequences of neurological injury, to monitor the effects of treatment and natural recovery, and to understand how reductions in disability are related to improvements in impairment. One instrument developed for this purpose is the NIH Stroke Scale, described by Brott and associates.1 This stroke scale has clinical appeal and some evidence of reliability and validity when used in acute medical settings. Brott and associates described the rationale for defining the items, reported the extent to which the items were scoreable during acute care, and provided evidence of construct validity by reporting correlations between summed scores and lesion volume. Modest interrater reliability was reported among a neurologist, neurology house officer, neurology nurse, or emergency department nurse ({kappa}=.69); limited evidence of interrater reliability was reported by Goldstein and associates,2 who found moderate to substantial agreement for 9 of 13 items in 20 cases. Construct validity was supported by modest correlations between raw score and volume of lesion at 1 week from CT scan (r=.68) and clinical outcome at 3 months (r=.79; 1). Sensitivity to change was documented by Rothrock et al3 ; they reported a 24% rate of spontaneous improvement to the point of no or only mild deficits in patients with ischemic stroke who did not receive rehabilitation, although they did not describe item-specific rates of improvement. Lyden and associates4 found that video training of raters was effective in achieving moderate to excellent agreement on all but two items, facial paresis and ataxia, for which they recommended item revisions. Scale limitations are reflected in the finding that many patients could not be tested on some items and that normal scores were obtained on admission by many patients.

A valid measure of impairment severity resulting from stroke would allow clinicians and researchers to quantify both the extent of neurological recovery that occurs during acute management and medical rehabilitation and the relationships between disease, impairment, disability, and handicap. The WHO5 described a model of disablement that draws distinctions between various aspects of disease consequences, including effects on individuals, their families, and society. Four illness realms defined in this model are (1) underlying diagnosis or disease, (2) loss or abnormality of physical or psychological capabilities or impairment, (3) restriction in activities of daily living or disability, and (4) social disadvantage due to limited ability to fulfill a role that is normal for that person or handicap.6 The NIH Stroke Scale focuses on impairment severity. A scale with good measurement properties should contain items that cover a wide range of impairment levels, which when combined are sensitive to improvements over time or due to treatment. The clinical basis for item definition appears clear and is based on neurological examination: sensory, motor, reflex, and language functions are often disrupted after stroke. However, because certain impairments have a greater impact on disability than others, patients selected for inpatient rehabilitation may require a different or modified measurement tool than that used for patients in acute settings.

While the NIH Stroke Scale appears to be useful for constructing a measure of impairment, its utility and validity in medical rehabilitation have not been evaluated extensively. The high proportion of acute-care cases that could not be scored on some items and the apparent insensitivity of the items to neurological recovery in the original sample of Brott et al1 requires that the measurement properties be evaluated more thoroughly. Also, the sum of item scores is not an interval-level measure, although the items may form the basis of such a measure. The goal of this study was to evaluate the clinical utility and validity of the scale for patients during medical rehabilitation using a psychometric approach called rating scale (or Rasch) analysis.


*    Subjects and Methods
up arrowTop
up arrowAbstract
up arrowIntroduction
*Subjects and Methods
down arrowResults
down arrowDiscussion
down arrowReferences
 
Sample
The sample was composed of all patients with a primary diagnosis of stroke who were admitted consecutively to a freestanding, medical school–affiliated, urban rehabilitation hospital between December 1993 and December 1995. The hospital is designated a Rehabilitation Research and Training Center on Enhancing the Quality of Life of Stroke Survivors by the National Institute on Disability and Rehabilitation Research. As part of the center's activities, patients are routinely assessed on a variety of medical, functional, psychological, and social scales during and after their rehabilitation. Ratings of medical conditions and impairment were made at admission and discharge by physicians responsible for patients' care. The physicians viewed a videotape demonstrating instrument administration procedures before providing patient ratings.

Instrument
The NIH Stroke Scale1 consists of 15 items that assess the severity of impairment in LOC, ability to respond to questions and to obey simple commands, pupillary response, deviation of gaze, extent of hemianopsia, facial palsy, resistance to gravity in the weaker limb, plantar reflex, limb ataxia, sensory loss, visual neglect, dysarthria, and aphasia severity. Each item is rated on a 3- or 4-point ordinal scale. It was intended to assist in the examination of patients with acute cerebral infarction. Table 1Down lists the items and reproduces Brott's original summary of patient testability and incidence of impairment for each item. Many of the items appear to have limited utility, given the high proportion of subjects who were rated as normal at admission (LOC), who were untestable (limb ataxia), or who were rated more poorly 1 week later (sensory).


View this table:
[in this window]
[in a new window]
 
Table 1. NIH Stroke Scale Items (n=693)

Statistical Analysis
The raw score obtained by summing NIH Stroke Scale item responses is ordinal in nature, which precludes its use in parametric statistical comparisons because these raw data only allow rank ordering of scores. A measurement procedure that can be used to develop reliable, valid, and interval-scaled measures from ordinal scores is rating scale or Rasch analysis,7 named after the Danish mathematician whose work in the 1950s and 1960s has been widely applied in educational testing and more recently in rehabilitation outcome measurement. Interval (also called linear) measures possess the advantage of having equal intervals between units of the scale. When distributed in a reasonably normal fashion, measures from an interval scale can be confidently subjected to parametric statistical analyses that relate independent and dependent variables. Transforming ordinal raw scores to interval measures allows one to quantify individuals' impairments along an equal-interval continuum of severity, make quantitative comparisons within an individual across time, or compare severity levels between individuals or across groups of individuals.

Rasch analysis8 helps evaluate to what extent patients' responses to NIH Stroke Scale items are dominated by the one dimension it purports to measure (severity of stroke impairment). This procedure requires viewing impairment items as forming a continuum of tasks that range from easy to perform to difficult to perform. If the tasks form a single construct, one would expect patients of any impairment level to be more able to perform easy tasks and less able to perform difficult tasks. Thus, when patients are able (or unable) to perform a task that they would (or would not) have been expected to perform on the basis of their overall impairment level, their responses misfit the model. While individual patients do not always respond as expected on a given series of tasks, the finding that a substantial proportion respond unexpectedly to a task provides an indication that the task does not "fit" with the remaining tasks in forming a unidimensional construct. It is often the case that an item set contains "noisy" items or that the definitions of scale categories are vague, which contributes to imprecise measurement. Strategies for fine-tuning item sets include rewording or deleting items, rescoring rating scales, or segregating patients into more homogeneous subgroups. The derived interval measure becomes useful in the description and evaluation of patients' stroke severity when the evidence for unidimensionality is compelling. Attaining a fine-tuned item set allows each patient to be characterized by a single interval-level impairment measure and each item by an estimate of its difficulty (called its calibration).

Several criteria are used to judge and improve the adequacy of a measure. These criteria include (1) person separation (the range of impairments represented by the patients in the sample) and item separation (the range of impairments covered by the measure), (2) item fit (the extent to which the sample as a whole responds unexpectedly to specific items) and person fit (the extent to which individuals or diagnostic subgroups respond idiosyncratically to the item set), and (3) scale structure (the extent to which raters are using the steps in the scale correctly and consistently). The BIGSTEPS program9 provides these statistics. The range of impairment represented in a given sample is summarized with a person separation index, defined as the ratio of true spread of the measures with their measurement error. The index indicates the spread of a given sample of patients in units of the test error in their measures. A clinically useful set of items should define at least three strata of patients (eg, "high," "moderate," and "low" levels of impairment), which are reflected in a separation index of 2.0. A related statistic, called item separation, indicates the item set's potential range of measurement; larger values indicate a potentially greater range of impairment that the item set can measure.

Two indicators of fit are defined: infit and outfit. Infit is sensitive to irregular patterns of responses for items that are close to patients' impairment levels. Outfit is sensitive to extremely unexpected or rare responses. For example, a problem with large outfit would occur when a score associated with an overall severe level of impairment reflected a pattern of impairment in which relatively common signs (eg, facial palsy) are absent while signs indicative of severe impairment are present (eg, reduced LOC). While both infit and outfit are useful indicators of noise, large outfit usually reflects gross anomalies that might reflect the presence of patients with unique patterns of impairment. However, large infit usually reflects more serious problems in the item's coherence with the measure's underlying construct. A pattern of poorly fitting items should give users pause to consider whether their construct of impairment should be defined differently from what they hypothesized. Other reasons for poor item fit include ambiguous wording of an item, a misordered scale structure, and peculiar item use by a subgroup of patients. Small fit statistics are generally not a concern, although they provide insights into how an item set might be shortened by deleting redundant items.

Items in a set may be rated using a common scale that ranges from devastating impairment (eg, rated as 3) to no impairment (eg, rated as 0). When items do not share a single rating scale (eg, a consistently defined 0 to 3 scale), a model is used in which each item is allowed to define a unique scale structure. While interpretation of such a model is cumbersome, this approach is necessary given the varying number of categories defined for each NIH Stroke Scale item (ranging from 3 to 5) and various definitions of the scale across items. Accordingly, we used what is called a partial credit approach. A desirable scale characteristic is that the average measure of impairment across all items should increase with each step on each individual item. Our analytic strategy was to maximize person separation (ie, the range of impairment represented in the sample) while minimizing problems with inconsistently used scale structures and poorly fitting items.


*    Results
up arrowTop
up arrowAbstract
up arrowIntroduction
up arrowSubjects and Methods
*Results
down arrowDiscussion
down arrowReferences
 
Descriptive Characteristics
A total of 1291 records from 693 patient hospitalizations were available for analysis, 598 of which had both admission and discharge ratings. The median±SD age was 66±16.9 years. Women constituted 53% of the sample; 71% of the sample incurred ischemic strokes, and 29% incurred hemorrhagic strokes. The proportions of left- (45%) and right-sided strokes (43%) were nearly equivalent; the remainder incurred bilateral strokes (12%). History of prior stroke was experienced by 24%. The median number of days from stroke onset to rehabilitation admission was 13±128.5; the median rehabilitation length of stay was 28±15.3 days.

Rasch Analysis
The initial Rasch analysis of all 15 items (summarized on row 1 in Table 2Down) yielded a person separation of 1.88 and an item separation of 15.33. While the item separation indicates that the potential breadth of the measure was large (2.0 indicates adequate spread), the items were able to distinguish slightly fewer than three strata of rehabilitation patients (person separation, 1.88). This initial analysis revealed problems with several items that contributed to poor person separation. We used a strategy of rescoring those items that had poor scale structure (ie, items with rating steps that are associated inconsistently with better total scores) and deleting the worst fitting items sequentially to improve the measure. Reducing problems of fit with item or scale structures has the effect of improving person separation by deleting the sources of error (noise) and thus increasing precision (the ratio of signal to noise). The most notable problem was with limb ataxia; better scores on this item were associated with poorer scores on the remaining items. The large item infit (1.81) and outfit (8.43) revealed a considerable number of unexpected responses in ataxia ratings. The negative correlation (-.22) between the item and the total score also illustrates this problem. The directions to score ataxia as "normal" (0) in patients with hemiplegia who were rated as more impaired on other items accounts for this problem.


View this table:
[in this window]
[in a new window]
 
Table 2. Summary of BIGSTEPS Analyses of NIH Stroke Scale, All Admission and Discharge Records (n=1268)

Deletion of limb ataxia (row 2 in Table 2Up) yielded an improved person separation (2.04) and item separation of 16.79. The language item had a confusing step structure in that the two intermediate steps (mild to moderate versus severe aphasia) were used inconsistently; patients rated as having severe aphasia had total measures that indicated less impairment overall than did patients with mild to moderate aphasia. Consequently, we rescored the item by combining the middle two levels of this item. The results of rescoring this item are summarized in row 3 of Table 2Up. The separation statistics were essentially unchanged, and pupillary response still was the poorest fitting item based on an item outfit of 2.58. Deletion of pupillary response (shown in row 4 of Table 2Up) yielded a 13-item solution with slightly improved person separation (2.05).

Final Statistics
Table 3Down shows the item fit statistics for the 13 retained items in order of increasing difficulty. All items had excellent fit statistics. The "noisiest" item based on infit was best visual function; its infit of 1.12 means that it contains 12% more information than does the average item. In contrast, the best motor arm item with an infit of .85 provides redundant information because it contains only 85% of the expected amount of information. Inclusion of two similar items (best motor arm and leg) probably accounts for this finding. Several items with large outfits remained. However, these items tended to be relatively difficult (best gaze, best visual function) or easy (LOC–questions). Relatively difficult or easy items with large infits are of less concern because they provide diagnostic information about individual patients.


View this table:
[in this window]
[in a new window]
 
Table 3. Item Statistics in Measure Order

Not shown in Table 3Up was the finding that all items have increasing average measures across the scale steps and increasing step measures. Another index of good step structure is the ratio of the observed to expected outfit at each step for each item; a ratio of 1.0 is desirable, with values greater than 1.6 providing cause for concern. The worst ratio of observed to expected fit at any step for any item was 3.27 for best visual at step 2. While relatively few patients were rated with bilateral hemianopsia, some of them performed unexpectedly well on the other items, revealing that visual function can be disrupted in isolation from other functions.

Table 3Up shows that LOC was the rarest characteristic of stroke rehabilitation patients (with a difficulty of 2.73 logits, only the most impaired patients showed impairment on it), followed by best gaze, best visual, and best language. The item that reflected the most common characteristic of stroke patients was facial palsy (with a difficulty of -1.19 logits, even patients with the least impairment were affected), followed by plantar reflex, dysarthria, LOC–questions, best motor arm, sensory, and LOC–commands. Fig 1Down illustrates the range of person impairment and item difficulties for these 13 items. The left-hand column shows the distribution of patients' measures (under the "Persons" heading); the right-hand column shows the distribution of item difficulties. The mean±SD person measure of -1.84±1.45 logits is considerably below the mean item difficulty (fixed at 0.00 logits), indicating that the items are targeted above the average impairment level of this sample, that is, to a more impaired sample of patients. Although no patients were scored as completely impaired by these 13 items, 25 patients were scored as having no impairment. The capacity of the items to reflect more severe impairment may be useful in acute medical settings but not in a rehabilitation setting.



View larger version (12K):
[in this window]
[in a new window]
 
Figure 1. Map of persons and items. The distribution of patient measures (in logits) is shown in the left histogram. The distribution of item difficulties is illustrated in the right histogram.

Person Fit Analysis
We investigated the nature of misfitting ratings of patient impairments by examining relationships between patient outfits (outfit selected because it is more sensitive to anomalous patterns than infit) and stroke characteristics. Multivariate ANOVA was used to examine differences in person outfits at admission and discharge (time period) for four stroke categories (intracerebral hemorrhage, subarachnoid hemorrhage, and thrombotic and embolic stroke) and for patients with left- and right-sided strokes. A significant main effect was found for stroke category (F[3371]=4.92, P<.01) and time period (F[1371]=6.14, P<.02); the interaction between stroke category and side approached significance (F[3371]=2.26, P=.08). Fig 2Down shows that patients with left intracerebral and subarachnoid hemorrhage strokes had larger outfits (noisier measures) than did patients with thrombotic or embolic strokes, that admission outfits generally were greater than discharge outfits, and that patients with left hemorrhagic strokes tended to have larger outfits than did patients with right hemorrhagic strokes. This item set reflects the idiosyncratic ways in which hemorrhagic stroke impairments are manifested. "Noisier" measurement of impairment occurs at admission among patients with hemorrhagic strokes, and particularly left hemorrhagic strokes.



View larger version (18K):
[in this window]
[in a new window]
 
Figure 2. Patient mean square outfits at admission and discharge. Outfit values greater than 1.2 indicate idiosyncratic use of the items with patients who have left (L) hemorrhagic strokes at rehabilitation admission and discharge and right (R) subarachnoid hemorrhagic strokes at rehabilitation admission.

Improvement in Impairment
Admission and discharge NIH Stroke Scale measures for the 598 patients with ratings at both time points were correlated significantly (r=.82, P<.001); the majority of patients had a statistically significant reduction in impairment by discharge (admission mean, -1.65 logits; discharge mean, -2.20 logits; t[df=597]=14.5, P<.001).

Clinical Applications
The conversion between raw scores and linear measures for patients with complete items is listed in Table 4Down; a plot of the raw scores against the linear measure would reveal an ogival ({sans serif S}-shaped) relationship in which the raw scores and interval measures are related linearly in the mid range of values even though the relationship becomes curvilinear toward the top and bottom ends of the range. This curve illustrates the mathematically necessary relationship between the finite range of impairment measured by the raw scores and the infinite range of impairment implied by the interval measure. In the middle ranges of impairment, the raw NIH Stroke Scale scores provide a reasonably linear estimate of impairment. The consequence of using raw scores for patients with very low ratings is to overestimate their actual impairment and for patients with very high ratings, to underestimate their actual impairment.


View this table:
[in this window]
[in a new window]
 
Table 4. Conversion Table From Sum of 13 Raw Rescored Items to Linear Measures

Fig 3Down provides a self-scoring key for the 13-item measure. For a given patient, clinicians can circle responses to the 13 items and then mark a vertical line that passes through the midpoint of the ratings; the point where this line intersects the horizontal axis is the estimated measure for that person. In this hypothetical example, the patient's average measure is about .5 logits, indicating moderate impairment. Unusual responses should be immediately evident, giving cause to reconsider the ratings or to explore further the idiosyncrasies of impairment in a specific patient. The rating of no sensory impairment fits poorly with the patient's overall level of impairment and should be investigated further. Estimates of impairment can be derived when fewer than 13 items are rated as well, although with less precision.



View larger version (16K):
[in this window]
[in a new window]
 
Figure 3. Self-scoring key for NIH Stroke Scale items. Q indicates quartile; S, 1 standard deviation; and M, mean. Distance between scale points is equal-intervaled. Scale at top and bottom of key (ranging from -6 to +8 is in log-odd units (logits), centered at the mean item difficulty. For a given patient, clinicians can circle responses to the 13 items and then mark a vertical line that passes through the midpoint of the ratings; the point where this line intersects the horizontal axis is the estimated measure for that person. In this hypothetical example, the patient's average measure is about 0.5 logits, indicating moderate impairment.


*    Discussion
up arrowTop
up arrowAbstract
up arrowIntroduction
up arrowSubjects and Methods
up arrowResults
*Discussion
down arrowReferences
 
A useful measure of impairment resulting from stroke was derived from the NIH Stroke Scale for patients undergoing medical rehabilitation, with two items omitted. Limb ataxia fit poorly because the directions confuse hemiplegia and normal function; pupillary response ratings reflect idiosyncratic ratings for patients with relatively minimal impairment overall. One item, best language, was rescored because distinctions among the intermediate ratings of aphasia severity were not consistently related to the measure. In summary, 13 of the original 15 items are useful and sufficient in describing the impairment severity experienced by patients during medical rehabilitation; the other items add too much noise given the original directions.

Improvements to the measure could be made by revising the directions for ataxia; providing a "not testable" option would eliminate confusion between hemiplegia resulting in untestability and unimpaired walking. Aphasia assessments may have been difficult to make given the presence of tracheostomy in some patients, particularly those with reduced LOC; provision of an "untestable" option may also be useful for this item. The need to rescore the aphasia rating may reflect patient selection criteria for rehabilitation. Patients with less motor disability and severe aphasia are often admitted to rehabilitation, whereas patients with severe motor disability and severe aphasia are often poor rehabilitation candidates. While the potential range of the measure is large, this sample was arrayed across a relatively small range of the items. In contrast, our work with the Functional Independence Measure instrument,10 an ordinal measure of disability, used with a similar sample, produced a measure that differentiated patients on a wide range of tasks and yielded a person separation of 3.74 for the motor items and 2.17 for the cognitive items. The items may be targeted better in patients in acute medical settings because the average item severity was targeted above the average patient measure of this sample. In contrast, relatively little variance may be seen in samples comprising patients drawn from outpatient and community settings. Additional items that reflect more subtle impairment for patients during medical rehabilitation would better target the scale in this setting. Separate ratings for sitting and standing balance might help distinguish impairment severity better. Finally, the utility of this revision in acute medical settings should be evaluated also.

Measures for patients with hemorrhagic strokes had poorer fits than for patients with thrombotic or embolic stroke; a tendency for patients with left-sided hemorrhagic strokes to have poorer fit was also found. Patients with hemorrhage are of two types: intracerebral and subarachnoid. These are quite distinct conditions and are likely to influence person fit in a heterogeneous sample such as this. Patients with left-sided strokes, especially hemorrhagic strokes, typically perform poorly LOC questions and commands, which reflect LOC, language, or both. Hemorrhagic stroke causes variable impairments, unlike the predictable impairments that follow vascular anatomy in ischemic stroke. It is also difficult to assess sensation in patients with aphasia. Reasons for these differences need to be explored further.

The NIH Stroke Scale is used widely to evaluate neurological change in pharmaceutical trials. Use of an interval measure with known reliability derived from this scale should provide greater sensitivity for these studies. Rehabilitation studies that wish to distinguish functional recovery and handicap reduction attributable to clinical interventions from neurological recovery will also benefit from this linear measure of impairment.

Limitations of this study include selection of patients from only one rehabilitation hospital and ratings provided by a small number of physician raters. It is possible that a narrower or broader range of impairment is seen in patients referred for medical rehabilitation in other settings. Additional training of raters might enhance item fit. However, the extent of training provided to physicians in this study is apt to reflect real world situations, thus enhancing the generalizability of these findings.

In summary, clinicians and researchers now have a valid method of describing severity of impairment found in patients undergoing stroke rehabilitation. Such a measure complements the available variety of linear disability (Functional Independence Measure,10 Patient Evaluation Conference System,11 LORS III12 ) and handicap measures (Craig Hospital Assessment and Reporting Technique13 ), thus realizing the assessment of the WHO model of impairment, disability, and handicap.


*    Selected Abbreviations and Acronyms
 
NIH = National Institutes of Health
WHO = World Health Organization
LOC = level of consciousness


*    Acknowledgments
 
Funding was provided by the National Institute on Disability and Rehabilitation Research through the Rehabilitation Research and Training Center on Enhancing the Quality of Life of Stroke Survivors (H133B30024) and the Rehabilitation Research and Training Centers on Functional Assessment and Evaluation of Rehabilitation Outcomes (H133B30041). The authors extend their appreciation to Benjamin D. Wright, PhD, for his assistance with data analysis and interpretation and to Rita Bode, PhD, postdoctoral fellow at the Rehabilitation Institute of Chicago, for her psychometric review and editorial comments.


*    Footnotes
 
Reprint requests to Allen W. Heinemann, PhD, Rehabilitation Institute of Chicago, 345 E Superior St, Chicago, IL 60611-3015.

An earlier version was presented at the Mid-Western Educational Research Association Meeting, Chicago, Ill, October 13, 1995.

Received August 7, 1996; revision received February 18, 1997; accepted March 14, 1997.


*    References
up arrowTop
up arrowAbstract
up arrowIntroduction
up arrowSubjects and Methods
up arrowResults
up arrowDiscussion
*References
 

  1. Brott T, Adams HP, Olinger CP, Marler JR, Barsan WG, Biller J, Spilker J, Holleran R, Eberle R, Hertzberg V, Rorick M, Moomaw CJ, Walker M. Measurements of acute cerebral infarction: a clinical examination scale. Stroke. 1989;20:864-870.[Abstract/Free Full Text]
  2. Goldstein LB, Bertels C, David JN. Inter-rater reliability of the NIH stroke scale. Arch Neurol. 1989;46:660-662.[Abstract]
  3. Rothrock JF, Clark WM, Lyden PD. Spontaneous early improvement following ischemic stroke. Stroke. 1995;26:1358-1360.[Abstract/Free Full Text]
  4. Lyden P, Brott T, Tilley B, Welch KMA, Mascha EJ, Levine S, Haley EC, Grotta J, Marler J, for the National Institute of Neurological Disorders and Stroke TPA Stroke Study Group. Improved reliability of the NIH Stroke Scale using video training. Stroke. 1994;25:2220-2226.[Abstract]
  5. World Health Organization. Classification of impairments, disabilities, and handicaps. Geneva, Switzerland: World Health Organization; 1980.
  6. US Department of Health and Human Services. Post-Stroke Rehabilitation. AHCPR publication 95-0662, May 1995.
  7. Rasch G. Probabilistic Models for Some Intelligence and Attainment Tests. Copenhagen, Denmark: Danmarks Paedogogiske Institut; 1960 (Chicago, Ill: University of Chicago Press; 1980).
  8. Wright BD, Masters G. Rating scale analysis: Rasch measurement. Chicago, Ill: MESA Press; 1982.
  9. Linacre JM. BIGSTEPS for PC Compatibles. Chicago, Ill: Mesa Press; 1995.
  10. Linacre JW, Heinemann AW, Wright BD, Granger C, Hamilton BB. The structure and stability of the Functional Independence Measure. Arch Physical Med Rehabil. 1994;75:127-132.[Medline] [Order article via Infotrieve]
  11. Fisher WP, Harvey RF, Taylor P, Kilgore KM, Kelly CK. Rehabits: a common language of functional assessment. Arch Phys Med Rehabil. 1995;76:113-122.[Medline] [Order article via Infotrieve]
  12. Velozo CA, Magalhaes LC, Pan AW, Leiter P. Functional scale discrimination at admission and discharge: Rasch analysis of the Level of Rehabilitation Scale III. Arch Phys Med Rehabil. 1995;76:705-712.[Medline] [Order article via Infotrieve]
  13. Whiteneck GG, Charlifue SW, Gerhart KA, Richardson GN. Quantifying handicap: a new measure of long-term rehabilitation outcomes. Arch Phys Med Rehabil. 1992;73:519-526.[Medline] [Order article via Infotrieve]



This article has been cited by other articles:


Home page
J. Cogn. Neurosci.Home page
C. Rorden, H.-O. Karnath, and L. Bonilha
Improving lesion-symptom mapping.
J. Cogn. Neurosci., July 1, 2007; 19(7): 1081 - 1088.
[Abstract] [Full Text] [PDF]


Home page
StrokeHome page
J. P. Davis, A. A. Wong, P. J. Schluter, R. D. Henderson, J. D. O'Sullivan, and S. J. Read
Impact of Premorbid Undernutrition on Outcome in Stroke Patients
Stroke, August 1, 2004; 35(8): 1930 - 1934.
[Abstract] [Full Text] [PDF]


Home page
StrokeHome page
E. J. Roth, L. Lovell, R. L. Harvey, A. W. Heinemann, P. Semik, and S. Diaz
Incidence of and Risk Factors for Medical Complications During Stroke Rehabilitation
Stroke, February 1, 2001; 32(2): 523 - 529.
[Abstract] [Full Text] [PDF]


Home page
StrokeHome page
P. Lyden, M. Lu, C. Jackson, J. Marler, R. Kothari, T. Brott, and J. Zivin
Underlying Structure of the National Institutes of Health Stroke Scale : Results of a Factor Analysis
Stroke, November 1, 1999; 30(11): 2347 - 2354.
[Abstract] [Full Text] [PDF]


Home page
StrokeHome page
S. Shafqat, J. C. Kvedar, M. M. Guanci, Y. Chang, and L. H. Schwamm
Role for Telemedicine in Acute Stroke : Feasibility and Reliability of Remote Administration of the NIH Stroke Scale
Stroke, October 1, 1999; 30(10): 2141 - 2145.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow Request Permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Heinemann, A. W.
Right arrow Articles by Roth, E. J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Heinemann, A. W.
Right arrow Articles by Roth, E. J.