Validating Administrative Data in Stroke Research
Background and Purpose— Research based on administrative data has advantages, including large numbers, consistent data, and low cost. This study was designed to compare different methods of stroke classification using administrative data.
Methods— Administrative hospital discharge data and medical record review of 206 patients were used to evaluate 3 algorithms for classifying stroke patients. These algorithms were based on all (algorithm 1), the first 2 (algorithm 2), or the primary (algorithm 3) administrative discharge diagnosis code(s). The diagnoses after review of medical record data were considered the gold standard. Then, using a large administrative data set, we compared patients with a primary discharge diagnosis of stroke with patients with their stroke discharge diagnosis code in a nonprimary position.
Results— Compared with the gold standard, algorithm 1 had the highest κ for classifying ischemic stroke, with a sensitivity of 86%, specificity of 95%, positive predictive value of 90%, and κ=0.82. Algorithm 3 had the highest κ values for intracerebral hemorrhage and subarachnoid hemorrhage. For intracerebral hemorrhage, the sensitivity was 85%, specificity was 96%, positive predictive value was 89%, and κ=0.82. For subarachnoid hemorrhage, those values were 90%, 97%, 94%, and 0.88, respectively. Nonprimary position ischemic stroke patients had significantly greater comorbidity and 30-day mortality (odds ratio, 3.2) than primary position ischemic stroke patients.
Conclusions— Stroke classification in these administrative data were optimal using all discharge diagnoses for ischemic stroke and primary discharge diagnosis only for intracerebral and subarachnoid hemorrhage. Selecting ischemic stroke patients on the basis of primary discharge diagnosis may bias administrative samples toward more benign, unrepresentative outcomes and should be avoided.
Administrative data are those data routinely collected, often for billing purposes, on all patients in a given population. Advantages to research based on administrative data include the large number of cases, consistent data across patients, and little expense necessary to obtain and analyze data.1 Disadvantages include the limited range of data elements and inaccuracies of coding.2,3⇓
As expected, and as has been shown in other studies of administrative data and stroke, classifying patients on the basis of International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes leads to a tradeoff between sensitivity and specificity3–6⇓⇓⇓; similar data are unavailable for ICD-10. This study was designed to assess the validity of stroke classification using administrative data. Our objectives were (1) to see how different algorithms, using different numbers of diagnoses from hospital discharge administrative data, perform for the classification of stroke types, including ischemic stroke, intracerebral hemorrhage (ICH), and subarachnoid hemorrhage (SAH), and (2) to explore the possibility of bias being introduced into the selected stroke type samples as a result of the different algorithms.
The Comprehensive Hospital Abstract Reporting System (CHARS) is an administrative database that includes the discharge (or death) diagnoses, age, sex, home zip code, and other variables on all patients hospitalized in Washington State (excluding Veterans Administration hospitals). The discharge diagnoses (up to 9 per hospitalization) are coded in CHARS with ICD-9-CM codes. We obtained a subset of CHARS, which included all hospitalizations for all patients ≥20 years of age in Seattle (Wash) hospitals from 1990 through 1996 with any stroke-related discharge diagnosis (ICD-9-CM 430 to 438). In an attempt to make the study population based, non-Seattle residents were excluded by selecting only those hospitalizations for patients with a home zip code within Seattle city limits. Additionally, some Seattle residents with stroke may have been admitted to hospitals outside of Seattle.
The CHARS data were merged with Washington State death certificates using the first 2 letters of the last name, first 2 letters of the first name, and date of birth. Thus, if a hospital discharge record matched with a death record (from the time of the admission to the end of 1996), we obtained additional information, including death date. Administrative data 30-day case fatality was defined as the CHARS data reporting death at discharge if ≤30 days or, for deaths occurring after hospital discharge, a match found from the merge with death certificates with a date of death ≤30 days from the date of hospital admission.
Classification of Stroke Patients in CHARS
Strokes were identified and classified with specific ICD-9-CM discharge diagnosis codes from the CHARS database. Four previous studies that used either a registry or chart review as the gold standard demonstrated that ICD-9-CM codes 434 and 436 were the most predictive of ischemic strokes.3–6⇓⇓⇓ In October 1992, a fifth digit was added to ICD-9-CM codes 433 and 434: 0=without cerebral infarction and 1=with cerebral infarction. Making use of this fifth digit furthers increases specificity and positive predictive value (PPV).2 Thus, ischemic strokes were identified by ICD-9-CM code 433.x1 (“x,” the fourth digit, can vary to specify a specific arterial distribution), 434 (excluding 434.x0), or 436. The assumptions made in the 4-digit era that code 433 was not an ischemic stroke and that 434 was an ischemic stroke were reasonable (data not shown) but did include some miscoding that would be corrected by use of the fifth digit. ICD-9-CM code 435 identified transient ischemic attacks (TIAs). ICD-9-CM code 430 identified SAH; code 431 identified ICH. If only ICD-9-CM code 432 (epidural and subdural hematoma), 437 (other and ill-defined cerebrovascular disease), or 438 (late effects of cerebrovascular disease) was present, the case was classified as “not a stroke.” For each stroke type, the case was excluded (classified as not a stroke) if any “traumatic brain injury” ICD-9-CM code (800 to 804, 850 to 854) or “rehabilitation care” as the primary ICD-9-CM code (V57) was present.
We tested 3 algorithms for stroke classification. Algorithm 1 used all available discharge diagnosis codes (up to 9) to identify stroke cases; for each patient, a dichotomous variable was created to indicate the presence or absence of a valid TIA, ischemic stroke, ICH, or SAH code. The overall stroke classification thus included the categories TIA, ischemic stroke, ICH, SAH, and not a stroke. The administrative data classification began with the assignment of all patients to the category of not a stroke. If a TIA code was present, the patient was reclassified; then, if an ischemic stroke code was present, the patient was further reclassified (from not a stroke or TIA to ischemic stroke); then, if an ICH code was present, the patient was further reclassified (from not a stroke, TIA, or ischemic stroke to ICH); then, if an SAH code was present, the patient was further reclassified (from not a stroke, TIA, ischemic stroke, or ICH to SAH). Algorithm 2 restricted the search for stroke ICD-9-CM codes to just the first 2 discharge diagnoses; otherwise, it was identical. Algorithm 3 restricted the search to just the primary diagnosis. The hierarchy in these algorithms was based on the assumption that hemorrhagic strokes (ICH and SAH) are coded more accurately than ischemic strokes; thus, hemorrhagic diagnoses take precedence. Also, if both a stroke and a TIA had occurred, it seemed reasonable to assume that a stroke had occurred. Finally, cases with both ICH and SAH coded were thought, on the basis of the clinical experience of the authors, to be more likely to represent primary SAH. The hierarchy was not adjusted to optimize classification in these data. The hierarchy is not relevant when just the primary discharge diagnosis is used.
Because each patient may have had multiple hospitalizations, we hypothesized that restriction to the first hospitalization during the study interval for any given patient would improve the performance of a stroke classification algorithm and increase the proportion of incident events (defined as no evidence of a history of previous stroke after review of medical records). If a patient had been admitted to the hospital for a stroke before our study interval, the first hospitalization in our data would not represent the incident stroke.
Selection of Cases for Chart Review
A subset of the original 1990 to 1996 data set was identified for medical record review. Only patients with the ICD-9-CM codes most predictive of ischemic stroke, ICH, or SAH (using algorithm 1 as defined above) were selected for review. We deliberately oversampled hemorrhagic stroke to ensure fairly equal numbers of the 3 main stroke types (ischemic, ICH, SAH). After these stipulations were applied, medical records were randomly selected from 4 convenient hospitals, including 2 community hospitals, 1 health maintenance organization hospital, and 2 university-associated hospitals; abstracted (by a nurse with training in the abstraction of stroke records); and reviewed and classified by a stroke neurologist (D.L.T.) who was blinded to the ICD-9-CM codes.
The performance of the stroke classification algorithms was compared with results from chart review using standard definitions of sensitivity, specificity, and positive predictive value (PPV). For theses calculations, the diagnoses of the stroke neurologist after review of the medical record data were considered the gold standard. Percentages are presented with exact binomial 95% confidence intervals (CIs). Alternatively, one might not want to assume that a diagnostic gold standard is present; thus, agreement between administrative stroke classification and reviewing stroke neurologist diagnosis was assessed with unweighted κ statistics. The estimates of κ, sensitivity, specificity, and PPV for each of the stroke types were based on a 2×2 table [administrative diagnosis of stroke type yes/no versus chart review diagnosis (gold standard) stroke type yes/no], whereas the overall κ estimates were based on a 5×5 table (an administrative diagnosis axis and a chart review diagnosis axis, each with the 5 categories of ischemic stroke, ICH, SAH, TIA, and not a stroke).
Using the first admission for each patient during the study interval and just the administrative data, we compared patients with their stroke discharge diagnosis codes in the primary position with patients whose codes were in a nonprimary position. Nonparametric tests were used to compare age (Wilcoxon rank sum) and sex (χ2). Using administrative 30-day case fatality as the outcome of interest, we compared patients with primary position and those with nonprimary position stroke discharge diagnosis codes. Unadjusted risk estimates are presented as both relative risks (RRs) and odds ratios (ORs), followed by 95% CIs; the ORs are presented in addition to the preferred RRs to facilitate comparison with multivariate-adjusted ORs derived from logistic regression models. Logistic regression was used to control for age, sex, and multiple comorbid discharge diagnoses.
All analyses were performed with the STATA statistical program (version 6, Stata Corp). The Human Subjects Review Committee of the University of Washington and the Washington State Department of Health approved this study.
The original hospital discharge administrative data set contained 20 703 patient discharges, each having a stroke-related discharge diagnosis (ICD-9-CM 430 to 438). Of these, 13 210 (63.8%) were first admissions for any given patient. We reviewed 206 medical records for this validation study. Characteristics of each data set are shown in Table 1. Restricting from all hospitalizations for each patient (n=206) to the first hospital admission (n=147) raised the proportion of incident strokes from 77% to 88%. The stroke type–specific percentages of incident cases in first admissions were 76% for ischemic stroke, 89% for ICH, and 96% for SAH. This higher proportion of incident cases was the rationale for using just the first admission for each patient in the subsequent analyses.
Merging Hospital Discharge Data and Death Certificates
Of the 34 of 206 patients for whom medical record review data indicated a death during hospitalization, all 34 were coded as dead on discharge in the CHARS data, and 31 of 34 matched with death certificate data. One patient was discharged alive with an incorrect CHARS code for death at discharge. Three patients matched death certificates dated before admission, indicating erroneous matches. Therefore, with the medical record review as the gold standard for in-hospital death identification, using CHARS data to identify in-hospital deaths had a sensitivity of 34 of 34=100%, specificity of 171 of 172=99%, and PPV of 34 of 35=97%; these parameters reflect the matching for deaths occurring during hospitalization (73% of 30-day case fatalities). Again, with medical record review as the gold standard, using death certificate data to identify in-hospital deaths had a sensitivity of 31 of 34=91%, specificity of 169 of 172=98%, and PPV of 31 of 34=91%; these parameters likely reflect the matching for deaths occurring after discharge (27% of 30-day case fatalities). For deaths that occurred after hospital discharge, there was no way to directly validate the matching with these data.
Performance of Algorithms for Stroke Classification
Three different algorithms, as described in Methods, were used for stroke type classification of first admissions for both the administrative data and the validation subset. The performance characteristics of each of the 3 algorithms based on 147 first admission medical records reviewed are shown in Table 2. Ischemic stroke classification agreement with the gold standard agreement was maximal with algorithm 1 (κ=0.82). ICH and SAH classification agreement was maximal with algorithm 3 (κ=0.82 for ICH and 0.88 for SAH). Overall stroke classification agreement of the administrative data with reviewing stroke neurologist diagnosis was maximal with algorithm 1 (κ=0.79).
Comparing Patients With Primary Position Versus Nonprimary Position Stroke Discharge Diagnoses in the Large Administrative Data Set
There were 993 patients in the first admissions administrative data set with their stroke discharge diagnosis code in a nonprimary position; the proportions by stroke type were as follows: ischemic stroke, 845 of 5992=14%; ICH, 111 of 949=12%; and SAH, 37 of 348=11% (denominators for each stroke type from Table 1, middle). These patients were compared with patients with a primary discharge diagnosis of stroke; the results are shown in Table 3. Sex proportions did not differ between nonprimary position and primary position stroke patients. Age did not differ between nonprimary position and primary position ischemic stroke patients (median, 77 years) and SAH patients (median, 59 years) but was significantly greater for ICH primary position versus nonprimary position patients (median, 73.3 versus 68.2 years; P=0.004). Nonprimary position ischemic stroke patients were significantly more likely than primary position ischemic stroke patients to have died by 30 days after their hospital admission (30% versus 13% in primary position patients; RR=2.4; 95% CI, 2.1 to 2.7). When age and sex were controlled for, having an ischemic stroke diagnosis code in a nonprimary position versus primary position was associated with an OR of 3.2 (95% CI, 2.7 to 3.8) for 30-day case-fatality; this was reduced to an OR of 2.0 (95% CI, 1.7 to 2.5) after controlling further for comorbid discharge diagnoses (see footnotes in Table 3). Using logistic regression to control for age and sex, having a nonprimary position ICH diagnosis code was associated with an OR of 1.6 (95% CI, 1.0 to 2.4) for 30-day case-fatality; this was reduced to an OR of 1.0 (95% CI, 0.6 to 1.5) after controlling for comorbid discharge diagnoses (see footnotes in Table 3). Compared with primary position SAH patients, nonprimary position SAH patients were not associated with increased risk of 30-day case-fatality (Table 3).
This validation study demonstrates that stroke patients (ischemic, ICH, and SAH) can be identified from administrative data. The algorithm that maximized sensitivity and overall classification agreement corrected for chance (κ) used all discharge diagnoses (algorithm 1, Table 2). The algorithm that most consistently maximized specificity and PPV was the one based on the primary diagnosis only (algorithm 3, Table 2). These findings were not consistent across stroke subtypes.
With the diagnosis from review of medical record data used for comparison, each stroke type classification was nearly perfect (κ>0.8) using at least 1 of the 3 prespecified algorithms (Table 2). Ischemic stroke showed best agreement with algorithm 1, for which κ=0.82. Sensitivity was maximal with algorithm 1 and then steadily decreased with algorithms 2 and 3 (sensitivity, 86%, 80%, and 74%, respectively). At the same time, specificity (95%, 96%, and 95%) and PPV (90%, 91%, and 88%) remained stable. Another argument supporting algorithm 1 as the best classification for ischemic stroke comes from the analyses on the large administrative data set comparing patients with the ischemic stroke discharge diagnosis code in the primary position with those with the diagnosis in a nonprimary position. The nonprimary position patients have a larger comorbidity burden and an elevated risk of 30-day case-fatality (age- and sex-adjusted OR, 3.2). That the increased comorbidity is responsible for much of this increased mortality is suggested by a reduction in OR to 2.0 after adjustment for comorbid discharge diagnoses (Table 3). If one were to use only the patients with an appropriate ICD-9-CM ischemic stroke code in the primary discharge diagnosis position, patients with higher levels of comorbidity and related short-term mortality would be excluded, and estimates related to outcomes in these patients would be overly optimistic. Previous reports using ICD-9-CM discharge data in ischemic stroke have suggested using only the primary discharge diagnosis as a way to optimize the PPV3 or have empirically done so, 2,7,8⇓⇓ and thus any outcomes reported may have been optimistic.
ICH diagnoses showed best agreement with diagnosis from review of medical record data with algorithm 3, for which κ=0.82. The PPV steadily increased from 80% to 83% to 89% going from algorithms 1 to 3, whereas there were lesser gains in sensitivity (82%, 85%, and 85%, respectively) and specificity (93%, 94%, and 96%, respectively) (Table 2). Nonprimary position ICH patients were associated with an age- and sex-adjusted OR of 1.6 for 30-day case-fatality; this excess risk disappears (OR=1.0) with adjustment for comorbid discharge diagnoses (Table 3). Much of the increased risk in those removed patients was due to the comorbid, primary discharge diagnoses of malignant brain tumor and acute myocardial infarction. Malignant brain tumor (either primary or secondary) was the primary discharge diagnosis in 16% of ICH patients. Acute myocardial infarction, with associated thrombolytic therapy the likely cause of the ICH, was the primary discharge diagnosis in 15% of ICH patients. In an effort to identify a “pure” ICH population, these diagnoses might be excluded; the estimated RR of 30-day case fatality in nonprimary versus primary position ICH patients then decreases from an OR of 1.6 with them included to a nonsignificant OR of 1.2 with them excluded.
SAH showed nearly perfect agreement with diagnosis from review of medical record data using all 3 algorithms; sensitivity was maximal with algorithm 1 compared with specificity, PPV, and κ (just barely) being maximal with algorithm 3 (κ=0.88) (Table 2). There was no difference in the risk of 30-day case fatality for patients with primary position SAH diagnosis codes compared with those with nonprimary position diagnoses; therefore, excluding those patients would not bias estimates of this outcome. The 30-day case-fatality rate for primary position SAH patients was 31.2%, a figure similar to the 32% reported from a population-based study performed in a similar geographic region in a just-earlier time frame (1987 to 1989).9 This high level of diagnostic agreement (κ) and the similarity in 30-day case-fatality rates raise the possibility that, at least for SAH, administrative data may be sufficient for descriptive epidemiological research.
The study has a number of potential limitations. The medical records reviewed were chosen from high-probability stroke ICD-9-CM codes; therefore, true stroke patients coded with reportedly less predictive ICD-9-CM codes (ie, 432, 435, 437, or 438) are likely to have been missed, and thus our estimates of sensitivity are optimistic. By excluding most patients who did not have a stroke (those without ICD-9-CM code 430 to 438), we may have made pessimistic estimates of specificity. Avoiding the less predictive ICD-9-CM codes should not affect the estimates of PPV. The number of medical records abstracted (n=206) was small compared with the number of patients in the original administrative data set (n=20 703). Moreover, records were reviewed only from a subset of all possible hospitals, making the records reviewed not population based, a convenient rather than a random sample and possibly an unrepresentative sample. If the hospitals sampled did not code discharge diagnoses like the rest of the hospitals in Seattle or throughout the United States, our conclusions might not be generalizable; we have no reason to believe this was the case. The time frame of this validation study was 1990 to 1996, and although these results are directly applicable to our research with the same data set, extrapolation of these results to other time periods and other administrative data sets remains uncertain.
The performance of these algorithms for classifying stroke patients in administrative data is based on the premise that the reviewing stroke neurologist’s diagnosis (heavily influenced by the diagnosis of the treating physician) represents the gold standard—an assumption that may not be completely accurate. If we assume that errors in this gold standard may exist, then the κ statistics, used to compare agreement between 2 measures, neither of which is superior, are of value. The medical record sampling scheme used, not reviewing charts for which ICD-9-CM codes (eg, 433 or 435) have been shown to represent few true strokes,3–6⇓⇓⇓ may have led to optimistic κ estimates. The amount of bias introduced would depend on the proportion of disagreement, and if the reviewing stroke neurologist’s diagnoses were also largely “not a stroke,” it is possible that the bias introduced might be quite small. Also, the κ estimates of overall stroke classification agreement between administrative and medical record sources are artificially heavily weighted by the agreement within the hemorrhagic stroke types. We oversampled these stroke types to have a reasonable number of patients to evaluate. The stroke type–specific κ estimates may therefore be more accurate reflections of the agreement.
In conclusion, patients with stroke can be identified from administrative data. The best algorithm to use to classify the administratively identified potential stroke patients differed by stroke subtype and may differ depending on the research question of interest. For ischemic stroke, the algorithm that considered all possible discharge diagnoses seemed to maximize stroke classification agreement (κ) and may give less biased estimates of 30-day survival. In ICH and SAH, agreement was maximal with the algorithm that used only the primary discharge diagnosis, and these samples seemed unbiased. There are limitations to the conclusions that can be drawn from research results based on administrative data sets. That being said, after careful examination and validation of such sources, the advantages of research based on such sources (large numbers of population-based patients, little cost in obtaining data) justify its continued pursuit. Such research may typically be used to find preliminary answers to complex questions and may help make empirically based decisions about whether such questions are worthy of more labor-intensive and costly additional research. Efforts to improve the validity and increase the breadth of information available in these administrative data sources would allow them to serve as even better sources of hypothesis generation and testing. This inexpensive approach to research may then become increasingly more useful.
Dr Tirschwell’s efforts were supported in part by NINDS grant 1 K23 NS02119-01 and by the Medic One Research Foundation, Seattle, Wash.
- Received April 4, 2002.
- Revision received May 22, 2002.
- Accepted June 3, 2002.
- ↵Goldstein LB. Accuracy of ICD-9-CM coding for the identification of patients with acute ischemic stroke: effect of modifier codes. Stroke. 1998; 29: 1602–1604.
- ↵Benesch C, Witter DM Jr, Wilder AL, Duncan PW, Samsa GP, Matchar DB. Inaccuracy of the International Classification of Diseases (ICD-9-CM) in identifying the diagnosis of ischemic cerebrovascular disease. Neurology. 1997; 49: 660–664.
- ↵Leibson CL, Naessens JM, Brown RD, Whisnant JP. Accuracy of hospital discharge abstracts for identifying stroke. Stroke. 1994; 25: 2348–2355.
- ↵Ellekjaer H, Holmen J, Kruger O, Terent A. Identification of incident stroke in Norway: hospital discharge data compared with a population-based stroke register. Stroke. 1999; 30: 56–60.
- ↵Mayo NE, Neville D, Kirkland S, Ostbye T, Mustard CA, Reeder B, Joffres M, Brauer G, Levy AR. Hospitalization and case-fatality rates for stroke in Canada from 1982 through 1991: the Canadian Collaborative Study Group of Stroke Hospitalizations. Stroke. 1996; 27: 1215–1220.
- ↵Davenport RJ, Dennis MS, Warlow CP. The accuracy of Scottish Morbidity Record (SMR1) data for identifying hospitalised stroke patients. Health Bull (Edinb). 1996; 54: 402–405.
- ↵Longstreth WT Jr, Nelson LM, Koepsell TD, van Belle G. Clinical course of spontaneous subarachnoid hemorrhage: a population-based study in King County, Washington. Neurology. 1993; 43: 712–718.
- Lyden PD, Lau GT. A critical appraisal of stroke evaluation and rating scales. Stroke. 1991; 22: 1345–1352.