Classification of Stroke Cases in Administrative Data using ICD-9 Codes
Background: Using administrative data allows inexpensive evaluation of large patient datasets. An objection to such analyses is poor reliability of ICD-9 discharge codes and thus inaccurate case ascertainment. Objective: To identify reliable algorithms to classify stroke cases using ICD-9 discharge codes . Methods: From a population-based hospital discharge database with 20,803 cases, 206 medical records of probable stroke cases, from 3 community and 2 university hospitals, were randomly selected and abstracted. The gold standard was the medical record attending physician diagnosis with corroboration by a stroke neurologist (DLT). Results: Final gold standard diagnoses included 8 (4%) TIAs, 76 (37%) ischemic stroke, 47 (23%) intracerebral hemorrhage (ICH), 58 (28%) subarachnoid hemorrhage (SAH) and 17 (8%) no cerebrovascular event. Compared to algorithms that utilized either all possible or the first 2 ICD-9 discharge codes, an algorithm restricted to use of the first (principal) discharge code optimized specificity and positive predictive value (Table). Overall stroke classification based on the first discharge code showed substantial agreement with the gold standard (Kappa = 0.72, P < 0.0005). In the original 20,803 case database, the prevalence of ischemic stroke was 30%, ICH 4.6% and SAH 1.6%. Conclusions: An algorithm based on the principal ICD-9 discharge code can reliably identify and classify stroke cases. By optimizing specificity and positive predictive value, sensitivity is lost. Positive predictive value is influenced by prevalence.