Comparison of Medicare Claims Versus Physician Adjudication for Identifying Stroke Outcomes in the Women’s Health Initiative
Background and Purpose—Many studies use medical record review for ascertaining outcomes. One large, longitudinal study, the Women’s Health Initiative (WHI), ascertains strokes using participant self-report and subsequent physician review of medical records. This is resource-intensive. Herein, we assess whether Medicare data can reliably assess stroke events in the WHI.
Methods—Subjects were WHI participants with fee-for-service Medicare. Four stroke definitions were created for Medicare data using discharge diagnoses in hospitalization claims: definition 1, stroke codes in any position; definition 2, primary position stroke codes; and definitions 3 and 4, hemorrhagic and ischemic stroke codes, respectively. WHI data were randomly split into training (50%) and test sets. A concordance matrix was used to examine the agreement between WHI and Medicare stroke diagnosis. A WHI stroke and a Medicare stroke were considered a match if they occurred within ±7 days of each other. Refined analyses excluded Medicare events when medical records were unavailable for comparison.
Results—Training data consisted of 24 428 randomly selected participants. There were 577 WHI strokes and 557 Medicare strokes using definition 1. Of these, 478 were a match. With regard to algorithm performance, specificity was 99.7%, negative predictive value was 99.7%, sensitivity was 82.8%, positive predictive value was 85.8%, and κ=0.84. Performance was similar for test data. Whereas specificity and negative predictive value exceeded 99%, sensitivity ranged from 75% to 88% and positive predictive value ranged from 80% to 90% across stroke definitions.
Conclusions—Medicare data seem useful for population-based stroke research; however, performance characteristics depend on the definition selected.
Medicare claims provide a nationwide data source for individuals aged ≥65 years, which is reasonably representative of the population. Ongoing collection of medical data allows for a variety of secondary uses in public health and health services research, including disease surveillance, tracking of patient outcomes, and health care utilization. Randomized clinical trials and major prospective observational cohort studies have traditionally relied on intensive data collection processes, including medical record review, to ascertain outcomes. These rigorous approaches are generally considered necessary to yield accurate outcome information. The extent to which Medicare administrative data may be useful to ascertain outcomes in this context is unclear but is important to evaluate because of cost-efficiencies of secondary data use and augmented research potential from the comprehensive data collected by Medicare regarding health care utilization and expenditures.
The Women’s Health Initiative (WHI) is a national longitudinal study of 161 808 women aged 50 to 79 years that evaluates strategies for preventing major causes of morbidity and mortality, including cardiovascular disease and stroke.1 Similar to other large clinical studies, the WHI outcomes adjudication involves medical record review by physicians. This is resource-intensive. Chronic disease outcomes increase substantially with age, necessitating escalating resources for outcome ascertainment. Most current WHI participants are enrolled in Medicare at baseline or subsequently during follow-up. To evaluate whether Medicare data may be useful for outcome ascertainment, the WHI program initiated a validation effort to assess agreement between Medicare and WHI data for cardiovascular disease outcomes. This report focuses on stroke.
Diagnostic codes pertaining to stroke in health care administrative databases have been reported to have variable performance when compared with clinical definitions of stroke.2–9 A recent systematic review found that some algorithms for stroke and intracranial bleeds had positive predictive values (PPVs) >80%. Other metrics including sensitivity were less frequently reported.2 Confirmation criteria for stroke events varied substantially across studies. Although occasional studies involved neurologists to confirm strokes,2,5 they were typically geographically limited or included a relatively small number of events. Despite the incidence of stroke being greatest in the elderly population, there is limited published information comparing Medicare data with neurologist-adjudicated strokes.6 The WHI uses vascular neurologists to perform stroke adjudication. Medical record review by these specialists who have substantial expertise regarding diagnostic nuances of stroke and relevant neuroimaging is intended to identify strokes with a high level of accuracy. Also, cases that were not clear-cut were discussed by the committee of vascular neurologist adjudicators to develop consensus. The WHI is also geographically diverse, and the number of neurologist-confirmed adult strokes is larger than that used in any previous report for validation studies of administrative data. Hence, the WHI–Medicare linkage is a valuable data source for evaluating the performance of Medicare data.
The goal of this study was to assess whether Medicare data can be reliably used to assess stroke events in the WHI. We developed algorithms that used variables in Medicare claims data to define stroke and examined algorithm performance for accurate detection of stroke hospitalizations using clinically defined strokes in the WHI. Results presented herein have implications for future large clinical trials as exemplified by the WHI and also for the broader agenda of population science and health services research.
The WHI enrolled 161 808 women aged 50 to 79 years from 1993 until 1998 in a set of randomized clinical trials (68 132 participants) as well as an observational study (93 676 participants). The observational and clinical trial studies were performed until 2005, at which time the women were invited to participate in the WHI extension study through 2010 (first extension). Longitudinal follow-up continues, and current participants have 14 to 19 years of follow-up. The WHI was conducted in 24 states across the United States using 40 field centers. Field center locations are indicated at https://cleo.whi.org/about/SitePages/Recruitment.aspx. The WHI database has been linked to Medicare data from the Centers for Medicare and Medicaid Services (CMS).
We used incident stroke events from the start of WHI in 1993 through 2007. We included women from the observational study who had Medicare Parts A and B, fee-for-service coverage at the time of WHI enrollment (1993–1998), or later met age criterion for Medicare enrollment through 2007. Women were excluded if they were enrolled in a Medicare-managed care plan at the time of WHI enrollment and were censored at the time of entry to managed care and when they lost Medicare fee-for-service eligibility. Women who experienced a WHI-adjudicated stroke outcome before Medicare eligibility were excluded because the WHI adjudicated only the first stroke outcome. In the event-based analysis, participants were censored from observation 7 days after the WHI stroke, allowing for a 7-day match window between WHI and Medicare. The study population was randomly split into a training data set and a test data set (50% each) to replicate the results of the algorithm.
Stroke Events in the WHI
The WHI process uses self-report via annual questionnaires completed by participants (or their proxies) who are asked if they were hospitalized overnight since their last report. Using hospitalization information in the participant self-report, medical records were obtained and subsequently adjudicated by trained vascular neurologists using detailed standards.10 Details of the record request process are in the online-only Data Supplement. Medical records were typically inpatient hospitalization records and included admission and discharge notes, emergency room notes, neurology consultations, therapy (physical, occupational, speech) evaluations, and imaging results (eg, brain, cerebrovascular, and cardiac studies). The WHI-adjudicated events included hospitalization with overnight inpatient stays. Hence, emergency room visits that did not lead to a hospital admission were excluded. For the WHI, stroke was defined as follows: “the rapid onset of a persistent neurologic deficit attributed to an obstruction or rupture of the brain arterial system (including stroke occurring during or resulting from a procedure). The deficit was not known to be secondary to brain trauma, tumor, infection, or other cause. The deficit had to last more than 24 hours unless death supervened or there was a demonstrable lesion compatible with an acute stroke on CT or MRI. A stroke was defined as procedure-related if it occurred within 24 hours after any procedure or within 30 days after a cardio-version or invasive cardiovascular procedure” (see the online-only Data Supplement for reference). Venous infarcts, traumatic brain injury, and subdural and epidural hemorrhages were specifically not considered strokes. The WHI collected data regarding only the first stroke event for each participant (ie, once a participant had confirmed stroke outcome, further stroke events were not adjudicated).
Defining Stroke in Medicare Data
Stroke hospitalization in Medicare data was defined using variables in the MedPAR (Medicare Provider Analysis and Review) files. We used MedPAR because our analysis was focused on hospitalized stroke, and MedPAR files have claims data regarding inpatient hospital stays. Each MedPAR record has claims from a single hospital stay and has up to10 International Classification of Diseases (Ninth Revision; ICD-9 CM) discharge diagnosis codes indicating principal diagnosis in the primary position or coexisting conditions in subsequent or secondary positions. Four stroke definitions were created using a preliminary set of these codes. The preliminary set of ICD-9 CM codes was based on a review of published studies that identified acute stroke in administrative data as well as our previous experience with population-based stroke surveillance.2,5,11 The most comprehensive definition was for all stroke, which included ICD-9 430.xx, 431.xx, 433.x1, 434.x1, 436.xx, 437.1x, and 437.9x in any diagnostic position, whereas the definition for primary position stroke (indicating stroke as the principal diagnosis or reason for hospitalization) included the same codes but was limited to the first position. Ischemic stroke was defined as ICD-9 433.x1, 434.x1, 436.xx, 437.1x, and 437.9x in any diagnostic position, and hemorrhagic stroke was defined as ICD-9 430.xx and 431.xx in any diagnostic position. Claims not meeting these stroke definitions were classified as nonstrokes. Claims with ICD-9 CM codes pertaining to transient ischemic attacks (TIAs) were not considered strokes. Using the training data set, we ultimately refined the definitions by omitting the codes 437.1x and 437.9x, and all tables in this article use the final definitions. We also evaluated whether the coding algorithms could be further optimized by using rehabilitative therapy charges (ie, nonzero charge for physical, occupational, or speech therapy during the same hospital stay) as an added criterion to define stroke. Preliminary analyses suggested that this would likely increase specificity at the expense of a decrease in sensitivity, which was nontrivial because 20% of matched events did not have these charges. Therefore, these therapy charges were not used further for defining stroke. We explored the use of codes indicating iatrogenic stroke, and this is described in the online-only Data Supplement.
For the main analysis, the analytic unit was any hospitalization event. Included were confirmed WHI strokes after neurologist adjudication and all hospitalization claims from the Medicare MedPAR file, including stroke and nonstroke hospitalizations. Women who did not have any hospitalization recorded in the Medicare data and who did not have a WHI-confirmed stroke were excluded from this analysis. WHI strokes that were based on death certificate only (no hospitalization data) were excluded. The event-based analysis was especially intended to inform whether a hospitalization with stroke diagnosis codes in Medicare data was likely to represent a true stroke as well as the completeness of ascertainment. For each participant, Medicare stroke events within a 7-day time window (±7 days using admission dates) were regarded as a single stroke hospitalization. The first analytic step matched WHI and Medicare events, populating a 2×2 concordance matrix (Table 1). A Medicare stroke and a WHI stroke were considered a match if they occurred within ±7 days of each other. The 7-day match window was selected based on previous work showing that postdischarge readmission rate due to recurrent stroke at 7 days was low at 0.3% (95% CI, 0–0.7).12 For both Medicare and WHI data, we used admission date as event date. We defined a WHI diagnosis of stroke as reference standard, and WHI versus Medicare concordance was evaluated using κ-statistic, as well as by PPV, negative predictive value (NPV), sensitivity, and specificity. Discordant cells were evaluated in greater detail by examining reasons for a nonmatch. To understand the performance of Medicare data when adjudicated hospital medical records were available for comparison, we analyzed concordance–discordance after excluding events that were not informative (eg, no medical records were received and therefore the Medicare event could not be judged to be true or false).
Although our main analysis was event-based, we also performed a person-based analysis to evaluate the usefulness of Medicare data for ascertainment of incident strokes, because this is often of interest in clinical trials and cohort studies. For person-based analysis, we assessed whether each participant had a WHI stroke and, similarly, a Medicare stroke at any time during the overlapping follow-up period. Concordance was examined on Medicare versus WHI stroke status for each participant. If a participant had multiple hospitalizations with stroke codes in Medicare data, then the first hospitalization was used for any analyses requiring dates.
The analysis was first performed on the training set to allow potential fine-tuning of the coding algorithms and then repeated on the test (validation) data set.
A total of 48 877 WHI observational study participants met inclusion criteria for the stroke validation study. Of these, 24 428 randomly selected participants were used for algorithm development, and the remaining 24 422 formed the test/validation set. There were no significant differences in demographic characteristics between training and validation data sets.
A total of 31 399 Medicare hospitalizations among 24 428 participants were analyzed in event-based analysis (Table 1). Using WHI data, there were 582 strokes. Using Medicare data and the most general stroke definition (stroke diagnosis code in any position), there were 796 strokes. Of these, 478 were found in both WHI and Medicare databases with a date within ±7 days of each other.
Among 104 WHI strokes that were not found in Medicare data, 5 women were not hospitalized, that is, they were managed in an outpatient setting only. Hence, they were not identifiable in Medicare data of inpatient stays. For the remaining 99 WHI strokes, a key reason for discordance was that WHI picked up many patients with stroke who were discharged with nonstroke diagnosis codes in Medicare claims. We found a total of 198 distinct Medicare nonstroke discharge codes among hospitalizations within ±7 days of 99 WHI stroke events (because there could be multiple discharge codes and multiple nonstroke events in that ±7-day period). Review of all diagnosis codes in these hospital claims revealed a wide variety of conditions. The most frequent codes included diagnoses of essential hypertension (ICD-9 401.9; 33 events), atrial fibrillation (ICD-9 427.31; 21 events), diseases of the urinary system (ICD-9 599.0; 14 events), unspecified TIAs (ICD-9 435.9; 12 events), and heart failure (ICD-9 428.0; 11 events). We did not find any predominant code that could have been added to our stroke definition to improve the overall performance.
Among the 318 events found in Medicare data but not WHI data, 182 (57%) had no record of a corresponding hospitalization (for any medical condition) reported to the WHI program. Of these, 62 (34%) participants had died ≤365 days after Medicare hospitalization. In contrast, only 18.6% (89 of 478) of participants who had matching WHI and Medicare stroke events had died ≤365 days of hospitalization. Of the 136 Medicare events that had a corresponding WHI hospitalization reported, medical records were adjudicated by the stroke committee for only 79. Reasons for not being adjudicated included no medical records were received (11 events) due to administrative reasons (eg, no signed release of records or no documents received) and no stroke adjudication attempted (46 events; eg, reason reported by the participant for hospitalization was not suggestive of a WHI outcome of interest). Among the 79 records adjudicated by the WHI stroke committee and not found to be stroke, 27 (34%) were found to be TIAs.
Specificity (99.0%) and NPV (99.7%) were high for Medicare ascertainment of stroke. However, sensitivity was more modest at 82.1%. PPV was low at 60.1% in the initial analysis. When events that were discordant due to outpatient strokes or lack of adjudicated medical records were excluded, PPV increased to 85.8%, whereas sensitivity remained largely unchanged (82.8%). The exclusion of events for which no WHI adjudication was performed improved the WHI No/CMS Yes discordant cell (Table 1), and algorithm performance as measured by the κ-statistic increased from 0.69, indicating moderate agreement, to 0.84, indicating high agreement. The pattern of results was similar for all stroke definitions.
Compared with using diagnosis codes in any position to define stroke, the use of only principal diagnosis increased PPV from 60.1% to 64.4% when all records were included and remained unchanged at 85.8% when only events with adjudicated medical records were included. Sensitivity of ascertaining inpatient strokes decreased from 82.8% to 74.4%.
Among the 582 WHI-confirmed strokes, 109 (18.7%) were hemorrhagic, 453 (77.8%) were ischemic, and 20 (3.4%) were of unknown type. Hemorrhagic stroke coding algorithm had the best PPV among all stroke definitions (91.1% when limited to events with adjudicated medical records), although the sensitivity of this algorithm was lower (75.9%). In comparison, ischemic stroke coding algorithm had a PPV of 79.4% and a sensitivity of 82.2%.
Person-based analyses (Table 2) showed modestly improved concordance compared with event-based results. Using the most general stroke definition (codes in any diagnosis position), 505 participants had a stroke in both WHI and Medicare data during the overlapping follow-up period. For most of these (83%), the date matched exactly in both data systems: 88% were within ±1 day; 91% were within ±3 days; 95% were within ±7 days; and 96% were within ±30 days. Among 240 participants who had a stroke hospitalization identified in Medicare data only, 29 had WHI strokes based on death certificates only.
We examined the robustness of person-based results by expanding the match window for discordant events to ±30 days (Table I in the online-only Data Supplement) as done in other WHI–Medicare validation studies. Specifically, among the 240 participants with a Medicare stroke diagnosis without a WHI stroke, 125 had a WHI-reported hospitalization within ±30 days of Medicare admission date, and 65 of these had hospital records adjudicated. Other cells in the concordance matrix remained unchanged. Sensitivity was unchanged between the 7-day and 30-day match windows at 87.4%. PPV was 88.6% when using adjudicated WHI hospitalization in a 30-day time window compared with a PPV of 88.9% when using the ±7-day time window.
Optimizing Training Algorithm Performance
We examined the contribution of each stroke-related ICD-9 CM code used in the preliminary CMS stroke definition to concordance and discordance between WHI and CMS stroke events. Specifically, we examined the distribution of codes for WHI Yes/CMS Yes and WHI No/CMS Yes. Of all cerebrovascular disease codes, we found that codes 437.1 and 437.9 (defined as generalized and unspecified cerebrovascular disease in ICD-9 CM manuals) had disproportionately high number of events for WHI No/CMS Yes (14% of events for the most comprehensive stroke definition) and low numbers for WHI Yes/CMS Yes (<1% events for the most comprehensive stroke definition). Therefore, we eliminated these codes from final stroke definitions and there was no change in sensitivity, whereas PPV improved slightly. For example, in the event-based analysis of the most general CMS stroke definition, PPV improved from 56.4% (data not shown) to 60.1% when the codes 437.1x and 437.9x were eliminated.
Algorithm performance on the test data set was similar to the performance on the training data set. In event-based analyses (Table II in the online-only Data Supplement) using adjudicated WHI hospitalizations for the most comprehensive stroke definition, sensitivity was 82.8% for the training data set versus 82.0% for the test data set, and PPV was 85.8% for the training data set versus 84.6% for the test data set. Specificity and NPV were 99.7% for both data sets. Results for the stroke definition using only principal diagnosis (definition 2) and ischemic strokes (definition 3) showed that algorithm performance on the test data set was similar (and frequently identical) to that on the training data set. For hemorrhagic strokes (definition 4), although sensitivity (75.9% for training data set versus 75.3% for test data set), specificity (>99.9 for both), and NPV (99.9% for both) were similar, PPV diminished from 91.1% for the training data set to 84.3% for the test data set but was not statistically significant.
In person-based analysis, a similar pattern was found (Table III in the online-only Data Supplement). Also, test data set results were robust when expanding the match window for discordant events to ±30 days (Table IV in the online-only Data Supplement). Specifically, among the 277 participants with a Medicare stroke diagnosis without a WHI stroke, 145 had a WHI-reported hospitalization within ±30 days of Medicare admission date, and 77 of these had their hospital records adjudicated. PPV was 86.9% when using adjudicated WHI hospitalizations compared with a PPV of 87.5% when using the ±7-day time window.
A comparison of performance across all Medicare stroke definitions using event-based and person-based frameworks in test data set for events when medical records were available is shown in Table 3.
These analyses of linked WHI–Medicare data showed that a claims-based algorithm for identifying acute strokes had good sensitivity and excellent specificity and NPVs when compared with patient self-report followed by expert adjudication. Because strokes were infrequent overall, PPV was easily influenced by small changes in specificity. Nevertheless, PPV was fairly high (85% in event-based and 88% in person-based analyses in test data set; definition 1) when only Medicare stroke events with WHI-adjudicated medical records were included. A comparison of performance across all Medicare stroke definitions in the more generalizable test data set comparing events where WHI medical records were available (Table 3) showed that, overall, person-based approach had higher sensitivity and PPV compared with event-based approach. The definitions using stroke codes in any position or ischemic stroke codes had highest sensitivity, ranging from 82% in event-based to 88% in person-based analyses for both definitions. A more stringent Medicare definition using only primary position stroke codes and hemorrhagic stroke definition had lower sensitivities in the range of 75% for event-based and 81% for person-based analyses (Table 3). The definition using primary position stroke code had the best PPV at 88% in event-based and 90% in person-based analyses. Based on our results, we conclude that Medicare data seem useful for population-based stroke research and that performance characteristics of the Medicare data depend on the definition selected. The choice of stroke definition may be guided by sensitivity and PPV desired in the study. One consideration is that Medicare claims data are typically available many months after the end of a calendar year due to the time needed for claims accrual and processing, especially for hospitalizations occurring late in the calendar year. For example, data on Medicare claims for 2011 may be available approximately mid to late 2012. This has to be factored into research planning. There is also some lag in event ascertainment using the WHI approach due to the time needed for obtaining self-report data and adjudicating medical records.
Our results are consistent with other published reports. A recent systematic review of 26 articles comparing administrative codes versus medical record abstraction found PPV in the range of ≥80% when stroke-specific ICD-9 CM codes such as 430, 431, and 434 were used. Although sample sizes ranged from 50 to 4000, many of the studies had a small number of cases, were from smaller geographic areas, and did not examine training versus test data validation.2 Our study is unique regarding the timespan covering 10 to 14 years; it included neurologist review and the cohort was drawn from across the United States Medicare population, a high-risk population for stroke. We recognize that our training and test data sets were both derived from the same larger data sample and thus were not completely independent. Because our training and test samples were nonoverlapping, we think that this study provides some validation of algorithm performance beyond that typically reported.
Although the terms sensitivity and specificity were used in traditional sense, we note that our application was not typical in that the reference standard (WHI) was imperfect. There were some hospitalizations with stroke diagnosis codes recorded in Medicare data for which no hospitalization was reported by WHI participants or their proxies. More complete ascertainment could potentially be achieved using a multipronged approach such as adding Medicare data or, for studies that were geographically limited (unlike WHI), using discharge codes from hospitals within a certain geographic area and surveillance of administrative claims.
This study validates concordance between WHI and Medicare stroke events using 2 approaches. The first is the more stringent event-based analysis, which required that the WHI and Medicare strokes had to be within ±7 days of each other to be declared a match. The person-based approach relaxed this requirement, and this may have increased the matches slightly as well as decreased discordant events for individuals with multiple stroke hospitalizations; hence, sensitivity and PPV were slightly higher in person-based analysis. The majority of matches, however, occurred within a 7-day interval. The results of event-based analysis presented here are relevant when any event (initial or recurrent) in Medicare hospital data is the focus, whereas the results of person-based analysis apply to incident events in trials and cohort studies.
When Medicare events were limited to those for which medical records were available for review by the stroke adjudication committee, we found generally high, although imperfect, concordance between Medicare data and medical records. We also note that there were WHI participants who had confirmed strokes based on hospital record adjudication, for whom Medicare diagnosis codes were not stroke-related. These Medicare hospitalizations, which corresponded temporally with WHI hospitalizations, did not include stroke-related codes but instead included codes for hypertension, atrial fibrillation, diseases of the urinary system, heart failure, and TIAs, among others. Because the hospital record clearly supported a stroke diagnosis, this undercoding could reflect a coding of stroke-related complications or comorbid conditions rather than stroke. However, we did not feel that we could add any of the ICD-9 CM codes to our Medicare stroke definition, because as sensitivity increases, specificity and PPV would be compromised. Furthermore, there were a variety of ICD-9 CM codes, and not just a few predominant ones, to account for this discordance. Conversely, there were many cases with a principal diagnosis of stroke in Medicare data, which did not pass adjudication in the WHI. Approximately one-quarter of these were TIAs per neurologist adjudication in the WHI. Most, however, appeared to be other conditions. Thus, we conclude that although errors in the assignment of hospital codes would result in both overascertainment and underascertainment of stroke in Medicare data, there was moderate agreement in the initial analysis with an overall κ of 0.69, which increased to 0.84 when discordance was limited to adjudicated records.
The process of bringing the WHI cohort and the Medicare data in line for an event-based comparison revealed important insights about the strengths and weaknesses of each data source for the purpose of identifying incident strokes. As noted previously, using the Medicare data involves reliance on hospital coding. In some cases, strokes may be considered among differential diagnoses and ruled out by physicians. These cases should not get coded as strokes; however, they may get coded as strokes if the conclusion is not clearly documented in the medical record by the treating clinician. In other cases, undercoding may occur; for example, an actual stroke is one among multiple medical problems and comorbidities and not identified in the coding process.
We note that Medicare identified more stroke events compared with WHI. Some of these hospitalizations were not identified or adjudicated by the WHI, and these constituted the majority of discordant events. For example, in Table 1, the WHI No, CMS Yes data declined by 75% (from 318 to 79) when discordance was limited to only those with adjudicated records. The main reason for this is that WHI case ascertainment relies on report by a participant (or proxy) before a record can be adjudicated. For a minority of these participants, although WHI did not receive a report of hospitalization, stroke was ascertained via death certificate data.
Strengths of our work include large nationwide cohort of WHI participants linked to Medicare, availability of neurologist-adjudicated stroke events in the WHI, and examination of both event-based and person-based frameworks for evaluating concordance. A major limitation arises from the fact that we lacked Medicare data for participants enrolled in managed care plans. These participants were excluded from our analyses. In general, for studies such as the WHI that enrolls persons from a broad United States population, the extent to which this may cause a problem depends on the degree of managed care enrollment and the extent to which managed care enrollees are dissimilar to those with traditional fee-for-service Medicare. Another limitation is that our analysis predominantly included persons aged ≥65 years because that is the age of Medicare eligibility, although younger persons with disability or end-stage renal disease are also eligible. A different consideration is that using the adjudicated subset as the best estimate of PPV (ie, third row for each stroke definition in Tables 1 and 2 and in Tables II and III in the online-only Data Supplement) implicitly assumes that nonreported and nonadjudicated events in the WHI No/CMS Yes cell would yield a similar PPV if all their data had been available for adjudication. However, if these events differ systematically from adjudicated events, then the true PPV may lie in between PPV values that include all events (ie, first row) and those that include only events for which records were available for adjudication (ie, third row).
This study compares 2 case ascertainment approaches for acute stroke hospitalization events: patient self-report of stroke followed by neurologist adjudication (WHI stroke) versus ascertainment in Medicare data using 4 Medicare stroke definitions in an event-based or person-based framework. Both approaches were imperfect, and although Medicare data claims helped capture some strokes not identified as such by the WHI, Medicare data also missed some WHI-adjudicated stroke events. We report the following insights into the performance characteristics of Medicare data. All Medicare stroke definitions had high specificity and NPVs when compared with the WHI approach. Medicare stroke definitions, which used stroke codes in any diagnosis position or ischemic stroke codes, had high sensitivities in a person-level framework. The definition using stroke codes in any diagnosis position also had a high PPV. Medicare definitions, which used hemorrhagic stroke codes or only primary position stroke codes, were less sensitive, especially in the event-based framework, although using a stroke code in the primary position had the highest PPV. The results of this study can assist researchers in selecting the best Medicare definition that suits their purpose. Our findings pave the way for informed use of Medicare data to ascertain strokes, for example, during long-term follow-up or in large resource-efficient clinical trials, with the caveat that outcome ascertainment in managed care enrollees may have to rely on other data sources.
The authors thank the WHI investigators and staff for their dedication and the study participants for making the program possible. A full listing of WHI investigators can be found at https://cleo.whi.org/researchers/Documents%20%20Write%20a%20Paper/WHI%20Investigator%20Short%20List.pdf.
Sources of Funding
The WHI program is funded by the National Heart, Lung, and Blood Institute/National Institutes of Health, US Department of Health and Human Services, through contracts HHSN268201100046C, HHSN268201100001C, HHSN268201100002C, HHSN268201100003C, HHSN268201100004C, and HHSN271201100004C. Dr Safford was supported by K24 HL111154 for this work.
The online-only Data Supplement is available with this article at http://stroke.ahajournals.org/lookup/suppl/doi:10.1161/STROKEAHA.113.003408/-/DC1.
- Received September 4, 2013.
- Revision received December 21, 2013.
- Accepted January 8, 2014.
- © 2014 American Heart Association, Inc.
- 1.↵The Women’s Health Initative. https://www.whi.org. Accessed April 10, 2013.
- Tirschwell DL,
- Longstreth WT Jr.
- Tirschwell D,
- Kukull WA,
- Longstreth WT Jr..
- Goldstein LB
- Leibson CL,
- Naessens JM,
- Brown RD,
- Whisnant JP
- Lakshminarayan K,
- Anderson DC,
- Jacobs DR Jr.,
- Barber CA,
- Luepker RV
- Lakshminarayan K,
- Schissel C,
- Anderson DC,
- Vazquez G,
- Jacobs DR Jr.,
- Ezzeddine M,
- et al