Optimizing Stroke Clinical Trial Design
Estimating the Proportion of Eligible Patients
Background and Purpose—Clinical trial planning and site selection require an accurate estimate of the number of eligible patients at each site. In this study, we developed a tool to calculate the proportion of patients who would meet a specific trial’s age, baseline severity, and time to treatment inclusion criteria.
Methods—From a sample of 1322 consecutive patients with acute ischemic cerebrovascular syndromes, we developed regression curves relating the proportion of patients within each range of the 3 variables. We used half the patients to develop the model and the other half to validate it by comparing predicted vs actual proportions who met the criteria for 4 current stroke trials.
Results—The predicted proportion of patients meeting inclusion criteria ranged from 6% to 28% among the different trials. The proportion of trial-eligible patients predicted from the first half of the data were within 0.4% to 1.4% of the actual proportion of eligible patients. This proportion increased logarithmically with National Institutes of Health Stroke Scale score and time from onset; lowering the baseline limits of the National Institutes of Health Stroke Scale score and extending the treatment window would have the greatest impact on the proportion of patients eligible for a stroke trial.
Conclusions—This model helps estimate the proportion of stroke patients eligible for a study based on different upper and lower limits for age, stroke severity, and time to treatment, and it may be a useful tool in clinical trial planning.
Clinical trial planning and site selection depend on an accurate estimate of eligible patients at each site. Overestimates may lead to slower than expected recruitment rates.1 The purpose of this study was to develop a tool to calculate the proportion of patients who would meet a specific trial’s age, baseline severity, and time to treatment inclusion criteria.
Patients and Methods
This is a retrospective analysis of data collected prospectively for quality-improvement purposes at Suburban Hospital in Bethesda, Maryland, and Washington Hospital Center in Washington, DC. This analysis includes data from all patients with acute ischemic cerebrovascular syndrome2 seen by the National Institutes of Health stroke team at both hospitals between September 30, 2000 and June 30, 2006, whose age, baseline National Institutes of Health Stroke Scale (NIHSS) score, and onset to triage time (OTT) were known. We abstracted patient data (age, NIHSS score, time last seen normal, and triage time) from the stroke team’s clinical database. For the analysis, we used the first NIHSS score recorded by the stroke team. We calculated the OTT by subtracting the time last seen normal from the triage time as documented in the emergency department log. The stroke code paging time was used as the triage time for all inpatient stroke cases. For estimating the proportion of patients presenting within a target treatment time window, we used the OTT plus 60 minutes. When the NIHSS score was missing but the hospital chart documented resolution of symptoms at the time of the evaluation, a score of 0 was given. Patients who had missing data, were younger than 18 years, or who died before hospital admission were excluded. Patient identifiers were removed before the final analyses.
To fit the data to a regression equation, we created a cumulative frequency table that described the proportion of patients with less than or equal to nonzero values of each of the 3 variables of interest (age, NIHSS score, OTT time). We removed the outlier values (approximately the highest and lowest 2.5% of the sample for age and OTT, and the highest 2.5% for NIHSS), and fit regression curves using Data Fit 9.0.59 (Oakdale Engineering). The best-fitting curve was chosen as that which resulted in the lowest-order function that explained >99.5% of the variance and conformed to the shape of the data.
To validate the model, we divided the sample into 2 groups, A and B, randomly assigning patients into one group or the other group. Using the regression equations obtained from sample A, we calculated the probability that patients would meet the age, NIHSS, and time to treatment (estimated as OTT plus 60 minutes) criteria for 4 recently published or ongoing stroke clinical trials (DIAS-2, MR RESCUE, ROSIE, and SAINT)3–6 and compared these predicted proportions with the actual proportions of trial-eligible patients in group B. A deviation between predicted and actual of >5% was considered significant. After confirming the predictive validity of the model, we fitted regression curves for the entire sample for further use.
We calculated the proportion of patients whose age, NIHSS score, and OTT was within a range by subtracting the proportion of patients whose values were below the lower limit from the proportion of patients whose values were below the upper limit. The proportion of patients within range for all 3 variables was calculated as the product of the 3 individual proportions, assuming that correlations among the 3 variables were negligible. To confirm that assumption, we calculated pair-wise correlations among the 3 variables.
A total of 1322 patients met the inclusion criteria and their summary statistics are listed in Table 1; 651 were randomized to group A and 671 were randomized to group B. In group A, the frequency distribution for age was best fit by a Weibull function (r2=0.997 and 0.997, respectively); for OTT, it was best fit by a third-order logarithmic function (r2=0.998 and 0.994); and for NIHSS score, it was best fit by a first-order logarithmic function (r2=0.997 and 0.997). The proportions (95% CI) of trial-eligible patients in group B predicted from group A were 6.2% (4.6%–8.2%), 8.2% (6.2%–10.6%), 28.9% (25.5%–32.5%), and 20.1% (17.2%–23.4%), respectively. The actual proportions of patients in group B who met these criteria were similar (Table 2). Because in all cases the deviation between predicted and actual number of patients who met all criteria was within 5%, we proceeded to create final regression equations using the entire dataset. These equations are shown in Table 3. Graphs of the cumulative frequency distribution for the actual and fitted data for each of the 3 variables are shown in Figure. The pair-wise correlation coefficients for the 3 variables were between 0.05 and 0.13, confirming poor correlation among the variables.
We developed a model to estimate the proportion of stroke patients who meet eligibility requirements for a combination of common clinical trial selection variables. With this model, it is possible to estimate the impact on recruitment rate of different cut-off points for 3 of the most influential entry criteria. Because the variables considered in this article did not correlate with each other, the simple arithmetic product of the proportions for each variable was a satisfactory predictor. This approach to estimating the proportion of patients eligible for trials could potentially accommodate additional criteria, eg, imaging features. If the additional variables correlate with the others, however, then the calculation of the proportion meeting all criteria would need to account for that.
Our study has some limitations. We excluded 256 patients from the final dataset because they were missing data, most commonly time last known well. These patients, however, would also not be eligible for acute therapies. Because of this selection bias, the model may overestimate the proportion of patients eligible for a specific clinical trial. In addition, many of our patients would have been excluded because they had a mild stroke (the median baseline NIHSS score was 3). Although our sample is large, was collected prospectively by several physicians, and combines data from 2 stroke centers serving a multi-ethnic and socioeconomically varied population in inner city and suburban settings, our results would have to be replicated with data from other stroke centers. Despite these limitations, we believe that the statistical functions describe the relationship of baseline features to the proportion of patients, although the parameters of the regression equations we defined may vary by stroke center or geographical regions because of different demographic and organizational characteristics.
This model may be useful in clinical trial planning. Because the proportion of patients increased logarithmically with NIHSS and time from onset (Figure), allowing the inclusion of patients with milder strokes and earlier treatment (ie, not limiting enrollment to patients beyond the standard thrombolytic time window) will have a greater impact on the proportion of eligible patients than extending the time window or allowing older or more patients with more severe disease to enroll. The design of clinical trials strikes a balance among several important, often competing, features, including sample size, recruitment rate, years required to complete the trial, generalizability, and optimal patient selection to maximize effect size. Although lowering the minimum NIHSS requirement would exponentially increase the pool of eligible patients, these patients tend to recover spontaneously and may be less likely to demonstrate a treatment effect. Enrolling only patients outside the time window for intravenous thrombolytic therapy may be desirable when investigating the effects of a new treatment as monotherapy, but that decision would exclude the nearly 50% of otherwise eligible patients who present in time for alteplase therapy. Balancing these design factors is difficult and often depends on expert opinion. The model we developed adds a quantitative dimension to this decision-making process that has not been previously available.
The authors thank Dr Lawrence Latour and the members of the National Institutes of Health Stroke Teams at Suburban Hospital and Washington hospital who assisted with data collection and patient care.
Sources of Funding
This research was supported by the Division of Intramural Research of the National Institute of Neurological Disorders and Stroke, National Institutes of Health.
- Received January 6, 2010.
- Accepted January 11, 2010.
Elkins JS, Khatabi T, Fung L, Rootenberg J, Johnston SC. Recruiting subjects for acute stroke trials: a meta-analysis. Stroke. 2006; 37: 123–128.
Kidwell CS, Warach S. Acute ischemic cerebrovascular syndrome: diagnostic criteria. Stroke. 2003; 34: 2995–2998.
Hacke W, Furlan AJ, Al-Rawi Y, Davalos A, Fiebach JB, Gruber F, Kaste M, Lipka LJ, Pedraza S, Ringleb PA, Rowley HA, Schneider D, Schwamm LH, Leal JS, Söhngen M, Teal PA, Wilhelm-Ogunbiyi K, Wintermark M, Warach S. Intravenous desmoteplase in patients with acute ischaemic stroke selected by MRI perfusion-diffusion weighted imaging or perfusion CT (DIAS-2): a prospective, randomised, double-blind, placebo-controlled study. Lancet Neurol. 2009; 8: 141–150.
Diener HC, Lees KR, Lyden P, Grotta J, Davalos A, Davis SM, Shuaib A, Ashwood T, Wasiewski W, Alderfer V, Hårdemark HG, Rodichok L; SAINT I and II Investigators. NXY-059 for the treatment of acute stroke: pooled analysis of the SAINT I and II Trials. Stroke. 2008; 39: 1751–1758.