Scope of Preclinical Testing Versus Quality Control Within Experiments
To the Editor:
We would like to congratulate Drs Philip, Benatar, Fisher and Savitz1 on their recent evaluation of quality in preclinical stroke studies, but also wish to clarify a reference they made to our own observations, which we believe are consistent with their findings.
Dr Philip et al stated that “O'Collins et al using their own set of guidelines concluded that NXY-059 nearly achieved all their quality criteria (9/10), but subsequently one of the authors provided a reanalysis in a subsequent publication (reporting that the drug achieved only 4.5/10 criteria).” The criteria used by O'Collins et al2 were based on the STAIR recommendations for preclinical testing,3 and as detailed in our methods they specifically relate to that subset of the STAIR recommendations that reflect the range or scope of testing across studies, not the scientific validity within individual studies. This evaluation is thus very similar to what Dr Philip et al describes as “adequacy of preclinical literature”—and like Dr Philip et al, reveals that range or scope of testing for NXY-059 was high as defined by this scale.
Dr Philip et al also cites Dr Donnan's Feinberg Lecture4 referring to a separate analysis of NXY-059 which indicated that the quality of individual studies was low. The analysis has been published in full by Dr Macleod et al5 and specifically related to the reported quality within individual experiments and their internal scientific validity based on measures taken to avoid introduction of bias. This evaluation was very similar to Dr Philip et al’s “methodological quality score”—and like Dr Philip et al, found that experimental quality tended to be low for NXY-059 using these criteria. Importantly, it should be noted that this pattern is not unique to NXY-059. For instance, the range of testing for magnesium, melatonin and minocycline was 8 to 9 of 10 while the average intraexperimental quality was 4 of 10 (O'Collins and Howells, unpublished data, 2009).
Clearly both range of testing and the quality within individual experiments are important, but they address different parts of the problem of translational stroke research. If individual experiments are biased toward a particular outcome, their value is reduced. However, if we wish to develop new treatments for stroke then testing across a range of circumstances is useful because stroke is both a heterogeneous disease and also a moving target. A patient’s genetic background, their predisposing risk factors and the unique pathology with which they present to hospital will all contribute to their outcome. No single model of stroke is likely to fully encompass this heterogeneity. Therefore, in preclinical drug evaluation we need a well conducted set of experiments performed under the range of conditions that best mimic the circumstances likely to be experienced by the cohort of stroke patients targeted by the treatment.
We, Dr Philip et al and others have used simple scoring systems based on the original STAIR recommendations for range of testing and individual experimental quality as tools to explore the limitations of the existing stroke literature.1,6 They have provided some insight into past research, and we hope they can help increase the chance of successful translation of scientific findings. Nevertheless, evaluation of the stroke literature is an ongoing, fluid process. These tools and the models they assess need to be used wisely and should be updated in a timely manner in response to new developments in stroke science. Checklists should be enabling and not constraints.
Philip M, Benatar M, Fisher M, Savitz SI. Methodological quality of animal studies of neuroprotective agents currently in phase II/III acute ischemic stroke trials. Stroke. 2009; 40: 577–581.
STAIR. Recommendations for standards regarding preclinical neuroprotective and restorative drug development. Stroke. 1999; 30: 2752–2758.
Donnan GA. The 2007 Feinberg lecture: a new road map for neuroprotection. Stroke. 2008; 39: 242.
Macleod MR, van der Worp HB, Sena ES, Howells DW, Dirnagl U, Donnan GA. Evidence for the efficacy of NXY-059 in experimental focal cerebral ischaemia is confounded by study quality. Stroke. 2008; 39: 2824–2829.
Macleod MM, Fisher M, O'Collins V, Sena ES, Dirnagl U, Bath PM, Buchan A, van der Worp HB, Traystman R, Minematsu K, Donnan GA, Howells DW. Good laboratory practice: preventing introduction of bias at the bench. Stroke. 2009; 40: e50–e52.