Response to Letter by Cloft and Kallmes
We appreciate the comment by Drs Cloft and Kallmes on our recent article regarding the matched-pair evaluation of a new bioactive coil.1 It provides us with the opportunity to briefly clarify the statistical analysis and the rationale behind it. This will illustrate why the statistical analysis of the data presented in this article is accurate and not erroneous. We have indeed used a 1-sided Fisher exact test for comparison of the 6-month follow-up occlusion rates. For statistical inference, ie, the confidence that our reported results do in fact detect a true-positive increase of follow-up occlusion rates with Cerecyte compared with bare platinium coils, this is important, especially because for comparison of the initial treatment results a 2-sided test was performed.
However, we do not agree that this corresponds to a “basic statistical error” and that the use of a 1-sided test for comparison of the follow-up occlusion rates would be “inappropriate”, as claimed by Drs Cloft and Kallmes. Without wanting to delve too deep into the philosophy of significance testing (see reference 2 for a more in-depth discussion), it appears well established that it is largely uninterpretable to ascertain the “true value” of false-positive probability rates, ie, probability values and significance levels, without a specific null-hypothesis and specific assumptions about the deviation in the data from it. It is largely misleading to claim that 2-sided testing is simply “more conservative” compared with 1-sided statistical inference: both utilize the same null-hypothesis and test the value of a given test statistic under different assumptions about the alternative hypothesis. Neither one is “incorrect” in the way claimed by Drs Cloft and Kallmes. Instead, both test for different types of deviation from the null-hypothesis and thus evaluate a different question that is asked about the data.
In our case, we specifically decided to test the null-hypothesis that Cerecyte occlusion rates are similar—instead of superior—to bare platinium coils on a 6-month follow-up period against the alternative hypothesis of there being a clear increase in the occlusion rate for Cerecyte. This very null-hypothesis is rejected by our data at a type-1 error probability of <5% against the alternative. When asking the question whether Cerecyte does in fact seem to achieve higher occlusion rates than bare coils, our evidence is confidently indicating such behavior at the common accepted significance level.
But is it the right question to ask or should we have also been interested in the possibility that Cerecyte may result in lower occlusion rates on follow-up? In that case, we should have definitely tested the null-hypothesis that Cerecyte occlusion rates are equal—instead of superior or inferior—to bare coils not only in their initial (as we assumed) but also their follow-up occlusion rates against the alternative of increased follow-up occlusions for either Cerecyte or bare platinium coils. However, Drs Cloft and Kallmes themselves are “not saying that there is any evidence that Cerecyte causes a higher recurrence rate. On the contrary, our own clinical data3 and experimental data provided prior evidence that these particular, recently developed bioactive coils may result in a lower long-term recanalization rate than bare platinium (which is in fact the incentive for the further improvement of bioactive coils after the experiences with first-generation Matrix and for the eventual marketing of Cerecyte). As such, there was a clear directionality in the underlying hypothesis so that a 2-sided test (which does not incorporate any assumptions about the sign of the assumed change under the alternative hypothesis) would have appeared quite inappropriate.
This prior evidence was the rationale for the 1-sided testing on follow-up, and we maintain that the statistical results are valid as originally reported by us. First-generation Matrix experiences are different in that regard because there were quite early indicators for occasional coil compaction and subsequent aneurysm recurrence. In addition, we wish to alert to the fact that neglecting the above mentioned prior evidence in formulating the exact testing procedure against the null-hypothesis will inevitably lower the sensitivity to detect a possible advantage of Cerecyte over bare platinium coils. Sensitivity for this putative advantage or minimizing the false-negative probability of such conclusion is crucial when deciding whether a new product like Cerecyte is good enough to be introduced for further evaluation based on limited initial experience available with its use. Specificity or minimizing the false-positives, on the other hand, will decide whether the product can put up with its claim to further improve outcome of an established treatment like aneurysm coiling by bare platinium coils in the long run. In that regard, we do agree with Drs Cloft and Kallmes that further randomized and preferably multicenter trials will be necessary for the definite assessment of Cerecyte’s performance. Given that these are difficult to conduct and that data of comparable quality are sparse, however, we consider our own prospective matched-pair analysis of substantial value. When we started to analyze the data of our study, there was no evidence that initial treatment results might be different between Cerecyte and bare platinium coils (2-sided testing). There was, however, arguable evidence that follow-up results might improve with Cerecyte (1-sided testing). Thus, our approach to testing was accurate and, by substantiating the confidence that Cerecyte may achieve higher occlusion rates on follow-up, illustrates the strength rather than weakness of incorporating prior knowledge into the statistics. Nevertheless, we have pointed out ourselves that more clinical data are essential to validate the beneficial effect of this new bioactive coil.
Despite this and the fact that occlusion rates were transparent (and therefore accessible to 2-sided statistical inference for advertent readers in disagreement such as Drs Cloft and Kallmes), we do acknowledge that the 1-sided assessment should have been emphasized in the article. We wish to point out that it was explicitly stated in the original version of the article. Here, we submitted the following explanation: “For comparison of initial treatment results between Cerecyte and bare coils, 2-sided tests were performed. Based on the directional assumption of an increased chance of complete occlusion with Cerecyte, 1-sided alternative hypotheses were specified for control results after 6 months.” For the third revision, we were then asked to cut the article to a short communication and in order to comply with the word count we had to omit large passages of the article including the detailed description of the statistics applied.
With regards to the power analysis, we agree with Drs Kallmes and Cloft: considering that we detected a significant effect and that sufficient testing power needs to be assured to protect against a type-II error (ie, false-negatives), its results are not only of very limited and questionable information but dispensable for our purpose. Therefore, the “power” values were not part of our original submission and only included on the specific request of one of the reviewers. Despite our frank objections, we did eventually add these simply because their rather low estimates—irrespective of the fact that their retrospective calculation may not be particularly accurate—primarily indicate a high false-negative risk. (Again, the relevant statistical explanation that they were calculated using the power.prop.test within R’s stats package had to be omitted to comply with the limits of a short communication.) In any case, the substantial risk to face a false-negative as opposed to a false-positive inference at our limited sample size emphasizes that 1-sided testing based on prior pieces of information can be quite powerful and must not be depreciated.
In summary, it is beyond any doubt and certainly compatible with the opinion of Drs Cloft and Kallmes that our preliminary findings (1) at least favor improved follow-up occlusion rates by Cerecyte compared with bare platinium coils and (2) require validation by larger cohorts of treated patients in prospective series or future retrospective meta-analysis not yet available today.
Bendszus M, Bartsch AJ, Solymosi L. Endovascular occlusion of aneurysms using a new bioactive coil: a matched pair analysis with bare platinum coils. Stroke. 2007; 38: 2855–2857.
Bendszus M, Solymosi L. Cerecyte coils in the treatment of intracranial aneurysms: a preliminary clinical study. AJNR Am J Neuroradiol. 2006; 27: 2053–2057.