Acupuncture and Stroke Rehabilitation
To the Editor:
In a carefully controlled, randomized clinical trial, Johansson et al1 concluded that acupuncture had no effect on functional improvement in stroke, thus contradicting most of the previous randomized clinical trials studying this relationship, including their own groundbreaking research.2,3 However, inspection of their Figure 1 indicates that the acupuncture group improved over a 12-month period by approximately 61 Barthel points, while the transcutaneous electrical nerve stimulation group improved by only 46 points and the subliminal stimulation group by 49 points. In other words, the median improvement of the acupuncture group, over and above the improvement seen in the control groups, was 12 to 15 points. This is not a trivial difference, despite the authors’ claim that this is a small difference in outcomes. Additionally, the interquartile range at 12 months clearly showed a lower boundary much closer to the median than in either of the 2 control conditions, which suggests that a greater proportion of the subjects received some benefit from the intervention when compared with the control conditions. In their original study,2 Johansson et al reported that the acupuncture group improved by 46.9 points while the standard-of-care control group improved by 26.2 points, producing a difference of 20.7 points in favor of the acupuncture group over a no-intervention control group. While the maximum relative improvement in the current study (15 points) is somewhat smaller than the 20.7 points in the original study, the absolute improvement of 61 points in this study certainly compares favorably to the 26.2-point improvement of the original standard-of-care group and the 46.9 points of the acupuncture group in that study.
A similar inspection of their Table 2 shows that after 12 months, the acupuncture-treated subjects had a median walking speed of 0.56 m/s versus 0.23 m/s and 0.38 m/s for the control subjects. Certainly a >100% increase in walking speed would be considered a large improvement by almost any criterion of clinical relevance. Yet this result does not achieve significance. In addition, despite the claim of no statistically significant differences, there is a clear and consistent pattern for the acupuncture group to have better scores (in some cases, much better scores) on most of the Nottingham quality-of-life domains, especially at 12 months.
How could what appear to be large and clinically meaningful results not have even approached statistical significance, given what appears to be adequate power and appropriate methods? My fear is that the choice of nonparametric, categorical statistics, the forcing of a range of scores into 2 categories, and treating interval values as ordinal scores drastically reduced power. For example, why is a clearly equal-interval outcome such as walking speed treated as an ordinal variable and submitted to a less-sensitive statistic than necessary? Failure to use the most powerful statistic appropriate for the data, such as analysis of variance, substantially increases the probability of a type II error.
Why would one want to use less powerful statistics, ones which are less likely to detect an effect, if one exists? My guess is that the use of these techniques stems from the decision to use intention-to-treat (ITT) analysis. While ITT analysis obviously is intended to reflect typical clinical experience, in some situations it can create distortions in the statistical analysis and may not fairly represent the potential benefit of the intervention being studied. To include a functional independence score of zero for subjects who died certainly reflects their functioning but wreaks havoc with the statistics, in terms of both central tendency and variability, and unfairly represents what is happening with the survivors. Since premature death following stroke is unfortunately an inevitable outcome for a small percentage of those who survive the acute stage, and since acupuncture was not hypothesized to increase survival in these individuals, the use of ITT analysis may be an overly conservative way of analyzing the data in this situation. Even though the authors report removing deceased subjects from post hoc confirmatory analyses, the initial choice of low-power, nonparametric statistics remained. At the very least, a robust and more sensitive statistic such as analysis of variance, or its equivalent, should be performed as a supplement to ITT analysis, to determine whether there is any reason to believe that, for survivors, acupuncture may be helpful in restoring function in stroke patients.
When I inspect the data of Johansson et al,1 using an admittedly subjective criterion of clinical importance, it appears that despite the conclusions of the authors, their data actually suggest that acupuncture may be helpful in restoring function in subacute stroke patients, and the results are easily as strong as those reported in their seminal study2 nearly 10 years ago. Failure to detect these important effects in the more recent study may be due to the less-than-optimal choice of statistical techniques.
Johannson BB, Haker E, von Arbin M, Britton M, Langstrom G, Terent A, Ursing D, Asplund K. Acupuncture and transcutaneous nerve stimulation in stroke rehabilitation: a randomized, controlled trial. Stroke. 2001; 32: 707–713.
Johansson K, Lindgren I, Widner H, Wiklund I, Johansson B. Can sensory stimulation improve the functional outcome in stroke patients? Neurology. 1993; 43: 2189–2192.
Magnusson M, Johansson K, Johansson BB. Sensory stimulation promotes normalization of postural control after stroke. Stroke. 1994; 25: 1176–1180.
The Swedish multicenter trial on sensory stimulation after stroke was designed to detect clear differences in functional outcome between patients treated with acupuncture, high-intensity transcutaneous nerve stimulation, or subliminal transcutaneous nerve stimulation.1 Thus, it had an 80% power to detect a 20% difference in the proportion of patients with severe activities of daily living (ADL) dependency (Barthel Index score of ≤70 points) at 3 months’ follow-up.
In trial design, analyses, and reporting on the results, we were orthodox, adhering strictly to generally accepted principles for randomized controlled trials.2–4 This includes (1) intention-to-treat-analyses, (2) nonparametric statistical tests of ordinal data (such as the Barthel Index) and of data that did not show normal distribution, and (3) reliance on predefined outcome measures (which, in the present trial, was difference in functional outcome measures at follow-up) and not on post hoc analyses (such as calculations of degree of improvement).
Dr Shiflett proposes that we should abandon the intention-to-treat principle, because acupuncture is not supposed to affect case fatality. This concerns a more general problem in stroke intervention trials. The case fatality rate is high in patients who have poor ADL function. If there are differences in case fatality rates between groups, the mean functional outcome in survivors would seem to be better in the group with the highest case fatality when, in fact, no treatment effect exists. It therefore seems sound to apply the intention-to-treat principle in stroke trials.
Nevertheless, in the article we reported on sensitivity analyses in which fatal cases were excluded. The differences in outcome between the acupuncture and subliminal groups were still far from statistically significant after such exclusion. Analysis of variance, the statistical procedure proposed by Dr Shiflett for our data set, assumes normal distribution of data. This requirement was not met for the present outcome variables. Analysis of variance is certainly not always the most powerful or statistically appropriate method. Even when the normality assumptions are met, the Wilcoxon test is nearly as powerful as parametric tests.4 It is difficult to envisage what statistical tests that would be more appropriate and robust and much more powerful and sensitive than the test used in our analyses.
One point raised by Dr Shiflett is open to debate. Different approaches may be used when expressing treatment effects. In his recalculations of our data, Dr Shiflett has used incremental changes in the Barthel ADL score between baseline and 3 and 12 months’ follow-up rather than comparisons of absolute scores at follow-up, as we have done. Admittedly, our approach is more conservative. When, instead, we are comparing the degree of improvement from baseline to 12 months, as Dr Shiflett suggests, the difference in Barthel Index between the acupuncture group and the group receiving subliminal stimulation is still far from statistically significant. It is not enough to look at point estimates of the improvements. The variations in treatment effects must also be taken into account, and they were large (Figure 21). The same applies to differences in walking speed. Because walking speed did not show normal distribution even after various transformations, we used nonparametric rather than the parametric tests suggested by Dr Shiflett. It is also problematic that Dr Shiflett concentrates on single outcome variables that tend to favor acupuncture and ignores others that show tendencies in the opposite direction.
There are now 2 meta-analyses of acupuncture after stroke being conducted, one of them within the framework of the Stroke Module of the Cochrane Collaboration.5 Hopefully, the statistical power of the combined randomized trials will be sufficient to confirm or refute with greater precision the hypothesis that acupuncture improves functional outcome after stroke. Until then, it seems that the scientific support for acupuncture being used as standard treatment in the subacute phase of stroke is too weak.
Johansson BB, Haker E, von Arbin M, Britton M, Langstrom G, Terent A, Ursing D, Asplund K, for the Swedish Collaboration on Sensory Stimulation After Stroke. Acupuncture and transcutaneous nerve stimulation in stroke rehabilitation: a randomized, controlled trial. Stroke. 2001; 32: 707–713.
Dorman PJ, Sandercock PA. Considerations in the design of clinical trials of neuroprotective therapy in acute stroke. Stroke. 1996; 27: 1507–1515.
Lehmann EL. Non-Parametrics: Statistical Methods Based on Ranks San Francisco, Calif: Holden-Day; 1975.
Counsell C, Warlow C, Sandercock P, Fraser H, vanGijn J, for the Cochrane Collaboration Stroke Review Group. Meeting the need for systematic reviews in stroke care. Stroke. 1995; 26: 498–502.