Self-Administered Computer Therapy for Apraxia of Speech
Two-Period Randomized Control Trial With Crossover
Background and Purpose—There is currently little evidence on effective interventions for poststroke apraxia of speech. We report outcomes of a trial of self-administered computer therapy for apraxia of speech.
Methods—Effects of speech intervention on naming and repetition of treated and untreated words were compared with those of a visuospatial sham program. The study used a parallel-group, 2-period, crossover design, with participants receiving 2 interventions. Fifty participants with chronic and stable apraxia of speech were randomly allocated to 1 of 2 order conditions: speech-first condition versus sham-first condition. Period 1 design was equivalent to a randomized controlled trial. We report results for this period and profile the effect of the period 2 crossover.
Results—Period 1 results revealed significant improvement in naming and repetition only in the speech-first group. The sham-first group displayed improvement in speech production after speech intervention in period 2. Significant improvement of treated words was found in both naming and repetition, with little generalization to structurally similar and dissimilar untreated words. Speech gains were largely maintained after withdrawal of intervention. There was a significant relationship between treatment dose and response. However, average self-administered dose was modest for both groups. Future software design would benefit from incorporation of social and gaming components to boost motivation.
Conclusions—Single-word production can be improved in chronic apraxia of speech with behavioral intervention. Self-administered computerized therapy is a promising method for delivering high-intensity speech/language rehabilitation.
Speech/language impairments after stroke are subcategorized into aphasia, dysarthria, and apraxia of speech (AOS). AOS is a disorder at the interface of language and speech production, involving breakdown in mapping from abstract linguistic representations to motor plans.1 Typical behaviors include speech errors, loss of automaticity and fluency, and altered timing parameters.2 In severe cases, patients may be nonverbal. Lesions causing AOS usually occur within the left cortical motor or somatosensory areas.3 Because of the proximity of speech control regions to left perisylvian cortex, AOS often co-occurs with aphasia.
Behavioral interventions for AOS involve 2 broad classes of therapies: bottom-up articulatory-kinematic therapies focus on individual speech sounds4 and top-down interventions aim to reestablish fluent production of larger linguistic units.5 Comparisons of outcomes for the 2 approaches are not conclusive.6 Intervention research has largely used quasi-experimental designs with nonrandom assignment. A meta-analysis and systematic review conclude that there is no randomized controlled trial evidence in support of intervention for AOS.4,7
We report outcomes of an intervention for AOS combining these 2 therapeutic traditions. The intervention aimed to improve word production, with target forms ultimately placed in sentence frames. This approach acknowledges the common comorbidity of AOS with aphasia, allowing both linguistic and phonetic processes to be targeted. Trials of aphasia therapies indicate that lower intensity interventions have limited outcomes.8 Attempts to increase face-to-face therapy dose can result in high attrition rates because attending multiple appointments can challenge participants.9 The use of software programs, allowing participants to self-administer intervention, may circumvent this difficulty. A feasibility study reported that computer therapy is cost-effective and acceptable to patients with poststroke anomia.10
We used a software therapy for AOS. It involved a perceptual stage (spoken word-picture matching, auditory-written word matching, and auditory lexical decision), followed by a production stage. The perceptual component aimed to consolidate form-meaning representations of target vocabulary and facilitate feedforward input to motor representations.11 The production stage consisted of hierarchical speech activities. First, participants observed videos of word production, followed by blocks of trials requiring imagined production. The program then moved to overt word repetition with increasing delays between stimulus and response. Responses were audiorecorded by the software. The final stages involved more autonomous word production. Participants used trained words in sentence frames, followed by independent word retrieval/picture naming (program detail12).
We explored the effectiveness of this intervention in a randomized controlled trial with a subsequent crossover period. Participants were randomly assigned to 1 of 2 order conditions: speech-first condition or sham-first condition. The sham intervention was another self-administered software program with identical interfaces but minimal speech/language content, involving visuospatial activities, for example, pattern matching and timed jigsaw completion. We report outcomes for the first intervention period and descriptively profile effects of the crossover. Power calculations based on an initial pilot study13 indicated that, for medium-to-small effect sizes (eg, 0.5–0.33) and α=0.05 (2-tailed), a sample size of 50 pairs of cases was required to ensure sufficient power for repeated measures comparisons (>95% for medium effects and 80% for small effects).
Our objective was to determine effectiveness of the speech intervention in improving communicative/functional adequacy of word production in comparison with a sham intervention for chronic AOS. Two baseline measures of speech before intervention were recorded to evaluate behavioral stability. Participants were profiled on a range of measures to establish AOS severity and the presence of comorbidities. The primary outcome measure was communicative adequacy of spoken naming. The secondary outcome measure was phonetic accuracy of words in repetition. Three word sets were developed, each containing 35 items. One set appeared in the intervention (treated). Two untreated sets consisted of matched items (phonetically similar to treated words) or control items (phonetically dissimilar). They allowed identification of generalization of treatment to similar or remote forms. Other outcome measures were collected but not reported here (repetition word duration, health economic analysis, and connected speech). The primary hypothesis was that speech intervention would result in significantly greater improvement in naming adequacy than sham. Secondary research predictions were as follows: (1) speech intervention would result in improved repetition accuracy; (2) effects of speech intervention would generalize to phonetically related untreated forms but not unrelated control words; (3) speech improvements would be maintained through a no-intervention period to final assessment (18 weeks post intervention for the speech-first group and 8 weeks for the sham-first group).
This study was granted ethical approval by an National Health Service panel (08/H1308/14). Volunteers gave consent to participation. Some deception was involved because participants were blinded to the sham nature of the visuospatial program. Participants were told that the program aimed to improve attention and memory. Participants were offered debriefing on completion of the study. It was a single-center, community-based trial (Sheffield, United Kingdom). Participants self-administered interventions, supported by speech and language therapists (SLTs), in their homes.
Participants were recruited from community SLT services across the South Yorkshire region over a 25-month period. The inclusion/exclusion criteria were adults with chronic AOS (at least 5 months post onset of apraxic stroke); unilateral left hemisphere lesion(s); the absence of neurodegenerative condition; premorbid competence in English; sufficient auditory/visual acuity to interact with a laptop; and not receiving impairment SLT. AOS diagnosis was independently confirmed by 2 SLTs using standard diagnostic criteria2: disrupted speech intelligibility (distortions/substitutions) with intact gross oral movements and reduced speed/fluency and effortful speech (hesitations, groping, and prosodic disruption). In cases of uncertainty, a third assessor evaluated behavior. All assessors were registered SLTs.
Fifty participants were recruited (29 men and 21 women). Figure 1 displays progression through the study. After baseline evaluation, participants were randomly allocated to speech-first/sham-first conditions by a researcher blind to case via block randomization (block sizes, 20-20-10). Assessors were aware of block sizes. An unpredictable allocation sequence was generated via computer randomizer. The sequence was transferred to opaque numbered envelopes and consecutive referrals allocated to condition via these envelopes. A subsequent allocation check revealed that 1 participant, allocated to sham-first group, did not receive interventions in planned order. An intention-to-treat criterion was used, and data from this participant were analyzed as per initial randomization. No stratification/minimization was used. Subsequent comparison of baseline AOS severity, aphasia severity, age, years of education, time post onset, and laterality using independent samples, t tests (2-tailed; α=0.05), revealed no significant differences across the 2 order conditions (Table 1). There was a sex imbalance in the speech-first condition, with more men than women (17 versus 8).
Before randomization, there were 2 baseline evaluation sessions (B1 and B2) to assess stability of naming and repetition behavior. The gap between baselines was 7 to 34 days (M=18). There were 3 word sets, each containing 35 items (Table I in the online-only Data Supplement). One word set (treated) appeared in intervention. Treated words, and nonmatched controls, represented vocabulary of high functional value and were roughly matched on word frequency and imageability. The 2 control sets did not appear in treatment and were either phonetically matched or phonetically dissimilar to treated forms (eg, treated: night; matched: white; control: house). All sets were roughly matched on word length and syllable structure. In repetition, participants repeated items after live presentation by an experimenter. Words were presented in a fixed pseudorandom order, with no phonetically similar items appearing in sequence. Only first responses were scored. The repetition task included all 35 items from each set. Naming performance was scored on 23 word triplets (only triplets with good name agreement by healthy speakers were included to avoid treatment effects being inflated by disambiguation of images during therapy). No cues were given in either task other than orientation cues to key elements of photographs in naming. Speech data were audiorecorded for subsequent analysis by an assessor who had no participant contact and was blind to allocation and period.
Naming responses were scored as correct/incorrect (1/0). Correct responses were target words or appropriate synonyms (eg, children-kids). Problematic responses were scored in a consensus fashion by a group of 5 to 6 raters, the majority of whom were blind to allocation and period. Phonetic errors were not penalized if a listener could unambiguously identify the intended target. Repetition responses were coded on a 0- to 7-scale (eg, 0=no/entirely off-target response; 6=accurate but slow latency or lengthened duration; 7=fast, accurate response; Table II in the online-only Data Supplement, full scale). Responses scored at 6/7 were recorded as correct. An inter-rater reliability check on a subset of 558 samples was performed by a further member of the research team who was blind to period, allocation, and assessor 1’s ratings. The reliability sample was drawn from 16 participants with different levels of AOS severity, randomly selected across assessment points, and with equal numbers from both order conditions. Spearman’s rank correlation indicated a high level of inter-rater reliability (n=558; ρ=0.895; P<0.0001).
Immediately after the second baseline, participants were loaned a laptop for ≈6 weeks (speech-first range: 36–64 days, M=45 [SD, 5.1] and sham-first range: 42–50 days, M=44 [SD, 1.97]). Participants could access only their allocated program. An SLT researcher assisted participants with program use for initial sessions, followed by phone contact to check progress. Further support visits were arranged as needed (face-to-face visits in period 1: speech-first range, 1–6; M=4 [SD, 1.45] and sham-first range, 1–7; M=3 [SD, 1.17]). The regular use of software was encouraged (once or twice a day for at least 20 minutes). The actual intensity of treatment was determined by the participant. The program recorded interactions and compliance with recommendations could be tracked. After ≈6 weeks, the laptop was withdrawn, and speech was reevaluated (outcome 1 [O1]).
After a 4-week rest phase, the crossover period began. The speech-first group received sham intervention, and the sham-first group received the speech program. Programs were again available for ≈6 weeks. Laptops were then withdrawn, and further reassessment was completed (outcome 2 [O2]). Final reassessment (Maintenance [M]) took place after an 8-week no-treatment period.
There was no significant difference in program usage across the 2 groups in period 1: speech-first range, 355 to 1888 minutes and M=1142 (SD, 439.54); sham-first range, 137 to 3129 minutes; M=1026 (SD, 726.17); t(46)=−0.66; P=0.512. The use of the first program tended to be higher than the second (period 2: speech-first range, 0–2322 minutes and M=832 [SD, 677.55]; sham-first range, 103–2106 minutes and M=996 [SD, 529.06]).
Statistical analyses are reported for period 1, with naming accuracy (Table 2) and repetition accuracy (Table 3) as dependent measures. Period 2 results are profiled for treated items in Figure 2 (naming) and Figure I in the online-only Data Supplement (repetition) (Table III in the online-only Data Supplement, statistical analysis). Comparisons explored baseline (B) stability (B1–B2), period 1 effects (B2–O1), and maintenance (speech-first: O1–M; sham-first: O2–M).
Means and SEs for correctly named items are presented in Table 2. Baseline stability was investigated by ANOVA with assessment point (B1 and B2), item type (treated, matched, and control) as the repeated measures, and treatment (sham-first; speech-first) as the between-group factor. Main effects of item type were significant (F=3.35; df=2, 92; P<0.05) with more treated items correctly named than control items (t=3.02; df=47; P<0.01; Bonferroni corrected, α=0.0167 for this and subsequent post hoc analyses of item effects). There were no other significant effects. Naming accuracy was stable at baseline and comparable across treatment groups.
Period 1 treatment effects were investigated using ANOVA with assessment point (B2 and O1) and item type (treated, matched, and control) as repeated measures and treatment (sham-first and speech-first group) as the between-group factor. Results revealed a main effect for assessment point (F=18.82; df=1, 46; P<0.001) and a significant interaction between assessment point and treatment group (F=5.66; df=1, 46; P<0.05). The assessment point effect was because of better naming at O1 collapsed across item type and treatment group (estimated marginal mean±SE: B2=12.71±1.20 and O1=14.07±1.18). Assessment point interacted with treatment group with greater improvement in naming for the speech-first (estimated marginal mean±SE: B2=13.54±1.73 and O1=15.64±1.70; t=3.68; df=22; P<0.01) than the sham-first group (estimated marginal mean±SE: B2=11.88±1.66 and O1=12.49±1.63; t=2.10; df=24; P<0.05). The main effect of item type (F=7.68; df=2, 92; P<0.01) and the interaction between item type and treatment group (F=3.12; df=2, 92; P<0.05) were also significant. Overall, treated items were named more accurately than matched (t=2.59; df=47; P<0.014) and control items (t=3.99; df=47; P<0.001). However, only the difference between treated and control items was significant for both the sham-first (t=2.60; df=24; P<0.0167) and the speech-first groups (t=3.01; df=22; P<0.01). Post hoc analysis of the significant interaction between assessment point×item type×treatment group (F=6.82; df=2, 92; P<0.01) showed little change between B2 and O1 for the treated, matched, or control items for the sham-first group (differences of estimated marginal means between B2 and O1 for treated: d=0.12; matched: d=0.76; control: d=0.96; Bonferroni corrected α=0.008 for 6 post hoc comparisons). By contrast, increases in accuracy between B2 and O1 for the speech-first group were larger for treated and control items (differences of estimated marginal means between B2 and O1 for treated: d=3.30; t=3.71, df=22; P<0.005; matched: d=1.04; control: d=1.96; t=3.35, df=22, P<0.005). Figure 2 shows the effect of crossover, with increased naming accuracy for the sham-first group after exposure to the speech program (statistical analysis is given in Table III in the online-only Data Supplement).
Table 3 presents means and SEs for the number of treated, matched, and control items with accuracy ratings of 6 or 7. Baseline stability was investigated by ANOVA with assessment point (B1 and B2) and item type (treated, matched, and control) as the repeated measures and treatment (sham-first and speech-first groups) as the between-group factor. Main effect of item type was significant (F=17.96; df=2, 92; P<0.001) with greater accuracy for treated compared with matched and control items (treated versus matched: t=5.73, df=47, P<0.001; treated versus control: t=3.82, df=47, P<0.001). There were no other significant effects, indicating that repetition accuracy was stable across baselines and comparable across treatment groups.
Period 1 treatment effects were investigated by ANOVA with assessment point (B2 and O1) and item type (treated, matched, and control) as repeated measures, and treatment (sham-first and speech-first group) as the between-group factor. Main effects for assessment point (F=15.18; df=1, 46; P<0.001) and item type (F=25.32; df=2, 92; P<0.001) were significant. Assessment point effects were because of higher accuracy after intervention, collapsed across item type and treatment group (estimated marginal mean±SE: B2=16.85±1.50 and O1=18.51±1.45). Item type effects resulted from significant differences among all 3 word sets, with the highest accuracy for treated, followed by control, and then by matched items (treated versus matched: t=6.35, df=47, P<0.001; treated versus control: t=4.05, df=47, P<0.001; matched versus control: t=3.17, df=47, P<0.001). Post hoc analysis of the significant interaction between assessment point×item type×treatment group (F=3.98; df=2, 92; P<0.05) showed relatively little change between B2 and O1 for treated, matched, and control items in the sham-first group (differences of estimated marginal means between B2 and O1 for treated: d=0.36; matched: d=1.60; control: d=1.32). For the speech-first group, there were significant increases in accuracy for treated and matched items (differences of estimated marginal means between B2 and O1 for treated: d=3.13, t=3.22, df=22, P<0.005; matched: d=1.96, t=2.96, df=22, P<0.008; control: d=1.57) (Figure I in the online-only Data Supplement displays the crossover effects showing an increase in repetition accuracy for the sham-first group after exposure to the speech program; Table III in the online-only Data Supplement, statistical analysis).
Maintenance of gains in naming and repetition of treated items was examined with paired t tests, comparing immediate postspeech intervention performance with the maintenance assessment (speech-first group: O1 versus M; sham-first group: O2 versus M). For naming, there were no significant changes in the speech-first group (t=1.61; df=19; nonsignificant) or the sham-first group (t=1.49; df=23; nonsignificant), indicating maintenance of treatment gains. For repetition, there were no changes in the speech-first group (t=0.75; df=19; nonsignificant) but a significant decrease in performance in the sham-first group (t=3.06; df=23; P<0.01).
Dose–response correlations were computed (Figure 3). Response was measured as the difference in naming of treated items between B2 and maintenance. Dose was measured in terms of minutes of speech program use. The correlation for both groups was positive, indicating an increase in correctly named items as a function of increased time using the speech intervention (speech-first group: r=0.45, P<0.05; sham-first group: r=0.42, P<0.05).
In this 2-period, crossover study, we observed improvements in both naming and repetition in speakers with chronic and stable AOS impairments. Treatment effects were generally specific to trained vocabulary, with only limited transfer to phonetically similar words in repetition accuracy. The effects of intervention were largely maintained when interventions were withdrawn, and in the speech-first group, this retention period was 18 weeks. There was some loss of gains in repetition accuracy in the sham-first group, which might be because of lower baseline performance or the lower use of the period 2 intervention affecting the speech program. Treatment effects were specific to the speech program. The period 1 results, equivalent to a randomized controlled trial design, revealed no significant speech change in response to the sham program. The period 2 profiles reflect the manipulation of the crossover, with increased scores on treated items in the sham-first group for naming and repetition. Furthermore, the significant relationship between speech treatment dose and response is an indicator that behavioral change might be linked to the speech intervention.
The item-specific improvement in naming is similar to that found in successful therapies for anomic aphasia.14 One possibility is that the effects we observed resulted from lexical facilitation rather than enhancement specifically at the phonetic level. It is evident from the aphasia severity scores (Table 1) that most, but not all, participants had significant accompanying aphasic impairment. Given the strong interconnectivity between lexical and phonetic levels, top-down activation from the lexical level may enable access to motor plans. Importantly, in the face of item-specific effects, the use of functionally relevant vocabulary in therapy is essential.
The results provide evidence that computer therapy and development of programs enabling patients to self-administer interventions are important directions in rehabilitation of poststroke speech and language disorders. This model of intervention may allow administration of high-intensity therapies in a cost-effective manner. Some participants had little or no previous experience in using computers; however, design of programs with simple interfaces enabled computer novices to access interventions with SLT support. Family members were largely positive about the intervention, some reporting reduced burden of care in that they felt able to pursue their own activities, knowing that the participant was engaged in purposeful activity. Participants were also generally positive about the software, although many commented on the repetitive nature of stimulation. The dose levels administered by participants were varied and sometimes modest. An important future direction for software design is to incorporate game and social elements to maximize motivation and achieve higher usage levels. This refinement would benefit engagement with the later stages of the program in particular, which focus on the use of trained words in sentence frames. Practice at this level is likely to be crucial in achieving transfer to spontaneous speech.
We thank Andrew Hardbottle and Jennifer Ryder for contributions to data collection and Rotherham NHS Trust for assistance in participant recruitment.
Sources of Funding
This research was funded by the Bupa UK Foundation specialist grant programme. The funders had no role in design or conduct of the study.
Drs Varley and Whiteside are coauthors of a commercially available program used to treat speech impairments. They and the University of Sheffield (employers of Cowell, Dyson, and Whiteside) receive royalties from sales of software. The software used in this study was a pilot version of this program. The other authors report no conflicts.
The online-only Data Supplement is available with this article at http://stroke.ahajournals.org/lookup/suppl/doi:10.1161/STROKEAHA.115.011939/-/DC1.
- Received October 23, 2015.
- Revision received December 7, 2015.
- Accepted December 11, 2015.
- © 2016 American Heart Association, Inc.
- Gonzalez-Rothi LJ,
- Crosson L,
- Nadeau S
- McNeil MR,
- Doyle PJ,
- Wambaugh JL.
- Basilakos A,
- Rorden C,
- Bonilha L,
- Moser D,
- Fridriksson J.
- West C,
- Hesketh A,
- Vail A,
- Bowen A.
- Bhogal SK,
- Teasell R,
- Speechley M.
- Brady MC,
- Kelly H,
- Godwin J,
- Enderby P.
- Palmer R,
- Enderby P,
- Cooper C,
- Latimer N,
- Julious S,
- Paterson G,
- et al