Volume 45, Issue 6 p. 1107-1114
Full Access

Speech-in-noise perception in high-functioning individuals with autism or Asperger's syndrome

José I. Alcántara

José I. Alcántara

Department of Experimental Psychology, University of Cambridge, UK

Search for more papers by this author
Emma J.L. Weisblatt

Emma J.L. Weisblatt

Department of Psychiatry, Developmental Psychiatry Section, University of Cambridge, UK

Search for more papers by this author
Brian C.J. Moore

Brian C.J. Moore

Department of Experimental Psychology, University of Cambridge, UK

Search for more papers by this author
Patrick F. Bolton

Patrick F. Bolton

Department of Psychiatry, Developmental Psychiatry Section, University of Cambridge, UK

Search for more papers by this author
José I. Alcántara, Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, UK; Tel: +44-01223-333569; Fax: +44-01223-333564; Email: [email protected]

Abstract

Background: High-functioning individuals with autism (HFA) or Asperger's syndrome (AS) commonly report difficulties understanding speech in situations where there is background speech or noise. The objective of this study was threefold: (1) to verify the validity of these reports; (2) to quantify the difficulties experienced; and (3) to propose possible mechanisms to explain the perceptual deficits described.

Method: Speech-in-noise perception abilities were measured using speech reception thresholds (SRTs), defined as the speech-to-noise ratio (SNR) at which approximately 50% of the speech is correctly identified. SRTs were measured for 11 individuals with HFA/AS and 9 age/IQ-matched normal-hearing control subjects, using an adaptive procedure, in a non-reverberant sound-attenuating chamber. The speech materials were standardised lists of everyday sentences spoken by a British male speaker. The background sounds were: (1) a single female talker; (2) a steady speech-shaped noise; (3) a speech-shaped noise with temporal dips; (4) a steady speech-shaped noise with regularly spaced spectral dips; and (5) a speech-shaped noise with temporal and spectral dips.

Results: SRTs for the HFA/AS group were generally higher (worse) than those for the controls, across the five background sounds. A statistically significant difference in SRTs between the subject groups was found only for those background sounds that contained temporal or spectro-temporal dips. SRTs for the HFA/AS individuals were 2 to 3.5 dB higher than for the controls, equivalent to a substantial decrease in speech recognition. Expressed another way, the HFA/AS individuals required a higher SNR, whenever there were temporal dips in the background sound, to perform at the same level as the controls.

Conclusions: The results suggest that the speech-in-noise perception difficulties experienced by individuals with autism may be due, in part, to a reduced ability to integrate information from glimpses present in the temporal dips in the noise.

Abbreviations:

  • HFA/AS
  • high-functioning individuals with autism or Asperger's syndrome
  • SRT
  • speech reception threshold
  • Autism is a behaviourally defined syndrome characterised by abnormalities in reciprocal social interaction, verbal and nonverbal communication, and restricted repertoire of interests and behaviours, all present from early childhood (American Psychiatric Association, 1994). Conspicuously absent in the diagnostic criteria is any reference to sensory interests or abnormalities, although atypical reactions to the sensory environment are often reported in children with autism spectrum disorder. Indeed, clinicians view these sensory reactions as a prominent part of the clinical picture of autism. This is in keeping with recent findings from a clinical database, currently being analysed, that the prevalence of abnormal sensory phenomena may be over 60% for individuals with autism in the United Kingdom. While all modalities appear to be affected, these abnormalities are particularly evident in hearing, and are often described as very distressing by high-functioning individuals with autism spectrum disorder (ASD) (Goldfarb, 1963; Ornitz, 1974; Grandin & Scariano, 1986). Among the auditory symptoms reported is hyper-reactivity to auditory stimulation (i.e., hyperacusis) (Grandin & Scariano, 1986; Rosenhall, Nordin, Sandström, Ahlsén, & Gillberg, 1999), although cases of hyporeactivity to sound have also been noted (Grandin & Scariano, 1986). An increased awareness of environmental noises and difficulty in hearing speech in background noise are also prominent auditory features in autism (Grandin & Scariano, 1986; Boatman, Alidoost, Gordon, Lipsky, & Zimmerman, 2001). There have been very few psychophysical studies of auditory discrimination in autism (e.g., Franca et al., 2001; Bonnel et al., 2002) making it difficult to identify the processes involved. Difficulties in auditory perception can be disabling, especially for children, who may become distressed and unable to understand speech in a noisy classroom environment, and therefore find themselves at an educational disadvantage relative to their peers. The development of language and use of language in a social context may also be adversely affected.

    The process of detecting speech in competing speech or everyday background sounds may be viewed as an example of ‘auditory scene analysis’, whereby information arising from several simultaneous sources is perceptually grouped into separate ‘auditory objects’ or perceptual streams (Bregman, 1990). The auditory system may make use of a variety of physical cues to derive separate perceptual streams, including the perceived similarity or dissimilarity between different sound sources. Specifically, differences in spectrum and fundamental frequency (F0), disparities in onset time, correlated changes in amplitude or frequency, and location of the auditory signals may all assist in the scene analysis of complex auditory inputs and the formation of auditory streams. Although no single cue is effective all the time, when used together they generally provide a very effective basis for auditory grouping.

    Other potential explanations for the reported difficulties experienced by individuals with autism when listening to speech in a competing background are a failure of selective attention, deficits in working memory, and reduced ability to integrate visual and auditory speech cues. There could also be a failure of the normal use of contextual cues to facilitate speech perception. For example, previous studies have demonstrated that individuals with autism are insensitive to contextual cues in linguistic tasks, in that they are less able than controls to use the context of the preceding sentence to disambiguate the pronunciation of a homograph (Frith & Snowling, 1983; Snowling & Frith, 1986).

    In summary, the analysis of complex auditory signals, such as speech-in-noise, may be affected by deficits in both peripheral and central processing. In addition, the role of the visual modality and attention may also be significant. At present, it is not known to what extent any or all of these deficits operate in individuals with autism.

    The process of speech-in-noise recognition has been studied extensively for normally hearing individuals and people with sensorineural hearing loss. Relative to normally hearing people, individuals with cochlear hearing impairment experience particular difficulty understanding speech in the presence of background sounds, even when the effects of reduced audibility are accounted for by amplification of the speech signal so that it is well above their absolute hearing thresholds (Plomp, 1978; Duquesnoy, 1983a; Laurence, Moore, & Glasberg, 1983; Glasberg & Moore, 1989; Moore, Glasberg, & Vickers, 1995). This problem has been quantified in the laboratory by estimating the speech reception threshold (SRT), defined as the signal-to-noise ratio (SNR, expressed in dB) corresponding to the 50% correct point on the psychometric function relating percent speech recognition to SNR. It is important to note here that the slope of the psychometric function is at its steepest here, and, therefore, even relatively small differences in SNR may indicate functional differences in speech recognition: for sentences in noise, a 1-dB change in SNR may correspond to as much as a 19% change in speech recognition, for a steady background noise (Duquesnoy, 1983b), and about a 12% change when there are temporal dips in the background noise (Duquesnoy, 1983a).

    SRTs for people with cochlear hearing impairment are typically higher than for normally hearing people. In other words, hearing-impaired individuals require a higher SNR to achieve the same level of performance. The SRT difference between normal and hearing-impaired individuals varies markedly depending on the type of background sound: normally hearing people achieve markedly lower (better) SRTs in a background of a single talker than in a steady background noise with a spectrum that has the same shape as the speech spectrum (‘speech-shaped’), while hearing-impaired individuals do not show this effect (Duquesnoy, 1983a; Festen, 1987b; Moore et al., 1995; Peters, Moore, & Baer, 1998). The reason for the difference is thought to arise because of a failure on the part of the hearing-impaired individuals to take advantage of ‘dips’ present in the background noise or competing voices. These dips take two forms: temporal and spectral. The temporal dips arise because there are moments, during brief pauses, for example, when the overall level of the competing speech is low. During these moments, the signal-to-background ratio is relatively high, allowing ‘glimpses’ of the target speech to be obtained, a process called ‘dip listening’. The spectral dips arise because the spectrum of the target speech is often quite different from that of the background speech, at least over the short term. Therefore, there may be some frequencies of the target speech that are not masked at all by the competing speech, resulting in a high SNR for those frequencies. This allows regions of the target speech spectrum to be glimpsed, which can then be used to infer the structure of the complete target speech sound.

    The objective of this study was threefold: (1) to verify the validity of the commonly reported speech-in-noise problems; (2) to quantify the difficulties experienced; and (3) to propose possible mechanisms to explain the perceptual deficits described. Specifically, we wanted to determine whether the problems reported by HFA/AS individuals in understanding speech in noise might be due, at least in part, to a failure to exploit spectral and/or temporal dips present in background sounds. We measured the SRTs of a group of high-functioning normal-hearing individuals with autism (HFA) or Asperger's syndrome (AS) and also of a group of normal-hearing age/IQ-matched control subjects. The speech materials were lists of everyday sentences spoken by a British male talker, presented in five different types of background sounds, designed to determine the relative importance of spectral and temporal processing for speech-in-noise recognition.

    Method

    Subjects

    Two groups of subjects participated in the study: 11 high-functioning normal-hearing adults and adolescents with confirmed diagnoses of autism or Asperger's syndrome (HFA/AS); and 9 normal-hearing age/IQ-matched adult and adolescent control subjects, with no previously reported speech-in-noise perception problems. All had normal hearing thresholds (<20 dB HL) across the audiometric frequencies (.25 to 8 kHz) and middle ear function within normal limits, and were paid for their services. All HFA/AS subjects had previously reported speech-in-noise problems during ADI/ADOS sessions (Lord et al., 2000; Lord, Rutter, & Le Couteur, 1994). However, this did not form part of our selection criteria. Indeed, it was only some time after the end of the study that the ADI/ADOS data were inspected and this fact became clear.

    All the HFA/AS individuals were diagnosed by clinicians according to the criteria specified by ICD-10 (WHO, 1992). The mean ages of the HFA/AS and control groups were 21 and 19 years, respectively. Subjects were assessed and matched for verbal, performance and full-scale IQ using the Weschler Abbreviated Scale of Intelligence (WASI) (Psychological Corporation, 1999). All four subtests in this scale were used to calculate the IQ values (i.e., vocabulary, block design, similarities and matrix reasoning). Table 1 summarises the subject characteristics. Independent samples t-tests (2-tailed) were carried out on the matching data. No significant differences between the two subject groups were found on the age (t (18) = .42, p = .68), verbal IQ (t (18) = .57, p = .58), performance IQ (t (18) = −1.14, p = .27), or full-scale IQ (t (18) = −.09, p = .93) measures.

    Table 1. Subject matching data
    Group Age Verbal IQ Performance IQ Full scale IQ
    HFA/AS (n = 11) 20.9 ± 11.1 107.6 ± 17.1 102.1 ± 12.8 105.6 ± 14.8
    Control (n = 9) 19.3 ± 4.8 103.0 ± 17.8 108 ± 10.4 106 ± 14.0

    Stimuli and apparatus

    Speech perception testing was carried out in a double-walled non-reverberant sound-attenuating chamber. Subjects were seated directly in front of and facing a loudspeaker, at a distance of about 1 metre. The speech materials used were lists of everyday English sentences, the ASL sentence lists (MacLeod & Summerfield, 1990), spoken by a male native British speaker with a Received Pronunciation (‘RP’) English accent. Each sentence list contained 15 sentences, and each sentence contained three ‘key words’. The following five background sounds were used:

    • 1

      A steady speech-shaped noise with a long-term average spectrum similar to that of the target speech. This noise was used to provide a base level of performance against which SRTs in other noise conditions could be compared. The spectrum of this noise is shown in Figure 1.

    • 2

      A single competing female talker digitally filtered so as to have the same spectrum as the noise in (1) (see Figure 1). This sound contained both spectral and temporal dips.

    • 3

      A noise with the same spectrum as in (1), but amplitude modulated with the temporal envelope of the single talker. This noise had only temporal dips.

    • 4

      A noise with the same overall spectral shape as in (1), but digitally filtered to have spectral dips in several frequency regions. The digital filter used to produce this noise is shown in Figure 2. The filtering was based on the equivalent-rectangular-bandwidth (ERB) scale derived from the auditory filter bandwidths for normally hearing subjects (Glasberg & Moore, 1990). Each ERB represents one auditory filter bandwidth. The noise was filtered with an alternating pattern of four ERBs present and four ERBs removed. Thus, this noise contained only spectral dips, four-ERBs wide.

    • 5

      A noise with both spectral and temporal dips obtained by applying the temporal envelope of the single talker used in (2) to the speech-shaped noise used in (1), and then filtering that noise as in (4). Thus, this noise contained temporal dips and four-ERB wide spectral dips.

    Details are in the caption following the image

    Long-term average spectra of the speech-shaped noise (dashed line) and the female competing talker (solid line) after filtering to match the spectrum of the noise

    Details are in the caption following the image

    Characteristics of the digital filter used to produce the noise with four-ERB wide spectral notches in alternating frequency regions

    The level of the background sounds was either 60 or 75 dB SPL. For the background sounds with spectral dips, the overall level of the noise was slightly reduced by the filtering, by about 3 dB. The spectrum level of the noise in the filter passbands was the same as in the speech-shaped noise in (1). In this way, it was possible to investigate the effect of introducing spectral dips without the confounding effect of increasing the spectrum level in the remaining parts of the speech spectrum.

    The background sounds in (2) to (5) were digitally filtered using a Silicon Graphics UNIX workstation, using a 16-kHz sampling rate. When processing was complete, they were transferred digitally to a recordable compact disc (CDR) for use in the study. The speech was played back from digital audiotape (DAT). The level of the background sound was adjusted using the calibrated volume control of a Quad 306 amplifier. The speech level was controlled by means of a Tucker Davies Technologies programmable attenuator (PA4), adjusted manually by the tester. The background sound and speech stimuli were mixed and played back through a Monitor Audio MA4 loudspeaker.

    Procedure

    SRTs were measured using an adaptive procedure. The level of the speech was initially set to give a +5 dB SNR (i.e., the speech level was 5 dB higher than the noise level). If the subject scored two or more key words correct out of three, the level of the speech was decreased by 5 dB; if the subject scored fewer than two key words correct, the speech level was increased by 5 dB. After two ‘turnpoints’, that is, changes from increasing to decreasing level, or vice versa, the step size was decreased to 3 dB. Testing then continued until the sentence list was completed. Two complete sentence lists were used for each background sound.

    Subjects were instructed to ignore the noise and concentrate on the talker, and to repeat back, verbatim, what they heard. They were encouraged to guess and to make full use of any contextual cues available.

    The tester did not provide any feedback at any time.

    The numbers of key words correct for each speech level, for a given background, were totalled. A probit analysis (Finney, 1971) was then carried out to estimate the 50% correct point on the psychometric function, that is, the SRT for the specific background sound; the more negative the SRT, the better the performance of the subject. For example, an SRT of −10 dB would indicate that 50% of the speech could be correctly identified when the level of the speech was 10 dB below that of the background sound.

    All data for a given subject were collected in one two-hour session by the first author. The order of testing of the background sounds and presentation levels was counterbalanced across subjects to control for test order effects. The second author administered the WASI during a break between the speech testing.

    Results

    Figure 3 shows the mean SRTs for the HFA/AS and control subject groups, for the 60- and 75-dB SPL background sounds in the top and bottom panels, respectively. The SRTs are plotted in dB, with more negative numbers (indicating better speech recognition performance) proceeding downwards. Error bars denote ± one standard deviation (SD).

    Details are in the caption following the image

    Mean speech reception thresholds (SRTs), in dB, for the HFA/AS group (squares) and control group (circles) for the 60-dB SPL background noise (top panel) and the 75-dB SPL background noise (bottom panel). More negative SRT scores indicate better performance. SRTs are presented for each of the five background sounds. The shaded symbols indicate statistically significant (p < .05) differences between HFA/AS and control subjects

    An analysis of variance (ANOVA) with repeated measures was carried out on the data with the following three factors: (1) subject group; (2) background sound; and (3) background sound level. In this design, the individual subjects may be viewed as defining a fourth factor. The ‘subject’ factor is crossed with the ‘background sound’ and ‘background sound level’ factors, but nested under the ‘subject group’ factor.

    Table 2 shows a summary of the analysis of variance results. Briefly, all three main factors were found to be statistically significant (p < .05). Of main interest here is the significant ‘subject group’ factor, indicating that SRTs for the HFA/AS group were significantly higher (worse) than for the control group (−16.4 dB and −18.2 dB, respectively). Inspection of Figure 3 indicates that SRT differences between the two groups varied between about −.5 and 3 dB, across background sound and level. Although the magnitude of these differences may seem small, recall that small differences in SNR indicate functional differences in speech recognition: for sentences in noise, a 1-dB change in SNR may correspond to as much as a 19% change in speech recognition, for a steady background noise (Duquesnoy, 1983b), and about a 12% change when there are temporal dips in the background noise (Duquesnoy, 1983a).

    Table 2. Summary of ANOVA table
    Effect d.f. F-ratio p-value
    Subject group 1,18 5.13 .036
    Background level 1,18 5.71 .028
    Background type 4,72 547.6 <.001
    Group × background level 1,18 1.55 .229
    Group × background type 4,72 2.86 .030
    Background level × background type 4,72 2.79 .032
    Group × background level × background type 4,72 .85 .500

    Table 2 indicates that the ‘background sound’ factor was statistically significant, but more interestingly, that the interaction of ‘background sound’ and ‘subject group’ was also significant. Post-hoc tests showed that the SRTs for the female talker, the noise with temporal dips, and the noise with temporal and spectral dips, were significantly worse (p < .05) for the HFA/AS group than for the control group. This suggests that the HFA/AS individuals were not able to exploit temporal dips in the background sounds as well as the controls.

    In summary, although individuals with autism or Asperger's syndrome appear to make use of temporal and spectral dips to improve the detection of speech in background sounds, they do not derive as much benefit from the temporal dips as the control subjects. This can be seen more clearly in Figure 4, which shows the SRT improvements measured for the background sounds containing spectral and/or temporal dips, relative to the SRT for the steady noise containing no dips, which resulted in the highest (worst) SRT values. The squares indicate the SRT improvements for the HFA/AS group, and the open circles show the SRT improvements for the control group. The SRT improvements for the HFA/AS group were about 2 to 3 dB lower than for the control group when the background sound contained temporal dips alone or temporal and spectral dips. In contrast, when only spectral dips were present, the SRT improvements for the two groups were very similar for the 60-dB noise level, and slightly larger for the HFA/AS group for the 75-dB noise level.

    Details are in the caption following the image

    SRT improvements relative to the SRT for the steady noise, for the HFA/AS group (squares) and control group (circles) for the 60-dB SPL background noise (top panel) and the 75-dB SPL background noise (bottom panel), for the four background sounds with dips

    Discussion

    The current study was carried out to determine if the problems commonly reported by individuals with HFA/AS in understanding speech when background noise is present could be verified and quantified using a standardised speech recognition test. Significant differences between SRTs for the HFA/AS and control groups were found, however, only for those background sounds containing temporal dips. That is, the HFA/AS group performed significantly worse than the controls when the following three background sounds were present: (1) the competing female talker; (2) the noise with temporal dips; and (3) the noise with temporal and spectral dips. When the background sound contained only spectral dips, or no dips at all, SRTs for the HFA/AS group were not significantly different from those for the control subjects.

    Normal-hearing subjects commonly use temporal and spectral dips present in background sounds to improve the intelligibility of the perceived speech. In the case of temporal dips, they are able to make use of the momentary fluctuations in the level of the background sound to improve detection of the speech signal. This is most likely due to the operation of a process termed ‘dip listening’ whereby subjects glimpse the speech signal during those times when there is little background sound energy present (Miller & Licklider, 1950; Howard-Jones & Rosen, 1993; Alcántara & Moore, 1995; Alcántara, Holube, & Moore, 1996). In the case of spectral dips, normal-hearing subjects combine information from glimpses in different frequency regions (Meddis & Hewitt, 1992; Howard-Jones & Rosen, 1993).

    In attempting to understand why the HFA/AS individuals failed to take full advantage of the temporal dips in the background sound, it is useful to review the speech-in-noise abilities of hearing-impaired people, to identify those psychoacoustic abilities important for good speech-in-noise perception. It was stated in the introduction that whereas normal-hearing people achieve markedly lower SRTs in a background of a single talker than in a background of speech-shaped noise, hearing-impaired people do not (Duquesnoy, 1983a; Festen, 1987a; Festen & Plomp, 1990; Hygge, Rönnberg, Larsby, & Arlinger, 1992; Moore, 1995; Peters et al., 1998; Moore, Peters, & Stone, 1999), even when the speech is presented at a high level so that audibility is not a factor. The failure to take advantage of the temporal dips in the background sounds might be due to a deficit in temporal resolution: people with cochlear hearing loss generally show impaired temporal resolution for stimuli with slowly changing envelopes, and this would lead to a reduced ability to take advantage of temporal dips (Festen, 1987a, 1987b; Glasberg, Moore, & Bacon, 1987; Moore & Glasberg, 1988; Festen & Plomp, 1990; Glasberg & Moore, 1992; Festen, 1993; Moore, 1995). Hearing-impaired individuals also show reduced frequency selectivity, and this would lead to a reduced ability to take advantage of spectral dips (Glasberg & Moore, 1986; Tyler, 1986; Moore, 1995). Frequency selectivity refers to our ability to separate or resolve, at least to a limited extent, the components in a complex sound, such as speech.

    As none of our HFA/AS subjects had a hearing loss, it seems unlikely that the speech in the dips of the background sound was below their absolute hearing thresholds. However, supra-threshold deficits in temporal resolution and/or frequency selectivity might be, in part, due to the differences in speech-in-noise perception between the HFA/AS and control groups. Dip listening is, however, a two-stage process, requiring initially the perception of isolated speech segments, that are either very brief, or have narrow frequency bandwidths. This is followed by the attempted reconstruction of the intended message from the ‘glimpsed’ speech segments. The first stage requires good temporal resolution and/or frequency selectivity abilities (depending mainly on low-level or ‘peripheral’ processing), as described previously, while the second stage requires good top-down processing (higher-level or ‘central’ processing), in the use of contextual cues and a knowledge of the syntactic structure of language. The inability of individuals with autism to take advantage of dips in the noise to improve speech communication might therefore be due to either peripheral and/or central processing deficits.

    Unfortunately, it was not possible to assess the influence of ‘top-down processing’ on the results due to the lack of a no-sentence material test condition, such as a word or nonsense-syllable task. We have, however, performed a pilot study on the frequency selectivity abilities of a small group of normal-hearing individuals with autism, and found that they are significantly worse than those of normal-hearing controls (Plaisted, Saksida, Alcántara, & Weisblatt, 2003). This suggests that at least part of the speech-in-noise difficulties experienced by our subjects could have been due to abnormal peripheral auditory processing. However, more studies are required on the psychophysical abilities of individuals with autism, both in frequency selectivity and temporal resolution, before this can be generalised to the wider population with autism.

    Although significant differences in perceiving speech in noise were found between the HFA/AS and control subjects, anecdotal reports from the HFA/AS individuals indicated that they did not experience as severe a difficulty in detecting the speech signal during the laboratory testing sessions as in their everyday lives. They reported that the ‘laboratory’ testing of speech-in-noise was not very realistic for them, as there were no echoes and the sounds always came from one location in space. Indeed, the process of perceiving speech in background sound in everyday life is complex and requires not only the detection of the auditory signal, but also the successful integration of visual and auditory speech cues and the ability to make use of contextual cues. In addition, competing noises in ‘real-life’ situations are often multiple and spatially separated. In contrast, testing in the current study was performed without lip-reading in a near-anechoic (i.e., echo-free) sound-attenuating chamber, and without overall contextual cues. Moreover, the speech and background sounds were presented from the same loudspeaker, which was always in front of the subject. The speech-in-noise perception abilities measured in the current study may therefore have underestimated the extent of the real problem experienced by the HFA/AS group who may have a decreased ability to use contextual cues, or suffer from a greater than normal interfering effect of room echoes or multiple background sound sources. Future studies are needed to investigate the contribution of these factors to perception of speech in noise by autistic individuals.

    In summary, the results of the current study suggest that the problems commonly reported by autistic individuals of understanding speech when there is background noise are real and quantifiable. They may be due, at least in part, to abnormal peripheral processing, specifically, to a reduced ability to exploit information about the target speech present during the spectral and temporal dips in the background. Due to the relatively small number of subjects used in the present study, the results should be viewed as preliminary, to be confirmed by further testing on a larger group of HFA/AS individuals. Further speech perception studies are planned to test additional conditions that simulate real-room acoustics and require audio-visual integration of speech information, to determine the contribution of these variables to the speech recognition abilities of autistic individuals. A series of psychophysical studies is also planned to investigate possible abnormal processing in auditory peripheral pathways in this population.

    Acknowledgements

    J.I. Alcántara was supported by a Medical Training Fellowship from the Royal National Institute for Deaf People, London. Additional support was provided by a grant from the Department of Psychiatry, University of Cambridge.