Volume 49, Issue 3 p. 557-572
Original Article
Full Access

Native English Speakers’ Perception of Arabic Emphatic Consonants and the Influence of Vowel Context

Rachel Hayes-Harb

Rachel Hayes-Harb

University of Utah

Search for more papers by this author
Kristie Durham

Kristie Durham

University of Utah

Search for more papers by this author
First published: 31 August 2016
Citations: 5
Rachel Hayes-Harb (PhD, University of Arizona) is Associate Professor of Linguistics, University of Utah, Salt Lake City.
Kristie Durham (MPhil, University of Utah) is a Doctoral Student in Linguistics, University of Utah, Salt Lake City.

Abstract

Native English speakers experience difficulty acquiring Arabic emphatic consonants. Arabic language textbooks have suggested that learners focus on adjacent vowels for cues to these consonants; however, the utility of such a strategy has not been empirically tested. This study investigated the perception of Arabic emphatic-plain contrasts by means of cross-language vowel identification and perceptual discrimination tasks. It was found that native English speakers relied more on following vowels than on the consonants themselves when discriminating Arabic emphatic and plain consonants, and that their accuracy was greatest when the following vowel was /æ/, followed by /u/ and /i/. The cross-language vowel identification task revealed that Arabic /æ/ was identified as a systematically different English vowel (/ɑ/) when following emphatics than when following plain consonants (/æ/), while Arabic /i/ and /u/ failed to exhibit a differential identification pattern. Together these findings indicate that following vowel quality moderates Arabic emphatic consonant perception by English speakers, who may exploit their sensitivity to the English /ɑ/-/æ/ contrast to perceive emphatic-plain contrasts in the context of Arabic /æ/. However, such a strategy is less likely to be successful when the following vowel is /i/ or /u/. Implications and pedagogical suggestions for the teaching of Arabic to native speakers of English are offered.

Introduction

Despite the geopolitical significance of Arabic and its classification as a “critical foreign language” by the U.S. government (Critical Language Scholarship Program, 2016), Arabic remains among the less commonly studied languages in the United States (Taha, 2007). The acquisition of Arabic by native English speakers has accordingly received relatively little attention in the second language acquisition (SLA) literature (but see, e.g., Brosh, 2015; Hansen, 2010; Raish, 2015; Shiri, 2015; Showalter & Hayes-Harb, 2015). This study adds to the research base by focusing specifically on native English speakers’ ability to perceive notoriously difficult Arabic consonants.

Native English speakers learning Arabic as a second language are faced with a rich consonant inventory involving unfamiliar places of articulation for consonants (i.e., uvular and pharyngeal), in addition to a secondary articulation associated with the so-called emphatics (see Table 1). On the other hand, Arabic has a smaller inventory of vowels relative to that of English (see Table 2): Modern Standard Arabic is typically seen as having three vowel qualities (/æ/, /i/, and /u/), in addition to a contrast between short and long versions of each, although regional varieties of Arabic may have additional vowel qualities (Munro, 1993).

Table 1. Consonant Inventory of Modern Standard Arabic (adapted from Amayreh, 2003, and Ryding, 2014)
Bilabial Labio-dental Inter-dental Alveo-dental Palatal Velar Uvular Pharyngeal Glottal
Stop b d t ḍ ṭ k q
Fricative f ð θ ð̣ z s ṣ ʃ ɣ x ʁ χ ʕ ħ ʔ
Affricate
Nasal m n
Liquid l
Tap/trill ɾ/r
Glide w j
  • Notes: Voiced consonants are presented on the left and voiceless consonants on the right; emphatics are indicated with a dot under the symbol.
Table 2. Vowel Inventory of Modern Standard Arabic
Front Back
High ↑ i/i: u/u:
Low ↓ æ/æ:

Given the small size of the Arabic vowel inventory, there is substantial variation in the production of Arabic vowels in different phonological contexts. As noted, e.g., by Amayreh (2003), following an emphatic consonant, vowels are produced “farther back in the mouth” (p. 518). This study focuses on the relative perceptibility of Arabic emphatic-plain consonant contrasts in various vowel contexts and offers a number of pedagogical suggestions based on the findings.

Background

Emphasis in Arabic

Emphasis is a feature of Semitic languages involving a secondary constriction in the posterior vocal tract. While both phonological and instrumental studies of emphasis have predominantly described the articulatory correlates of emphasis as involving either uvularization or pharyngealization (e.g., Davis, 1995; Laufer & Baer, 1988; McCarthy, 1994; Zawaydeh, 1999), questions about the exact articulatory correlates of emphasis for various Arabic dialects remain. In addition to methodological differences in investigations of emphasis, possible reasons for this lack of consensus may be related to dialectal variation in the realization of Arabic emphatics, not only with respect to their acoustic effects, but also in the directionality and perseverance of emphatic effects, also called “emphasis spread” or “emphasis harmony” (see, e.g., Watson, 1999). As Laufer and Baer (1988) summarized, such dialectal variation may involve the degree to which emphatic and plain consonants differ, the impact of coarticulatory effects on adjacent sounds, and the specific emphatic consonants realized within a given dialect.

Given the wide dialectal variation in Arabic, this discussion is focused on Arabic spoken in Jordan, as it has been the subject of recent and thorough investigation concerning the acoustics and perception of emphatic consonants. For discussion of the dialects of Jordanian Arabic, see, e.g., Abd-El-Jawad (1986), Al-Wer (2007), and Cleveland (1963). While Arabic distinguishes “plain” /d t s ð/ from “emphatic” /ḍ ṭ ṣ ð̣/, for the purpose of further limiting the scope of the study, the focus here is primarily on the /d/-/ḍ/ contrast. As part of a comprehensive acoustic and perceptual investigation of emphasis in Urban Jordanian Arabic (spoken in the Ammani and Irbid regions of Jordan), Jongman, Herd, Al-Masri, Sereno, and Combest (2011) compared the spectral mean of emphatic and plain consonants as produced in several nonwords by male and female speakers. Both emphatic stops and adjacent vowels exhibited significant acoustic effects. The spectral mean of emphatic stops in word-initial and word-final positions was significantly lower than that of their plain counterparts in these positions, while the second formant (F2) frequency in vowels adjacent to emphatic stops was lower than that of vowels adjacent to their plain counterparts. This acoustic manifestation of emphasis on adjacent vowels is consistent with earlier studies of Urban Jordanian Arabic reporting an F2 lowering in vowels following emphatic consonants (see Al-Masri & Jongman, 2004; Jongman et al., 2011; Khattab, Al-Tamimi, & Heselwood, 2006; Zawaydeh, 1999). More specifically, Al-Masri and Jongman (2004) found that a significant F2 lowering in vowels in the context of emphatic consonants provided the most reliable acoustic cue to emphasis.

Although F2 lowering of vowels in emphatic environments appears to be a robust indicator of emphasis, it is not equivalent for all three Arabic vowel phonemes. Jongman et al. (2011) separately analyzed the acoustic effects of emphasis on /æ/, /u/, and /i/. When following emphatic onset consonants, F2 was shown to be significantly lower throughout all three vowels, i.e., at vowel onset, vowel midpoint, and vowel offset, than in vowels following plain counterparts, but a significant vowel quality by emphasis interaction for the three measurement points was also found, with the F2 lowering effect of emphasis strongest and most enduring throughout the vowel for /æ/, followed by /i/ and /u/. These findings are consistent with Al-Masri and Jongman's (2004) observation that, while the latter two vowels do not resist the effects of emphasis, they do limit these effects to the emphatic consonant-containing syllable. Finally, in addition to detailing the F2 lowering previously observed in emphatic contexts, Jongman et al. (2011) reported that, compared to vowels preceded by plain consonants, those preceded by emphatic consonants displayed a higher first formant (F1) frequency. As with F2 lowering, the effect of emphasis on F1 was greatest for /æ/, followed by /i/ and /u/.

In order to determine the relationship between these acoustic correlates of emphasis and the perception of emphasis by native speakers of Urban Jordanian Arabic, Jongman et al. (2011) further conducted a perceptual experiment using manipulated auditory stimuli. The stimuli consisted of consonant-vowel-consonant (CVC) minimal pairs with emphatic-plain target onsets, produced by two speakers (one male, one female), from each of whom four versions were created using same- and cross-splicing. The cross-spliced stimuli were of two types. The first was created by taking the rime from a production with an emphatic onset (ṾC̣), where subscript dots indicate the original emphatic context, and attaching it to the onset consonant from a plain production (C) to form CṾC̣ stimuli (in a CVC syllable, the first C is the onset, and the VC sequence is the rime). The second type was produced by taking the rime of a plain production (VC) and attaching it to the onset of an emphatic production (C̣) to form C̣VC stimuli, in which the subscript dot indicates that the onset is an emphatic consonant. The two resulting stimuli, as represented by CṾC̣ and C̣VC, allowed for the isolation of the vocalic and consonantal contributions to the perception of emphasis. Note that since target consonants and vowels were segmented at the onset of F1, formant transitions were included in the VC portion.

The same-spliced stimuli were also of two types, which were produced by taking the rime of either an originally emphatic or plain production and attaching it to the onset of a different production of the same word by the same speaker. The two resulting stimulus types were represented as C̣ṾC̣ and CVC. In sum, the four stimulus types in Jongman et al. (2011) included the spliced forms C̣ṾC̣, CṾC̣, C̣VC, and CVC. Thirty native Urban Jordanian Arabic speakers heard multiple repetitions of each of the same- and cross-spliced stimuli and indicated for each which of two written words matched the auditory form. The two orthographically represented words were the corresponding minimal pairs. For example, if participants heard one of the four possible stimuli types spliced from the word /ṭu:b/, the possible responses were /tu:b/ and /ṭu:b/ in Arabic script.

The percentage of emphatic responses was calculated for each of the four splice types in Jongman et al. (2011). Same-spliced items with both onset and rime originating in emphatic productions (C̣ṾC̣) received 90% emphatic responses, while same-spliced items with onset and rime originating in plain productions (CVC) received only 3% emphatic responses. When the onset originated in an emphatic production but the rime originated in a plain production (C̣VC), 15% were judged to be emphatic. In contrast, when the onset originated in a plain production but the rime originated in an emphatic production (CṾC̣), 69% were judged emphatic (p. 92). In sum, stimuli with the rime originating in an emphatic production were more likely to be identified as having an emphatic onset, regardless of the actual status of the target onset, than those stimulus types with the rime originating in a plain production. This response pattern suggests that in perceiving emphatic onsets, native speakers of Urban Jordanian Arabic appear to rely more heavily on information contained in the rime portion of the signal than on information contained in the onset consonant itself.

The phonetic manifestation of emphasis in Urban Jordanian Arabic involves F1 raising and F2 lowering of adjacent vowels, but with differential acoustic effects depending on the quality of the vowel—the strongest and most enduring effects have been found for /æ/, followed by /i/ and /u/. In addition, studies have shown that native speakers of Arabic exhibit reliance on information contained in the rime portion of CVC syllables when determining the status of an onset consonant with respect to emphasis.

Perception of Arabic Emphasis by Native Speakers of English

Despite the recently growing number of studies of Arabic language acquisition by native English speakers (e.g., Brosh, 2015; Hansen, 2010; Raish, 2015; Shiri, 2015; Showalter & Hayes-Harb, 2015), only a limited number of studies have investigated the perception of Arabic emphasis by native English speakers (Al Mahmoud, 2013; Lababidi & Park, 2014; Zaba, 2007). Al Mahmoud (2013) investigated the discrimination of a variety of Arabic consonant contrasts produced by a native Arabic speaker (dialect unspecified) by native English–speaking learners of Arabic, including the emphatic-plain contrasts /ṭ/-/t/ and /ð̣/-/ð/. Al Mahmoud found that the learners experienced significantly more difficulty discriminating the emphatic-plain contrasts than contrasts common to English and Arabic (e.g., /t/-/d/ and /ð/-/θ/). Lababidi and Park (2014) asked naïve native English speakers to label aurally presented Arabic consonants (speaker's dialect unspecified) by identifying the closest corresponding English phoneme using sample English words containing the phonemes as response options and then to rate their certainty about these cross-language consonant identifications. A measure of the perceived similarity between the Arabic and English phonemes was then calculated by multiplying the proportion of responses for the most commonly identified English phoneme for a given Arabic phoneme by the mean certainty rating for that Arabic-to-English pairing. The resulting fit index was qualitatively described as “good,” “fair,” or “poor” based on whether it fell within one, two, or three standard deviations of a control fit index created from a separate task using English word stimuli.

While Lababidi and Park's (2014) materials involved a variety of Arabic consonants, the data discussed here focus on the results for /ḍ/ and /d/, as they are most relevant to the present investigation. Arabic /d/ to English /d/ was found to be a “good” fit in Lababidi and Park, while emphatic Arabic /ḍ/ to English /d/ was a “fair” fit. Thus, while not equally good, both Arabic /ḍ/ and Arabic /d/ were rated as at least fair exemplars of English /d/, providing evidence that Arabic /ḍ/ and /d/ may be perceived as exemplars of English /d/ by native English speakers following the “category goodness” assimilation type (Best, 1995). To the extent that native English speakers perceive Arabic plain /d/ and emphatic /ḍ/ as members of the English category /d/, their ability to perceive the contrast should be hindered by their native phonology.

It is further possible that the relationship between the English and Arabic vowel inventories will result in the moderation of native English speakers’ ability to discriminate Arabic emphatic-plain contrasts in different vowel environments. As noted above, Arabic has a limited vowel quality inventory relative to English. Given that listeners rely heavily on acoustic information contained in adjacent vowels to identify stop consonant place (Walley & Carrell, 1983), it is possible that native English speakers, like the Arabic speakers in Jongman et al. (2011), also rely heavily on the following vowel and coda in detecting emphasis in onset consonants. To the extent that native English speakers do so, the ease or difficulty that they experience in discriminating Arabic emphatic-plain contrasts may result from an interaction between the acoustic effects of emphasis on adjacent vowels and the relationship between the English and Arabic vowel inventories.

As outlined above, two acoustic cues to emphasis are F1 raising and F2 lowering in emphatic-adjacent vowels. The arrows in Figure 1(a) provide a stylized representation of this acoustic effect for the three Arabic vowel phonemes, while Figure 1(b) shows the same arrows overlaid on the English vowel space. Within the English vowel space, the potential for allophonic variation of Arabic /æ/ to overlap acoustically with English /æ/ and /ɑ/ is evident. In contrast, the allophonic variation of Arabic /u/ and /i/ in plain and emphatic contexts does not indicate this differential mapping to English vowels. Using a cross-language vowel identification task, Zaba (2007), with a native Arabic speaker from Saudi Arabia, found exactly this Arabic-to-English mapping pattern: While native English speakers identified Arabic /æ/ as English /æ/ following a plain consonant and as English /ɑ/ following an emphatic consonant, Arabic /i/ was identified as mostly /i/ in both plain and emphatic contexts, while /u/ was identified as predominantly /u/ or /ʌ/ in both contexts.

Details are in the caption following the image

Stylized Representation of the Effect of Emphasis on Vowel Quality

Note: Arrows represent a stylization of the acoustic effect of emphasis on the three Arabic vowel phonemes (a) and the same arrows overlaid on English vowel space (b).

It appears, then, that the difficulty that native English speakers experience in discriminating Arabic emphatic-plain contrasts may relate to the effect of emphasis on the quality of following vowels and how those vowel allophones map to the English vowel inventory. Indeed, such an assumption is implicit in some Arabic language teaching materials aimed at native speakers of English (e.g., Brustad, Al-Batal, & Al-Tonsi, 1995; Odisho, 1981, 2005). According to the widely used Arabic language textbook Alif Baa: Introduction to Arabic Letters and Sounds, e.g., differences in vowel sounds in emphatic and plain contexts provide the clearest indication of emphasis, particularly for vowels /æ/ and /u/ (Brustad et al., 1995). The aim of the present investigation was to empirically test the hypothesis that the difficulty native English speakers experience in perceiving Arabic emphasis may be moderated by vowel context.

In sum, studies have shown that native Arabic speakers rely more on rimes than onsets to perceive onset consonant emphasis in Arabic (Jongman et al., 2011). The research question is whether this reliance on the rime to perceive Arabic emphasis extends to native English speakers. To the extent that native English speakers rely on rimes to identify Arabic onset consonants, one might further expect their cross-language perception of vowel quality in plain vs. emphatic contexts to influence their ability to discriminate plain and emphatic onset consonants. It is therefore appropriate to ask also whether native English speakers exhibit greater sensitivity to Arabic emphatic-plain onset consonant contrasts when they perceive the rimes as containing distinct English vowel phonemes. To address these research questions, discrimination and cross-language vowel identification experiments were conducted.

Method

The purpose of the cross-language vowel identification task was to investigate the cross-language mapping of Arabic vowels in plain and emphatic contexts to English vowel phonemes. The purpose of the discrimination task was twofold: First, the use of cross-spliced stimuli made it possible to determine whether native English speakers rely more heavily on rimes or onsets in perceiving onset emphatic-plain consonants, and second, through comparison of the discrimination task results to those of the cross-language vowel identification task, it was possible to determine whether native English speakers show greater sensitivity to Arabic emphatic-plain consonant contrasts when they perceive the rimes as containing different English vowel phonemes.

Participants

Forty native English speakers ages 18–47, recruited from the University of Utah community, completed the cross-language vowel identification task followed by the discrimination task. None reported any exposure to Arabic or any other language exhibiting emphatic consonants, or any speech or hearing impairments.

Auditory Stimuli

A male native speaker of Jordanian Arabic (32 years old) who had lived in the United States for approximately 4 years was recruited from the University of Utah to produce the Arabic stimuli. The speaker, who is also an Arabic language instructor at the University of Utah, reported being from northern Jordan, speaking the Jordanian Fallahi dialect, and having no speech or hearing impairments. The Arabic stimuli consisted of three pairs of Arabic nonwords with a CVC syllable structure. The onset consonant was either plain /d/ or emphatic /ḍ/; the vowel was /æ/, /u/, or /i/; and the final consonant was always /k/, resulting in the following: /ḍæk/-/dæk/, /ḍuk/-/duk/, and /ḍik/-/dik/. The nonwords were recorded at a sampling rate of 44100 Hz in a sound-attenuated booth using a Marantz PMD660 recording device. The target nonwords were embedded in an Arabic script version of the carrier sentence uħibu kalimata ___ [I like the word ___] and presented once in each of five blocks. The carrier phrase evidenced features of Modern Standard Arabic for the purpose of eliciting speech similar to that produced in the Arabic language classroom. Three tokens of each of the six nonwords were selected for presentation in the study, for a total of 18 unique auditory stimuli.

Table 3 presents relevant acoustic properties of the stimuli. Praat speech analysis software (Boersma & Weenink, 2007) was used to measure F1 and F2 frequency from linear predictive coding spectra calculated over a 20 ms Hamming window. Measurements were taken at the beginning, middle, and end of each vowel. The vowel beginning was defined as the clear emergence of F1, and the vowel end was defined as the point where F2 noticeably weakened in the spectrogram. When averaged across vowel quality, vowels following initial emphatic consonants had a lower F2 at onset (F(1,8) = 48.194, p < 0.0005, ηp2 = 0.858), middle (F(1,8) = 63.531, p < 0.0005, ηp2 = 0.888), and offset (F(1,8) = 18.283, p = 0.003, ηp2 = 0.696) than vowels following plain consonants. Emphatic F2 lowering was greatest for /æ/, followed by /i/ and /u/. F1 raising emphatic contexts was also observed at onset (F(1,8) = 5.783, p = 0.043, ηp2 = 0.420), middle (F(1,8) = 10.330, p = 0.012, ηp2 = 0.564), and offset (F(1,8) = 6.164, p = 0.038, ηp2 = 0.435). At vowel middle and offset, F1 raising was greatest for /æ/, followed by /i/ and /u/.

Table 3. Mean F1 and F2 Values (Hz), by Preceding Consonant (Plain vs. Emphatic) and Position in Vowel (Onset, Middle, Offset)
F1 F2
Onset Middle Offset Onset Middle Offset
æ /dæk/ 512 567 531 1570 1601 1742
/ḍæk/ 501 657 620 1020 1134 1235
u /duk/ 404 403 386 1318 1035 915
/ḍuk/ 470 427 404 1045 797 830
i /dik/ 375 385 355 1765 1881 1991
/ḍik/ 438 422 374 1347 1591 1735
  • Note: Each value represents a mean across three tokens.

Procedures

Cross-Language Vowel Identification Task

The three tokens of each of the six Arabic nonwords were presented aurally over headphones using DMDX experiment presentation software (Forster & Forster, 2003). The native English–speaking participants were told that they would be hearing some unfamiliar words and that their task was to indicate which English vowel sound each word contained by pressing the corresponding key on a computer keyboard. The English vowel sounds were represented orthographically on the keyboard in the carrier words top, cat, seat, pit, pet, food, look, mud, wait, bite, cow, boy, and coat, representing /ɑ/, /æ/, /i/, /ɪ/, /ɛ/, /u/, /ʊ/, /ʌ/, /eɪ/, /ɑɪ/, /aʊ/, /oɪ/, and /o/, respectively. Before beginning the main task, participants completed two practice tests, using English filler words such as hip, hop, and skate, in order to familiarize themselves with the vowel sounds represented orthographically on the keyboard and the amount of response time available. During the first practice test, which consisted of four items, participants had 7 seconds to respond before the program automatically advanced to the next item. Before beginning the second practice test, which allowed 5 seconds for response, participants were warned that they would need to respond more quickly and that the actual test would proceed at this new speed. In the main task, each of the three tokens of each of the six nonwords was presented in random order for a total of 18 trials. If participants did not respond within the 5-second time period, the program would automatically continue to the next item. Participants were instructed in advance to focus on the next item in such cases. After completing the vowel identification task, native English speakers were permitted a rest before proceeding to the discrimination task.

Discrimination Task

The same auditory stimuli used in the cross-language vowel identification task were cross- and same-spliced in the manner of Jongman et al. (2011; as summarized above). Onset consonants were extracted from each token at the onset of F1 of the following vowel at the nearest zero-crossing according to the waveform, and the resulting onset and rime portions were then recombined to form four different splice types, abbreviated as CṾC̣, C̣VC, C̣ṾC̣, and CVC (see Table 4). Three unique spliced stimuli were made for each of the splice types described in Table 2 for a total of 36 items (four splice types × three vowels × three uniquely combined repetitions of each); all stimuli were amplitude-normalized to 70 dB.

Table 4. Descriptions of Four Cross- and Same-Spliced Stimulus Types
CṾC̣ onset from an original plain production; rime from an original emphatic production (dæ̣ḳ, dụḳ, dịḳ)
C̣VC onset from an original emphatic production; rime from an original plain production (ḍæk, ḍuk, ḍik)
C̣ṾC̣ onset and rime segments from two separate emphatic productions (ḍæ̣ḳ, ḍụḳ, ḍịḳ)
CVC onset and rime segments from two separate plain productions (dæk, duk, dik)

DMDX experiment presentation software (Forster & Forster, 2003) was used to present the stimuli in triads (AXB) with a one-second pause between each. The same-spliced items (C̣ṾC̣ and CVC) served as A and B in the AXB task, while all four splice types (CṾC̣, C̣VC, C̣ṾC̣, and CVC) served as X. The items were counterbalanced so that the C̣ṾC̣ and CVC splice types for each vowel served an equal number of times as A and B. Each of the 36 items described above served twice as X. All were arranged within the AXB task such that that no identical speech portions, i.e., those speech portions originating in the same production, were ever combined within an AXB trial. After a completing a practice task using English words, participants listened to the target triads and indicated whether A or B sounded most similar to X.

Results

Cross-Language Vowel Identification Task

For each participant individually, a vowel identification matrix was created, indicating the percentage of identifications of each English vowel by Arabic vowel quality and plain vs. emphatic context. Next, overlap scores were computed; that is, the proportion of the time that the Arabic vowels in the plain and emphatic contexts were identified as the same English vowel. To illustrate: Participant 1 identified Arabic /æ/ as English /ɛ/ 100% of the time in the plain context but as /ɑ/ 100% of the time in the emphatic context, resulting in an overlap score of 0% for Arabic /æ/. On the other hand, in the plain context this subject identified Arabic /i/ as English /ɪ/ 100% of the time, while in the emphatic context the subject identified Arabic /i/ as English /i/ 33% and English /ɪ/ 67% of the time, resulting in a 67% overlap score for /i/. That is, 67% of the time emphatic Arabic /i/ was identified as the same English vowel as plain Arabic /i/. Finally, this subject identified Arabic /u/ as 33% /ʊ/ and 67% /ʌ/ in the plain context, and 67% /ʊ/ and 33% /ɑ/ in the emphatic context, for an overlap score of 33% for Arabic /u/. Table 5 shows the mean overlap scores for all 40 participants combined.

Table 5. Cross-Language Vowel Identification Task Results
/æ/ /i/ /u/
Mean 3.725 71.300 69.250
Std. Deviation 11.560 33.358 28.435
  • Note: mean percent vowel identification overlap by vowel

For the purpose of the statistical analysis, percent overlap scores were converted to rationalized arcsine units (RAUs; Studebaker, 1985). The RAU-transformed percent overlap scores were submitted to a one-way ANOVA with vowel as a within-subjects variable (three levels: æ, u, i), revealing a significant main effect of vowel (F(2,78) = 98.137, p < 0.0005, ηp2 = 0.716). Follow-up analyses indicated that the overlap scores for both /i/ words (F(1,39) = 127.316, p < 0.0005, ηp2 = 0.766) and /u/ words (F(1,39) = 172.868, p < 0.0005, ηp2 = 0.816) were significantly higher than for /æ/ words, while there was no significant difference in percent overlap scores between /i/ and /u/ words (F<1).

AXB Discrimination Task

The AXB task results for native English speakers are shown in Figure 2, with percent emphatic responses by splice type and vowel. When the rime portion originated from an emphatic production (two left sets of bars), participants recorded higher rates of emphatic responses (C̣ṾC̣: 61%; CṾC̣: 54%) than when the rime originated from a plain production (two right sets of bars; C̣VC: 7%; CVC: 7%). When the rime originated from an emphatic production, the rate of emphatic responses was highest for /æ/ (79%), followed by /u/ (60%), then /i/ (34%). When the rime originated from a plain production, the rate of emphatic responses was lowest for /æ/ (2%), followed by /i/ (9%) and /u/ (10%).

Details are in the caption following the image

AXB Discrimination Task Results

Notes: Mean percent emphatic responses by splice type and vowel context; whiskers indicate the 95% confidence interval; 40 native speakers of English.

The percent emphatic scores were converted to RAUs. An ANOVA was conducted with splice type (four levels: C̣ṾC̣, CṾC̣, C̣VC, and CVC) and vowel (three levels: æ, u, i) as within-subjects variables. The main effects of splice type (F(3,117) = 202.263, p < 0.0005, ηp2 = 0.838) and vowel (F(2,78) = 54.352, p < 0.0005, ηp2 = 0.582) were significant, as was the interaction of the two (F(6,234) = 66.027, p < 0.0005, ηp2 = 0.629). Following up on the significant interaction, analyses of the effect of vowel were performed for each splice type separately, revealing that the effect of vowel was significant for all four splice types (p < 0.002 for all). Pairwise comparisons of vowels for each splice type separately revealed that in the C̣ṾC̣ and CṾC̣ conditions—those in which the rime originated from an emphatic production—the percent emphatic responses was significantly higher for /æ/ than for both /i/ (C̣ṾC̣: F(1,39) = 13.489, p = 0.001, ηp2 = 0.257; CṾC̣: F(1,39) = 76.172, p < 0.0005, ηp2 = 0.661) and /u/ (C̣ṾC̣: F(1,39) = 328.254, p < 0.0005, ηp2 = 0.894; CṾC̣: F(1,39) = 19.692, p < 0.0005, ηp2 = 0.336). When the rime originated from a plain production, however, percent emphatic responses was significantly lower for /æ/ than for /i/ (C̣VC: F(1,39) = 14.787, p < 0.0005, ηp2 = 0.275; CVC: F(1,39) = 19.880, p < 0.0005, ηp2 = 0.338) or /u/ (C̣VC: F(1,39) = 14.924, p < 0.0005, ηp2 = 0.277; CVC: F(1,39) = 11.586, p = 0.002, ηp2 = 0.229). With respect to the comparison between /u/ and /i/, the percent emphatic responses was significantly higher for /u/ than for /i/ for the C̣ṾC̣ (F(1,39) = 213.403, p < 0.0005, ηp2 = 0.845) and CṾC̣ (F(1,39) = 18.802, p < 0.0005, ηp2 = 0.325) splice types, but there was no significant difference between /u/ and /i/ for the C̣VC or CVC splice types (F<1 for both).

To determine whether this pattern of results was influenced by the native language background of the English speakers, a small follow-up study was conducted involving four native Arabic speakers who completed the same AXB task. The native Arabic speakers (three Jordanians from Amman: two males and one female, and one male Qatari; all ages 19–29) were also recruited from the University of Utah community and did not report any speech or hearing impairments. The native Arabic speakers exhibited the same response pattern as the native English participants with respect to their reliance on the rime over the onset portion of the syllable in discriminating the stimuli. However, it is important to note that their pattern of sensitivity to the emphatic-plain contrast by vowel differed from that of the English speakers (see Figure 3). For these native Arabic speakers, the proportion of emphatic responses for C̣ṾC̣ and CṾC̣ splice types was greatest for /u/, followed by /æ/ and /i/, while the proportion of emphatic responses for C̣VC and CVC splice types was lower for /æ/ than for /u/ or /i/. A comparison between the two groups of speakers thus indicated that while native English speakers registered the most emphatic responses to stimuli with emphatic rimes in the context of /æ/, native Arabic speakers registered the most emphatic responses to such stimuli in the context of /u/. This differential pattern of perceptual sensitivity by vowel between the two native language groups indicates that these patterns were not due solely to the acoustic properties of the contrast as manifested in the various vowel contexts, but rather that native Arabic and native English participants were influenced by their language-specific perceptual systems.

Details are in the caption following the image

AXB Discrimination Task Results

Notes: mean percent emphatic responses by splice type and vowel context; four native speakers of Arabic

Discussion

The first research question was whether native English speakers, like native Arabic speakers (Jongman et al., 2011), rely more heavily on rimes than onsets when detecting emphasis in onset position. This question was addressed by examining native English speakers’ AXB discrimination performance using cross-spliced stimuli similar to those used by Jongman et al. (2011). Monosyllabic Arabic nonwords with /d/ or /ḍ/ in onset position; /æ/, /u/, or /i/ as the nucleus; and /k/ in coda position were produced by a native Arabic speaker. These syllables were then cross-spliced so that emphatic onset consonants were spliced onto rimes taken from originally plain productions (C̣VC), and plain onset consonants were spliced onto rimes from originally emphatic productions (CṾC̣). Native English listeners recorded significantly more emphatic responses for those words with an emphatic rime (C̣ṾC̣ and CṾC̣) than for those words with a plain rime (CVC and C̣VC). That is, words with an emphatic consonant and a plain rime (C̣VC) received significantly fewer emphatic responses than words with a plain consonant and an emphatic rime (CṾC̣). These results suggest that for both native English speakers and native Arabic speakers (Jongman et al., 2011), the rime contributes more to the perception of onset emphasis than does the onset itself. These findings align with literature documenting listener reliance on formant transitions in the perception of stop consonant place: Here it was found that formant transitions also provide cues to the secondary articulation emphasis in Arabic.

Given this reliance on the rime, the next research question concerned whether English speakers exhibit greater sensitivity to Arabic emphatic-plain consonant contrasts when they perceive rimes as containing different English vowel phonemes. The two parts of this question were addressed through the two perception tasks. The AXB discrimination task revealed vowel environments in which English speakers exhibit a greater sensitivity to emphasis, while the cross-language vowel identification task revealed which English vowel phonemes listeners perceive in these environments. English speakers’ performance on the AXB discrimination task revealed a significant effect of vowel for each of the four splice types. The proportion of emphatic responses for those stimulus types with rimes originating in an emphatic production (C̣ṾC̣ and CṾC̣) was significantly higher for stimuli containing /æ/ than those containing /u/ or /i/. For the stimulus type in which the rime originated in a plain production (C̣VC and CVC), the percent emphatic response was significantly lower for /æ/ than for /u/ or /i/. In addition, the percent of emphatic responses for /u/ was significantly higher than for /i/ for the C̣ṾC̣ and CṾC̣ stimulus types, but there was no significant difference between /u/ and /i/ stimuli for the C̣VC or CVC stimulus types. Thus, overall, native English speakers were most sensitive to the Arabic emphatic-plain onset contrast in the context of /æ/, followed by /u/ and /i/.

A small-scale follow-up study involving four native speakers of Arabic confirmed that the moderating effect of following vowel phoneme on consonant perception by native English speakers cannot be attributed to the acoustic properties of the contrast alone, as the pattern of following vowel effects was different for native Arabic participants. As noted above, both Arabic and English speakers were more likely to select an emphatic response when the rime portion of the stimuli originated from an emphatic production than when the rime originated from a plain production. Likewise, when the rime originated from an emphatic production, both English and Arabic speakers recorded higher rates of emphatic responses for /u/ than for /i/. However, English speakers recorded higher rates of emphatic responses for /æ/ than for /u/, while Arabic speakers recorded lower rates of emphatic responses for /æ/ than for /u/. When the rime originated from a plain production, both English and Arabic speakers recorded higher rates of emphatic responses for /u/ and /i/ than for /æ/, but neither Arabic nor English speakers recorded a substantial difference in rates of emphatic responses between /u/ and /i/.

In the vowel identification task, native English speakers were asked to identify which English vowel sound corresponded to each Arabic vowel they heard. The auditory stimuli were the same unmodified Arabic nonwords that were used to develop the spliced stimuli for the AXB discrimination task. To the extent that cross-language vowel mappings influence the AXB performance described above, it was expected that Arabic vowels in the plain and emphatic contexts would be identified as the same English vowels less often for /æ/ than for /i/ or /u/. Indeed, the percentage of identification overlap in plain and emphatic contexts for /æ/ was only 3.7%, while the percentage of identification overlap for /i/ and /u/ was significantly higher at 71.3% and 69.2%, respectively. Thus, the greater sensitivity to the emphatic-plain contrast in the context of /æ/ than /u/ or /i/ may be attributed to the perception of different English vowel phonemes in these environments. While both the native English speakers and the native Arabic speakers differed with respect to which vowel context resulted in the greatest AXB discrimination (/æ/ for native English; /u/ for Arabic), both groups displayed somewhat more accurate discrimination in the context of /u/ than of /i/. The cross-language vowel identification results were sufficiently similar for /i/ and /u/ (71.3% and 69.2%, respectively) that they do not offer an explanation for this finding in the native English speakers. However, given this particular similarity between the native English and native Arabic participants, it may be the case that inherent acoustic properties of the /u/ and /i/ stimuli are driving the common perceptual pattern; however, this possibility requires further investigation.

Second Language Arabic Teaching

As noted earlier, the teaching of Arabic as a second or other language has not been guided by extensive research that informs a particular pedagogy for native English speakers, although there has been a notable increase in research on Arabic SLA in recent years (e.g., Al Mahmoud, 2013; Brosh, 2015; Hansen, 2010; Lababidi & Park, 2014; Raish, 2015; Shiri, 2015; Showalter & Hayes-Harb, 2015; Zaba, 2007). The present investigation contributes to this growing research base by examining native English speakers’ perception of the notoriously difficult emphatic consonants.

It has previously been observed that native English speakers exhibit difficulty acquiring Arabic emphatic-plain consonant contrasts (e.g., Al Mahmoud, 2013; Odisho, 1981). However, as noted by Odisho (2005), “without a reasonable mastery of such consonant contrasts, the learner will not be able to distinguish the meaning of thousands of Arabic words” (p. 55), including, for example, /ḍællæ/ [to wander] and /dællæ/ [to demonstrate]. For this reason, Arabic language instructors and their students will benefit from understanding the nature of the difficulty posed by Arabic emphatics, in addition to variables such as following vowel quality that moderate this difficulty.

The present study's findings reinforce recommendations found in existing instructional materials that direct learners’ attention to adjacent vowels in perceiving and producing Arabic emphatics (e.g., Brustad et al., 1995; Odisho, 1981, 2005). Particularly in the context of the Arabic vowel /æ/, such attention may have positive consequences for learners’ acquisition of Arabic emphasis and allow them to exploit their sensitivity to the English /ɑ/-/æ/ contrast so as to more easily distinguish emphatic-plain onset consonant contrasts. Indeed, Odisho (1981) noted that “the existence of the two vowel qualities /æ/ and /ɑ/ in English is of great help” in teaching the emphatics to native English speakers (p. 278), and Odisho (2005) presented explicit instructions for teaching the perception and production of emphatics with a particular focus on those preceding /æ/. However, and crucially, the study reported here also reveals that a strategy where learners rely on native English vowel contrasts to perceive Arabic emphatic-plain consonant contrasts should be less effective when the following vowel is /i/ or /u/, where the vowel allophones in emphatic and plain contrasts do not map differentially to the English vowel inventory. To the extent that learners are instructed to use their knowledge of English vowel categories when perceiving emphasis, they should also be informed of the role of the vowel phoneme itself in moderating the robustness of the vowel-related cues to emphasis.

How, then, might Arabic language instructors facilitate the acquisition of emphatic-plain consonant contrasts in the context of /i/ and /u/? The results of the present study may provide useful insights. Note that the present findings indicate that, like native Arabic speakers (Jongman et al., 2011), native English speakers indeed do rely more on rimes than onsets when discriminating emphatic-plain onsets, regardless of vowel quality. It is thus apparent that all three Arabic vowel phonemes contain robust cues to onset emphasis. What makes the case of /i/ and /u/ so difficult relative to /æ/ is that learners must attend to subphonemic vowel differences (i.e., variants of English /i/ or /u/) rather than a native phonemic contrast (i.e., English /æ/ vs. /ɑ/). A vast second language phonology literature has indicated that learners can indeed develop sensitivity to second language phonological contrasts that are subphonemic in the native language (see, e.g., Colantoni, Steele, & Escudero, 2015, for discussion). It is recommended that instructors support students’ acquisition of emphatic-plain contrasts followed by /i/ and /u/ as they would any other type of novel phonological contrast, focusing their learning on the perception of the following vowel quality contrast, which will be new to them, as a proxy for the consonantal contrast.

It is important to recall that the present study has focused exclusively on Jordanian Arabic and the fact that the effect of emphasis on vowels varies by Arabic dialect. Thus, the cues that are available to learners will depend on the particular dialect to which learners are exposed. Individual instructors are urged to consider the effect of emphasis on vowels in the input they provide to students and to expose their students to the ways in which emphasis is manifested across the dialectal spectrum of Arabic. Finally, it is worth noting, especially for K–12 instructors, that emphatic consonants emerge quite late even in native Arabic–speaking children. While stop consonants are acquired “early” (by age 2:0 to 3:10), the emphatic stop consonants are not normally acquired until at least age 6:4 by native speakers of Arabic (Amayreh & Dyson, 1998, p. 647). Instructors should similarly expect these consonants to be late-acquired by native English-speaking learners, who must contend not only with consonants that are difficult even for native Arabic speakers but also with the novelty of emphatic consonants relative to their native English phonological system.

Conclusion

This study has provided evidence that native English speakers use following vowel quality as a cue to Arabic emphatic-plain consonant contrasts. Using stimuli modified to isolate acoustic cues to emphasis, native English speakers were able to better determine onset emphasis status based on information contained in rimes than on information contained in the onsets themselves. In addition, they were able to more accurately determine onset emphasis when the stimuli contained Arabic /æ/ than /u/ or /i/. These discrimination patterns may be understood as resulting from the mapping of Arabic vowel allophones to English vowel phonemes: Native English speakers were shown to perceive Arabic /æ/ in plain vs. emphatic contexts as predominantly distinct English vowels, while there was a significantly greater overlap in the vowel identification in the two contexts for /u/ and /i/. These findings provide empirical support for a pedagogical approach for native English-speaking learners of Arabic that emphasizes the importance of attending to vowel sounds in order to detect emphatic-plain contrasts, as some textbooks have done (e.g., Brustad et al., 1995), but with the new understanding that this strategy may be more successful in the context of Arabic /æ/ than in the contexts of /u/ and /i/.

Further investigation is needed to determine how the perception of Arabic emphasis changes over time in learners, and how explicit instruction concerning the role that adjacent vowels may play in perceiving emphasis influences learners’ performance. It also remains to be seen whether such findings are generalizable beyond these segments and this dialect.

Acknowledgments

We gratefully acknowledge the contributions of Aleksandra Zaba to this project. We also thank Abdulaziz Alzoubi for his help with stimulus development, and the members of the Speech Acquisition Lab at the University of Utah for their help with data collection. Finally, the manuscript has benefited greatly from the comments and suggestions of the editor and two anonymous reviewers.

      The full text of this article hosted at iucr.org is unavailable due to technical difficulties.