next up previous
Next: About this document Up: The University of Aizu

1. ABSTRACT
This study examines the acoustic properties of the nasal sounds [m] and [n] from a cross-linguistic perspective. Two experiments were conducted. The first experiment briefly examines the parameters (formant frequency, bandwidth, and duration) with which to define the acoustic properties of [m] and [n] as found within both English and Japanese through analysis of American and Japanese production of the target sounds. The second experiment briefly examines the acoustic properties (formant frequency and duration) of English and Japanese nasal sounds as spoken by native Japanese speakers. A main purpose of the study, through examining the acoustic characteristics of the target sounds as found in both English and Japanese, is to help gain a better understanding of why Japanese speakers have difficulty in pronouncing English nasals, particularly in word-final positions, which is believed to be partly a result of L1 transfer.

2. INTRODUCTION
Nasals are found in most languages of the world. The most common type of nasal sounds are [m] and [n]. Nasals are similar to oral stop consonants in that both are produced with an obstruction somewhere within the oral cavity. Nasal sounds differ, however, in that they are produced with the entire vocal tract, including both the nasal cavity and nasopharynx. Also, there is no interruption of airflow through the nasal cavity, unlike the obstruction of the nasal passage that is characteristic of oral stop consonants.

Nasal consonants are produced by a coupling of the nasal and oropharyngeal resonators. The closed oral cavity and the complex sinus structure of the nose are joined together to form cavities to the main passage (pharynx and nasal tract). By lowering the velum, a passage is opened from the pharynx to the nasal cavity allowing air to escape through the nasal cavity. The nasal resonator is smaller yet more complex than the oral pharyngeal system. In cavity coupling, the soft palate is relaxed to enable both the nasal and oropharyngeal resonators to work together. This connection between the oropharyngeal and nasal resonators is directed by the palatopharyngeal or velopharyngeal mechanism, a muscle complex that contracts to reduce or close the opening located behind the soft palate or (velum). If the palatopharyngeal musculature is relaxed, the nasal port opens permitting the nasal and oropharyngeal resonators to work together (Tiffany and Carrell, 1977). Coupling between the nasal cavity and velopharyngeal port changes according to different sounds formed in the supralaryngeal vocal tract, however.

Fujimura (1962) claims the primary frequency range of interest for nasal consonants is between 200 and 2500 Hz. Nasals are characterized by a stable concentration of energy in the lower frequency regions with a first formant near 300 Hz. Due to the presence of an antiformant there is little energy in the areas around 600 Hz. Nasal sounds in general are highly damped and their presence weakens the upper formants of neighboring vowel sounds. This is caused by the broader band frequency response in the vocal tract, since broadly-tuned resonators fall away more rapidly than narrowly-tuned ones. During nasal production, the nasal and oral cavities resonate together resulting in a loss of amplitude (or antiresonance) at certain frequencies. These two cavities affect each other and, at times, even cancel each other out if both resonate at similar frequencies. A lessening of broad-band resonances and an absorption of acoustic energy in the oral cavity and nasal walls will also result in antiresonances. The high density of formants in the frequency range with the existence of antiformants causes the sound energy of nasals to be spread evenly throughout the central frequency range (800-2300 Hz). Although the shape of the antiformant will vary depending on the place of articulation, the overall spectral shape of the nasal consonants remains basically the same (see Fujimura, 1962).

A significant amount of research on nasals has focused on the acoustic cues that are required for nasal perception. Nakata (1959) found that the nasal murmur makes a significant contribution to the perception of place of articulation. Malecot (1956) and Mermelstein (1977), however, discovered that place of articulation of [m], [n], and [] is largely perceived by the transitions of the adjoining vowel formants. Recently the major consensus among researchers is that the acoustic cues necessary for identifying nasal place of articulation are found in both the nasal murmur and formant transitions, (See Kurowski and Blumstein, 1984; Repp, 1988; Ohde, 1994; Harrington, 1994.)

3. PHONETIC PROPERTIES
English contains the three nasal sounds [m], [n] and []. Bilabial [m] is produced with an open velopharyngeal port in a labial shape configuration. [n] is a voiced alveolar nasal. It is articulated with a similar tongue position as the alveolar stop [d]. Unlike [m] and [n], the English velar [] never appears at the beginning of a word. Its articulation will vary according to the quality of its neighboring vowels. For example, [] is more forward in [si] than in [so]. [] is an allophone of [n] when it occurs before [g] and [k], e.g., [si] becomes [sig]. Figure 2. Vocal tract configuration of English [m]. From Ladefoged (1993), (3rd ed.) A course in phonetics. Japanese contains two contrasting nasals [m] and [n] before the vowels /a,i,u,e,o/. Japanese also contains the velar [] in non-initial positions, although linguists are not certain whether [] is a separate phoneme or just an allophone of [g]. Japanese [m] and [n] are phonetically similar to their English counterparts. Initial [n], in particular, is articulated with a tongue position similar to the alveolar stops [t] and [d] before the vowel sounds /e a o u/, but is more palatal before the vowels /i,y/ (Vance, 1987).

Word-final [n] in Japanese differs phonetically from initial [n] since it is articulated in varying positions depending on the context. Final [n] is also distinguishable by its unreleased quality. Vance (1987) refers to final [n] in Japanese as the ``mora nasal.'' He transcribes it as an unreleased uvular nasal [N:vspace]. Sakuma (1929) calls final [n] an unreleased velar nasal when it occurs in words like onsen (See Figure 2.) Jones (1967) claims Japanese final [n] is articulated somewhere between a typical alveolar nasal and a nasalized fricative sound. Final [n] in [onsen] possesses a different acoustic quality depending on the neighboring consonant. For example, final [n] in onsen ka `onsen' is more velar than in onsen ni `in onsen,' (see Vance, 1987).

0.3in

4. EXPERIMENT 1
Speech samples were collected from four Japanese university students and four American speakers of English (a total of four men and four women) who produced the nasal sounds [m] and [n] in English and Japanese word-onset and coda positions within a varied vowel environment. Critical-band analysis was performed on a 30 ms segment centered within each of the target sounds, and the resulting spectral characteristics of the Japanese and American production of the nasal sounds are compared. Recordings were made with a Kay Computerized Sound Laboratory (CSL) model 4300B. A Shure SM 48 microphone was positioned 4 centimeters from each speaker's mouth. Both frequency and energy spectra were calculated from an LPC frequency response and low-pass filtered using a Blackman window. A sampling rate of 20 kHz was used with a frequency range between 0 Hz and 5,000 Hz.

5. RESULTS

First Formant () Frequency

Spectral differences between nasal sounds are due to modifications made within the oral resonator. The first formant () is lower for [m] than it is for [n] since the vocal tract is longer for a bilabial than for an alveolar sound. Formant frequencies and their corresponding bandwidths are listed in Table 1. The of [m] and [n] in both English and Japanese averaged between 250 and 300 Hz. Acoustic energy is concentrated in a lower frequency region as evidenced by the shorter bandwidths. Notice the lower formant emphasis of [n] in both onset and coda positions in the spectrogram of the word none below. Figure 1. A spectrogram of the word ``none`` as spoken by a native speaker of English. From the data, the average of [m] was lower than the of [n] within both languages. of initial [m] was 265 Hz in Japanese and 232 Hz in English, a difference in frequency slightly greater than that between the first formants of initial [n] in both languages.

The of word-final [n] varied significantly in both languages. The of English [n] is lower (263 Hz) in final positions than in initial positions (274 Hz). The opposite result occurred in Japanese, where the of [n] was 300 Hz in final positions compared with 285 Hz in initial positions (see Table 1).

Table 1. Parameter values for the nasal consonants. The unit for the formant frequency (F) and its bandwidth (B) is Hz.

Duration

The durations of the target sounds were examined from the English and Japanese word tokens. Results show that the average duration of [n] is slightly longer than [m] within both languages. As you would expect, nasals are longer in English than in Japanese, particularly in word-final positions where the average duration was two times greater. (See Figure 3.)

It is not surprising that the target sounds are longer in English than in Japanese. Japanese is a syllable-timed language. Each syllable (or mora) in Japanese is comprised of a single consonant followed by a vowel and is pronounced with basically the same duration. English is a stressed-timed language. The duration of syllables in English is different depending on the context.

Figure 3. Comparison of mean durations (in ms) of English (E) and Japanese (J) nasals in onset and coda positions as spoken by the Japanese and American subjects.

6. DISCUSSION
The spectral characteristics of Japanese and English word-final [n], in particular, are worth examining more closely. The frequency of [n] is naturally higher than its bilabial counterpart since the vocal tract configuration of final [n] is shorter than that of initial [m]. It is interesting, however, that the of Japanese [n] is much higher in coda positions than in onset positions. This can be attributed to the varying articulation of final [n] in Japanese, since it is articulated either as an alveolar, velar, or uvular nasal depending on the context. Further research is necessary to examine the phonetic peculiarity of word-final [n] in Japanese.

The next experiment examines and compares the acoustic properties ( frequency and duration) of the nasals [m] and [n] in English and Japanese as produced exclusively by native speakers of Japanese.

7. EXPERIMENT 2
The subjects were one-hundred Japanese freshman students taking an introductory course in English pronunciation at the University of Aizu. Each student recorded both English and Japanese word tokens containing the target sounds [m] and [n] in word-initial and word-coda positions within a varied-vowel environment. Critical-band analysis was performed on a short segment of the target sounds, and the resulting F1 frequency and duration properties of the Japanese production of the English and Japanese nasal sounds are compared. First formant () frequencies of the target sounds were calculated from an FFT frequency response with a preemphasized, low-pass filter. A sampling-frequency of 19 kHz was used with a frequency range between 0 Hz and 5,000 Hz.

8. RESULTS
First Formant () Frequency

frequencies of [m] and [n] in initial and final positions were measured from Japanese recorded production of the English and Japanese word tokens. Since spectral differences between nasal sounds are due to modifications made within the oral resonator, adding the nasal cavity to the vocal tract increases the size of the resonator which greatly affects the frequencies of the sounds. The first formant is typically lower for [m] than for [n] since the vocal tract is longer during production of [m].

Results indicate the average F1 frequency of initial [m] within both English and Japanese were comparable at 313 Hz and 316 Hz, respectively. The average of final [m] in English is slightly lower at 307 Hz.gif

Significant differences are observed in the frequencies of Japanese [n] in initial and coda positions. The of [n] averaged 320 Hz in initial positions and 345 Hz in word-coda positions (see Figure 5). The LPC results from the first experiment were lower (272 Hz for initial [n] and 300 Hz for final [n]); however, they also revealed a higher for final [n].

The average of English [n] as recorded by the Japanese subjects within both initial and final positions varied as did Japanese [n] but in a reverse manner. The subjects produced initial [n] in English with an average of 345 Hz, while producing final [n] with an average of only 288 Hz.

Figure 5. A comparison of frequencies of target sounds in English and Japanese words as spoken by Japanese subjects.

Duration

The durational differences between initial [m] and [n] within Japanese word tokens are minimal. Both sounds averaged 68 ms. The durational differences of the target sounds are greater in English as initial [m] is 110 ms and initial [n] is 103 ms; English [m] and [n] in final positions averaged 163 ms and 147 ms, respectively. These results differ from the results of the first experiment which revealed that English [n] is slightly longer than [m] within both positions.

The Japanese subjects produced nasals longer in English than in Japanese. Final [n], in particular, is much longer in English (147 ms) than in Japanese (99 ms), but is still significantly shorter than the final [n] as recorded by native English speakers in the first experiment (see Figure 6).

Figure 6. The mean durations of nasal sounds in Japanese and English words as spoken by Japanese subjects. The unit of measurement is in milliseconds (ms).

9. Discussion
The durations of English and Japanese nasals as produced by the Japanese subjects were measured. Since Japanese is a syllable-timed language, one would expect Japanese speakers to pronounce English nasals with less duration than native speakers. Although the Japanese subjects pronounced the target sounds longer in English than in Japanese, the results would indicate they have some difficulty in distinguishing the correct duration of English sounds, which the author of this paper believes to be a result of L1 transfer.

The frequencies of the target sounds within both languages were also measured as recorded by the Japanese subjects. Results reveal the of [m] was similar within both languages. The of final [n] differed across languages, however. Japanese [n] has a much higher emphasis in final positions than in onset positions. One would expect this considering the phonetic nature of final [n] in Japanese since it can be articulated either as an alveolar, velar, or uvular sound depending on the context. The fact that Japanese word-final [n] is an unreleased sound also contributes to its unpredictable acoustic nature. Another factor that is perhaps worthy of consideration is the role played by final [m] not being contained in the Japanese sound system.gif This would tend to place additional responsibility on [n] to behave more flexiblely in word-final positions, i.e., vary in manner of articulation depending on the context.

The data from Japanese recorded production of English [n] is cryptic. It is puzzling why subjects produced English [n] in word-initial positions with a much higher frequency (345 Hz), while final [n] is just 288 Hz. This could be a result of subjects producing final [n] more forward in the vocal tract, possibly with a configuration more closely resembling bilabial [m]. This misarticulation could also be a result of the unreleased nature of final [n] in Japanese, and further indication of L1 interference.

By comparing the spectral differences of Japanese recorded production of [m] and [n] in Japanese and English word-initial and word-coda positions, this paper posits that these production differences may at least partly explain the significant acoustic differences of these sounds as produced by Japanese speakers of English. Further research is necessary to examine the acoustic and phonetic uniqueness of word-final [n] in Japanese to see how it interferes with Japanese production and perception of English nasals. A cross-language perception study of nasal sounds, particularly in word-final positions, would also greatly assist our further understanding of the discriminational difficulties of these sounds as experienced by native speakers of Japanese.

10. REFERENCES
Fujimura, O. (1962). Analysis of nasal consonants. The Journal Of The Acoustical Society Of America, 34(12), 1865-1875.

Harrington, Jonathon.(1994). The contribution of the murmur and vowel to the place of articulation distinction in nasal consonants The Acoustical Society of America, 96 (1), 19-32.

Jones, D. (1967). The phoneme: Its nature and use. Cambridge: Cambridge University Press. (quoted in Vance, 1987).

Kurowski, K. and Blumstein, S.(1984). Perceptual integration of the murmur and formant transitions for place of articulation in nasal consonants. The Journal Of The Acoustical Society Of America, 76 (2), 383-390.

Ladefoged, P. (1993), (3rd ed.). A course in phonetics. New York: Harcourt Brace Jovanovich College Publishers.

Malecot, A.(1956). Acoustic cues for nasal consonants: an experimental study involving a tape-splicing technique. The Journal Of The Acoustical Society Of America, 32, 274-284.

Mermelstein, P.(1977). On detecting nasals in continuous speech. The Journal Of The Acoustical Society Of America, 61, (2), 581-587.

Nakata, K.(1959). Synthesis and Perception of Nasal Consonants. The Journal Of The Acoustical Society Of America, 31 (6), 661-666.

Ohde, R.N.(1994. The development of the perception of cues to the [m]-[n] distinction in CV syllables. The Journal Of The Acoustical Society Of America, 96(2) 1-12.

Repp, B.H. and Svastikula, K. (1988). Perception of the [m]- [n] distinction in VC syllables. The Journal Of The Acoustical Society Of America, 83 (1), 237-247.

Sakuma, K. (1929). Nihon Oneigaku, Tokyo: Kazama Shobo. (quoted in Vance, 1987).

Singh S. and Singh K. (1982). Phonetics: principles and practices, (2nd ed.). Austin, Texas: Pro-ed.

Tiffany, W. and Carrell, J. (1977). Phonetics: theory and application. New York: McGraw-Hill Publishing Company.

Vance, Timothy J. (1987). An introduction to Japanese Phonology. New York: State University of New York Press.





next up previous
Next: About this document Up: The University of Aizu



Stephen G. Lambacher
Wed Aug 23 15:57:05 JST 1995