—> To Continue with Chapter 4

Formant Synthesis

Formant synthesis is a special, but important case of additive synthesis. Part of what makes the timbre (there, we've used that word again!) of a voice or instrument consistent over a wide range of frequencies is the presence of fixed frequency peaks, called formants in the sound’s spectrum.

These peaks stay in the same frequency range, independent of the actual (fundamental) pitch being produced by the voice or instrument. While there are many other factors that go into synthesizing a realistic timbre, the use of formants is one way to get reasonably accurate results.
Figure .x

A trumpet playing two different notes, a perfect fourth apart, but the formants (fixed resonances) stay in the same places.

The Resonant Structure

The location of formants is based on the
resonant physical structure of the sound producing medium. For example, the body of a certain violin exhibits a particular set of formants, depending upon how it is constructed. Since most violins share a similar shape and internal construction, they share a similar set of formants, and thus sound alike. In the human voice, the vocal tract and nasal cavity act as the resonating body. By manipulating the shape and size of that resonant space (i.e. changing the shape of the mouth and throat) we change the location of the formants in our voice. We recognize different vowel sounds mainly by their formant placement. Knowing that, we can generate some fairly convincing synthetic vowels by manipulating formants in a synthesized set of tones. A number of books (including Charles Dodge's highly recommended standard text, Computer Music — Dodge is a great pioneer in computer music voice synthesis), list actual formant frequency values for various voices and vowels.

change of resonance

Composing with Synthetic Speech

Generating really good and convincing synthetic speech and singing voices is more complex than simply moving around a set of formants — we haven’t mentioned anything about generating consonants, for example. And no speech synthesis system relies purely on formant synthesis. But, as these examples illustrate, even very basic formant manipulation can generate sounds that are undoubtedly "vocal" in nature.
Figure .x

A spectral picture of the voice, showing formants. Graphic courtesy of the alt.usage.english newsgroup.

Composer Paul Lansky.

Soundfile .x

Notjustmoreidlechatter of Paul Lansky.

Notjustmoreidlechatter was made on a DEC MicroVaxII computer in 1988. All the `chatter' pieces (there are three in the set) use a technique known as Linear Predictive Coding, granular synthesis and a variety of stochastic mixing techniques.

Paul Lansky is a well-known composer and researcher of computer music who teaches at Princton University. He has been a leading pioneer in software design, voice synthesis, and compositional techniques.

Used with permission from Paul Lansky

Soundfile .x

idlechatterjunior of Paul Lansky from 1999.

Paul Lansky's writes:

"Over ten years ago I wrote three "chatter" pieces, and then decided to quit while I was ahead. The urge to strike again recently overtook me, however, and after my lawyer assured me that the statute of limitations had run out on this particular offence, I once again leapt into the fray. My hope is that the seasoning provided by my labors in the intervening years results in something new and different. If not, then look out for Idle Chatter III..."

Used with permission from Paul Lansky

Soundfile .x

Composition by composer Sarah Myers entitled Trajectory of Her Voice.

The composer used an interview with her friend Gili Rei as the source material for her composition. Trajectory of Her Voice is a ten-part canon that explores the musical qualities of speech. As the verbal content becomes progressively less comprehensible as language, the focus turns intsead to the snonorities inherent in her voice.

This piece was composed using the Cmix computer music language in 1998 (Cmix was written by Paul Lansky, the composer of the examples above).

Soundfile .x

Synthetic speech example, Fred Voice from the Macintosh computer. Over the years, computer voice simulations have become better and better. They still sound a bit robotic, but advances in voice synthesis and acoustic technology make these more and more realistic. Bell Telephone Laboratories has been one of the leading research facilities for this work, which is expected to become extremely important in the near future.

formant manipulations

Soundfile .x

Carter Sholz's one-minute piece Mannagram, based on a reading by Australian sound-poet Chris Mann.

In this piece the composer tries to separate the vowels and the consonants, moving them each to a different speaker. This was inspired by an idea of Mann's, who always wanted to do a "headphone piece" in which he spoke, and the consonants appeared in one ear, the vowels in another.

Soundfile .x

One of the most interesting examples of formant usage is in playing the trump, sometimes called the jaw-harp. Here, a metal tine is plucked and the shape of the vocal cavity is used to create different pitches.

—> To Continue with Chapter 4

<— Back to 4.3

<— To Table of Contents