Seamless Speech Communications Technology


It is highly desirable in the multimedia age to realize a seamless acoustic space that allows people to converse as if they were in the same room and without being overtly conscious of the placement of microphones and speakers. Creating such an acoustic environment requires the capability to pick up sounds (i.e., collect sounds) from anywhere in a room, the ability to hear (or reproduce) sufficient volume anywhere in the room, and the ability to converse with other people without artificial disruptions in the flow of speech or changes in speech quality. In conventional speech systems, however, simply increasing the volume causes howling and echoes, and the limited performance of present-day speakers and microphones is still very far removed from the sort of seamless acoustic space that we envision.

Recently, remarkable progress has been made by NTT researchers on the key enabling technologies to realize a seamless acoustic space: sound signal processing, sound-field modeling, sound-field measurement, and the development of a robust electroacoustic transducer. Excellent progress has also been achieved in combining an acoustic echo canceler, a microphone array, and other basic elements to implement a fully integrated system. One recent breakthrough was the development of a novel ES* projection algorithm that speeds up the adaptive tracking performance of echo cancelers to improve speech performance when two or more people are talking at the same time, a long-standing problem. Two other noteworthy projects led to the development of a high-performance acoustic echo canceler that uses a voice-switching compatibility scheme to implement an automatic learning function, and a microphone for conferences that provides excellent sound quality (i.e., flat frequency characteristics) and signal-to-noise ratio. Not only do these technologies provide sufficient volume for conferences, but also they make it possible to converse with another participant who is some distance away from a microphone. These technologies will see extensive use in telephones of course, but also in multipoint conferencing systems, large-screen communications systems, and many other applications.

(Human Interface Laboratories)

* ES: Exponentially weighted Stepsize


Multimedia Technologies Page
Next Page
Previous Page