IA-SIG Newsletter
"The Interactive Audio Journal"


Vol. 1 No. 3, November 1, 1999
Editor: Alexander Brandon


In This Issue:

Section I: From the Chairman: Some opening comments from IA-SIG Chairman, Mark Miller

Section II: Official IA-SIG Announcements: All official announcements regarding IA-SIG members, activities, and special events. In this issue, announcements from the 3DWG, the MFWG and an Event Write Up from Project BBQ.

Section III: Working Group Reports: Status Reports from the 3D Working Group, the Interactive Composition Working Group, the Intellectual Property Working Group, the DSP Working Group, and the Multi Format working Group.

Section IV: Features: "A look Back"… ever wonder just how music and sound were implemented in games such as "Asteroids", "Road Blasters", and "Super Mario Bros."? Even if you haven't, here are the answers… a pleasant stroll down memory lane with the former director of technology development at Atari Games, Brad Fuller..

Section V: Industry Corner: This section contains news, interviews and information from our member companies. This issue features a multi channel audio article with John Loose from Dolby as well as an interview with Guy Whitmore about Microsoft's "DirectMusic."

<Note: Item printed and re-printed in this section do not necessarily represent the views of the IA-SIG or its management although we do try and insure that there is useful content contained in everything that we publish.>

Section VI: Developer's Corner: "A View From The Bridge"… Quotes on the future of Interactive Audio from Brain Schmidt, Jeff Roberts, Tommy Tallarico, The Fat Man, Andrew Barnabas, Colin Anderson, and more…

If you are interested in contributing to "The Interactive Audio Journal" please contact Mark Miller (mark@groupprocess.com)


Please Join the IA-SIG!

The Interactive Audio Special Interest Group (IA-SIG) exists to allow developers of audio software, hardware, and content to freely exchange ideas about "interactive audio". The goal of the group is to improve the performance of interactive applications by influencing hardware and software design, as well as leveraging the combined skills of the audio community to make better tools.

The IA-SIG has been influential in the development of audio standards, features, and APIs for Microsoft Windows and other platforms, and has helped numerous hardware companies define their directions for the future. Anyone with a commercial interest in multimedia audio is encouraged to become a member of the IA-SIG and participate in IA-SIG discussions.


IA-SIG Steering Committee

Chairman: Mark Miller (mark@groupprocess.com)

Steering Committee:

Rob Hubbard (EA), Monty Schmidt (Sonic Foundry), Danny Petkevich (Staccato Systems), Brian Schmidt (Microsoft), Alexander Brandon (Straylight Productions), Tom White (MIDI Manufacturers Association).

IASIG Advisory Board

Thomas Dolby Robertson (Beatnik), David Mash (Berkelee School of Music), Craig Anderton (EQ Magazine), Gordon Currie (Portal Productions), Dale Gulick (AMD), Rudy Helm (At the Helm Productions)


Section I: From the Chairman

A message from IA-SIG chairman, Mark Steven Miller (Mark@GroupProcess.com)


I'm just back from BBQ 99, and as I said to the assembled attendees at Saturday final night's dinner, it was the best BBQ ever. To all of those who were not able to attend, try harder, it's worth it. This year, the discussions were more focused and directed than in year's past. Chris Grigg presented 'The Roadmap, Part 1' that we published here in July. I was quite please to find that there was strong interest in the idea of standardizing this 'component based, plug in style' architecture for Interactive Audio. We look forward both to publishing the second part of this article and to starting a Working Group in the near future to address this 'big picture' idea.

As always, I left feeling reinvigorated and ready to receive the seeds of inspiration that were produced this event and grow them into IA-SIG Working Groups and initiatives. You can read more about this in Alex Brandon's Event Write up, but for now, let me just say a heart felt thanks to George Sanger, Linda Law and Spanki for all of the work that goes into BBQ, again, it's worth it.

As to IA-SIG business, there are some great achievements to report.

The 3D Working Group (3DWG) has completed and published the official I3DL2 document. You can find it on the Web site, under Working Groups. I can not say strongly enough that the I3DL2 effort represents the best that the IA-SIG has to offer. In the process, many companies put aside desire to 'own the market' and agreed to a standard for virtual acoustics modeling extensions to real-time 3D audio that would grow the entire industry and universally benefit consumers.

The Multi-Format Working Group (MFWG) has come to consensus and release their report on the handling of multi-format audio and it's translation to the variety of speaker systems (mono, stereo, 4.1, 5.1, 7.1, etc.) to the entire SIG for comment and review. We expect to publish this report within the month.

In an effort to bring in more content creators and composers back into the SIG, we have instituted a plan to offer free one year trial memberships to select participants of the Video Game Musician's List and the GameAudioPro List. The response so far has been extremely enthusiastic with new members joining the Interactive Composition Working Group (ICWG) and the Intellectual Property Working Group (IPWG).

Looking forward, the Audio Track courses for GDC2000 have been finalized. I will be publishing the entire set of course descriptions and instructor bios in the next issue. I can say that I am quite confident that we will take the best of last year and improve upon it in some exciting and surprising ways.

In closing, I would like to take this opportunity to thank and recognize our Working Group Chairmen, Conrad Maxwell (3DWG), Michael Land (MFWG), Brad Fuller, Jim Hedges, and David Javelosa (ICWG), and new chairman, Keith Weiner (DSPWG) for all of their hard work and dedication. Without dedicated volunteers like these, there would be no SIG.

Until January,

Mark Steven Miller
IA-SIG Chairman
GroupProcess Consulting


Section II: Official Announcements

I3DL2 is now available at www.iasig.org

The I3DL2 document details the 3DWG's recommended approach to providing state-of-the art Interactive 3D audio using a pre-set reverberation to simulate a common environment for sound source and listener.

You can download the document by clicking here 'I3DL2'

Official Comment Period for the Multi Format Audio Working Group Report Begins 11/5/99:

No major objections were raised during the general membership review of the MFWG report. As a result, the report will now be sent back to the Working Group one last time for the 'Official Comment Period' or OCP..

What is 'the official comment period' you may ask? The OCP is an essential part of our document publication process that has been under utilized of late...

More specifically, the OCP is a fixed length period of time after the WG chair announces that the group has reached a consensus. During this time period, WG members may submit official written comments that, by rule, must accompany the published document in the form of an appendix. This rule was created to both prevent individual WG members from dragging down an otherwise united group while at the same time allowing their opinions to be recorded. It also provides insurance that a WG chairman does not run rouge with the process.

Project BBQ a success… again!

Project BBQ, a Texas Style Think tank in which experts across the Interactive Audio industry gather to brainstorm improvements o current methodologies And invent new ones, was held from October 14-17th at the Guadalupe River Ranch, Borneo, Texas.

To have a convention is one thing, but to have a gathering such as BBQ with its uncanny combination of style, elegance, luxury, and most importantly cooperative discussions takes BBQ beyond the conventional convention. Giving the Game Developer's conference a run for its money, one might even go as far as to say it is the event of the year for Interactive Audio discussion as well as being a howling good time.

BBQ this year had representatives from such companies as Microsoft, IBM, Texas Instruments, Intel, Beatnik, Apple, Be, Ion Storm, Dolby, and AMD. A virtual "whos who" of the interactive audio community (though mostly on the hardware front) sat at roundtables, put aside their corporate hats, and discussed numerous issues and proposed new technologies. Final group reports will be forthcoming but here is a brief sample of some of this year's discussion groups.

The Angry Frogs:

The Angry Frogs (so named in honor of the recent riots at McDonalds restaurants in Paris)decided to explore the business implication of MP3 and other recent audio technologies on our industry. The group defined the following mission statement:

"Examine sustainable business models enabled by new digital music technology - that grow the market overall and enable new, expanded, extended, or unique commerce opportunities. (i.e. we get to sell more stuff….)"

Among the most interesting ideas to emerge was the notion that music distribution must evolve into a service model rather than a 'sell the plastic' model. To exemplify this, imagine a service that allowed you to access and listen to the music that you wanted to hear where ever you were, on whatever device was handy. It is the revue potential of this kind of convenience that will eventually win out over restricted access and it's associated piracy concerns.

The 'Q' group:

The 'Q' group had the unenviable task of trying to make progress on Chris Grigg's 'big idea' of 'component based, plug in style' architecture for Interactive Audio (presented in part in last issues 'The Roadmap, Part 1'). The resulting discussion diverged into two relevant threads. The first took the 'We have problems on the ground right now in this area. What can we do about them?' approach. The second attempted to take the top level view of 'If we could start all over again, how would we do it better?'

The first uncovered a particular type of abstraction that could assist in defining both a syntax and process for game programmer / sound designer communication. The group postulated that the correct place to draw the line between what the game programmer needs to do, and what the sound designer needs to do lies in the notion of an 'audio cue'. More specifically, the nature and function of the 'cue' should be jointly defined in pre-production. Subsequently, the sound designer should create the elements and logic that will combine to create the sound track that will play when the 'cue is called'. During run time, the game programmer determines simply when the 'cue' begins and ends.

This is a very interesting idea and I would expect to see either a sub group of the ICWG or an independent group form within the IA-SIG to explore this idea further.

The second thread remained quite high level and I would simply refer you to 'The Roadmap, Part 1' for more details. Suffice to say for now that this is a really interesting concept and expect to see some WG action in this area within the next few months.

More on this in the next issue… or check out http://www.fatman.com for a the complete BBQ report


Section III: Working Group Reports

The Working Group is the main functional aspect of the IA-SIG. Working Groups generally form around issues of current concern for the industry. Once formed, they meet either in person or via the Internet and develop standards and recommended practices document. These documents represent industry consensus and are published and made available to all interested parties. This is where the Working Groups (WGs) report their quarterly progress.
click here for a description of the Working Group Process

Chairman: Conrad Maxwell, Conexant <conrad.maxwell@conexant.com>


The group focuses on creating 3D Audio rendering guidelines to define more realistic audio environments. This effort has lead to extensions to the Microsoft DirectSound 3.0 API to enable hardware acceleration, and to the publication in 1998 of the IA-SIG Interactive 3D Audio Rendering and Evaluation Guidelines (Level 1), describing "minimal acceptable" 3D audio features for all platforms. The group this year documented the recommended enhancements to current 3D audio technology, such as Reverb parameters, object reflections and occlusions, and more, published as the IA-SIG Interactive 3D Audio Rendering Guidelines (Level 2).

The I3DL2 document details the 3DWG's recommended approach to providing state-of-the art Interactive 3D audio using a pre-set reverberation to simulate a common environment for sound source and listener. Occlusion and obstruction effects are also specified. An example API implementation (two 'C' header files) is included in the specification. enumerating a recommended "DirectSound Property Set" implementation of the low level I3DL2 source and listener controls. The Property Set also includes a number of useful presets for rooms and material types for occlusions and obstructions.

Current Status:

It is expected the 3D working will next take up the issue of 3D MIDI specification as a future (and possibly its next) work item. The issue of effects beyond 3D has also been discussed as a possible next item. Using the I3DL2 specification for environment definitions, it could be possible to apply similar control paradigms to flange, chorus and other studio effects. Currently this is beyond the charter of this working group, however, many members expressed interest in this topic.

Submitted by Brian Schmidt <bschmidt@microsoft.com>
3DWG Steering Committee Representative

Chairman: Scott McNeese, VLSI <scott.mcneese@tempe.vlsi.com>

A new group has been proposed to supersede the AAWG's efforts as a separate functioning body. The details and specific goals of this group have not been finalized and until the matter is resolved the AAWG will remain in place as an advisory group to the new organization.

Submitted by Alexander Brandon

Chairman: Brad Fuller <bfuller@pacbell.net>


Q: What is Adaptive Audio?
A: Adaptive Audio is audio that is delivered via a system that allows for direct or indirect control of the data and/or the data stream.

Q: What is the ICWG's definition of an "Architecture"?
A: Architecture: a collection of components and interactions among those components. A description of elements from which systems are built, the interactions among those elements, the patterns that guide their composition, and the constraints on these patterns.

Q: What is the desired output or result of the ICWG?
A: We hope to build a common lexicon of adaptive and interactive music terms by the end of the year. This is an important step as it will give us a foundation to move discussions and decisions along. More importantly, we hope to have the first draft of the Adaptive Audio Architecture completed by CGDC 2000.

Current Status

We've had a hot summer of extremely provocative ideas for Adaptive Audio Systems including discussions outlining high level architectures. Unfortunately, the heat has been cooled in the last few months.

To get the discussions moving forward again we have compiled the discussion threads for your review and we are at work building a matrix of terms for interactive music. You have undoubtedly been asked to contribute information regarding interactive audio systems that you are familiar with. If not, please visit the ICWG and ask to be counted. These initiatives will provide a good history of our discussions and the state of the art We will conveniently provide this information to members of the IASIG.

The ICWG provides a forum to discuss technologies to build Adaptive Audio Systems, but it can be only as good as the contributions. We want to get the ICWG hopping again. If you have ideas, contributions or just want to be informed on how interactive audio can be part of your products, we invite you to join the ICWG and help us create technologies that will further propel audio in the interactive entertainment industry.

Brad Fuller <bfuller@pacbell.net>
Chairman, ICWG

Intellectual Property Working Group (IPWG)
Chairman: Mark Miller, Mark@GroupProcess.com


PURPOSE: The purpose of the IPWG is to facilitate and insure the continued development, availability, growth, and profitability associated with the marketing, distribution, and licensing of sounds and sound sets to the interactive audio community.

GOAL: The immediate goals of the IPWG are to facilitate improvements in interactive audio and to make use and licensing term recommendations regarding reasonable and appropriate distribution of sounds and sound sets into interactive media and playback environments. The recommendations will be based on broad industry input and be consistent with the IPWG Purpose above and copyright-owner rights as intended under existing and evolving copyright laws.

For additional information or to join this group, please contact Mark at Mark@GroupProcess.com

Current Status

The IPWG is has successfully launched and discussion is underway. Roughly 20 members have joined and the mix between Soundware vendors, game sound developers, and MI hardware vendors is looking just about right.

Currently, the group is close to completion of Phase 1a of the Working Group Goals and is moving rapidly on the Phase 2a. Please click here for more information on the IPWG goals.

Submitted by Mark Miller <Mark@GroupProcess.com>

Chairman: Michael Land, LucasArts <mland@lucasarts.com>

The MFWG was formed to address the problem of taking audio authored in a variety of channel formats and playing it back on all the possible types of speaker configurations out there. We started out with with a chart developed at BBQ '98, in which several input formats of audio were matrixed to several output formats, with appropriate channel mapping instructions at each point in the grid. We began by examining and evaluating this chart, as well as some general comments that accompanied it. In the course of discussion, we explored some of the principles and cxommon elements inherent in the problem being addressed by the chart. As a result, the chart itself was distilled into a set of much more readable and understandable rules and procedures. Some other key points were also identified, such as that low frequency content would automatically be routed to the subwoofer by the speaker system, and that total signal power needed to be conserved during format conversion. The result is that the guidelines are now much simpler and more useful, without having lost any of the original problem solving value of the chart.

Current Status

The MFWG Report has passed the General Membership Review and is currently in the Official Comment Period. It is expected that the report will be published by 12/1/99.

Submitted by Michael Land.
MFWG Chairman

Updated by Mark Miller

Chairman: Keith Weiner, DiamondWare <keith@dw.com>

The groups mission is to develop and recommend standard methodologies for applying real time DSP to sound streams.

Current Status
Following the Texas BBQ Conference, the WG is actually getting down to
the preliminary business of defining terms, and specifying goals. No
concrete work product(i.e. papers) have been proposed yet.


Section IV. Features

This section contains features and columns on varied topics of IA-SIG interest.

"A Look Back"
by Alexander Brandon and Brad Fuller

Interactive Audio has been with us for a long time, but it is only recently that it has achieved 'buzzword' status. Composers and tool developers alike are frantically trying to find ways to give audio new meaning on computers and other platforms, and to use computers to even extend music to a completely new artistic medium beyond the traditional monologue between the composer and the listener. Some of us, however, have been dabbling in the art for quite a while.

Brad Fuller, for instance… composer, sound designer, programmer… (now working at OpenTV), was quite notable in this respect during his term as the head of the Technology Department at Atari Games. I had the opportunity to talk with Brad about the projects he directed and technology he used for the Atari arcade games from 1982 to 1996. What I learned was extremely interesting, not only about the level of interactivity in these games but a development process that lent a huge amount of freedom to the composer when compared to PC development. Read on!

(For more information on Brad Fuller and his work go to www.bradfuller.com)

The Old Days

Before 1976, Atari would develop audio hardware by creative uses of standard electrical components, the less expensive units that were rapidly being superseded by the infamous microprocessor. As a result, sound was recreated using extremely primitive methods (by today's standards) but the techniques were very innovative for the time, especially for commercial products.

In Like Flint… FM sound!

The microprocessor revolutionized games just as it revolutionized every other aspect of the computer industry. Atari furthered the arcade experience by using 8-bit microprocessors such as MOS Technology's 6502 and the Motorola 6809. These microprocessors handled everything - including audio. Atari created a sound-chip called POKEY - a 4-bit square-wave generator with a noise circuit. Later, Atari would add the TI 5220 speech chip - a Linear Predictive Coding (LPC) IC that modeled human speech. The TI 5220 was made famous by TI in such commercial products as "Speak 'n' Spell" The 5220 used a form of audio reproduction known as Linear Predictive Coding, which could model various parts of the throat and mouth to emulate speech. A good example of the 5220 can be found in the vector graphics hit "Star Wars" which featured sound clips from the original motion picture.

Also introduced as a music and sound processor was the Yamaha YM2151 which offered 4 operator per channel, 8 channel FM sound, first used in Marble Madness (1984). The YM2151 was even more advanced than the 2 operator 9 channel OPL chip found on the Adlib card (the first sound card for IBM PCs and compatibles) in 1987. Later the YM2151 would also be able to handle speech independent of the 5220 thanks to a Sinusoidal Modeling algorithm developed by Earl Vickers.

Development problems and development solutions

With the audio hardware system being a very capable one for the times, and most pleasantly, standardized throughout the entire product line, Brad Fuller should be happy to develop the products and be on his merry way, yes? Well, no. As idyllic as the system might sound, the development aspect of it wasn't exactly a cup of tea.

For starters, Atari began with a single development system that used VAX / VMS systems exclusively. Then a new system dubbed "blue box" was used. Developers used Vax 780 machines connected to the blue boxes through a 1200-baud serial cable, the blue boxes running FORTH on a 6502 processor. The Vax would run EDT, a text editor, through dumb terminals. Audio was written using an interpretive language called RPM through EDT, converted into pseudo code, assembled and linked with the entire audio program to the blue box through the 1200 baud serial line, and audio was called in a game simulation. Bugs and other problems would be found and the process would be repeated.

While everyone had a terminal, the first blue box unit to be used was required by both Brad and his audio team as well as the programmers. Everything was done in one place.. no labs, no graphics studios, until 1985. You can imagine the time spent on waiting for the single unit to become available. Thankfully no compiling time was needed back then because everything was written in assembly. Coupled with additional minor hardware problems on various blue box units (such as having to lift the keyboard up two inches and then drop it to resume a stuck download), development was certainly not very enjoyable. Especially when compared to the techniques employed today.

So, Brad Fuller and Pat McCarthy set about building a board, separate from the main board, to isolate audio development. The new board was called the 'SA board' which stood for "Stand Alone Audio" (Ed. Shouldn't that be SAA? Hmmm…). The idea was to separate audio development completely from the rest of the game team, which contrary to first impressions of others at Atari, was a very good idea. Now the audio could be developed easily without as much reliance on the other members of the game team. This also saved a great deal in manufacturing costs, which for hardware were, and still are, sky high compared to software.

The SA boards consisted of a 68000 CP/M (IBM's competitor to MS-DOS) box connected to any or all of the audio processors needed for a particular product (starting with the 6502 processor and including the TI 5220, Yamaha YM2151, and Pokey chip). The SAII board, however, removed the Pokey chip entirely). SA boards made their way into development after the introduction of Atari's new "System 1" and "System 2" main boards (which still had audio hardware on the main boards),. These boards were used with 68000 boxes running CP/M that could be bought inexpensively. The SAII board added an inexpensive DAC that could play 4 channels of digital sfx at much better quality.

The 68000 boxes contained an RPM compiler that could compile one RPM file and quickly load it over to the SA hardware. Instead of the need for VAX machines Brad and his group could use the 68000 boxes with a text editor, compile the file on the box, download it to the SA hardware, listen, and repeat the process.

With this system in place, the SA boards combined with the 68000 box sped up development by 100 times and contributed to the creation of the soundtracks of Atari products from 1987 to 1992.

One subsequent title "Cyberball" (1989) used the Motorola 68000 chipset for audio. This chipset featured wavetable playback technology which enabled sampling and other techniques not available prior to its introduction. Unfortunately, use of these components was discontinued after this product due to their high cost.

After the 68000 came another collaborative effort from Brad and Chuck Peplinski, (who arrived at Atari from Hybrid Arts). The system new system was called the CAGE development system. Named for the recently deceased composer John Cage, the system's acronym spelled out Configurable Audio Generation Engine. This new system contained a DSP board that had everything from a music sequencer to speech sequencer to a real-time OS of its own! The tools needed to create great audio another quantum leap faster than the original methods had become a reality.

An interesting side tale of the development of this board was that Brad and Chuck, over a period of 5 months, developed the code independently of each other, although the architecture was developed together.) This included such improvements to Atari audio boards as a connection between the CAGE hardware and PCs for even faster development (RPM was converted to C as well). The first time they began to play music, the team expected the usual torrent of bugs and fixes and compiles, however, this time was different. The system worked perfectly with two completely separate batches of code. Music played and played for hours on end without a single glitch, which to the best of their knowledge, neither of the two developers had had happen before or since.

CAGE was used for "T-Mek" (1994) and the first generation "San Francisco Rush" titles but by 1996 Brad had moved on to his privately owned operation Sonaurel, formerly Matter to Magic.

Where's the Interactivity?

RPM left a lot to be desired as an artistic rendering medium. But, its lines of text (as opposed to notes on staves found in programs such as "Finale"), gave a great deal of flexibility in the actual creation of music. Since it was a form of scripting language, variables could be assigned anywhere to almost any aspect of the game.. graphics, controls, any events that could be linked to. The playback system, as with all audio chips, contained "n" channels that could be defined as logical channels and which played via a priority system that corresponded to the actual physical channels. Theoretically one could have 100 channels playing simultaneously and call any one of them at any time using any number of globally or locally defined variables. This kind of interactivity, mind you, was going on in the 1980s.

While the PC has made great strides forward as a gaming platform in general and for audio technology in specific, it is not without its disadvantages when compared to the one company / one development system / one delivery platform approach. For instance, once the hardware at Atari or Sega is locked down for a new system, it sounds exactly the same and acts exactly the same on every distributed unit. Whereas, on a PC, there are vast differences in both hardware and software implementation due to the many available configurations. Still, I believe a great deal can be learned from Brad's innovations at Atari, as during his tenure, it reigned as one of the greatest arcade game companies of the time.



This section contains news, interviews and information from our member companies. This issue features a multi channel audio article with John Loose from Dolby as well as an interview with Guy Whitmore about Microsoft's "DirectMusic."

<Note: Item printed and re-printed in this section do not necessarily represent the views of the IA-SIG or its management although we do try and insure that there is useful content contained in everything that we publish.>

Composer Profile: Guy Whitmore, "Whitmoreland Productions"
Interviewed by Chanel Summers, Audio Technical Evangelist, Microsoft


Guy Whitmore first discussed his experiences using the precursor to DirectMusic -- the Microsoft Interactive Architecture (IMA) -- at the 1998 Game Developers Conference. Since then he's become one of the most accomplished DirectMusic composers working today. We had the opportunity to talk recently about his thoughts on DirectMusic and his plans for the future.

Chanel Summers: Tell me a little bit about your background and some of the projects you've worked on.

Guy Whitmore: I picked up the guitar at 10 and played in rock bands ever since. I was a drummer/percussionist in high school band. After high school, I went to Northwestern University where I got a Bachelors of Music in Guitar Performance, and then received two Masters Degrees in Composition and Guitar Performance from Southern Methodist University in Dallas.

My first professional gigs were in theater as a composer/sound designer. I've done shows at theaters in Dallas, New York, and L.A. I still try to score one show a year. This year, I'm collaborating with an actor and sculptor, to create a contemporary theater piece that uses interactive/non-linear music techniques. "Reaching She" is still a "work in progress". I have MIDI triggers and sensors from Infusion Systems that we're starting to implement. The long term goal is to utilize DirectMusic in the production.

My first game music job was at Sierra On-line. I was hired as a staff composer in '94. My first game was Mixed Up Mother Goose Deluxe (a classic). Shivers, Shivers II, and Power Chess were other Sierra titles I've done. Next I was hired by Monolith Productions and wrote music for Claw, Blood, Blood2, Shogo, and others.

In June of this year I set out on my own, and "Whitmoreland Productions" came into being. My first gigs have been from former employers: Sanity and No One Lives Forever (working title) for Monolith, and Word Games for Sierra Attractions. I also do some commercial work on the side, and will be scoring a film this winter called "Out Of The Blue."

Chanel: Tell me about the equipment you have in your studio.

Guy: Right now, I have the following pieces in my studio:

Computers: Mac G3 running Digital Performer, Peak, Infinity, MetaSynth (amazing program!). MOTU 2408 for digital I/O and a Windows laptop (Pentium II/366, 128MB RAM, 8MB ATI-Rage, 15" screen) running DirectMusic Producer, various audio utilities, and games!

Recording: AKG SolidTube mic, to a Focusrite VoiceMaster, to Lucid 9624 AtoD, to the computer via the MOTU 2408. I also have a Yamaha 01V mixer.

Samplers/Synths: EMU E4 ultra, EMU e6400, Korg Z1, Roland Jupiter-6, Roland JD-990, Kurzweil PC-88 controller.

Guitars, Etc.: Strat Copy Custom Frankenstein thing, Guild Acoustic, Godin Multiac Classical (on its way), Reverend bass, Blackshire hand built nylon string acoustic, and a TopHat-King Royale amp (I'll never record "direct" again!).

Chanel: When did you first start working with the DirectMusic technology?

Guy: Wow, let's see. I think it was late '97 when Dan Bernstein (my boss at the time), threw a beta version of the Microsoft Interactive Music Architecture -- the predecessor to DirectMusic -- on my desk and said, "Here. See what you can do with this." A few weeks later I had a little demo that I showed the Shogo team. It went over well enough that we decided to integrate IMA into the LithTech engine, since DirectMusic had not yet shipped. We're now upgrading the LithTech engine to use DirectMusic.

Chanel: What first attracted you to DirectMusic?

Guy: Two things initially hooked me on DirectMusic: the ability to do seamless transitions between sections or pieces of music, and the DLS wavetable standard, which gave us the ability to create custom sound banks while still getting consistent sound across all platforms. Once I discovered that DirectMusic/IMA could do these things, I was determined to utilize it.

Chanel: What have been your general experiences working with DirectMusic? What do you like about it?

Guy: Working with DirectMusic is a constant process of discovery. The interactive concepts themselves are still evolving, and there are no rules yet. You have to make your own rules, otherwise you'll drown in the possibilities. With each piece of music I create with DirectMusic I try to add one or two new techniques to my tool box.

Chanel: How do you generally approach scoring a game using DirectMusic?

Guy: With Shogo and Blood2 I'd typically start by composing a complete linear -- Red Book -- score using my main sequencer (Digital Performer) and all the instruments in my studio. Then I would break the piece of music down into its various sections in DirectMusic. Now I try to conceptualize the music "non-linearly" from the get-go. I still start in Performer, but I use it more as a sketch pad for getting themes, sounds, and ideas into the computer. Based on those themes, I create entirely new sections of music in Producer. That way I can work with the interactivity and variations as I arrange the music. One interesting technique I've been using is improvisational and non-linear in nature. I'll improvise and record guitar parts, then cut up the file into short samples and create a unique DLS bank from them. Then, in Producer, I can recreate the parts I played, or compose completely new guitar parts.

Chanel: What things in DirectMusic have you found most frustrating?

Guy: I use sequencing software (Performer) that has been in development for well over a decade. It has had a lot of time to mature and develop. DirectMusic Producer sometimes feels awkward, particularly in terms of its
sequencing functionality, because it doesn't have all the years of development behind it. Moreover, it hasn't been tested extensively in real world products and games... yet. That's where Microsoft will get very crucial feedback, and the program can mature and begin to have a more elegant interface. And beyond all that; the very concepts that DirectMusic
is based on are evolving. Microsoft will have to respond quickly to user input as the program evolves. So far I can attest that Microsoft is indeed listening.

Chanel: What one thing do you wish someone would have told you about DirectMusic when you first started to use it?

Guy: Be patient, and learn one thing at a time.

Chanel: What about DirectMusic has most confused you?

Guy: Right off, some of the terminology was confusing (Style, for instance). But once you realize they're just labels, and shouldn't be taken literally, it opens things up.

Chanel: What about DirectMusic has most surprised you?

Guy: The depth of the program. It may only take weeks to learn, but these techniques will take years to master.

Chanel: What features of DirectMusic do you hope to explore in your next

Guy: I'm already experimenting with all aspects of DirectMusic. My first goal was to get the music to sound as rich as the CD audio music I've done because, in my opinion, interactive music that still sounds like General MIDI is pointless. That's why I've explored DLS techniques so extensively. I've recently done a lot of work with motifs and secondary segments, so that I can work in layers. And I'm beginning to explore the harmonic functionality of DirectMusic, such as chordmaps. The type of music I'm writing determines what DirectMusic techniques I utilize.

Chanel: Do you have any words of advice for composers who might be considering DirectMusic?

Guy: My advice to composers new to DirectMusic is to keep it simple at first. A few effective interactive techniques can go a long way. Also, keep in mind the style of music you're writing, and what type of interactivity the game or program calls for. My first step to creating game music is sitting down with the game designer, and programmer, and determining what type of musical interactivity we'd like to hear in the game. Everything else should follow those goals.

Guy can be contacted at guy@speakeasy.org. Chanel can be contacted at chanels@microsoft.com

An interview with John Loose of Dolby Laboratories

In today's world of extreme audio expansion, Dolby Digital and DTS are becoming vitally important modern sound design technique. DVD movies, fast replacing VHS tapes, are furthering the ubiquity of Surround Sound hardware in the home. Most recently multichannel audio has begun to work its way into games. In our last issue we explored the many facets of 3D audio and its use in interactive applications. Now John Loose from Dolby gives us his insights on where the medium of multichannel sound is going.

AB) If we could begin with a brief summary of your position at Dolby and what projects you've contributed to and / or some sort of career history to give our readers a background.

JL) For 5 years, before I joined Dolby Laboratories, I was an independent sound designer and composer for the games industry, producing under the moniker of Game Sound Development. I created game audio for Accolade, Interplay, THQ, Acclaim, U.S. Gold, Virgin, Malibu, Electronic Arts, Williams, Black Pearl, Maxis, Brøderbund, Mindscape, Houghton Mifflin, Viacom, Purple Moon and Living Books. Now part technical guru, part technology evangelist at Dolby, I assist in all facets of multi-channel sound development for licensees that are producing games using Dolby Surround and Dolby Digital technologies on CD-ROM, DVD-ROM, N64, PSX, and next-generation platforms.

AB) Give a brief history of multichannel audio and how it enhances a listener's experience compared to stereo and binaural 3d sound.

JL) Binaural 3D sound is a singular experience, where the listener is required by design, to sit in the sweet spot to get any feeling that there are things happening behind you. True multichannel sound puts the listener and all of his/her friends in the same theater-like place with sounds physically behind them.

AB) Dolby has been pushing multichannel audio in games for the last several years. Roughly how many games have featured Pro Logic and Dolby Digital since 1996?

JL) There are currently over 250 games featuring Dolby Surround, and about 35 games and multimedia DVD-ROMs that feature some kind of Dolby Digital. Some are DVD-Video compatible games that use mono, stereo, or Dolby Surround. Others, like Lander (Psygnosis), feature 5.1 channel Dolby Digital.

AB) Have you noticed any measurable response from the public to multichannel audio?

JL) Multichannel audio is something that needs to be heard to be appreciated. Most people take surround sound for granted when they go to the movie theater, but when they hear the same quality surround sound out of their home systems, it blows them away.

AB) What is the most impressive example of multichannel audio you've seen yet in a game? In a sense, have you seen multichannel audio used as an integral part of the game as opposed to a fancy feature?

JL) Clearly, Lander from Psygnosis is the benchmark for in-game 5.1 mixed with sound effects in real time. Other games, like Zork, have had great sounding cut-scenes but Lander is the only game released retail that has 5.1 channel music during gameplay.

AB) Do you think Dolby Digital encoding and decoding will work its way into software, if it isn't already?

JL) For now, material is pre-encoded with Dolby Digital. If there are sounds that need to be mixed in real-time with the background audio, that's done in software post-decode as PCM streams.

AB) Do you think multichannel audio will work its way into soundcards and other PC / console based hardware?

JL) We're already seeing a number of sound cards that can handle Dolby Digital decode, and some that can pass the decoded PCM out to be mixed with other material in software. This can be done with soft DVD players and directshow compatible 4-channel sound cards.

AB) Are there any multichannel panning, effect, and encoding tools available to game developers specifically? Which tools have you seen that are being used most widely and effectively?

JL) For Dolby Surround, there have been good tools around for a while, like Dolby Surround Tools for Pro Tools. Now, there are new panning systems like SmartPan Pro from Kind of Loud, that allow you to work in 5.1 in Pro Tools. There are also new encoders like Soft Encode from Sonic Foundry for the PC, and A.Pack for the Macintosh. Also on the PC is the excellent MX51 software for 5.1. This plus the new SurCode Dolby Digital encoder gives you a good all around PC solution using the Yamaha DSP factory.
<Ed. These products are professional tools that can be used for any 5.1 project but are not specifically tailored for interactive applications.>

AB) his may be a long answer, but do you see multichannel overtaking stereo as the primary delivery method for interactive games and other applications?

JL) There will always be a place for stereo, but as the game community tries more and more to emulate Hollywood, multichannel will be the norm, not the exception. With new console systems like the PSX2 and Dolphin coming out as DVD-based, you'll see more 5.1 channel and Dolby Surround enhanced titles on every platform. Look for some news on gaming for DVD-ROM computers soon. Too hot to talk about now!

Thanks John!


Section VI. Developers Corner

A View From The Bridge…

Edited and compiled by Mark Miller

I recently completed work on a Supplement to Game Developer Magazine entitled 'Game Audio 2000, The Sound Of The Future'. In the course of my research (which consisted of dozens of email based interviews with developers and technologists), I came upon some extremely intriguing visions of the future of Interactive Audio. While these found no home in the actual Supplement, I am publishing them here as I believe that they are of great value to the community. Enjoy.

Microsoft's DirectMusic and DirectSound program manger, Brian Schmidt:

"Our holy grail: Enable developers to create content with high production quality that gives the impression that the music and sound was scored post-production, even though the content is being rendered in real-time in response to player actions.

It's a long term goal that requires combinations of clever software, changes in development procedures, availability and usability of 3rd party creativity and, in some cases, raw, unmitigated processing power. We're about 150 miles into our 1000 mile journey.

Video games sound like video games: less than CD quality, TONS of repetition in music and sound effects, music that has an only vague correlation with the visual actions. Using technologies like DirectMusic, it is possible to make video games sound more like (and eventually 'just like') other mainstream media like movies. That sounds like a plug, but "DirectMusic" is not meant to mean "current DirectMusic" but rather where we expect it and DirectSound to go and what we expect it to evolve into.

Getting more power into the "person with the ears" is critical to the Holy Grail goal listed above. The current process (composer creates content, composer gives content to programmer, etc..) is broken. Audio for games will be only as good as the weakest link in the chain. While I will concur that very good audio implementations can come from the current process, it's an extremely rare occasion where this is so. It requires a programmer of such dedication (with a good ear) to audio that they essentially become a co-audio developer. I have worked with two such developers in the approx. 150 games I've ever done."

Tommy Tallarico of Tommy Tallarico Studios on the future of interactive music:

"I think there are two types of interactive music… The easy way (streaming, which I prefer to do), and the more complex way (MIDI, which I prefer not to do). Let me quickly explain the differences…

The complex way.
I think when most people think of "Interactive Music" this is what they think of.
Everything is MIDI based and depending what is happening in the code the midi file is changing "on-the-fly". That could mean the muting or unmuting of tracks, volume, tempo, midi branching, etc. This is an amazingly complex and time consuming way to write music. I bow down to the numerous audio guys who have to go through this hell for months and sometimes years! I wouldn't want to do it!! …

The easy way.
Let's say your character is in a cave so I'm playing a two-minute looping ambient audio file. Then I find a switch which opens a secret door. I then lower my ambient looping piece and play a dramatic 5 second sting, and raise the ambient loop back up to where it was. While inside this area I come across a gazillion demons who want to eat my butt. I then switch to my 30 second looping "holy crap these demons want to eat my butt" battle tune. And why stop at music?!?! Why not have one and two minute looping ambiences such as waterfalls, streams, jungles, winds, etc. Let's see ya do that in Midi!!! Guess what!!! All of this is Interactive Music!! And I didn't have to spend a year to get it to work correctly, and because I'm just using audio files I can have real instruments galore playing!! Oh yeah, and don't forget… Eventide effects!! Lexicon reverbs!!, Mastering!!

Now I know that every game on the market (due to streaming restrictions) can't be done like this. That is why the more complex way is sometimes chosen. You have all that stuff downloaded and you never have to hit the disk! But as hard-drives become bigger, processors become faster & bandwidth becomes immense all of those worries will completely go away!!!! And I'm not talking 3 or 5 years from now!! I'm talking this Christmas!!!… Right now in the PlayStation (4 year old machine) you can have up to 16 .XA tracks all playing at once (check out Parappa the Rappa!!) and mute and un-mute on the fly to get interactive music!! Wait until DVD hits hard and the streaming bandwidth hits the ceiling!!"

Miles Sound System Programmer, Jeff Roberts on the future of 3D audio hardware:

"Currently, the DSPs in the sound cards are abstracted through high-level, specific-purpose APIs like EAX and A3D2. If the DSP functionality itself was exposed, all kinds of new audio technology (not just 3D audio) could be created, and we wouldn't be forced to access the chipset's capabilities via the sometimes-questionable interfaces designed by particular hardware companies with an API axe to grind. Unless and until that happens, 3D audio is as much about politics as it is about technology."

Colin Anderson of DMA Designs on Physical Modeling and DirectMusic:

"Physical Modeling of sound effects has the potential to be the next big revolution in interactive audio, hopefully in the next 18 months but maybe longer. The technology is showing a lot of promise with companies like Staccato Systems already having some very impressive demos available but there are still many hurdles to be crossed before it becomes a viable alternative to the current sample based technology.

Microsoft's DirectMusic also has the potential to become very important for interactive audio in general, not just music, once they get the playback latency down to around the same level as DirectSound. There are some other system resource issues which will need to be sorted out too, but the potential is definitely there.
From the punters point of view Physical Modeling of sounds could create the same leap in the quality of audio content as 3D rendered graphics did for visuals a few years back. DirectMusic on the other hand is less likely to make such a huge difference in the long term, but short term it could significantly improve the quality of audio heard in games by allowing the average developer much greater control over the interactivity of their audio.

Both technologies are able to produce better quality results than normal but the extra effort required on the part of the content developers to take advantage of them will be considerable too. To get the most from them developers will be forced to consider audio as an integral part of their development process right from the outset of a project, which can only be a good thing. It's a bad idea to add an engine sample in during the last few months of a game's development when you're using samples but it would just be impossible if you were using a physical modeling algorithm that needed to be fed all sorts of parameters from the game engine each frame."

Whitmoreland Productions' Guy Whitmore on the 'Downloadable Studio'

"Currently we have Downloadable Sounds (DLS). For the future, I have a 'Downloadable Studio' theory. A piece of interactive music data will be wrapped in a virtual studio consisting of, DLS banks, software synthesizers (FM, analog models, subtractive, etc.), Physical Modeling synths, DSP effect plug-ins (i.e. reverb, pitch shifting, compression, chorus, etc.) all centered around a virtual mixing environment,. This 'studio' will be downloaded to the gamer/end user's machine, but will be invisible to him/her. They will simply hear richly interactive music being controlled by the game or software it's embedded within."

Andrew Barnabas of Sony Computer Entertainment's Cambridge studio describes what he expects to see in the coming 18 months:

"From a PS2 perspective, <I see> sound API's that allow realtime mixing in a 5.1 environment (but also supporting mono and stereo), ability to stream off the DVD, 5.1 mixed music, and the ability to switch smoothly or cross fade between tracks. I don't personally know why console manufacturers try to make onboard sound chips better for music, 'cos there's no point. No one chip is going to replace a $500,000 studio with 100's of megs of sample RAM etc. etc. blah blah blah. Concentrate on doing excellent digital audio reproduction (a la mp3) and giving sufficient RAM for sound effects, and the realtime DSP functionality to make those sound effects go that little bit further.

A higher quality experience (hopefully) is my simplistic goal. The current popular trend in game music is for pseudo orchestral music. This is once again knocking on the film industry's door. The music takes far longer to produce as a result, but also gives far higher quality results. Sound designers are starting to look away from library CD's and creating their own sounds by going around with a portable DAT and mixing it together in Protools, again knocking on the film industry's door. It takes longer, but sounds loads better. People keep telling me that we're constantly playing catch up to the graphics guys whose target machines suddenly quadruple in power overnight. That maybe true, but since the advent of CD, it leveled the playing field. DVD is the most significant change for a musician / sound designer for the last 7 years, and I personally look forward to getting my teeth into it.

As far as content consumption, higher quality audio will start to become the norm. The sophistication and expectations of the audience demand it. Jurassic Park did dino's in '93 and wow'd the world. The Lost World in '97 didn't really cover new ground, and as such didn't do so well. Episode 1 has dino's loafing around in the background. My point is, it's not new, it's not ground breaking, it's EXPECTED! Game audio may not have had such defining moments, but when CD-ROM first became available, all game musicians went 'F&%K!', and have spent the last few years playing catch-up. Most of us are now at the stage where we're quite happy with what we can produce on CD, let's move onto something else."

Senaura's Peter Clair.

"We're going to see improved and increased support for multi-channel formats. This will be driven by the take up of DVD in the home and the increasing number of consumers who have four or more speakers connected to their PC. Changes to 3D positional audio are likely to be evolutionary (e.g. increased number of sound sources, improved audio quality, etc.) rather than revolutionary.

DSP effects such as reverb, chorus, distortion, flange etc. will increasingly be used at runtime for game effects and ambient sounds (in addition to their existing use for game music). The first step is to get these effects built into the basic sound hardware/driver. This stage is near completion and the next phase will be to create some sort of architecture where a developer will be able to create their own plug-in filter effects.

Interactive composition (e.g. using DirectMusic) will be embraced by some. But there will always be those game musicians and composers who wish to continue recording and mixing their music in a studio. For them, there is likely to be a move away from delivery of this music using Red Book CD audio to one of the compression schemes (i.e. MP3 now, likely to be supplanted by MPEG 4 Structured Audio in the near future)."

EA Seattle's Alistair Hirst provides his list of future developments:

"1) More penetration of DVD ROM drives into the mainstream market.
Those DVD disks have to be filled with something, and high quality streamed music tracks (perhaps multi-channel) is a good way to do it. The days of shipping a game with a lot of gratuitous video seem to be on the wane, so music with high production values is a good alternative. Also, sound effects will be able to stay at high sample rates and bit depths. This doesn't affect game audio creation, since source is always kept at as high a quality as possible, but it won't have to be butchered as badly for the delivery format.
2) Continuing trend of consumer machines having faster processors, more RAM, and increasing use of 3D Accelerator cards.
Faster processors, and increased use of 3D Graphic Accelerator cards will allow for more host based processing of audio on machines that don't have high end soundcards, including features like reverbs and filters.
3) Continuing evolution of DirectMusic
New skills and techniques will need to be learned to make it work effectively for games that decide to go that route. If done well, most consumers won't even realize that the music is interactive, but will find the game more compelling somehow.
4) Trend towards more Internet based games
The days of having to really watch size of samples, and the associated quality hit are back for games that will be delivered over the Internet. However, technologies like MPEG-4 will allow the sound quality to be better than the 11kHz, 8 bit sound of previous generations of PC products."

Activision/Raven's Chia Chin Lee:

"The bottom line is that we, as content providers, need to fully take advantage of current technologies and develop better techniques before entrenching ourselves in the thoughts of the future. For example, let's get some decent tools for 3D Audio implementation before adding more features that sound designers can't control. If we keep looking at the future without reference to the present, we will lose all focus of our potentials at this moment."

And finally, what would a view of the future of game music be without something off the wall from George Sanger A.K.A. the FatMan:

"No one person can see the future, or even the present, of this very complex field. It requires the help of the eyes, brains, and hearts of the entire community, just to get one's head around all of the marketing, artistic, hardware, software, interactivity, personality, and Internet issues. That being said, here is the certain, inevitable future of Game Music. J
001--The user buys the game and installs it.
002--If he is happy with the music that shipped with the game, he is done. If not,
003--He clicks on "change music."
004--He and/or his program now have the opportunity to select a trustworthy GamePlay DJ from the Internet. They choose somebody.
005--If that DJ has posted a mapping file for the game in question, GOTO 07, else,
006--Another DJ is selected until a mapping file is found
007--Any Amount of Music is now mapped from the vast number of musical works on the Web to the various gamestates of the game by the expert (GamePlay DJ) of his (the User's or the Developer's) choice.
008--The music is either streamed in realtime, or downloaded and changed out invisibly in background.
009--The musician gets paid automatically, proportionally to the number of people he has entertained, the number of minutes he entertained them, and the amount of income generated by the game.

It's that simple, and I'll bet on it. The most fitting model for Game Music is not movies. It is radio."





Working Groups

Interactive XMF
The worlds first open format for interactive audio content...

Game Audio Education
Resources for students and educators about interactive audio...


Industry Jobs
View free listings from MMA/IASIG Member Companies...

Interactive Audio Wiki
Learn about the tools & techniques for making Interactive Audio...

"Ask the IASIG" Web Forum
Ask our experts about the art or technology of Interactive Audio...