Since the inception of broadcasting, controlling audio level has been a difficult challenge. The designers of the ATSC DTV broadcast standard attempted to improve the consistency of audio dialog levels by the implementation of dialnorm. To date, that goal has yet to be accomplished. In many markets, anecdotal evidence and level measurements suggest that DTV audio levels are consistency worse when compared to NTSC, not better. Understanding dialnorm is key to defining the problem, resolving it, increasing consumer satisfaction, and preventing potential FCC action.
NTSC Audio Levels
Understanding dialnorm requires understanding NTSC audio levels. FCC rules produced an NTSC audio system that provides fairly consistent peak audio levels and reasonably consistent dialog levels from show to show and between broadcast stations. The unfortunate side effects of this system include severe loss of high-frequencies, compression artifacts and a dramatic loss of dynamic range, especially for symphonic and theatrical content. See the detailed section: NTSC Audio Levels.
When the FCC adopted the ATSC system for DTV Broadcasting, Dolby Laboratories AC-3 digital audio was specified as the audio standard. Dolby now calls AC-3 "Dolby Digital"- the same standard in common use for surround sound in theatrical release prints and home DVDs.
Dolby Digital includes a number of key technologies to eliminate the side effects of NTSC audio level controls. Digital perceptional coding eliminates the need for high-frequency limiting and dramatically increases the dynamic range. The inclusion of Dynamic Range Control (DRC) allows the consumer to chose the desired amount of compression. See a full description of DRC.
The most dramatic operational change was the requirement for dialnorm (one of the small group of items actually mandated in the FCC DTV proceeding.) Dialnorm replaces FCC modulation regulations as the means to insure consistent loudness. The precise requirement was adopted by the FCC April 3, 1997 in A/53 Annex B 5.5 Dialogue Level: "The value of the dialnorm parameter in the AC-3 elementary bit stream shall indicate the level of average spoken dialogue within the encoded audio program."
Dolby had two choices to insure consistent loudness with their system. One would be to define a fixed digital dialog level and the means for it's measurement. If Dolby had taken this approach, dialnorm would not be needed. (This would have been similar to the approach taken by SMPTE in standardizing -20 dBFS as the "operating level" for digital audio systems. Doing so establishes VU zero as -20dBFS and produces typical PPM peaks of about -10 dBFS for VU peaks of 0.)
Instead, Dolby recognized the difficulty in achieving consensus on a fixed dialog level and made it variable within a range from -31 dBFS to -1 dBFS, as measured during dialog with an A-weighted long-term average meter (LAeq). The system works great as long as the value of dialnorm for each program is delivered to every decoder. This is a minimal challenge to the DVD-mastering industry, digital satellite and digital cable movie providers. But it is a giant hurdle for local broadcast stations.
Why Dialnorm Implementation is Difficult
"Fixed" Dialnorm - A Simpler Approach
To implement "fixed" dialnorm within a station, establish dialnorm at a single fixed "plant" level and adjust audio levels to match (instead of the other way around). This eliminates the need to carry metadata through the plant. When encoded for distribution, the dialnorm will be correct, because the mix was performed to produce a dialnorm value equal to the house standard.
Using a "plant dialnorm" within a station dramatically simplifies implementation of dialnorm. Incoming sources with proper dialnorm are normalized to the plant level allowing them to switched, stored, routed and intermixed without need for associated metadata.
The fixed dialnorm value must be chosen to match average dialog levels on library tapes. If dialog levels on archive tapes are inconsistent, compression can be applied to those sources. (This approach is advocated by Orban.)
Do not feed a DTV encoder audio from a standard NTSC broadcast limiter. Doing so applies needless and detrimental pre-emphasis limiting. It also eliminates the option for consumers to "turn down the compression".
The vast majority of DTV stations have implemented "fixed" dialnorm without knowing it, and may be using the wrong value. They do so by turning on their new DTV encoder and leaving set the manufacturer default. Or they changed the value without realizing the implication. Dolby has suggested -27 as a default setting, and at least some encoders use this value. But there is no assurance that this is correct for any given station. (PBS measurements suggest that -27 may be a proper value for programs mastered to their specifications: Reference tone at -20 dBFS. Peaks reaching but not exceeding -10 dBFS.)
At minimum, stations should borrow a dialnorm measurement tool and set their DTV encoder dialnorm parameter to the average measured value at the encoder input.
Revised Thursday, 09-Feb-2006 17:23:38 CST - h - © 2000 - 2003 Local Enhancement Collaborative &