Perceptual Audio Coding: MP3 is a perceptual audio coding algorithm. Perceptual encoding is a lossy compression technique, i.e. the decoded data is not an exact replica of the original audio data. Instead, digital audio data is compressed in a way that - despite the high compression rate - the decoded audio sounds exactly - or as closely as possible - like the original. This is achieved by adapting the encoding process to the characteristics of the human perception of sound: The parts of the audio signal that humans perceive distinctly are coded with high accuracy, the less distinctive parts are coded less accurately, and parts of the sound which we do not hear at all are mostly discarded or replaced by quantization noise.
Parts of the MP3 Perceptual Audio Encoder:
- Perceptual model: An estimate of the actual (time and frequency dependent) masking threshold is computed by using rules known from psychoacoustics.
- Quantization and coding: The spectral components are quantized and coded with the aim of keeping the noise introduced by the quantization below the masking threshold.
- Encoding of the bitstream: A bitstream formatter is used to assemble the bitstream, which consists of the quantized and coded spectral coefficients and some side information, e.g. bit allocation information.
Mono and Stereo: MP3 works on both mono and stereo audio signals. A technique called joint stereo coding can be used to achieve a more efficient combined coding of the left and right channels of a stereophonic audio signal. MP3 allows both mid/side stereo coding and intensity stereo coding. The latter method is especially helpful at lower bitrates, but bears the risk of changing the sound image.
Multi-Channel (Surround) Audio: Conventional MP3 is capable of encoding audio signals with no more than two channels. In 2004, however, Fraunhofer IIS introduced a surround extension of 5.1-channel audio – MP3 Surround (see below).
Sampling Frequencies: MP3 works on a number of different sampling frequencies. In MPEG-1, audio compression at 32 kHz, 44.1 kHz and 48 kHz is defined. MPEG-2 extends this by the rates 16 kHz, 22.05 kHz and 24 kHz. “MPEG-2.5” is the name of a proprietary extension developed by Fraunhofer IIS. It enables MP3 to work satisfactorily at very low bitrates and introduces the additional sampling frequencies 8 kHz, 11.025 kHz and 12 kHz.
Bitrate: MP3 does not work solely at one fixed compression ratio or bitrate. The selection of the target data rate is, within some limits, left to the implementer or the user of an MP3 encoder. The standard defines a set of bitrates between 8 kbit/s and 320 kbit/s. Furthermore, MP3 decoders must support the switching of bitrates from audio frame to audio frame. Combined with the so-called bit reservoir technology, this allows both variable bit-rate coding and constant bit-rate coding at any fixed value within the limits set by the standard.
Bitrate vs. Audio Quality: In the MP3 encoder, the bits are allocated such that the quantization noise is kept inaudible. The lower the data rate, the fewer bits can be allocated to individual signal components. Therefore, at low bitrate quantization noise can become audible. That means, the audio quality of MP3 encoded material scales with the bitrate. Although MP3 can be used in the range between 8 kbit/s and 320 kbit/s, it is recommended not to use data rates below 80 kbit/s for mono or 160 kbit/s stereo audio. For very low data rate applications, MPEG-4 Audio Codecs are more suitable than MP3.