Just recently, there was a surge in traffic to my old legacy dial-up sounds pages. While I did my best with the equipment I had at the time to capture the sound of dial-up modems, I always knew that my job was only half done. As we approach the consumer POTS-line apocalypse, and ISP upon ISP are starting to disable their modem banks, the V.90/V.92 handshake sound is seriously under threat. Anything from V.34bis and down is still safe, since having modems “back to back” will allow for this type of connection to occur, but since V.90/V.92 are digital modulations on the downstream, the digital modems are rare, expensive and require special digital (ISDN or better) connectivity to function. There’s no way most home users can afford to have them. As a result, in this post, I finish the job I started (before it’s too late) to present the definitive collection of V.90/V.92 modem sounds at a quality never previously presented before.
A Two-Paragraph Introduction to Modem History
Early modems used simple modulations involving FSK/PSK modulations and separate carrier frequencies for receive and transmit. When that became spectrally impractical, interim protocols featuring “turn-around” which were half-duplex and various non-standard optimizations started to appear (e.g. Trailblazer, USRobotics HST, Hayes Ping-Pong/Express96). Over time, developments were made which increased the data rate by using more sophisticated Trellis-code modulation, and full-echo cancelling to allow for the full voice-band frequency to be used in both ways simultaneously. This increased the complexity of the modems, which also increased their handshaking times due to the requirement for training, a process whereby the modems assess the qualities of the phone line and set optimal parameters to maximise the transmission rate within a given set of line conditions.
This image, however, depicts a V.34-type connection, as it has no high speed sound.
The High Speed Sound
This is not the correct technical term for it, but it’s a term which I’ve heard amongst my less technically proficient group of friends. They realized that by listening to the modem’s handshake, there was a particular sound towards the end of the handshake which signalled a high speed connection. They were right.
This sound is of the Digital Impairment Learning (DIL) sequence. This is only applicable for V.90/V.92 connections, as V.34-type (analog, up to 33.6k) connections do not have it. If you heard this “sound”, you would be fairly certain that you were attempting a V.90 (up to 56k) connection.
For those who want a headache, they can go and try to digest the ITU-T’s V.90 recommendation here, specifically the section about DIL (8.4.1) and the section about Signal Ja (8.3.1).
While the document is technical, I will try to explain the concept in a progressively more difficult way. Basically, the standard dictates that the analog modem (customer side) sends a Ja sequence which tells the digital modem (ISP side) what signals to generate for the DIL segment and send back. This allows the analog modem to understand the line parameters better – namely, is there robbed bit signalling (i.e. where one-bit of every six 8-bit frames, or more in case of multiple RBS links, is altered by the signalling equipment), or are there digital pads (i.e. a DSP algorithm based scaler which reduces the maximum level on the phone line). This allows for the modem to adapt and compensate for these impairments.
The signal itself is rather complicated in specifications, consisting of up to 255 segments, 8 codes per segment, with the length in 6-symbol blocks determined by the associated H-value. Further to this, the specifications also define a sign-pattern and training-pattern, which are up to 128 bits long, and specify whether a symbol transmitted is a training symbol or a reference symbol, and whether the symbol has positive amplitude or negative amplitude. This sequence is repeated completely until it is aborted by the analog modem, or the sending modem times out. Further to this, throughout the DIL procedure, the analog modem is also permitted to send scrambled data (SCR), but is not required to do so.
Because of this analog modem defined nature of the DIL signal (i.e. the high speed sound), different modem manufacturers were at liberty to use a different DIL training pattern optimized for their algorithms. This resulted in a case where people would notice that my modem didn’t quite sound like that recording!
As to why this flexibility is in the V.90 standard (and not found in earlier standards) may have to do with the compromise nature of V.90 which came about to end the war between K56flex and X2 by producing an incompatible but similar modulation that modems from both camps could be upgraded to. This flexibility may have been necessary to appease both parties and reduce the engineering effort required to redesign their digital impairment learning algorithms, but that’s just a hunch.
In order to capture the best quality handshakes, I had to resort to a necessary evil of VoIP. While generally not optimized for data, I was careful to configure the transmit/receive amplitudes, disable echo cancellation, reduce packetization to 10ms increments, use G.711a codec, disable fax pass-through detections and lock the jitter buffer at its minimum value of 30ms. This made it much more suitable. I terminated my test calls through the Linksys/Sipura PAP2T ATA to a local VSP which has digital terminations I know are capable of returning the exact digital codes and permit a V.90 connection, which is very difficult. Additional V.92 testing was performed using an overseas VSP termination, as local V.92 modem banks that were reachable through VoIP terminations were not known.
The VoIP ATA was connected through a Ethernet bridge where all the packets in both directions were collected and reconstructed into call audio using Wireshark (as PCM 16-bit samples, rather than the native a-law 8-bits). This allowed me to separate digital and analog modems into separate channels. Some echo is heard due to impedance mismatches at the analog end, which is expected behaviour.
This method is much better than recording the audio output from the modem’s speaker jack (as I did previously) as it eliminates power supply and digital hash noise from the recordings, and allows for separation of the two ends of the call. It’s also obviously miles better than recording the audio using a microphone, as the tinny piezo buzzers on many internal modems do a pretty bad rendition of the actual call audio.
Of course, getting the drivers and getting the modems installed was not necessarily trivial especially with some of the less popular and less supported winmodems. This required computers spanning Windows 7 x64, down to Windows 98SE to ensure the functionality of the collection of modems.
This is the section you’ve all been waiting for. In this section, we will look at the actual recorded sounds. Before we begin, here’s a few things to look out for (spectrograms generated with Spek):
Some modems perform the V.8bis and escape to V.8 at the beginning of the call, with some of them getting the timings wrong and mucking that up. Others don’t bother at all, and just respond only to V.8, which is possibly slightly faster. Some modems send a guard tone with the data sequences before and after the line probing signal, while others do not, resulting in a slightly brighter sound. The DIL sequences vary between the chipsets, but some have scrambled data being sent back and others do not – so keep an ear out for that as well.
These will be presented, grouped by chipset. Click on the link text for the audio. Waveform images shown, with green indicating analog modem, and blue indicating digital modem. If you would like to hear just the DILs cut-together into one audio file, here it is. It probably makes an ideal nerdy ringtone, for those who still use a ringtone.
Rockwell/Conexant were the most popular chipset as they were relatively low cost solid performers. These could be found in various brands of external and early internal modems. The sound that these produce is arguably the most typical V.90 sound that most people remember, having a smooth crescendo as the DIL. It does not perform V.8bis.
By all means, this was not the case with all ACF2 modems, but my Netcomm Roadster II 56k UltraSVD (AM5690) and Roadster II USB (AM5050R3) both make this two toned crescendo, which might easily be mistaken for the first. It also has some difficulty during the INFO sequence which causes it to be prolonged. It performs V.8bis.
These were fairly popular internal soft modems and instead use a stuttering crescendo rather than a smooth one. It seems to get the V.8bis timing wrong, and misses the first capabilities request message consistently, which may be a driver bug.
Arguably one of the best softmodems I have used, and one that had kept me online for almost half of my dial-up career. This one uses a “bipping” noise with scrambled data sent during DIL. No V.8bis negotiation takes place. This used the Netcomm IN5699_5 with the latest V8.36 driver, although many products used the 1648C chipset, the last hardware-DSP based generation.
This one was recorded using the 1646 HV90 chipset modem running (deliberately) V5.44 driver under Windows 98SE (the things I go to for this). This validated the observation made by Richard Gamburg of Modemsite of DIL changes. The differences can be seen in the tone pattern and scrambled data during the DIL.
A relatively unrelated cousin of the LT winmodem, also went through a name change due to a series of acquisitions. It is seen that this modem does not do V.8bis negotiation and does not send scrambled data during the DIL, with a DIL that sounds like an echoing blip. The Broadcom modems also sound identical.
This particular chipset was considered a premium chipset and was part of most of USRobotics’ extremely reliable range of modems, such as the Sportster Flash, Courier, Message Modem etc. I also found the same chipset in an Aztech EM6800U/A and the same line frequency diagnostic command also works (ATY11). The DIL is known affectionately as the “bong”.
This chipset was found in the Swann Speed Demon as well as the Amigo Intel Host-Accelerated Modem. This one makes a crackly buzzing noise, which on tinny piezo speakers, is hard to distinguish from the regular noise of scrambled data. The modem also does not participate in V.8bis negotiations. Not very commonly encountered.
This sample was recorded from an Acer Acermodem Surf 56, the only Motorola hardware external modem I have come across. The modem has a very particular DIL sound which I have dubbed the laserbeam and has the honour of being the shortest DIL sequence recorded.
The SM56 soft-modem, while also a Motorola chipset, differs from their hardware chipsets in the pace of the DIL. For their soft modems, the DIL sequence is slower and more prolonged. A subtle, but noteworthy difference.
A relatively uncommon, low cost soft-modem which had some good opinions from time to time. I’ve never used it, however, its DIL is distinctive with a staccato bipping that makes it sound like someone’s having a seizure or something. It was nice to hear this one as I had never heard this one prior to this investigation.
This modem was an SL2800 modem, but it sounds practically identical to every Smartlink/Modio product I have used. At a time, these were quite popular as a softmodem driver for AMR/CNR based modems in cheaper computers and laptops, and they worked with many integrated audio chipsets. Their crescendo is easily mistaken for a Rockwell/Conexant but is slower.
V.92 was the last dial-up modem standard, and bought along with it features such as quick connect which at least, in theory claimed to reduce handshaking times by remembering line performance parameters, support for call-waiting modem-on-hold to suspend and resume data connections (although this was rarely allowed by ISPs), PCM upstream for increased upload rates up to 48k at the cost of limiting downstream rate to 48k as well (again, rarely used). It was accompanied by V.44, an improved data compression standard, which was commonly used to improve throughput.
In the course of testing, I also made calls overseas, terminating via a u-law gateway in the USA to their V.92 modem banks. The increased latency makes the connection both difficult, and prolonged, with the handshake signals almost artificially prolonged to lengths not often heard. However, this gave me a chance to capture the quick-connect behaviour of the V.92 capable modems that did successfully exhibit quick connect within the 10-30 test calls I made per modem.
There’s nothing too flashy about this one, and it doesn’t differ very much to the V.90 connection reported earlier with the exception of increased latency and the ulaw/alaw conversion or mismatch. Regardless, the arrangement was sufficient to demonstrate V.92 connection in normal mode.
This is where things get interesting. The quick connect recognizes V.92 upfront and skips the line probing sequence entirely. The DIL is still used, but an accelerated version of it is used. This is the first time I have exhibited the one modem using two different DIL sequences. The connect is indeed quicker by a few seconds, although its robustness may be affected.
Because of the latency, the DIL has “looped” around resulting in two-and-a-half bongs in the DIL segment. Normally, hearing four bongs indicates a time-out of the DIL and a retrain occurs, but in this case, the connection is successful as V.92 and the extra partial bong is due to latency of the “abort sending DIL” command.
Again, the line-probing sequence is skipped in the quick connect, as V.92 capabilities are quickly identified, but the DIL in this case remains the normal regular DIL as this modem only has the one DIL sequence.
In the case of the Motorola, the increased latency of the trans-pacific connection gives us an unusual glimpse into its extended DIL sequence. In the case of the USR above, its termination was merely delayed by latency, but in the Motorola, it was delayed more than just by latency revealing more of the DIL which is not normally heard.
The extended DIL is also used on the quick connect with the high latency, but the line probing phase was skipped. I suspect there may be different levels of quick connect, but achieving the fastest quick connect, especially under VoIP conditions where the impairments are different to POTS, is rather difficult.
A lot of calls were made, and some ISPs were probably annoyed. Variations on the V.90/V.92 handshake were examined, with differences in V.8bis negotiation, guard tone, DIL, scrambled data and INFO sequence length shown. It seems that certain chipsets (e.g. Rockwell/Conexant) have a habit of not sending scrambled data during DIL, whereas some others do.
With this posting, I think I am ready to farewell V.90 and V.92, and I think I won’t be too sad when the last V.90/V.92 modem banks are turned off and unplugged from the network for good. I’ll always be able to hear the comforting high-speed sound, and remember the various chipsets who were part of making internet connectivity faster and more affordable to the world.
While I wasn’t able to test every variation of V.90/V.92 chipset, the DILs recorded cover every chipset I know of (as most vendors use the same DIL for their whole family of chipsets), possibly with the exception of the IC-Plus/Topic chipset which I haven’t seen for sale or encountered personally.
I hope this post is enjoyed by those who enjoyed my legacy pages, and I hope it goes some way to helping you find your dial-up modem sound.