End-to-End VoIP Product Comparison Testing
Overview
Summary
How We Tested
Table 1: Voice Source, Cisco-7905-to-Cisco-7905
Table 2: Voice Source, Cisco-7905-to-Cisco-7960
Table 3: Voice Source, Cisco-7960-to-Cisco-7935-ConfPhone
Table 4: Voice Source, Cisco-7960-to-Cisco-ATA186
Table 5: 1KHz Test Tone Source, Cisco 7960 to Cisco 7905
Table 6: 1KHz Test Tone Source, Cisco 7960-to-Cisco 7935 -ConfPhone
Table 7 - Cisco ATA186 POTS-Adapter
Table 8 - Pingtel xpressa PX-1
Table 9 - Pingtel xpressa Software Phones
Appendix A - Notes
Overview
This report summarizes the results of comparison testing between six Voice-over-IP
products:
Pingtel Xpressa PX-1 (a "hard" phone)
Pingtel Instant Xpress (a software phone on a Dell PC)
Cisco 7960 (a full-featured VoIP phone)
Cisco 7905 (a streamlined-functionality phone)
Cisco 7935 "ConfPhone" (a conferencing phone)
Cisco ATA186 "POTS Adapter" (a device that allows a plain ordinary telephone
to be connected to the Internet for VoIP use).
All these phones are real products being marketed today. All were tested
under a variety of network conditions which are nearly identical to what they
might encounter when operated in the open Internet. Although it is possible to
upgrade network infrastructure to provide QoS guarantees, this will not be helpful
in most cases. In the general case, "anyone-talking-to-anyone" communications,
product performance could be affected by network disturbances. These disturbances
include jitter, packet drop, reorder, duplication, and others. The report includes
sound clips so you can hear and judge for yourself the voice quality results
of representative conditions. We used our Maxwell[tm] (see
sidebar) to impose these real-world network conditions. The test methodology
is also described.
Summary:
Maxwell makes it very easy to perform side-by-side product, product-version,
regression and -interoperability comparisons[1] under
controlled and realistic network conditions.
Some VoIP phone manufacturer's documentation recommends that the user's network
be designed within certain quality-of-service conditions, such as jitter not
to exceed 30ms, so we used those as a starting point in these measurements. We
picked 25ms and 30ms. The Pingtels performed well at jitter levels
far in excess of these, as you can hear from the sound clips.
The sound clips also demonstrate how combinations of network disturbances
or impairments affect the phones. Individual impairments may not affect
voice quality, however, in combination with other impairments, voice quality
is degraded. For example, at an average jitter of 25ms, we found no audible
distortion in the Cisco 7960 phones unless we also added reordering.
The sound clips are in WAV format, which most desktop computers
can play. All are digitized at 8KHz, 16-bit resolution, monophonic. For reference, CD-quality sound is 44.1KHz, 16-bit
resolution, stereo. You can listen to the recordings and judge for yourself. Reference
recordings are also included for `best case' network conditions. You will
need a PC with a sound card. It is best to listen with good headphones,
rather than typical desktop PC speakers. Actual sound quality effects are
more accurate iif the sound comes from a source near your ears, just as it does
with a regular phone. Good headphones also block ambient noise, allowing
you to hear just the recording.
The following tables show network conditions at which the phones were
tested. Only three of the many kinds of possible network impairments were
tested. These three are:
Jitter: uniformly-distributed random amounts of delay
is added to voice data packets. Maxwell keeps track of the arrival time
and the exit time of each packet, automatically calculating an average delay
which is displayed and updated in real-time by the graphical user interface and
also shown in the tables below.
Drops: voice data packets are randomly selected to be dropped. Distribution
function is uniform. The mean is given; e.g., at 1% drop, 1 out of
a hundred packets is dropped. This number applies to both directions, which
means that the effective packet loss in each direction is about half that number
(e.g., when the drop rate is set to 3%, each direction was showing a drop rate
of 1.5%. The table column for drops has been adjusted for this fact
Reorder: the order in which packets arrive can be
changed. The higher the number, the more reordering takes place. In
real networks, packet-reordering can take place occasionally when routes are
adjusted, and consistently over tandem links (a commonplace solution when a quick
bandwidth-upgrade is needed). You can think of the reorder-number as being
the number of extra data links: e.g., reorder 0 -> one data link, reorder
1-> two tandem data links, reorder 2 -> three tandem data links, etc.
We did not duplicate, modify or corrupt packets, though Maxwell
can do those things too.
The table below provides a sound clip for each phone under each
condition, along with a text notation of the voice quality. Click on the sound
clip to hear for yourself exactly what the indicated network conditions do to
the tested equipment (i.e., "what that sounds like"). These clips were digitized
at 8KHz, 16-bit monophonic.
How We Tested
Other than the effects introduced by the Maxwell, the network was a quiet
internal 10/100 switched LAN, i.e., almost perfect.
Two kinds of audio source material were used: a snippet from a local
radio station's news reporting[2],
and a 1000 Hz test tone. Both were recorded onto CD-R media and played
using an RCA portable CD player. The headphone jack was connected via adapter
to RJ11 connector on the phone. The CD player's volume control was adjusted
so that with no impairments from the Maxwell, the signal was loud, clear and
undistorted.
For all measurements except the ones to the Cisco 7935 ConfPhone
(which has no handset), the receiving phone's handset cord was connected thru
an RJ11 adaptor to ministereo plug, and fed directly into a PC sound card. The
purpose in doing so was to avoid speaker-to-microphone distortion and background
noise pickup. The recording volume control was adjusted for maximum clarity
and volume without distortion when the Maxwell was set to no impairments.
Since the Cisco 7935 ConfPhone has no handset and no way to directly
record the output signal, for these measurements, an AudioTechnica ATR20 cardioid
low-impedance microphone was suspended one inch above the Cisco 7935 ConfPhone
speaker. These measurements were taken in a separate and quiet (though
not anechoic) room, away from our lab's equipment, RF emissions, and fan noises.
For each set of tests, a reference recording was made. Listen [3] to
the reference recording to hear what "best case" sounds like.
For reference recordings, Maxwell was set to 0 ms jitter, 0% drop, no reordering. In
other words, Maxwell did not impair any of the traffic. These reference files
contain some noise picked up by the sound card and cabling, not introduced by
either the Maxwell or its effect on VoIP traffic. It is recognizable as
60-Hz "hum" and also hiss. You hear it in all samples. Network effects
on VoIP tends by be heard as gaps and dropouts, or in some cases like the person
is gargling or talking underwater. For the test-tone measurements, instead
of a steady tone, it sounds more like you're listening to Morse Code.
Tables 1 thru 4 below show results when the audio sources are
human voices, a woman's and a man's, speaking clearly.
Tables 5 and 6 show results for a 1000-Hz sine wave test tone.
Table 7 shows the POTS Adapter.
Table 8 shows the Pingtel "hard" phone.
Table 9 shows the Pingtel "software" phone.
Table 1: Voice Source, Cisco-7905-to-Cisco-7905
Jitter
(in ms) |
Drop
(in %) |
Reorder |
Recording |
Comments |
0 |
0 |
0 |
|
Reference file |
25 |
0 |
0 |
|
clear |
25 |
1 |
0 |
|
clear |
25 |
1 |
1 |
|
Gargling/"underwater sound". Annoying. |
25 |
2 |
0 |
|
Slight distortion |
25 |
2 |
1 |
|
Gargling/"underwater sound". Max Headroom Sound. Annoying. |
25 |
3 |
0 |
|
Slight echo sound |
25 |
3 |
1 |
|
Gargling, echoing, extreme distortion. Very annoying, intelligible
only for slowly speaking talkers |
25 |
4 |
0 |
|
A little bit echo-sounding |
25 |
4 |
1 |
|
Very distorted. Unacceptable. |
25 |
5 |
0 |
|
Distorted but intelligible. |
25 |
5 |
1 |
|
Distorted and barely intelligible. Unacceptable. |
25 |
5 |
2 |
|
" |
30 |
0 |
0 |
|
Clear |
30 |
1 |
0 |
|
Slightly noticeable distortion but clear enough to understand |
30 |
1 |
1 |
|
Very distorted |
30 |
2 |
0 |
|
Slightly noticeable distortion but clear enough to understand |
30 |
2 |
1 |
|
Gargling/"underwater sound" |
30 |
3 |
0 |
|
" |
30 |
3 |
1 |
|
Distorted |
30 |
4 |
0 |
|
" |
30 |
4 |
1 |
|
Very distorted |
30 |
5 |
0 |
|
Distorted |
30 |
5 |
1 |
|
Very distorted |
30 |
5 |
2 |
|
Very distorted |
Table 2: Voice Source, Cisco-7905-to-Cisco-7960
Jitter |
Drop |
Reorder |
Recording |
Comments |
0 |
0 |
0 |
|
Reference file |
25 |
1 |
1 |
|
Slightly distorted |
25 |
2 |
0 |
|
Very slight distortion |
25 |
2 |
1 |
|
Distorted but understandable |
37 |
3 |
0 |
|
Distorted but understandable |
37 |
3 |
1 |
|
Distorted but understandable |
37 |
5 |
0 |
|
Distorted, noise pops |
37 |
7 |
1 |
|
Very distorted |
Table 3: Voice Source, Cisco-7960-to-Cisco-7935-ConfPhone
Jitter |
Drop |
Reorder |
Recording |
Comments |
Notes |
0 |
0 |
0 |
|
Reference |
These measurements were taken using a microphone and thus subject
to some speaker-to-microphone distortion |
30 |
0 |
0 |
|
clear |
|
30 |
1 |
0 |
|
Some distortion |
|
30 |
1 |
1 |
|
Some distortion |
|
30 |
2 |
0 |
|
Distortion |
|
30 |
2 |
1 |
|
Distortion |
|
30 |
3 |
0 |
|
Distortion |
|
30 |
3 |
1 |
|
Distortion |
|
30 |
4 |
0 |
|
Very distorted |
|
30 |
4 |
1 |
|
Very distorted |
|
30 |
5 |
0 |
|
Very distorted |
|
30 |
5 |
1 |
|
Very distorted |
|
Table 4: Voice Source, Cisco-7960-to-Cisco-ATA186
Jitter |
Drop |
Reorder |
Recording |
Comments |
Notes |
0 |
0 |
0 |
|
Reference file |
|
30 |
0 |
0 |
|
|
|
30 |
2 |
1 |
|
Slightly-noticeable gargling sound |
|
30 |
4 |
0 |
|
Clear |
|
30 |
4 |
1 |
|
Very distorted |
|
Table 5: 1KHz Test Tone Source, Cisco 7960 to Cisco 7905
Jitter
(in ms) |
Drop
(in %) |
Reorder |
Recording |
Comments |
0 |
0 |
0 |
|
Reference |
25 |
0 |
0 |
|
clear |
25 |
1 |
0 |
|
clear |
25 |
1 |
1 |
|
Sounds like Morse Code |
25 |
2 |
0 |
|
Sounds like Morse Code |
25 |
3 |
0 |
|
Sounds like Morse Code |
25 |
3 |
1 |
|
Sounds like Morse Code |
25 |
4 |
0 |
|
Sounds like Morse Code |
25 |
4 |
1 |
|
Sounds like Morse Code |
25 |
5 |
0 |
|
Sounds like Morse Code |
25 |
5 |
1 |
|
Sounds like Morse Code |
25 |
6 |
0 |
|
Sounds like Morse Code |
25 |
6 |
1 |
|
Sounds like Morse Code |
25 |
7 |
1 |
|
Sounds like Morse Code |
25 |
7 |
2 |
|
Sounds like Morse Code |
Table 6: 1KHz Test Tone Source, Cisco 7960-to-Cisco 7935
-ConfPhone
Jitter
(in ms) |
Drop
(in %) |
Reorder |
Recording |
Comments |
0 |
0 |
0 |
|
Reference |
25 |
0 |
0 |
|
Clear |
25 |
2 |
0 |
|
clear |
25 |
2 |
1 |
|
Morse code, lots of dropouts |
25 |
2 |
0 |
|
Morse code |
30 |
0 |
0 |
|
Clear |
30 |
2 |
0 |
|
Morse code |
30 |
2 |
1 |
|
Morse code |
30 |
5 |
0 |
|
Some dropouts can be heard |
30 |
5 |
1 |
|
Morse code; dropouts |
Tests started from no impairments, then increased drop percentage at one-
percentage-point intervals. At each interval, jitter started at 0 ms (the "set-point")
then increased, and the same with reordering. This order may be meaningful
depending upon how the receiving units compensated for these impairments.
Table 7 - Cisco ATA186 POTS-Adapter
Jitter
(in ms) |
Drop
(in %) |
Reorder |
Recording |
Comments |
0 |
0 |
0 |
|
Reference file, no impairments |
30 |
0 |
0 |
|
|
30 |
2 |
1 |
|
|
30 |
4 |
0 |
|
|
30 |
4 |
1 |
|
|
Table 8 - Pingtel xpressa PX-1
Jitter
(in ms) |
Drop
(in %) |
Reorder |
Recording |
Comments[4] |
0 |
0 |
0 |
|
Reference, no impairments |
155 |
5 |
0 |
|
Clear |
155 |
5 |
1 |
|
Clear |
155 |
5 |
1 |
|
Toggles between "no impairments" and 155ms/10%/reorder1.
You can hear some slight distortion in the transitions, but the compensation
adapts quickly and you don't hear this in the steady-state condition. Ignore
the buzz that you hear right after the phrase "six years". This is created by
the CD player when it goes back to repeat the track, it is not caused by the
phone. |
155 |
10 |
0 |
|
Some slight distortion |
155 |
12.5 |
0 |
|
Slight distortion |
155 |
12.5 |
1 |
|
Some warbling |
155 |
15 |
0 |
|
Some distortion |
155 |
15 |
1 |
|
Warbling |
200 |
5 |
0 |
|
Clear |
200 |
5 |
0 |
|
Toggle b/n no impairments. Some slightly noticeable
distortion occurring at the transitions. |
200 |
5 |
1 |
|
A little warbling |
200 |
5 |
1 |
|
Toggle b/n no impairments. A little warbling and some
echo at the transitions |
200 |
5 |
2 |
|
Clear |
200 |
5 |
2 |
|
Toggle b/n no impairments. Can hear some distortion,
echo and warbling at the transitions. Very garbled in spots. |
247 |
0 |
0 |
|
Clear |
247 |
2.5 |
0 |
|
Clear, though slight warbling is audible |
247 |
2.5 |
0 |
|
Toggle b/n no impairments. Slight gargling type effect
at transitions. |
247 |
2.5 |
1 |
|
Clear |
247 |
2.5 |
1 |
|
Toggle b/n no impairments. Can hear a little gargling
effect at transitions. |
247 |
2.5 |
2 |
|
Definite warbling, unacceptable quality |
Table 9 - Pingtel xpressa Software Phones
Jitter
(in ms) |
Drop
(in %) |
Reorder |
Recording |
Comments[4] |
0 |
0 |
0 |
|
Reference, no impairments |
200 |
0 |
0 |
|
Clear |
200 |
0 |
0 |
|
Toggles between no impairments and 200ms jitter.
Some warbling audible at the transitions. |
200 |
5 |
0 |
|
Clear
|
200 |
5 |
0 |
|
Toggle no impairments. No discernible changes
at the transitions. |
200 |
10 |
0 |
|
Clear |
200 |
10 |
0 |
|
Toggle no impairments. Some echo audible at
transitions |
200 |
10 |
1 |
|
Very bad, unacceptable. Warbling. |
300 |
0 |
0 |
|
Slight warbling, but generally clear |
300 |
1 |
0 |
|
A little warbling |
300 |
1 |
1 |
|
A little warbling, some pops |
300 |
1 |
1 |
|
Toggle no impairments. Noticeable warbling at
transitions |
Appendix A - Notes
The Cisco 7905-to-Cisco 7960 jitter25msdrop3pctreorder1 measurement had to be
redone. The phone had dropped its connection and no audio was present in
the recorded file. Cause unknown.
When the Cisco 7935 ConfPhone is taken off HOLD (un-muting the speaker), even
with no impairments set by the Maxwell, the resulting voice quality is sporadic
for a period of time (~30s)
During most Cisco VoIP phone; tests, connection to its call set-up director
software would be lost, but audio data continued to be sent and the source audio
was still audible. Hanging up the phone in some cases did not restore this
connection. Reducing the impairment parameter "reorder" back down was not
enough to make the phones work: I could dial but wouldn't get the call
completed when I picked up. The called phone kept ringing even after its
handset was picked up.
Why this matters
This would be a security vulnerability at the very least in the sense of a denial-of-service
attack: by manufacturing "bad" network conditions, it would be possible
to prevent the phones from switching between calls or placing another call. This
occurs under conditions where the voice quality is bad but still intelligible. Although
in principle it would be possible for the network to be installed such that the
phones were on a separate physical network than PCs, which as we know tend to
be vulnerable to Email-based and other forms of virus. Since the Cisco
7960s have a PC LAN connector, in practice this might not be so easy to enforce.
References:
RTP: RFC1889
Footnotes
|