WINDOW CHARACTERS AND HTML
Preventing Display Problems on Web Pages
by Jukka Korpela and Durant Imboden
Over the last decade, the World Wide Web and Microsoft Windows have grown up side by side. Today, both dominate their respective spheres: the Web as a means of publishing information on the Internet and Windows as a computer operating system for consumers and large organizations.
Market estimates suggest that 95 percent of the Web audience is running Microsoft Windows — which means it’s all too easy to forget the other 5 percent of users who view Web pages on the Macintosh, Unix workstations or other non-Windows devices. Yet with the Web audience totaling more than 130 million according to Nielsen/NetRatings, that modest “other” group represents some 6.5 million users, many of whom encounter problems when viewing Web pages that were created on Windows PCs without regard for cross-platform compatibility.
The most glaring design oversights involve the use of character codes that are found only in Microsoft Windows. The Windows character set contains a number of special characters in addition to the ISO Latin 1 (ISO 8859-1) character set.
WYSIWYG ISN’T WHAT OTHERS GET
Most problems with character display stem from the fact that Web authors don’t see the anomalies on their own computers. “What you see is what you get” may be true of your system, but it doesn’t necessarily apply to others.
Common examples include the em dash, trademark symbol and left and right quotation marks. A Web author who works in a Windows environment may not realize that, by using such characters, he creates problems for many users who don’t use Windows. For example, a trademark symbol that works fine in Windows may display as a blank space (or worse) on a Unix or other non-Windows system.
There’s nothing wrong with the characters discussed here. They have legitimate uses, and they are included in other standard character repertoires such as Unicode and ISO 10646. The problem is that they cannot be presented in HTML reliably enough. On the World Wide Web, one shouldn’t expect that vendor-specific, system-dependent encodings work consistently with all systems and browsers.
Let’s imagine that you’re creating an HTML document that includes the Windows trademark symbol. When you preview the document in your browser, the character will display correctly — as it will on other Windows systems. And unless you have a way to preview the document on a non-Windows computer, you’ll never know that other users will see a blank space or even have their displays messed up by a control function.
Although the Windows typographical trademark symbol probably looks better than using <SUP>(TM)</SUP> to generate ™, the gain is small compared to the damage caused if your vendor-specific method doesn’t work at all.
One might argue that such concerns apply to all cross-platform transfers of data. However, when data is transferred between known systems, a suitable character-code conversion program can be used. The situation is more complicated with a Web page: The characters must be readable on any system, not just on a specific target system with a known character set.
 Copyright 1999-2000 Penton Media Inc.
All Rights Reserved.
Colorado Offices
13949 W Colfax Ave Suite 250, Golden, CO 80401
Voice: 303-235-9510; Fax: 303-235-9502
|