As IT professionals try to reduce the cost of operating their Web sites, they should consider reducing the amount of bandwidth usage. Current compression technologies can do this; however, their implementation is limited by current bugs in both the browser and the server. Learn how to successfully compress your HTML output and save money on your monthly bandwidth. Benefits
Depending on the file type, you can get up to a 90% compression ratio, and the overall average bandwidth saving is two to three times. This means if you are paying $300 per mega byte per second for your bandwidth with a monthly bill of $1200, you can reduce your bill to $400 per month. Interestingly enough Internet Service Providers can provide the same service while reducing their overhead using HTTP compression.
How It Works
Internet Explorer 4.0 and above and Netscape 3.0 and above have the ability to decompress responses from Web servers that send a compressed response. This decompression feature is built directly into the browser, doesn't require a plug-in, and is enabled by default. Browsers with this feature signal the Web server by sending a request header called "Accept-Encoding:". Internet Explorer sends the header as "Accept-Encoding: gzip, deflate" and Netscape sends the header as "Accept-Encoding: deflate". Each indicates the type of compression that they can decompress.
Deflate is a format of data that represents the compressed page. It is not a compression technique. Different compression techniques can create different outputs; however, they usually produce the outputs in the deflate format. GZip is a format of wrapping the compressed page for transmission which include a ten byte header, followed by the compressed bytes, in a deflate format, usually followed by a checksum and original files size. Interestingly enough the first two bytes of any gzip file are the same, 0x1f followed by 0x8b. The third byte is the compression format (0x08 is deflate).
Figure 1 : GZip Format
When the Web server handles a request with a request header of "Accept-Encoding:" it has a choice, either to send a compressed page or a standard non-compressed page. If the Web server decides to send a compressed page, it responds with a response header called "Content-Encoded:" and the type of encoding as the value. For example, "Content-Encoded: gzip". The browser uses this header to determine if the content is encoded and decompresses it before rendering it to the user.
Figure 2 : Compressed HTTP Request/Response
Web Server Load
In order for the server to respond with a compressed page, the page needs to be compressed. Compression happens in two distinct ways: for static requests, the page can be compressed ahead of time and served to multiple requests. For dynamic pages, like Active Server pages or Cold Fusion pages, the response has to be compressed on every request, since the output is different for every request.
Since compression takes time, roughly 100 - 1000 milliseconds for every page depending on size and compression quality, compressing static pages ahead of time and severing them to multiple requests saves CPU cycles and makes for faster responses. Compressing dynamic pages is harder on the server, since all requests need compression and none can be done ahead of time.
Different types of file compress differently. Files that are already compressed or that have a random set of bits do not compress well, while files that have a lot of white space or text compress very well.
For example, jpgs and gifs are already compressed, so compressing those files do not save any space. However, HTML can be compressed up to 90%. You can't compress PDF, because the Adobe Acrobat reader can not handled compressed file, see "The Bad News" below.
IIS 5.0 allows you to compress static pages, dynamic pages, or both. The compression option is turned off by default. To turn it on follow these instructions:
- Open The IIS Manager
- Navigate To the Computer Node
- Right Click and Choose Properties
- Click On the Edit for Master Properties
- Choose the Server Tab.
- At the bottom there is a HTTP Compression Group
- Check the "Compress Application Files" to get the Active Server Pages or Cold Fusion Pages Compressed
- Check the "Compress Static Files" to get your HTML files compressed.
Figure 3 : Setting Up IIS Compression
IIS 5.0 implements its compression via an ISAPI Filter called C:\WINNT\System32\inetsrv\compfilt.dll that is installed at the computer level. This means compression is activated for all of the web sites that this machine is hosting. If the HTTP Compression group in this tab is grayed out you might be missing the compression filter. To reinstall it see: Manual Installation of Compression Filter Fails (Q223176).
By default, IIS 5.0 Compression will only compress files with an extension of "htm," "html," and "txt," since these files have the most potential for compression. However, you can tell IIS to compress requests with other extensions.
Here is how:
- Open a Command Prompt session.
- Change the directory to your \InetPub\AdminScripts folder.
- Enter the following commands:
CSCRIPT.EXE ADSUTIL.VBS SET W3Svc/Filters/Compression/GZIP/HcFileExtensions "htm" "html" "txt" "doc" "ppt" "xls"
CSCRIPT.EXE ADSUTIL.VBS SET W3Svc/Filters/Compression/DEFLATE/HcFileExtensions "htm" "html" "txt" "doc" "ppt" "xls"
- Close the Command Prompt session.
For more information read: How to Specify Additional Document Types for HTTP Compression (Q234497).
When you turn on static compression, the IIS filter writes all the compressed files to a temporary location, by default it is: C:\WINNT\IIS Temporary Compressed Files, these files are reused until the origin file changes. This prevents static files from having to be recompressed for every request and save CPU cycles. However the default option is to use unlimited space on your C: drive. This is not recommended since it will run your drive out of space. Make sure to change this setting to something reasonable for your system.
The Bad News
Both Internet Explorer 5.5 and Internet Explorer 6.0 have a bug with decompression that affects some users. This bug is documented in: the Microsoft knowledge Base articles, Q312496 is for IE 6.0 (Not Yet Published), the Q313712 is for IE 5.5. Basically Internet Explorer doesn't decompress the response before it sends it to plug-ins like Adobe Photoshop. This means that Adobe Photoshop sometime crashes Internet Explorer if it gets a compressed page. Microsoft is offering a fix for this problem, to get it you need to call Microsoft product support and reference Q313712.
Even though Microsoft has documented the problem and issued a fix this is very little help, since millions of people are running these browsers with this problem and are unwilling to install this fix. It would really be nice if Microsoft issued a fix for the server side.
The Good News
Though this is a bug with the browser, there is a server side solution that will work around the browser problem. If the server passes the HTTP response headers in a different way, the browser plug-ins will not get the compressed data, and the browser will decompress and render the pages successfully. However, currently there is no fix for IIS 5.0 compression filter.
There are other compression filters that are made by 3rd parties that will do HTTP compression. These filters have more features and workaround the IE browser problem. XCache http://www.xcache.com is one of these. XCache can be applied to single or multiple web sites and it will cache the static files in RAM as well as on disk without the worry of running out of disk space. With XCache you can choose which files you want to compress and what file types.
Deflate Compression Format: ftp://ftp.isi.edu/in-notes/rfc1951.txt
GZip File Format: ftp://ftp.isi.edu/in-notes/rfc1952.txt
Using HTTP Compression On Your IIS 5.0 Web Site: http://www.microsoft.com/technet/treeview/default.asp?url=/TechNet/prodtechnol/iis/maintain/featusability/httpcomp.asp
HTTP Compression Speeds up the Web: http://www.webreference.com/internet/software/servers/http/compression/3.html
About the Author
Wayne Berry is the architect of XCache Technologies' XCache and XTune. He is a former Microsoft design engineer, founder of 15 Seconds, and one of the top Active Server Page developers in the country. Berry's expertise includes software design, performance consulting, development, marketing, and online business. Berry served as a software development engineer at Microsoft and as editor of 15 Seconds prior to founding XCache Technologies. A popular speaker, Berry has been invited to speak to international ASP developers' conferences, BackOffice Conference, and Internet World. He has authored several technical books, including ActiveX Programming Unleashed, Windows NT Registry Guide, and Special Edition Using Microsoft Internet Information Server 4.0, as well as many articles in print and online trade publications. Berry holds a B.S. in computer science from Western Washington University. Email him at email@example.com.