« The Cozi Tech Blog | Main | Daylight Savings Time and JavaScript »

March 10, 2008

Transparent PNGs Can Deadlock IE6

http://www.georgevreilly.com/blog/content/binary/deadlock_thumb.jpg

Summary: Internet Explorer 6 does not support transparency in PNG images. The best-known solution is to use the DirectX AlphaImageLoader CSS filter. It's less well known that using AlphaImageLoader sometimes leads to a deadlock in IE6. There are two workarounds. Either wait until after the image has been downloaded to apply the filter to the image's style, or use the little-known transparent PNG8 format instead of the filter.

The First Set of Hangs

Back in September 2007, some of my colleagues at Cozi complained that IE6 would occasionally hang while loading the homepage of our web application. I took a look at some of these IE6 processes in Process Explorer and I could see that thread 0 (the UI thread) was stuck in MsgWaitForMultipleObjects.

I attached WinDbg to these hung IE6 processes and found a callstack like this:

 0:020> .symfix+
No downstream store given, using C:\Program Files\Debugging Tools for Windows\sym
0:020> .reload
................................................................................
Loading unloaded module list
.............
0:020> ~0k
ChildEBP RetAddr 
0013bfec 7c90e9ab ntdll!KiFastSystemCallRet
0013bff0 7c8094e2 ntdll!ZwWaitForMultipleObjects+0xc
0013c08c 7e4195f9 kernel32!WaitForMultipleObjectsEx+0x12c
0013c0e8 7e4196a8 user32!RealMsgWaitForMultipleObjectsEx+0x13e
0013c104 7e211e76 user32!MsgWaitForMultipleObjects+0x1f
0013c168 7e200b6b urlmon!CTransaction::CompleteOperation+0x140
0013c1a4 7e1ef557 urlmon!CTransaction::Start+0x52c
0013c228 7e1ef1b4 urlmon!CBinding::StartBinding+0x4d8     <<<<---
0013d2c0 7e1ef07e urlmon!CUrlMon::StartBinding+0x1d8
0013d2f8 6bdd9abe urlmon!CUrlMon::BindToStorage+0x67
0013d55c 6be43a78 dxtrans!CDXTransformFactory::LoadImageWrapW+0xdc
0013e600 77135d81 dxtmsft!CDXTAlphaImageLoader::put_Src+0x1b5
0013e61c 77136390 oleaut32!DispCallFunc+0x16a
0013e6ac 6be29b0b oleaut32!CTypeInfo2::Invoke+0x234
0013e6dc 6be29c9a dxtmsft!ATL::CComTypeInfoHolder::Invoke+0x45
0013e724 6be16448 dxtmsft!ATL::CComDispatchDriver::PutProperty+0x78
0013e75c 6be210eb dxtmsft!CComPropBase::IPersistPropertyBag_Load+0x9b
0013e770 6bde9513 dxtmsft!ATL::IPersistPropertyBagImpl<CDXTAlphaImageLoader>::Load+0x2a
0013e788 6bde51a3 dxtrans!CDXTFilter::Load+0x1f
0013e834 6bdea876 dxtrans!CDXTFilterBehavior::AddFilterFromBSTR+0x424
0:020> du (0013c228 + 48)
0013c270  "http://cozicentral.cozi.com/imag"
0013c2b0  "es/clock/clock_2.png"

(0013c228 is the ChildEBP address beside urlmon!CBinding::StartBinding+0x4d8. I was eventually to learn that du (<StartBinding's ChildEBP> + 48) would give me the URL that was being requested. It took probing with dds to figure this out. Spending 10 years in the Windows Division at Microsoft left me with advanced skills in debugging weird failures.)

http://www.georgevreilly.com/blog/content/binary/transparent-png/clock-digits.png

This happened intermittently, but it was more likely when the browser's cache had been flushed. The exact URL would vary, but it was always one of the PNG images used to draw the digital clock. We had a lot of transparent PNGs, but it always seemed to happen to the clock digits, which are loaded on demand. This image was always available: it was not a server problem. The hang would eventually time out, after 20 or 30 minutes. Until it did, you could do nothing with IE. It wouldn't repaint. You couldn't close the app. All you could do was kill it in Task Manager. When the UrlMon call finally did time out, the requested image would not have shown up, but otherwise the browser was fine. This happened only with IE6, never with IE7 or Firefox. This was very strange, and we really didn't know what to make of it.

It recurred sporadically, and not having any better ideas, we decided to replace the PNGs for the clock digits with transparent GIFs in the IE6-specific CSS. The PNGs use alpha transparency, to get antialiasing and the translucent reflection below the baseline. We were using filter:progid:DXImageTransform.Microsoft.AlphaImageLoader in IE6-specific CSS to get alpha transparency. Unlike modern browsers, IE6 does not support alpha transparency. Fully transparent pixels are rendered as an opaque blueish-gray. Hence, our use of the well-known AlphaImageLoader hack.

We couldn't bake the background into the clock digit images: we have cobranded editions with different backgrounds, so the background underlying the digits can vary with both cobrand and window size. GIFs may be transparent, but they have no notion of alpha transparency, so we had to remove the translucent reflection in the GIFs. Not as pretty, but acceptable.

Hangs, Redux

The problem went away for several weeks, but in late October, it came back with a vengeance, recurring multiple times a day. Some machines seemed immune, some could repro it with almost 100% success. There was no obvious commonality; it reproed on several builds of IE6. It had to be fixed: we could not ship a web application that sometimes hung IE6 hard.

I formed the hypothesis that the two-connection limit was contributing to the problem, since it looked like DirectX was bypassing WinInet and using UrlMon to fetch the image. At the time, we were making a lot of Ajax requests to fetch data for different portions of the homepage, and that somehow this was tying up WinInet.

Sure enough, raising IE's connection limit made the problem go away -- as far as we could tell. This wasn't an acceptable workaround. We could hardly tell our users, "By the way, our app tends to hang on Internet Explorer 6. Please make the following changes in your registry."

Commenting out the AlphaImageLoader filter from the CSS also made the problem go away, but the results looked horrible.

We weren't enthusiastic about converting all of our transparent PNGs to GIFs. Our design relies heavily on alpha blending and many of the softly blended border images, for example, would look bad without alpha transparency. GIFs are limited to a 256-color palette and a few of our images are very colorful. Moreover, there are a lot of images and maintaining two sets would be a pain. At that point, we were still trying to make our web application look just as good on IE6 as in IE7 and Firefox.

I had, of course, Googled extensively to see if anyone else knew what was going on. I didn't find anything that was of any use. I'm amazed that so few people seem to have encountered this. I don't have a really simple repro case, but what we were doing does not seem especially odd for a Web 2.0 application.

Enter Microsoft

So we contacted Microsoft.

Several of us are Microsoft veterans, so initially we made direct contact with the Internet Explorer team. They were able to reproduce the problem intermittently and asked us to open an official support incident.

Fairly quickly, they came up with a workaround of loading the images first and then applying the filter. This ensures that the image is in the browser's cache:

 <img src="…" onload="iePNGLoader.loadThis(this);" alt="…" width="…" height="…" />

var iePNGLoader =
{
   loadThis: function(img)
   {
     if (navigator.userAgent.indexOf("MSIE") >-1 && parseInt(navigator.appVersion) <= 6)
     {
       var pSrc = img.src;
       img.onload = null;
       img.src = "/…/shim.gif";
       img.style.filter = "progid:DXImageTransform.Microsoft.AlphaImageLoader(" +
               "enabled=true, src='" + pSrc + "')";
     }
   }
};

Unfortunately, that's not adequate for our web application. We make heavy use of background-images in CSS; onload events fire only for IMG tags. Also, I could never get img.onload events to fire after windows.onload, and we insert a number of images into the DOM as a result of Ajax calls.

We needed a JavaScript workaround, because even if Microsoft were to hotfix the bug, there's no way that we could ensure that all of our users installed it.

To make a long story slightly shorter, I spent a lot of time in November, with help from the IE CPR engineer, trying to preload images for IE6. To be completely safe, it was necessary to preload all of our images, even though most of them would never be used. I ended up loading them into 1x1 divs with a negative z-index obscured behind our toolbar. When windows.onload fired, I walked through all the transparent PNGs, setting their style.backgroundImage attributes (or src, if node.tagName == 'IMG') to a transparent 1x1 GIF (shim.gif, above), and applying the AlphaImageLoader filter. Inserting all of these images into the DOM added a very noticeable overhead, but we were prepared to live with it, if it gave us a reliable workaround.

Satzansatz has a comprehensive list of the shortcomings of AlphaImageLoader. For example, we use the background-repeat attribute to vertically or horizontally tile border images, allowing our bordered regions to resize themselves to the page's dimensions. Filtered images do not tile. At best, you can stretch them with sizingMethod='scale'. This happened to look okay with our border tiles, but it certainly wouldn't in general.

Anyway, preloading worked flawlessly on our intranet. Then we deployed to a test server in the datacenter, and it all fell apart. Apparently, the increased network latency exposed a bug in my preloading implementation. It turns out that if you dynamically insert images into the DOM, they are loaded after windows.onload fires -- even if the script itself executes before windows.onload. (Statically declared images are loaded before windows.onload fires, of course.) Hence, I was applying the filter before the image had necessarily loaded.

PNG8 to the Rescue

We were on the verge of switching over to GIF images, when one of my colleagues pointed out the PNG8 format to us, which we had never heard of before.

The SitePoint article on PNG8 contains a good explanation and examples. Briefly, however, typical PNG images contain arbitrary RGBA values, while PNG8 images contain an indexed palette of at most 256 colors, each of which can have 8-bit Red, Green, Blue, and Alpha values. This is similar to GIF's indexed palette, but PNG8 allows 256 degrees of opacity in each palette color, while GIF's transparency is all or nothing.

IE6, it turns out, does support transparency in PNG8 images, but it's a binary transparency, just like GIFs.

Here, then are three different logos, each rendered three different ways.

Here are the logos as PNG32s in IE6, without the AlphaImageLoader filter. Transparent pixels are opaque. Evidently unacceptable.

http://www.georgevreilly.com/blog/content/binary/transparent-png/ie6-png32.png

Here are the PNG8s in Firefox. Note the alpha transparency around the Cozi logo on the left. Look at the smooth edges in the logos. You can see the subtle alpha blending around the bordered box above the left Cozi logo and Mr Potato Head.

http://www.georgevreilly.com/blog/content/binary/transparent-png/ff-png.png

And here are the PNG8s in IE6. All alpha transparency is rendered as complete transparency. Note the jaggies in the logos, and the unsubtle borders around the box.

http://www.georgevreilly.com/blog/content/binary/transparent-png/ie6-png8.png

Clearly, PNG8 gives visually inferior results on IE6, but it looks great on modern browsers.

Kevin Freitas also covers the PNG8 technique. The comments in his article indicate that both the GIMP and Photoshop are capable of producing indexed PNG images.

The Aftermath

Googling again in early 2008, I did find an explanation from mid-2006. (I don't know why I never found this before. My google-fu is excellent.)

   
   

    To understand the problem, here is an explanation from the IE team     (thanks Peter Gurevich!):    

   
  • Each IE 6 window is a UI thread.

  • The HTML page and any script you wrote for the page run in the UI thread. Therefore filters in your page or in script will download on the UI thread

  • IE’s implementation of the AlphaImageLoader filter downloads images synchronously

  • Synchronous loading of an image or successive images on the UI thread has the potential to hang the browser and adversely affect the user experience.

   

They recommend using transparent GIFs as a workaround. We recommend using PNG8s.

I don't know why the AlphaImageLoader deadlock hasn't been a problem for more people. If they're out there, they haven't been vocal. The AlphaImageLoader hack is well known. The IE CPR engineer knew of only one other case like ours.

We have since decided that we will no longer make heroic efforts to get our web application looking just as good on IE6 as it does in modern browsers. Quite apart from the extraordinary amount of pain we endured in the Affair of the Transparent PNGs, fighting with IE6 has been a huge timesink for us, especially when it comes to working around its CSS bugs.

A significant fraction of our users are still using IE6, so we have no choice but to support it. However, we no longer aim to achieve full parity.

Alas, the recent announcement that Microsoft will be making IE7 an auto-update turns out not to be a panacea. IE7 will replace IE6 on intranets, at the discretion of IT organizations, but not on the Internet at large.

Acknowledgements. Despite my well-founded distaste for IE6, I'd like to thank the IE team members who helped us out. The amazing deadlock photo at the top of this article is taken from Ned Batchelder, who apparently got it from Reddit.

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/t/trackback/825287/27073186

Listed below are links to weblogs that reference Transparent PNGs Can Deadlock IE6:

Comments

I talked about this problem here on my blog[1] back in May 2007 and gave a similar "creative" work around. The biggest difference is, my images aren't static and must be loaded dynamically.

Cheers,
Drew

[1] http://blog.hackedbrain.com/archive/2007/05/21/6110.aspx

A great read. I've never liked the built-in complexity of the PNG32 hacks but I never realized how much was going on under the IE surface. If you guys couldn't adequately solve those issues, I think everyone should be scared.

Glad that PNG8 worked out to be a good fill-in.

Post a comment

Comments are moderated, and will not appear on this weblog until the author has approved them.

If you have a TypeKey or TypePad account, please Sign In

Other Cozi Blogs

  • Cozi Connection Blog
    Visit the Cozi Connection Blog for the latest information about Cozi (the company) and tips about Cozi (the software).
  • flow|state
    The user interface blog by Cozi founder Jan Miksovsky.