*
Microsoft.com Home|Site Map
Windows*
Search Microsoft.com for:
Windows Hardware and Driver Central *
|WHDC Process Guide|WHDC Site Map
Search WHDC for

WinHEC 2004

The Importance of Implementing APIC-Based Interrupt Subsystems on Uniprocessor PCs

Updated: January 7, 2003
*
On This Page
IntroductionIntroduction
Technical DifferencesTechnical Differences
Why Interrupts Should Not Be SharedWhy Interrupts Should Not Be Shared
Historical FactorsHistorical Factors
Industry Development IssuesIndustry Development Issues
Call To ActionCall To Action
Other ResourcesOther Resources

It is widely understood in the PC industry that for multiprocessor systems, 8259 Programmable Interrupt Controllers (PICs) are insufficient and Advanced Programmable Interrupt Controllers (APICs) are needed. However, the importance of APICs for uniprocessor PC platforms (especially mobile systems), is not as well understood.

This article explains why APICs are important for uniprocessor systems. The Microsoft® Windows® Logo Program currently requires APICs in all non-mobile systems. At some point in the future, the Windows Logo Program will require APICs in all systems, including mobile systems.

Introduction

APICs are beneficial for the following reasons:

APICs can contribute to resolving resource conflicts in the PC platform.

Windows operating systems have been designed with APICs in mind.

APICs are necessary for enabling new features in the PCI specification.

These issues are further discussed in the following sections.

Technical Differences

The traditional 8259 interrupt controller is subject to significant legacy issues. IRQs 0, 1, 2, 6, 8, 12, 13, 14, and 15 are consumed by legacy devices. Furthermore, even when legacy devices are not present, these IRQs are often claimed by legacy software or firmware. IRQs 3 and 4 sometimes fall into this category as well.

The result of these legacy issues is that only IRQs 5, 7, 9, 10, and 11 are available for general use on a typical machine. Audio hardware is almost always programmed to use IRQ 5. That leaves only four IRQs available for other devices to use. Most machines today have far more than four devices that are programmed to interrupt. For example, the system that was used to write this article includes:

One video device

One extra, non-legacy IDE controller (in the docking station)

One audio device (which uses an IRQ in addition to IRQ 5)

One USB controller

One modem

One 1394 controller

Four CardBus controllers (two in the docking station)

One Ethernet controller

And in the legacy, non-shareable category:

System timer

PS/2 keyboard

PS/2 mouse

Infrared

COM1

Floppy

RTC

Floating-point processor

Primary IDE

Secondary IDE

This means that, at best, this system will have an average of about three devices per available IRQ, assuming that all the devices in the system can share IRQs. Keep in mind that this machine has four CardBus slots, which means that, at any time, a user might plug in a PCMCIA device requiring two non-shareable IRQs for any one of those slots. Merely plugging cards into two of the four available slots could bring the machine to a state where it could not simultaneously operate all its devices.

Note also that the PCI devices in this machine are not directly connected to the interrupt controller. There are four IRQ steering devices within the south bridge, each capable of directing a PCI interrupt to a number of IRQs. This forces the machine designer to take the 11 PCI devices in the preceding example and wire-OR some of them together before they reach the interrupt controller, thereby decreasing the number of IRQs that will serve them to four.

APIC interrupt subsystems can have as many IRQs as are required in a specific machine. Commonly, chipset vendors design I/O APICs to have 24 IRQs each, and a client machine almost always contains only one I/O APIC. This is enough to guarantee a dedicated IRQ for each PCI device, which would make sharing necessary only when the user installs many PCMCIA devices.

In an APIC-based system, each PCI device can be routed directly to an interrupt controller input on an IOAPIC. Alternatively, some can be routed directly to the I/O APIC, and some can be routed through the IRQ steering devices. Ideally, the chipset could include more steering devices. (No OEM has ever taken on the extra cost of providing steering devices outside the chipset, at least not in single-processor systems.)

Many laptops are equipped with so few IRQs that they ship with the COM port or other internal devices disabled to ensure that IRQs remain available for PCMCIA devices. The situation is worse on machines using docking stations. Laptops typically ship with elaborate and confusing utilities that allow the end user to disable the modem so they can enable the COM port, and so on. Attempting to compensate for the lack of IRQs in this way degrades the usability of the system by making users do what the software should do, and what the software would do, if the hardware made it possible.

Finally, the 8259 interrupt controller can actually drop interrupts, because of how it handles spurious interrupts. The APIC is less likely to have this problem.

Why Interrupts Should Not Be Shared

On PIC-based systems, sharing interrupts is the only way to allow all or even most of the devices in the system to function. Microsoft has provided much information in sources such as white papers and the DDK to help vendors design hardware and drivers that can successfully share interrupts. However, interrupt sharing cannot be considered a sufficient solution to the interrupt problem on todays PIC-based PCs. Interrupt sharing has been required on many PC platforms, but it must be viewed as a necessary evil. The real solution to interrupt problems is to move to APIC-based systems.

The problem with the lack of IRQs is not solved even when Windows can attach all the PCI devices in a given system to one or just a few IRQs so that IRQs remain to serve other devices. A quick review of driver development newsgroups, for example, makes it clear that many hardware designs are very sensitive to interrupt latency. To work around this sensitivity, hardware vendors often want to know how to ensure that their device never shares an interrupt. For these devices, running on an APIC system is the only option.

The following examples should help to clarify the specific technical problems that are associated with interrupt sharing. When devices are forced to share interrupts, triggering an IRQ causes the following sequence of events:

Level-Triggered (PCI) Case:

1.

The processor retrieves an IDT vector from the interrupt controller and does a look-up in the IDT to find a dispatch address.

2.

The code at the dispatch address examines the interrupt service routines that were registered by device drivers for that IRQ. It then calls them in the order in which they are listed.

3.

Each interrupt service routine (ISR) probes the hardware for the device and determines if the device is, in fact, interrupting.

If the device is interrupting, the ISR queues any work to be done and causes the hardware to stop interrupting. (At this point, the device hardware has been accessed at least twice.) The ISR then returns a value indicating that the interrupt has been handled. The operating system then acknowledges the interrupt.

If the device is not interrupting, the ISR returns a value indicating that the interrupt was not caused by its device. The operating system then goes on to the next ISR in the chain and returns to step 3.

Edge-Triggered Case:

1.

The processor retrieves an IDT vector from the interrupt controller and does a look-up in the IDT to find a dispatch address. (Same as level-triggered.)

2.

The code at the dispatch address examines the interrupt service routines that were registered by device drivers for that IRQ. It then calls them in the order in which they are listed. (Same as level-triggered.)

3.

The operating system acknowledges the interrupt, so that any future edge-triggered interrupts are not lost.

4.

The operating system calls the first ISR in the chain.

5.

The ISR probes the hardware and attempts to handle the interrupt.

6.

If there are additional ISRs, the operating system goes on to the next one and returns to step 5.

Note that with edge-triggered devices, it is necessary to iterate through the entire chain of ISRs each time any device on that IRQ interrupts, because there is no guarantee that a device will reassert an unhandled interrupt after it is acknowledged at the interrupt controller.

On the machine being used to write this paper, there are 11 devices sharing IRQ 11. If any one of them interrupts, the operating system begins at the head of the ISR chain and examines each device until it finds the one that interrupted. Each hardware access can take several microseconds, and each ISR may access its device several (or even many) times. The result is that hardware accesses cause delays in the processing of all other tasks on the machine. Each interrupt can potentially cause a delay of most of a millisecond during which no other work can be done. This makes it very difficult to guarantee that time-sensitive devices, such as audio devices, function correctly. (This machine actually glitches its audio whenever there is a large amount of other activity in the machine.)

These problems of interrupt latency, are, of course, not the only issues in today's machines that have to be addressed before real-time behavior can be convincingly achieved. But, these problems are on the critical path.

Other architectural problems can arise when you cause PCI devices to share interrupts. For example: A machine contains a sound device and a USB controller, and both of these are connected to the same IRQ. The BIOS may try to use both devices during boot. The BIOS may attempt to access the sound device in order to play a welcome sound on startup. The BIOS may also attempt to access the USB controller on startup to determine whether the system uses a USB keyboard or mouse.

As of PCI 2.0, the PCI specification does not provide a generic way to stop a device from interrupting. The interrupt disable bit in PCI 2.3 addresses this problem, but will not impact the installed baseof machines for some time. (In contrast, PCI 2.0 does provide a way to stop a device from decoding I/O and memory resources, and stop bus-master transactions, by clearing the Command register.) This means that the BIOS could leave both the USB controller and the sound device in an interrupting state. (This also is quite common.)

The operating system has to load either the USB driver or the sound driver first. (Some might argue that these could be simultaneously loaded, but this is not possible if the system uses USB 2.0 and you are booting off of a USB-connected disk. And, it is certainly not possible to retrofit every existing Microsoft operating system to force drivers to enable interrupts simultaneously.) If you load USB before sound, you enable the IRQ with a sound interrupt pending. This causes an interrupt to be delivered, but with no ISR for the sound device in the ISR chain. The operating system calls the USB ISR and it returns with a value indicating that the interrupt was not caused by USB. The operating system then acknowledges the interrupt. However, because the interrupt is level-triggered, it is immediately reasserted and the operating system jumps right back into the interrupt-handling code. The result is that the machine is hung, endlessly dismissing interrupts (in other words, the machine is hung in an interrupt storm).

Similar cases can occur when a machine is brought out of a suspended or hibernating state.

So far, the discussion assumes that everything is working perfectly, that all hardware is perfectly well behaved, and that all device drivers are perfectly written. However, this is not always the case.

Consider a case where a certain driver A is poorly written and always indicates that its ISR has just handled an interrupt. Driver A operates a device that uses level-triggered interrupts. (This is true for all new devices, because everything either is PCI or looks like PCI these days.) Imagine also that there is another driver B with an ISR that is farther down the chain. If the device associated with driver B interrupts, the operating system will never be able to call its ISR, because driver A will always claim the interrupt. In this case, the machine also hangs because of an interrupt storm. However, if it is able to get its own IRQ, driver A functions without a problem.

In another case, two devices, a modem, and a CardBus controller share an IRQ. The machine is mobile and the user is not making any phone calls at the moment, so the operating system puts the modem in the D3 (powered-off) state. The driver for the modem unregisters its ISR and powers off its hardware. But, because of either a hardware or a software bug, the modem delivers an interrupt if the phone rings. If the modem had its own IRQ, the operating system would mask that IRQ when the driver unregistered its ISR. However, because these two devices share an IRQ, the operating system must leave the IRQ unmasked so the CardBus controller can function. If the phone rings at this time, an interrupt is delivered on the unmasked IRQ. There is no ISR registered for the modem hardware, so only the CardBus ISR is called, after which the operating system acknowledges the interrupt. Because the interrupt is still pending, the result is another interrupt storm.

The preceding example is actually very common, because many hardware designers confuse the concept of an "interrupt" with that of a "wake signal," or PME. Hardware designers often err in thinking that a device interrupt should be triggered to cause a device to wake up. (For more information on this, please refer to GPE Routing for Microsoft Windows.)

All these scenarios are avoided by putting an APIC in the system, which allows most or all devices to get their own IRQ.

Finally, the native interrupt mechanism for PCI Express is MSI (message-signaled interrupts). This is also true for PCI-X. You cannot use MSI without APIC.

Historical Factors

In 1997, while Windows engineers were designing the subsystem in Windows 2000 that chooses IRQs for devices (the IRQ arbiter), a large number of machines with numerous ISA and/or PCMCIA slots remained in use. This meant that IRQs were even more constrained than they are today. There was one laptop in Windows test labs that shipped with enough devices in it to consume 19 IRQs. There was another that required 18 interrupts. Furthermore, this was before the machines were switched into ACPI mode, which consumed another IRQ.

In 1996, Microsoft was told by Intel to expect all its machines to have I/O APICs by 1998. So, the IRQ arbiter was designed to cause every PCI device to share its IRQ with ACPI, when ACPI was enabled. This solved most of the interrupt constraint problems, but caused a huge amount of IRQ sharing. Many users did not like the implementation and switched off ACPI. But for other people, their machines functioned using the default behavior, even though the machines were subject to the common problems associated with interrupt sharing as explained earlier.

In machines with an I/O APIC, Windows 2000 spreads out IRQs among the devices.

The IRQ arbiter was altered in Windows XP to spread interrupts out more. However, operating system testing revealed an interesting issue. Because Windows 98 does not often use ACPI to configure the IRQ steering devices, and because Windows 2000 always put each device on its own IRQ, many ACPI BIOSes were found to contain bugs when the operating system started to use them differently. This finding required the development of heuristics to determine whether an operating system should spread out interrupts. The operating system, therefore, forces interrupt sharing (or stacking, the opposite of spreading) on any machine that uses a CardBus controller, if it has no I/O APIC.

Aside from this type of BIOS issue, there are also device driver problems. In order to change the IRQ that a device uses at run time, the driver has to support the WDM action, IRP_MN_STOP_DEVICE. However, many drivers do not successfully implement it. When a machine is booting, the operating system must determine whether each newly discovered device gets its own IRQ. If it does, and if the driver does not support IRP_MN_STOP_DEVICE, the operating system can never reallocate that IRQ to another device. This means that the operating system must often take a conservative stacking approach to ensure that every device in the machine can be started, including ones that may be hot-inserted into CardBus slots after the machine has been running for a while.

Industry Development Issues

In addition to solving the architectural problems described earlier in this article, PCs must become less expensive and more reliable. The following list describes how Windows and existing hardware platforms can make these changes:

The PC platform should stop using physical wires on the motherboard to send interrupts from PCI devices (and the code in the firmware that describes them). This can be done using message-signaled interrupts (MSI). However, it will not be possible to make this change until all existing operating systems use MSI. And, operating systems cannot implement MSI until hardware vendors enable it on a widespread basis. Hardware vendors need to act now to enable MSI on the hardware level so operating systems can implement MSI in the years to come. (This can be achieved in part by moving to serial approximations of PCI. However, we can still eliminate more complexity for a better implementation.)

Adopting MSI today would allow each MSI-enabled device to get its own IRQ without consuming any of the inputs on the I/O APIC. For example, manufacturers would be able to continue shipping machines containing a single I/O APIC with only 24 inputs, without constraining the machine to 24 IRQs. This is only true in systems that have an I/O APIC, however, because you cannot activate MSI without moving to an APIC interrupt delivery model.

If we do not stop IRQ sharing, real-time performance on the PC platform will be negatively impacted, particularly where audio is concerned.

Both on the hardware and the software level, the industry must move to resolve these problems.

Call To Action

Follow new Windows Logo program requirements for providing APIC-based interrupt subsystems.

Implement APIC-based interrupt subsystems in all PC platforms: uniprocessor and multiprocessor; and mobiles, desktops and servers.

Other Resources

See recent Engineering Change Notifications (ECNs) to the PCI Specification, version 2.3, for more information about MSI.



©2003 Microsoft Corporation. All rights reserved. Terms of Use |Privacy Statement
Microsoft