The Importance of Implementing APIC-Based Interrupt Subsystems on Uniprocessor PCsUpdated: January 7, 2003 On This Page
It is widely understood in the PC industry that for multiprocessor systems, 8259 Programmable Interrupt Controllers (PICs) are insufficient and Advanced Programmable Interrupt Controllers (APICs) are needed. However, the importance of APICs for uniprocessor PC platforms (especially mobile systems), is not as well understood. This article explains why APICs are important for uniprocessor systems. The Microsoft® Windows® Logo Program currently requires APICs in all non-mobile systems. At some point in the future, the Windows Logo Program will require APICs in all systems, including mobile systems. IntroductionAPICs are beneficial for the following reasons:
These issues are further discussed in the following sections. Technical DifferencesThe traditional 8259 interrupt controller is subject to significant legacy issues. IRQs 0, 1, 2, 6, 8, 12, 13, 14, and 15 are consumed by legacy devices. Furthermore, even when legacy devices are not present, these IRQs are often claimed by legacy software or firmware. IRQs 3 and 4 sometimes fall into this category as well. The result of these legacy issues is that only IRQs 5, 7, 9, 10, and 11 are available for general use on a typical machine. Audio hardware is almost always programmed to use IRQ 5. That leaves only four IRQs available for other devices to use. Most machines today have far more than four devices that are programmed to interrupt. For example, the system that was used to write this article includes:
And in the legacy, non-shareable category:
This means that, at best, this system will have an average of about three devices per available IRQ, assuming that all the devices in the system can share IRQs. Keep in mind that this machine has four CardBus slots, which means that, at any time, a user might plug in a PCMCIA device requiring two non-shareable IRQs for any one of those slots. Merely plugging cards into two of the four available slots could bring the machine to a state where it could not simultaneously operate all its devices. Note also that the PCI devices in this machine are not directly connected to the interrupt controller. There are four IRQ steering devices within the south bridge, each capable of directing a PCI interrupt to a number of IRQs. This forces the machine designer to take the 11 PCI devices in the preceding example and wire-OR some of them together before they reach the interrupt controller, thereby decreasing the number of IRQs that will serve them to four. APIC interrupt subsystems can have as many IRQs as are required in a specific machine. Commonly, chipset vendors design I/O APICs to have 24 IRQs each, and a client machine almost always contains only one I/O APIC. This is enough to guarantee a dedicated IRQ for each PCI device, which would make sharing necessary only when the user installs many PCMCIA devices. In an APIC-based system, each PCI device can be routed directly to an interrupt controller input on an IOAPIC. Alternatively, some can be routed directly to the I/O APIC, and some can be routed through the IRQ steering devices. Ideally, the chipset could include more steering devices. (No OEM has ever taken on the extra cost of providing steering devices outside the chipset, at least not in single-processor systems.) Many laptops are equipped with so few IRQs that they ship with the COM port or other internal devices disabled to ensure that IRQs remain available for PCMCIA devices. The situation is worse on machines using docking stations. Laptops typically ship with elaborate and confusing utilities that allow the end user to disable the modem so they can enable the COM port, and so on. Attempting to compensate for the lack of IRQs in this way degrades the usability of the system by making users do what the software should do, and what the software would do, if the hardware made it possible. Finally, the 8259 interrupt controller can actually drop interrupts, because of how it handles spurious interrupts. The APIC is less likely to have this problem. Why Interrupts Should Not Be SharedOn PIC-based systems, sharing interrupts is the only way to allow all or even most of the devices in the system to function. Microsoft has provided much information in sources such as white papers and the DDK to help vendors design hardware and drivers that can successfully share interrupts. However, interrupt sharing cannot be considered a sufficient solution to the interrupt problem on todays PIC-based PCs. Interrupt sharing has been required on many PC platforms, but it must be viewed as a necessary evil. The real solution to interrupt problems is to move to APIC-based systems. The problem with the lack of IRQs is not solved even when Windows can attach all the PCI devices in a given system to one or just a few IRQs so that IRQs remain to serve other devices. A quick review of driver development newsgroups, for example, makes it clear that many hardware designs are very sensitive to interrupt latency. To work around this sensitivity, hardware vendors often want to know how to ensure that their device never shares an interrupt. For these devices, running on an APIC system is the only option. The following examples should help to clarify the specific technical problems that are associated with interrupt sharing. When devices are forced to share interrupts, triggering an IRQ causes the following sequence of events: Level-Triggered (PCI) Case:
Edge-Triggered Case:
Note that with edge-triggered devices, it is necessary to iterate through the entire chain of ISRs each time any device on that IRQ interrupts, because there is no guarantee that a device will reassert an unhandled interrupt after it is acknowledged at the interrupt controller. On the machine being used to write this paper, there are 11 devices sharing IRQ 11. If any one of them interrupts, the operating system begins at the head of the ISR chain and examines each device until it finds the one that interrupted. Each hardware access can take several microseconds, and each ISR may access its device several (or even many) times. The result is that hardware accesses cause delays in the processing of all other tasks on the machine. Each interrupt can potentially cause a delay of most of a millisecond during which no other work can be done. This makes it very difficult to guarantee that time-sensitive devices, such as audio devices, function correctly. (This machine actually glitches its audio whenever there is a large amount of other activity in the machine.) These problems of interrupt latency, are, of course, not the only issues in today's machines that have to be addressed before real-time behavior can be convincingly achieved. But, these problems are on the critical path. Other architectural problems can arise when you cause PCI devices to share interrupts. For example: A machine contains a sound device and a USB controller, and both of these are connected to the same IRQ. The BIOS may try to use both devices during boot. The BIOS may attempt to access the sound device in order to play a welcome sound on startup. The BIOS may also attempt to access the USB controller on startup to determine whether the system uses a USB keyboard or mouse. As of PCI 2.0, the PCI specification does not provide a generic way to stop a device from interrupting. The interrupt disable bit in PCI 2.3 addresses this problem, but will not impact the installed baseof machines for some time. (In contrast, PCI 2.0 does provide a way to stop a device from decoding I/O and memory resources, and stop bus-master transactions, by clearing the Command register.) This means that the BIOS could leave both the USB controller and the sound device in an interrupting state. (This also is quite common.) The operating system has to load either the USB driver or the sound driver first. (Some might argue that these could be simultaneously loaded, but this is not possible if the system uses USB 2.0 and you are booting off of a USB-connected disk. And, it is certainly not possible to retrofit every existing Microsoft operating system to force drivers to enable interrupts simultaneously.) If you load USB before sound, you enable the IRQ with a sound interrupt pending. This causes an interrupt to be delivered, but with no ISR for the sound device in the ISR chain. The operating system calls the USB ISR and it returns with a value indicating that the interrupt was not caused by USB. The operating system then acknowledges the interrupt. However, because the interrupt is level-triggered, it is immediately reasserted and the operating system jumps right back into the interrupt-handling code. The result is that the machine is hung, endlessly dismissing interrupts (in other words, the machine is hung in an interrupt storm). Similar cases can occur when a machine is brought out of a suspended or hibernating state. So far, the discussion assumes that everything is working perfectly, that all hardware is perfectly well behaved, and that all device drivers are perfectly written. However, this is not always the case. Consider a case where a certain driver A is poorly written and always indicates that its ISR has just handled an interrupt. Driver A operates a device that uses level-triggered interrupts. (This is true for all new devices, because everything either is PCI or looks like PCI these days.) Imagine also that there is another driver B with an ISR that is farther down the chain. If the device associated with driver B interrupts, the operating system will never be able to call its ISR, because driver A will always claim the interrupt. In this case, the machine also hangs because of an interrupt storm. However, if it is able to get its own IRQ, driver A functions without a problem. In another case, two devices, a modem, and a CardBus controller share an IRQ. The machine is mobile and the user is not making any phone calls at the moment, so the operating system puts the modem in the D3 (powered-off) state. The driver for the modem unregisters its ISR and powers off its hardware. But, because of either a hardware or a software bug, the modem delivers an interrupt if the phone rings. If the modem had its own IRQ, the operating system would mask that IRQ when the driver unregistered its ISR. However, because these two devices share an IRQ, the operating system must leave the IRQ unmasked so the CardBus controller can function. If the phone rings at this time, an interrupt is delivered on the unmasked IRQ. There is no ISR registered for the modem hardware, so only the CardBus ISR is called, after which the operating system acknowledges the interrupt. Because the interrupt is still pending, the result is another interrupt storm. The preceding example is actually very common, because many hardware designers confuse the concept of an "interrupt" with that of a "wake signal," or PME. Hardware designers often err in thinking that a device interrupt should be triggered to cause a device to wake up. (For more information on this, please refer to GPE Routing for Microsoft Windows.) All these scenarios are avoided by putting an APIC in the system, which allows most or all devices to get their own IRQ. Finally, the native interrupt mechanism for PCI Express is MSI (message-signaled interrupts). This is also true for PCI-X. You cannot use MSI without APIC. Historical FactorsIn 1997, while Windows engineers were designing the subsystem in Windows 2000 that chooses IRQs for devices (the IRQ arbiter), a large number of machines with numerous ISA and/or PCMCIA slots remained in use. This meant that IRQs were even more constrained than they are today. There was one laptop in Windows test labs that shipped with enough devices in it to consume 19 IRQs. There was another that required 18 interrupts. Furthermore, this was before the machines were switched into ACPI mode, which consumed another IRQ. In 1996, Microsoft was told by Intel to expect all its machines to have I/O APICs by 1998. So, the IRQ arbiter was designed to cause every PCI device to share its IRQ with ACPI, when ACPI was enabled. This solved most of the interrupt constraint problems, but caused a huge amount of IRQ sharing. Many users did not like the implementation and switched off ACPI. But for other people, their machines functioned using the default behavior, even though the machines were subject to the common problems associated with interrupt sharing as explained earlier. In machines with an I/O APIC, Windows 2000 spreads out IRQs among the devices. The IRQ arbiter was altered in Windows XP to spread interrupts out more. However, operating system testing revealed an interesting issue. Because Windows 98 does not often use ACPI to configure the IRQ steering devices, and because Windows 2000 always put each device on its own IRQ, many ACPI BIOSes were found to contain bugs when the operating system started to use them differently. This finding required the development of heuristics to determine whether an operating system should spread out interrupts. The operating system, therefore, forces interrupt sharing (or stacking, the opposite of spreading) on any machine that uses a CardBus controller, if it has no I/O APIC. Aside from this type of BIOS issue, there are also device driver problems. In order to change the IRQ that a device uses at run time, the driver has to support the WDM action, IRP_MN_STOP_DEVICE. However, many drivers do not successfully implement it. When a machine is booting, the operating system must determine whether each newly discovered device gets its own IRQ. If it does, and if the driver does not support IRP_MN_STOP_DEVICE, the operating system can never reallocate that IRQ to another device. This means that the operating system must often take a conservative stacking approach to ensure that every device in the machine can be started, including ones that may be hot-inserted into CardBus slots after the machine has been running for a while. Industry Development IssuesIn addition to solving the architectural problems described earlier in this article, PCs must become less expensive and more reliable. The following list describes how Windows and existing hardware platforms can make these changes:
Call To Action
Other ResourcesSee recent Engineering Change Notifications (ECNs) to the PCI Specification, version 2.3, for more information about MSI. |