Friday, December 22, 2006

BH #2 - IRQ's and Interrupts, More Than You Ever Wanted to Know

If you tinker with computers long enough, you will eventually hear about IRQ's and interrupts. Yes, more acronyms. But what the heck are they? Well, most likely if your an old schooler from way back in the DOS days, when your parents had to walk five miles to school and back in the snow uphill both ways (yea yea lame old age joke), you probably have an idea of what they do, or at least know they are of some importance. This article will attempt to demystify them.



The best place to begin our story is at the beginning, in the early days of PC's. I'm talking real early, when PC's had Intel 8085, 8086, and 8088 processors. These machines also had Programmable Interrupt Controllers. (PIC's) These handy dandy chips acted as a traffic cop of sorts. They handled input from the various other hardware devices sitting on the system bus, and decided who got the processors attention and when. These original PIC's were made by Intel, dubbed the 8259 family. There were several types of these made, but I won't go into the various differences of them, as they are minor. This chip acts as a multi-plexer.It combines multiple hardware interrupt inputs into a single interrupt output to interrupt one device. The device that gets interrupted in your computer is the processor. (Today these chips are built in as part of the Southbridge chipset on x86 compatible motherboards.) These chips were created to take care of a performance bottleneck in early computing design.



Before these chips were created, computer processors had to poll each hardware device to see if it needed something processed. This resulted in a lot of wasted processing power. PIC's allow the processor to continue number crunching until a hardware interrupt is encountered. The processor will then take care of the request, then go about it's business. The main connectors on an 8259 chip consisted of 8 interrupt input request lines labeled IRQ0 through IRQ7, an interrupt request output line labeled INTR, an interrupt acknowledgment line labeled INTA, and D0 through D7 for communicating the interrupt level or vector offset. (An interrupt vector is the memory address of an interrupt handler, or an index into an array called an interrupt vector table or dispatch table. Interrupt vector tables contain the memory addresses of interrupt handlers. When an interrupt is generated, the processor saves its execution state via a context switch, and begins execution of the interrupt handler at the interrupt vector.) It also had CAS0 through CAS2 for cascading between two chips. (connecting multiple 8259's together to obtain more interrupts) Up to eight slave 8259s may be cascaded to a master 8259 to provide up to 64 IRQs. They are cascaded by connecting the INT line of one slave 8259 to the IRQ line of one master 8259.



The 8259A's Registers

A register is a small area of static RAM that the chip is able to access very quickly, as well as manipulate. They can come in different sizes, from 8-bit, 16-bit, 32-bit, 64-bit, and so on. The 8259A has three 8-bit registers that determine its behavior: the IMR (Interrupt Mask Register), the ISR (In-Service Register), and the IRR (Interrupt Request Register). The bits of these registers are numbered 0 through 7, where 0 is the least significant and 7 is the most significant bit. Each bit of each of these registers corresponds to the respective interrupt pin on the PIC. That is, bit 7 corresponds to IRQ 7, bit 6 corresponds to IRQ 6, and so on.



The IMR. This register lets the programmer disable or "mask" individual interrupts so that the PIC doesn't interrupt the processor when the corresponding interrupt is signaled. For an interrupt to be disabled, its corresponding bit in the IMR must be 1. To be enabled, its bit must be 0. Interrupts can be enabled or disabled by the programmer by reading the IMR, setting or clearing the appropriate bits, then writing the new value back to the IMR.



The IRR. This register indicates when an interrupt has been signaled by a device. As soon as a device signals an interrupt, the corresponding bit in the IRR is set to a 1. This register can only be modified by the PIC and its contents usually aren't important to the programmer. It can be used to tell which interrupts are waiting to be serviced.



The ISR. This register indicates which interrupts are currently being serviced (i.e., which ISRs have begun execution and have not yet finished). A 1 bit indicates that the corresponding ISR is currently in-service. Several interrupts can be in-service at the same time because of interrupt nesting. The PIC uses this register to determine the highest priority of the interrupts currently being serviced. With this information, the PIC will only interrupt the processor if the highest priority set bit in the IRR has a higher priority than the highest priority set bit in the ISR. In other words, the PIC will never interrupt an in-service interrupt in order to service another interrupt of the same or lower priority. Before an ISR finishes executing, it must send to the PIC the end of interrupt command (EOI) so that the PIC knows that it can safely clear the highest priority bit in the ISR and signal any other pending interrupts. Be careful not to confuse "In-Service Register" with "Interrupt Service Routine". Both of these use the "ISR" acronym.


The first PC's only contained one PIC, but later on two were cascaded together to allow more device to be attached to the system. The following chart lists the IRQ's and what they are normally used for.



  • IRQ 0 - System timer. Reserved for the system. Cannot be changed by a user.
  • IRQ 1 - Keyboard. Reserved for the system. Cannot be altered even if no keyboard is present or needed.
  • IRQ 2 - Second IRQ controller. See below for explanation.
  • IRQ 3 - COM 2(Default) COM 4(User)
  • IRQ 4 - COM 1(Default) COM 3(User)
  • IRQ 5 - Sound card (Sound Blaster Pro or later) or LPT2(User)
  • IRQ 6 - Floppy disk controller
  • IRQ 7 - LPT1(Parallel port) or sound card (8-bit Sound Blaster and compatibles)
  • IRQ 8 - Real time clock
  • IRQ 9 - ACPI SCI or ISA MPU-401
  • IRQ 10 - Free / Open interrupt / Available / SCSI
  • IRQ 11 - Free / Open interrupt / Available / SCSI
  • IRQ 12 - PS/2 connector Mouse / If no PS/2 connector mouse is used, this can be used for other peripherals
  • IRQ 13 - Math co-processor. Cannot be changed
  • IRQ 14 - Primary IDE. If no Primary IDE this can be changed
  • IRQ 15 - Secondary IDE


IRQs 0 to 7 are managed by one Intel 8259 PIC, and IRQs 8 to 15 by a second Intel 8259 PIC. The first PIC, the master, is the only one that directly signals the CPU. The second PIC, the slave, instead signals to the master on its IRQ2 line, and the master passes the signal on. Because of this, there are only 15 interrupt request lines available for hardware. Lower IRQ's have a higher priority. From looking at the list, you can definitely see why. The system timer is the most important. It is essentially the heartbeat of the system. Next is the keyboard. Generally, if everything is connected properly, and your keyboard caps lock key won't even change the state of it's status L.E.D. then your pretty much screwed, and will have to reboot.



Now IRQ 2 presents an interesting case. It's cascaded. This means everything on the slave PIC gets priority over everything else below it on the master PIC. So next in the pecking order we have the real time clock which keeps your date and time correct. Next up we have the ACPI (Advanced Configuration Power Interface) which controls various power saving features for the system, or the ISA MPU-401 which is a MIDI interface. (allows you to connect musical instruments to your computer) It continues down the list until IRQ 15, then goes back to IRQ3 and continues to IRQ 7. IRQ's 0, 1, 2, and 13 can't be changed. However, the other ones can. Usually its easier to use the settings listed in the chart, as these have become the standard.



When DOS and the early days of Windows, (pre-Windows 95), when you bought an add in card for your computer, you had to configure the card to use a free IRQ. This was typically done by placing a jumper on the card onto the appropriate connectors. If two cards in your system were not designed to share an IRQ and you attempted to do so, the result would usually be a system crash, or possibly not even boot at all.



Various Types of Interrupts

Hardware interrupts are not the only type of interrupts available in the computing world. Interrupts can be divided into two categories.

  • Synchronous interrupts are predictable interrupts that occur at known times, such as the execution of software interrupt instructions.
  • Asynchronous interrupts are unpredictable interrupts that may occur at any time, such the generation of an interrupt by a hardware device when it needs servicing.


Interrupts may be implemented in hardware as a distinct system with control lines, or they may be integrated into the memory subsystem. If implemented in hardware, a PIC, as we discussed earlier, or an APIC (Advanced Programmable Interrupt controller) may be used. (we'll get to APIC's in a bit) If implemented as part of the memory controller, interrupts are mapped into the system's memory address space. These interrupts can also be divided further as follows:



  • A software interrupt is an interrupt generated within a processor by executing an instruction. An example of this would be system calls, or a program using a subroutine stored in BIOS to display character to the screen.
  • A maskable interrupt is essentially a hardware interrupt which may be ignored by setting a bit in an interrupt mask register's (IMR) bit-mask.
  • A non-maskable interrupt is a hardware interrupt which typically does not have a bit-mask associated with it allowing it to be ignored.
  • An interprocessor interrupt is a special type of interrupt which is generated by one processor to interrupt another processor in a multiprocessor system.
  • A spurious interrupt is a hardware interrupt that is generated by system errors, such as electrical noise on one of the PICs interrupt lines.



Processors also have an internal interrupt mask that will allow it to ignore all hardware interrupts while it is set. This is typically used in programs that require specific timing and the fastest execution possible for the program. However, misuse of this mask can slow down the system.




Level Triggered vs. Edge Triggered



Level Triggered


A level-triggered interrupt is a class of interrupts where the presence of an unserviced interrupt is indicated by a high level (1), or low level (0), of the interrupt request line. A device that wants to signal an interrupt changes the line to its active level, and then holds it at that level until serviced. It stops asserting the line when the CPU commands it to or otherwise handles the condition that caused it to signal the interrupt. Normally, the processor samples the interrupt input at predefined times during each bus cycle. If the interrupt isn't active when the processor samples it, the CPU doesn't see it. This helps to prevent erroneous interrupts that may be caused by line noise. (electromagnetic interference) Multiple devices may share a level-triggered interrupt line if they are designed to. The interrupt line must have a pull-down or pull-up resistor so that when not active, it settles to its inactive state. Devices actively assert the line to indicate an outstanding interrupt, but let the line float (do not actively drive it) when not signaling an interrupt. The line is then in its asserted state when any (one or more than one) of the sharing devices is signaling an outstanding interrupt.



This class of interrupts is favored by some because of a convenient behavior when the line is shared. When the interrupt line is activated, the CPU must search through the devices sharing it until the one who activated it is detected. After servicing that one, the CPU may recheck the interrupt line status to see if any other devices need servicing. If the line is no longer asserted, then the CPU avoids the need to check all the remaining devices on the line. Where some devices interrupt much more than others, or where some devices are particularly expensive to check for interrupt status, a careful ordering of device checks brings some efficiency gain.



The protocol of a level-triggered interrupt goes like this:


  • When the interrupt signal is held low (for active-low interrupts, which are the most common), the interrupt controller will generate an interrupt.
  • If the interrupt is acknowledged and the signal is still low, then the interrupt controller will generate another interrupt.


    This is good for sharing, because it confirms that the operating system handled the interrupting device. The algorithm is:



    1. Take an interrupt.
    2. Run the first ISR in the chain.
    3. If that ISR returns TRUE (it handled an interrupt), then the operating system will ACK the interrupt and quit.
    4. If that ISR returns FALSE, then run the next ISR in the chain and go to Step 3.



    If multiple devices interrupt, the operating system will find the first one, run its ISR, and then ACK the interrupt. Then the operating system will immediately take another interrupt on that vector. At this point, the operating system runs through all the ISRs until it finds the second device, runs its ISR, and ACKs the interrupt again.



    There are some serious problems with sharing level-triggered interrupts. As long as any device on the line is requesting service, the line remains active, so it is not possible to detect a change in the status of any other device. Delaying the servicing of a low-priority device is not an option, because this would prevent detection of service requests from higher-priority devices. If there is a device on the line that the CPU does not know how to service, then any interrupt from that device permanently blocks all interrupts from the other devices.



    For example, lets say we're a suit, working for a fortune 500 company. We have our own top of the line laptop, and do a lot of work with it. We have docking stations both at home, and at work. We've got a printer, network card, DVD+R drive, maybe a USB camera, throw in a gamepad for the docking station at home, PDA docking station connected as well, and oh maybe a video capture device, scanner, etc. connected to the docking station. Lets say we have twelve devices all chained on one vector, on this docked laptop. Every time the operating system takes an interrupt, the operating system must run as many as twelve ISRs before it begins to handle the condition that caused the interrupt. I'm sure you can see where this is going.



    (Just in case you don't remember, an ISR is an Interrupt Service Routine. Basically it's a bit of code that does what it's suppose to do for a particular interrupt. For all you die hard programmers out there, think of it as a function call for hardware.)



    Even assuming that every device in the chain is well designed and can determine within a few I/O cycles whether it is generating an interrupt, a huge interrupt latency (delay) occurs, as each interrupt causes the operating system to touch up to twelve different pieces of hardware to find the correct one.



    Unfortunately, the reality of the modern PC market is that many devices are poorly designed and misbehave. Most hardware is designed with the mindset that the ISR will handle all the work of the interrupt. In other words, the hardware designers are passing the buck to the programmers writing the device driver. So now you have a lazy engineer ignoring the need for good hardware synchronization in their chip designs. This forces the driver programmer to write larger, slower code to compensate. The result is device drivers that do most of their real work in their ISR. This exponentially lengthens the time the OS spends running the ISR chain, causing it to run for relatively long periods of time with interrupts off and no threads or deferred procedure calls (DPCs). Imagine trying to do real-time video manipulation when every interrupt causes a 500us delay in processing. (*click* *go make coffee* *click* *order out for pizza* *click* *go take a nap*...)



    The previous examples assume that all device drivers are well behaved. However, if even one device driver returns FALSE from its ISR when its device is actually interrupting, then a system will hang as a result. The ACK causes the interrupt controller to generate another interrupt. The operating system runs the ISR chain forever.



    APICs provide more interrupt resources, thus greatly reducing, if not removing, the need to share interrupts among hardware devices. The original PCI standard mandated shareable level-triggered interrupts. The rationale for this was the efficiency gain discussed above. (Newer versions of PCI allow, and PCI Express requires, the use of message-signaled interrupts.)



    Edge-triggered

    An edge-triggered interrupt is a class of interrupts that are signaled by a level transition on the interrupt line, either a falling edge (1 to 0) or (usually) a rising edge (0 to 1). A device that wants to signal an interrupt drives a pulse onto the line, then returns the line to its normal state. If the pulse is too short to be detect by polled I/O, then special hardware may be required to detect the edge. Multiple devices can share an edge-triggered interrupt line if they are designed to. The interrupt line must have a pull-down or pull-up resistor so that when not actively driven, it settles to one particular state. Devices signal an interrupt by briefly driving the line to its non-default state, and let the line float (do not actively drive it) when not signaling an interrupt. The line then carries all the pulses created by all the devices. However, interrupt pulses from different devices may merge if they occur too close to each other. To avoid losing interrupts the CPU must trigger on the trailing edge of the pulse (the rising edge if the line is pulled up and driven low).



    After detecting an interrupt the CPU must check all the devices for service requirements. Edge-triggered interrupts don't suffer the problems that level-triggered interrupts have with sharing. Servicing a low-priority device can be delayed, and interrupts will continue to be received from the high-priority devices that are being serviced. If there's a device that the CPU doesn't know how to service, it may cause a spurious interrupt, or even periodic spurious interrupts, but it doesn't interfere with the interrupt signaling of the other devices.



    With any interrupt controller, an edge-triggered interrupt is a one-time event. There isn't any feedback to the OS. The OS never really knows when it has handled the situation that caused the interrupt. It can only know that an event happened recently. The only rational response to an edge-triggered interrupt is to run all the Interrupt Service Routines (ISRs) associated with that vector once, with the hope that this will resolve it. (kinda like killing a mouse with an elephant gun) This situation is especially sketchy when dealing with hardware that doesn't give any real indication of why it interrupted. This is a common occurrence among today's edge-triggering devices. The result is that the OS can miss interrupts delivered in the interval between when an interrupt is first taken and when it is acknowledged. With an 8259 PIC, the situation is even worse. The 8259 is inherently unreliable, particularly when coupled with an actual ISA bus. The old ISA bus uses edge-triggered interrupts, but doesn't mandate that devices be able to share them. Many older devices assume that they have exclusive use of their interrupt line, making it electrically unsafe to share them. The operating system software will see a number of spurious interrupts, some of which show up on different vectors than the original signal. However, ISA motherboards include pull-up resistors on the IRQ lines, so well-behaved devices share ISA interrupts just fine. When dealing with this older hardware, it's usually safer not to assume anything though. Always check your documentation.



    Combined Method


    Some systems use a hybrid of level-triggered and edge-triggered signaling. The hardware looks for an edge and verifies that the interrupt signal stays active for a certain period of time. A common hybrid interrupt is the NMI (non-maskable interrupt) input. Because NMIs generally signal major, or even catastrophic system events, a good implementation of this signal tries to ensure that the interrupt is valid by verifying that it remains active for a period of time. This two step approach helps to prevent false interrupts from screwing up the system. There are also some BIOS implementations that allow you to configure which method you wish to use as well as using the combined method.



    Message Signaled


    A message-signaled interrupt doesn't use a physical interrupt line. Instead, a device signals its request for service by sending a short message over some communications medium, typically a bus. The message might be of a type reserved for interrupts, or it might be of some pre-existing type such as a memory write. Message-signaled interrupts behave very much like edge-triggered interrupts. The interrupt is a momentary signal instead of a continuous condition. Interrupt handling software treats the two in almost the same manner. Usually, multiple pending message signaled interrupts with the same message (the same virtual interrupt line) are allowed to merge, just as closely spaced edge-triggered interrupts can merge. Message-signaled interrupt vectors can share the same communications medium (the same bus) without any extra effort. The identity of the interrupt is indicated by a pattern of data bits. It doesn't require a separate physical conductor. More separate interrupts can be handled, reducing the need for sharing. Interrupt messages can also be passed over a serial bus, not requiring any additional lines. PCI Express is a serial computer bus, and uses message-signaled interrupts exclusively.



    Why Sharing Interrupts Are Bad


    Regardless of the triggering style, multiple devices sharing an interrupt line act as spurious interrupt sources to each other. With many devices on one line, the workload in servicing interrupts grows as the square of the number of devices. It's preferred to spread devices evenly across the available interrupt lines. Shortage of interrupt lines is a problem in older system designs where the interrupt lines are actual physical conductors. Message-signaled interrupts, where the interrupt line is virtual, are favored in new system architectures (such as PCI Express) and go a long way towards fixing this problem. Some devices with poorly designed programming interfaces provide no way to determine if they have requested service. They may lock up or otherwise misbehave if serviced when they don't want it. Such devices can't tolerate spurious interrupts, and are useless when it comes to interrupt sharing. ISA cards are usually cheaply designed and constructed, and are notorious for this problem. Luckily, these are rarer today thanks to cheaper hardware logic, and newer specs that mandate interrupt sharing. On PIC-based systems, sharing interrupts is the only way to allow all or even most of the devices in the system to function. OS vendors have provided a lot of information to help hardware vendors design hardware and drivers that can successfully share interrupts. However, interrupt sharing cannot be considered a sufficient solution to the interrupt problem on todays PIC-based PCs. Interrupt sharing has been required on many PC platforms, but it is viewed as a necessary evil. The real solution to interrupt problems is to move to APIC-based systems.




    The problem with the lack of IRQs is not solved even when the OS can attach all the PCI devices in a given system to one or just a few IRQs so that IRQs remain to serve other devices. A quick review of driver development newsgroups, for example, makes it clear that a lot of hardware designs are very sensitive to interrupt latency. To work around this sensitivity, hardware vendors often want to know how to make ure their device never shares an interrupt. For these devices, running on an APIC system is the only option.




    These problems of interrupt latency aren't the only issues in today's machines that have to be addressed before real-time behavior can be convincingly achieved. But, these problems are on the critical path.



    Other architectural problems can arise when you cause PCI devices to share interrupts. For example, a machine contains a sound device and a USB controller. Lets say both of these are connected to the same IRQ. The BIOS might try to use both devices during boot up. The BIOS might try to access the sound device in order to play a welcome sound on startup. The BIOS might also try to access the USB controller on startup to determine whether the system uses a USB keyboard or mouse.



    As of PCI 2.0, the PCI specification doesn't provide a generic way to stop a device from interrupting. The interrupt disable bit in PCI 2.3 addresses this problem, but won't impact the older machines already out there. (In contrast, PCI 2.0 does provide a way to stop a device from decoding I/O and memory resources, and stop bus-master transactions, by clearing the Command register.) This means that the BIOS could leave both the USB controller and the sound device in an interrupting state. (This also is quite common.)



    The operating system has to load either the USB driver or the sound driver first. (Some might argue that these could be simultaneously loaded, but this is not possible if the system uses USB 2.0 and you are booting off of a USB-connected disk. And, it is certainly not possible to retrofit every existing Microsoft operating system to force drivers to enable interrupts simultaneously.) If you load USB before sound, you enable the IRQ with a sound interrupt pending. This causes an interrupt to be delivered, but with no ISR for the sound device in the ISR chain. The OS calls the USB ISR and it returns with a value indicating that the interrupt was not caused by USB. The OS then acknowledges the interrupt. However, because the interrupt is level-triggered, it's immediately reasserted and the OS jumps right back into the interrupt-handling code. The result is that the machine is hung, endlessly dismissing interrupts (in other words, the machine is hung in an interrupt storm). Similar cases can occur when a machine is brought out of a suspended or hibernating state.



    So far, the discussion assumes that everything is working perfectly, that all hardware is perfectly well behaved, and that all device drivers are perfectly written. However, this is not always the case.



    Consider a case where driver A is poorly written and always indicates that its ISR has just handled an interrupt. Driver A operates a device that uses level-triggered interrupts. (This is true for all new devices, because everything either is PCI or looks like PCI these days.) Imagine also that driver B exists with an ISR that is farther down the chain. If the device associated with driver B interrupts, the OS will never be able to call its ISR, because driver A will always claim the interrupt. In this case, the machine also hangs because of an interrupt storm. However, if it is able to get its own IRQ, driver A functions without a problem.



    Here's another example. Two devices, a modem, and a CardBus controller share an IRQ. The machine is a laptop, and the user is not making any phone calls at the moment. The OS puts the modem in the D3 (powered-off) state. The driver for the modem unregisters its ISR and powers off its hardware. But, because of either a hardware or a software bug, the modem delivers an interrupt if the phone rings. If the modem had its own IRQ, the operating system would mask that IRQ when the driver unregistered its ISR. However, because these two devices share an IRQ, the operating system must leave the IRQ unmasked so the CardBus controller can function. If the phone rings at this time, an interrupt is delivered on the unmasked IRQ. There is no ISR registered for the modem hardware, so only the CardBus ISR is called, after which the operating system acknowledges the interrupt. Because the interrupt is still pending, the result is another interrupt storm.



    That example is actually very common, because many hardware designers confuse the concept of an "interrupt" with that of a "wake signal", or PME. Hardware designers often wrong in thinking that a device interrupt should be triggered to cause a device to wake up. These scenarios are avoided by putting an APIC in the system. (we'll get to APIC's in a moment.) This allows most, or all devices to get their own IRQ.



    Also worth noting, the native interrupt mechanism for PCI Express is MSI (message-signaled interrupts). This is also true for PCI-X. You cannot use MSI without APIC.



    Typical Uses of Interrupts


    Ok, so now you have a good idea of what an interrupt is, what it does, and the different types. But what the heck is it used for!?!? Well, typical interrupt uses include the following:



      • system timers
      • disks I/O
      • power-off signals and traps
      • transferring data bytes
      • UARTS (Universal Asynchronous Receiver/Transmitter)
      • Ethernet
      • sense key-presses
      • control motors
      • etc.



      A classic system timer interrupt will periodically interrupt from a counter or the power-line. The interrupt handler counts the interrupts to keep time. The timer interrupt may also be used by the OS's task scheduler to reschedule the priorities of running processes. Counters are very popular, but some older computers did use the power line frequency instead. This was because power companies in most Western countries control the power-line frequency with an atomic clock. In the U.S.A. the frequency is 60 Hz. A disk interrupt signals the completion of a data transfer from or to the disk drive. This interrupt lets a program know when to continue or wait while reading or writing data to the drive. A power-off interrupt predicts or requests a loss of power. It allows the computer equipment to perform an orderly shutdown. In newer computer, this is sometimes called intelligent off, where the power button effectively clicks the start button, shutdown, turn power off selections on a Windows 9x and higher computer. (Why you have to click start to stop your computer still boggles my mind to this day...) Interrupts are also used in type ahead features for buffering events like keystrokes. I haven't used this type of software on a desktop computer, but if you have a newer cell phone you most likely have used this. Though now it's given flashy titles like predictive text, etc.



      APIC


      Now that we've covered the basics of interrupts, we can move on to the APIC, or the Advanced Programmable Interface Controller. The Intel APIC Architecture is a system of APICs designed by Intel for use in Symmetric Multi-Processor (SMP) computer systems. It was originally implemented by the Intel 82093AA and 82489DX, and is found in most x86 SMP motherboards. It's one of several attempts to solve interrupt routing efficiency issues in multiprocessor computer systems. There are two components in the Intel APIC system, the Local APIC (LAPIC) and the I/O APIC. The LAPIC is integrated into each CPU in the system, and the I/O APIC is used throughout the system's peripheral buses. There is typically one I/O APIC for each peripheral bus in the system. In original system designs, LAPICs and I/O APICs were connected by a dedicated APIC bus. Newer systems use the system bus for communication between all APIC components. In systems containing an 8259 PIC, the 8259 may be connected to the LAPIC in the system's bootstrap processor (BSP), or to one of the system's I/O APICs.



      Local APICs


      LAPICs manage all external interrupts for the processor that it's part of. It's also able to send and receive inter-processor interrupts (IPIs) between LAPICs. LAPIC's can support up to 224 usable IRQ vectors from an I/O APIC. Vectors numbers 0 to 31, out of 0 to 255, are reserved for exception handling by x86 processors.



      I/O APICs


      I/O APICs contain a redirection table, that is used to route the interrupts it receives from peripheral buses to one or more Local APICs.



      Design Issues


      The Intel APIC architecture is well known for having a large amount of unwanted variation in one or more of it's signals in its interrupt latency.



      Hardware Bugs


      There are a number of known bugs in implementations of APIC systems, especially when it comes to how the 8259 is connected. There are defective BIOS implementations which don't setup interrupt routing properly. This includes the errors in the implementation of ACPI tables and Intel Multiprocessor Specification tables.



      Operating System Issues


      It can be a cause of system failure, as some versions of some operating systems don't support it properly. If this is the case, disabling I/O APIC may cure the problem. For Linux, try the 'noapic nolapic' kernel parameters; for FreeBSD, the 'hint.apic.0.disabled' kernel environment variable. In Linux, problems with I/O APIC are one of several causes of error messages concerning "spurious 8259A interrupt: IRQ7.". It's also possible that I/O APIC can cause problems with network interfaces based on the via-rhine driver, causing a transmission time out. Uniprocessor kernels with APIC enabled can cause spurious interrupts to be generated.



      PIC (8295) VS. APIC



      PIC (8259) Hardware Is Slow

      Lets stop for a moment and look at things from a Windows point of view. The PIC interrupt controller has a built-in hardware priority scheme that is not appropriate for machines running operating systems based on Windows NT Technology. To address this problem, a different hardware priority scheme is used by the oOS.



      When the OS raises or lowers IRQL, a new mask is written into the 8259 that enables only the interrupts allowed at this IRQL. Raising or lowering IRQL causes either two "out" instructions or software simulation of one sort or another. Each of these I/O instructions must make it all the way to the South Bridge and back.



      The Compaq Alpha interrupt controller has software levels that are used in Windows NT and Windows 2000 to cause DPC and APC interrupts. On Intel Architecture platforms using a PIC, this has to be simulated because there isn't any hardware to cause the interrupts. When the OS drops below DISPATCH_LEVEL, it must check to see whether a DPC has been queued. If it has, it must simulate an interrupt.



      With an APIC device, the OS can queue a DPC at any time by sending itself an interrupt at a priority that matches DISPATCH_LEVEL. Then, whenever it lowers the IRQL below DISPATCH_LEVEL, the DPC fires in hardware with no software intervention at all.



      Raising or lowering IRQL on an APIC is just a matter of adjusting the Task Priority Register in the local APIC. This is just a "mov" instruction that adjusts a register inside the processor. It can happen much more quickly than multiple writes to the South Bridge. Keep in mind that every time anything is synchronized using any of the Win32 or Windows NT native synchronization primitives, the IRQL is changed at least twice. APICs, therefore, provide better speed in interrupt handling.



      Windows 95/98 both have a design requirement to support DOS device drivers. Because DOS drivers may assume that they can write directly to the 8259 PIC and its associated IDT entries, the APIC is unsupportable on these OS's. The 8259 PIC cannot be used in machines with multiple processors either.



      APICS - Not Just For Multi-Processor Systems




      The traditional 8259 PIC is subject to significant legacy issues. IRQs 0, 1, 2, 6, 8, 12, 13, 14, and 15 are consumed by legacy devices. Even when legacy devices are not present, these IRQs are often claimed by legacy software or firmware. IRQs 3 and 4 sometimes fall into this category as well. (COM1 and COM2)



      This leaves IRQs 5, 7, 9, 10, and 11 available for general use on a typical machine. Audio hardware is almost always programmed to use IRQ 5. That leaves us with only four IRQs available for other devices to use. Most machines today have far more than four devices that are programmed to interrupt. APIC interrupt subsystems can have as many IRQs as are required in a specific machine. Chipset vendors usually design I/O APICs to have 24 IRQs each, and a client machine almost always contains only one I/O APIC. This is enough to guarantee a dedicated IRQ for each PCI device, which would make sharing necessary only when the user installs many devices.



      In an APIC-based system, each PCI device can be routed directly to an interrupt controller input on an IOAPIC. Some can be routed directly to the I/O APIC, and some can be routed through the IRQ steering devices. Ideally, the chipset could include more steering devices. (No OEM has ever taken on the extra cost of providing steering devices outside the chipset, on single-processor systems.)



      Most laptops are equipped with so few IRQs that they ship with the COM port or other internal devices disabled to ensure that IRQs remain available for PCMCIA devices. It gets worse on machines using docking stations. Laptops usually ship with confusing utilities that allow the end user to disable the modem, just so they can enable the COM port, and so on. Attempting to Compensating for the lack of IRQs in this way degrades the usability of the system by making users do what the software should do, and what it would do, if the hardware made it possible. The 8259 interrupt controller can actually drop interrupts, because of how it handles spurious interrupts. The APIC is less likely to have this problem.



      In conclusion, the industry is constantly pushing forward to fix a major flaw in the original interrupt design. They are using APIC style designs along with message signaled interrupts to accomplish this goal. We've gone over the many types of interrupt detection and resolution, the good, the bad, and the ugly of it all. By now you should also have a deeper understanding of what goes on under the hood, as well as how even the smallest delay can become quite noticeable over time.



      -=databat=-

      No comments: