Skip to content

Commit f036d4e

Browse files
Andrew MortonLinus Torvalds
Andrew Morton
authored and
Linus Torvalds
committed
[PATCH] ia32 Message Signalled Interrupt support
From: long <[email protected]> Add support for Message Signalled Interrupt delivery on ia32. With a fix from Zwane Mwaikambo <[email protected]>
1 parent 82b699a commit f036d4e

File tree

18 files changed

+1837
-55
lines changed

18 files changed

+1837
-55
lines changed

Documentation/MSI-HOWTO.txt

Lines changed: 321 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,321 @@
1+
The MSI Driver Guide HOWTO
2+
Tom L Nguyen [email protected]
3+
10/03/2003
4+
5+
1. About this guide
6+
7+
This guide describes the basics of Message Signaled Interrupts(MSI), the
8+
advantages of using MSI over traditional interrupt mechanisms, and how
9+
to enable your driver to use MSI or MSI-X. Also included is a Frequently
10+
Asked Questions.
11+
12+
2. Copyright 2003 Intel Corporation
13+
14+
3. What is MSI/MSI-X?
15+
16+
Message Signaled Interrupt (MSI), as described in the PCI Local Bus
17+
Specification Revision 2.3 or latest, is an optional feature, and a
18+
required feature for PCI Express devices. MSI enables a device function
19+
to request service by sending an Inbound Memory Write on its PCI bus to
20+
the FSB as a Message Signal Interrupt transaction. Because MSI is
21+
generated in the form of a Memory Write, all transaction conditions,
22+
such as a Retry, Master-Abort, Target-Abort or normal completion, are
23+
supported.
24+
25+
A PCI device that supports MSI must also support pin IRQ assertion
26+
interrupt mechanism to provide backward compatibility for systems that
27+
do not support MSI. In Systems, which support MSI, the bus driver is
28+
responsible for initializing the message address and message data of
29+
the device function's MSI/MSI-X capability structure during device
30+
initial configuration.
31+
32+
An MSI capable device function indicates MSI support by implementing
33+
the MSI/MSI-X capability structure in its PCI capability list. The
34+
device function may implement both the MSI capability structure and
35+
the MSI-X capability structure; however, the bus driver should not
36+
enable both, but instead enable only the MSI-X capability structure.
37+
38+
The MSI capability structure contains Message Control register,
39+
Message Address register and Message Data register. These registers
40+
provide the bus driver control over MSI. The Message Control register
41+
indicates the MSI capability supported by the device. The Message
42+
Address register specifies the target address and the Message Data
43+
register specifies the characteristics of the message. To request
44+
service, the device function writes the content of the Message Data
45+
register to the target address. The device and its software driver
46+
are prohibited from writing to these registers.
47+
48+
The MSI-X capability structure is an optional extension to MSI. It
49+
uses an independent and separate capability structure. There are
50+
some key advantages to implementing the MSI-X capability structure
51+
over the MSI capability structure as described below.
52+
53+
- Support a larger maximum number of vectors per function.
54+
55+
- Provide the ability for system software to configure
56+
each vector with an independent message address and message
57+
data, specified by a table that resides in Memory Space.
58+
59+
- MSI and MSI-X both support per-vector masking. Per-vector
60+
masking is an optional extension of MSI but a required
61+
feature for MSI-X. Per-vector masking provides the kernel
62+
the ability to mask/unmask MSI when servicing its software
63+
interrupt service routing handler. If per-vector masking is
64+
not supported, then the device driver should provide the
65+
hardware/software synchronization to ensure that the device
66+
generates MSI when the driver wants it to do so.
67+
68+
4. Why use MSI?
69+
70+
As a benefit the simplification of board design, MSI allows board
71+
designers to remove out of band interrupt routing. MSI is another
72+
step towards a legacy-free environment.
73+
74+
Due to increasing pressure on chipset and processor packages to
75+
reduce pin count, the need for interrupt pins is expected to
76+
diminish over time. Devices, due to pin constraints, may implement
77+
messages to increase performance.
78+
79+
PCI Express endpoints uses INTx emulation (in-band messages) instead
80+
of IRQ pin assertion. Using INTx emulation requires interrupt
81+
sharing among devices connected to the same node (PCI bridge) while
82+
MSI is unique (non-shared) and does not require BIOS configuration
83+
support. As a result, the PCI Express technology requires MSI
84+
support for better interrupt performance.
85+
86+
Using MSI enables the device functions to support two or more
87+
vectors, which can be configure to target different CPU's to
88+
increase scalability.
89+
90+
5. Configuring a driver to use MSI/MSI-X
91+
92+
By default, the kernel will not enable MSI/MSI-X on all devices that
93+
support this capability once the patch is installed. A kernel
94+
configuration option must be selected to enable MSI/MSI-X support.
95+
96+
5.1 Including MSI support into the kernel
97+
98+
To include MSI support into the kernel requires users to patch the
99+
VECTOR-base patch first and then the MSI patch because the MSI
100+
support needs VECTOR based scheme. Once these patches are installed,
101+
setting CONFIG_PCI_USE_VECTOR enables the VECTOR based scheme and
102+
the option for MSI-capable device drivers to selectively enable MSI
103+
(using pci_enable_msi as desribed below).
104+
105+
Since the target of the inbound message is the local APIC, providing
106+
CONFIG_PCI_USE_VECTOR is dependent on whether CONFIG_X86_LOCAL_APIC
107+
is enabled or not.
108+
109+
int pci_enable_msi(struct pci_dev *)
110+
111+
With this new API, any existing device driver, which like to have
112+
MSI enabled on its device function, must call this explicitly. A
113+
successful call will initialize the MSI/MSI-X capability structure
114+
with ONE vector, regardless of whether the device function is
115+
capable of supporting multiple messages. This vector replaces the
116+
pre-assigned dev->irq with a new MSI vector. To avoid the conflict
117+
of new assigned vector with existing pre-assigned vector requires
118+
the device driver to call this API before calling request_irq(...).
119+
120+
The below diagram shows the events, which switches the interrupt
121+
mode on the MSI-capable device function between MSI mode and
122+
PIN-IRQ assertion mode.
123+
124+
------------ pci_enable_msi ------------------------
125+
| | <=============== | |
126+
| MSI MODE | | PIN-IRQ ASSERTION MODE |
127+
| | ===============> | |
128+
------------ free_irq ------------------------
129+
130+
5.2 Configuring for MSI support
131+
132+
Due to the non-contiguous fashion in vector assignment of the
133+
existing Linux kernel, this patch does not support multiple
134+
messages regardless of the device function is capable of supporting
135+
more than one vector. The bus driver initializes only entry 0 of
136+
this capability if pci_enable_msi(...) is called successfully by
137+
the device driver.
138+
139+
5.3 Configuring for MSI-X support
140+
141+
Both the MSI capability structure and the MSI-X capability structure
142+
share the same above semantics; however, due to the ability of the
143+
system software to configure each vector of the MSI-X capability
144+
structure with an independent message address and message data, the
145+
non-contiguous fashion in vector assignment of the existing Linux
146+
kernel has no impact on supporting multiple messages on an MSI-X
147+
capable device functions. By default, as mentioned above, ONE vector
148+
should be always allocated to the MSI-X capability structure at
149+
entry 0. The bus driver does not initialize other entries of the
150+
MSI-X table.
151+
152+
Note that the PCI subsystem should have full control of a MSI-X
153+
table that resides in Memory Space. The software device driver
154+
should not access this table.
155+
156+
To request for additional vectors, the device software driver should
157+
call function msi_alloc_vectors(). It is recommended that the
158+
software driver should call this function once during the
159+
initialization phase of the device driver.
160+
161+
The function msi_alloc_vectors(), once invoked, enables either
162+
all or nothing, depending on the current availability of vector
163+
resources. If no vector resources are available, the device function
164+
still works with ONE vector. If the vector resources are available
165+
for the number of vectors requested by the driver, this function
166+
will reconfigure the MSI-X capability structure of the device with
167+
additional messages, starting from entry 1. To emphasize this
168+
reason, for example, the device may be capable for supporting the
169+
maximum of 32 vectors while its software driver usually may request
170+
4 vectors.
171+
172+
For each vector, after this successful call, the device driver is
173+
responsible to call other functions like request_irq(), enable_irq(),
174+
etc. to enable this vector with its corresponding interrupt service
175+
handler. It is the device driver's choice to have all vectors shared
176+
the same interrupt service handler or each vector with a unique
177+
interrupt service handler.
178+
179+
In addition to the function msi_alloc_vectors(), another function
180+
msi_free_vectors() is provided to allow the software driver to
181+
release a number of vectors back to the vector resources. Once
182+
invoked, the PCI subsystem disables (masks) each vector released.
183+
These vectors are no longer valid for the hardware device and its
184+
software driver to use. Like free_irq, it recommends that the
185+
device driver should also call msi_free_vectors to release all
186+
additional vectors previously requested.
187+
188+
int msi_alloc_vectors(struct pci_dev *dev, int *vector, int nvec)
189+
190+
This API enables the software driver to request the PCI subsystem
191+
for additional messages. Depending on the number of vectors
192+
available, the PCI subsystem enables either all or nothing.
193+
194+
Argument dev points to the device (pci_dev) structure.
195+
Argument vector is a pointer of integer type. The number of
196+
elements is indicated in argument nvec.
197+
Argument nvec is an integer indicating the number of messages
198+
requested.
199+
A return of zero indicates that the number of allocated vector is
200+
successfully allocated. Otherwise, indicate resources not
201+
available.
202+
203+
int msi_free_vectors(struct pci_dev* dev, int *vector, int nvec)
204+
205+
This API enables the software driver to inform the PCI subsystem
206+
that it is willing to release a number of vectors back to the
207+
MSI resource pool. Once invoked, the PCI subsystem disables each
208+
MSI-X entry associated with each vector stored in the argument 2.
209+
These vectors are no longer valid for the hardware device and
210+
its software driver to use.
211+
212+
Argument dev points to the device (pci_dev) structure.
213+
Argument vector is a pointer of integer type. The number of
214+
elements is indicated in argument nvec.
215+
Argument nvec is an integer indicating the number of messages
216+
released.
217+
A return of zero indicates that the number of allocated vectors
218+
is successfully released. Otherwise, indicates a failure.
219+
220+
5.4 Hardware requirements for MSI support
221+
MSI support requires support from both system hardware and
222+
individual hardware device functions.
223+
224+
5.4.1 System hardware support
225+
Since the target of MSI address is the local APIC CPU, enabling
226+
MSI support in Linux kernel is dependent on whether existing
227+
system hardware supports local APIC. Users should verify their
228+
system whether it runs when CONFIG_X86_LOCAL_APIC=y.
229+
230+
In SMP environment, CONFIG_X86_LOCAL_APIC is automatically set;
231+
however, in UP environment, users must manually set
232+
CONFIG_X86_LOCAL_APIC. Once CONFIG_X86_LOCAL_APIC=y, setting
233+
CONFIG_PCI_USE_VECTOR enables the VECTOR based scheme and
234+
the option for MSI-capable device drivers to selectively enable
235+
MSI (using pci_enable_msi as desribed below).
236+
237+
Note that CONFIG_X86_IO_APIC setting is irrelevant because MSI
238+
vector is allocated new during runtime and MSI support does not
239+
depend on BIOS support. This key independency enables MSI support
240+
on future IOxAPIC free platform.
241+
242+
5.4.2 Device hardware support
243+
The hardware device function supports MSI by indicating the
244+
MSI/MSI-X capability structure on its PCI capability list. By
245+
default, this capability structure will not be initialized by
246+
the kernel to enable MSI during the system boot. In other words,
247+
the device function is running on its default pin assertion mode.
248+
Note that in many cases the hardware supporting MSI have bugs,
249+
which may result in system hang. The software driver of specific
250+
MSI-capable hardware is responsible for whether calling
251+
pci_enable_msi or not. A return of zero indicates the kernel
252+
successfully initializes the MSI/MSI-X capability structure of the
253+
device funtion. The device function is now running on MSI mode.
254+
255+
5.5 How to tell whether MSI is enabled on device function
256+
257+
At the driver level, a return of zero from pci_enable_msi(...)
258+
indicates to the device driver that its device function is
259+
initialized successfully and ready to run in MSI mode.
260+
261+
At the user level, users can use command 'cat /proc/interrupts'
262+
to display the vector allocated for the device and its interrupt
263+
mode, as shown below.
264+
265+
CPU0 CPU1
266+
0: 324639 0 IO-APIC-edge timer
267+
1: 1186 0 IO-APIC-edge i8042
268+
2: 0 0 XT-PIC cascade
269+
12: 2797 0 IO-APIC-edge i8042
270+
14: 6543 0 IO-APIC-edge ide0
271+
15: 1 0 IO-APIC-edge ide1
272+
169: 0 0 IO-APIC-level uhci-hcd
273+
185: 0 0 IO-APIC-level uhci-hcd
274+
193: 138 10 PCI MSI aic79xx
275+
201: 30 0 PCI MSI aic79xx
276+
225: 30 0 IO-APIC-level aic7xxx
277+
233: 30 0 IO-APIC-level aic7xxx
278+
NMI: 0 0
279+
LOC: 324553 325068
280+
ERR: 0
281+
MIS: 0
282+
283+
6. FAQ
284+
285+
Q1. Are there any limitations on using the MSI?
286+
287+
A1. If the PCI device supports MSI and conforms to the
288+
specification and the platform supports the APIC local bus,
289+
then using MSI should work.
290+
291+
Q2. Will it work on all the Pentium processors (P3, P4, Xeon,
292+
AMD processors)? In P3 IPI's are transmitted on the APIC local
293+
bus and in P4 and Xeon they are transmitted on the system
294+
bus. Are there any implications with this?
295+
296+
A2. MSI support enables a PCI device sending an inbound
297+
memory write (0xfeexxxxx as target address) on its PCI bus
298+
directly to the FSB. Since the message address has a
299+
redirection hint bit cleared, it should work.
300+
301+
Q3. The target address 0xfeexxxxx will be translated by the
302+
Host Bridge into an interrupt message. Are there any
303+
limitations on the chipsets such as Intel 8xx, Intel e7xxx,
304+
or VIA?
305+
306+
A3. If these chipsets support an inbound memory write with
307+
target address set as 0xfeexxxxx, as conformed to PCI
308+
specification 2.3 or latest, then it should work.
309+
310+
Q4. From the driver point of view, if the MSI is lost because
311+
of the errors occur during inbound memory write, then it may
312+
wait for ever. Is there a mechanism for it to recover?
313+
314+
A4. Since the target of the transaction is an inbound memory
315+
write, all transaction termination conditions (Retry,
316+
Master-Abort, Target-Abort, or normal completion) are
317+
supported. A device sending an MSI must abide by all the PCI
318+
rules and conditions regarding that inbound memory write. So,
319+
if a retry is signaled it must retry, etc... We believe that
320+
the recommendation for Abort is also a retry (refer to PCI
321+
specification 2.3 or latest).

arch/i386/Kconfig

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1030,6 +1030,25 @@ config PCI_DIRECT
10301030
depends on PCI && ((PCI_GODIRECT || PCI_GOANY) || X86_VISWS)
10311031
default y
10321032

1033+
config PCI_USE_VECTOR
1034+
bool "Vector-based interrupt indexing"
1035+
depends on X86_LOCAL_APIC
1036+
default n
1037+
help
1038+
This replaces the current existing IRQ-based index interrupt scheme
1039+
with the vector-base index scheme. The advantages of vector base
1040+
over IRQ base are listed below:
1041+
1) Support MSI implementation.
1042+
2) Support future IOxAPIC hotplug
1043+
1044+
Note that this enables MSI, Message Signaled Interrupt, on all
1045+
MSI capable device functions detected if users also install the
1046+
MSI patch. Message Signal Interrupt enables an MSI-capable
1047+
hardware device to send an inbound Memory Write on its PCI bus
1048+
instead of asserting IRQ signal on device IRQ pin.
1049+
1050+
If you don't know what to do here, say N.
1051+
10331052
source "drivers/pci/Kconfig"
10341053

10351054
config ISA

arch/i386/kernel/i8259.c

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -419,8 +419,10 @@ void __init init_IRQ(void)
419419
* us. (some of these will be overridden and become
420420
* 'special' SMP interrupts)
421421
*/
422-
for (i = 0; i < NR_IRQS; i++) {
422+
for (i = 0; i < (NR_VECTORS - FIRST_EXTERNAL_VECTOR); i++) {
423423
int vector = FIRST_EXTERNAL_VECTOR + i;
424+
if (i >= NR_IRQS)
425+
break;
424426
if (vector != SYSCALL_VECTOR)
425427
set_intr_gate(vector, interrupt[i]);
426428
}

0 commit comments

Comments
 (0)