|
Last
updated July 30, 1998
|
|
Writing
a high-performance server application requires implementing
an efficient threading model. Having either too
few or too many server threads to process client
requests can lead to performance problems. For example,
if a server creates a single thread to handle all
requests clients can become starved since the server
will be tied up processing one request at a time.
Of course, a single thread could simultaneously
process multiple requests, switching from one to
another as I/O operations are started, but this
architecture introduces significant complexity and
cannot take advantage of multiprocessor systems.
At the other extreme a server could create a big
pool of threads so that virtually every client request
is processed by a dedicated thread. This scenario
usually leads to thread-thrashing, where lots of
threads wake-up, perform some CPU processing, block
waiting for I/O and then after request procesing
is completed block again waiting for a new request.
If nothing else, context-switches are caused by
the scheduler having to divide processor time among
multiple active threads.
The goal of a server is to
incur as few context switches as possible by having
its threads avoid unnecessary blocking, while at
the same time maximizing parallelism by using multiple
threads. The ideal is for there to be a thread actively
servicing a client request on every processor and
for those threads not to block if there are additional
requests waiting when they complete a request. For
this to work correctly however, there must be a
way for the application to activate another thread
when one processing a client request blocks on I/O
(like when it reads from a file as part of the processing).
Windows NT 3.5 introduced
a set of APIs that make this goal relatively easy
to achieve. The APIs are centered on an object called
a completion port. In this article I'm going
to provide an overview of how completion ports are
used and then go inside them to show you how Windows
NT implements them.
|
|
Applications use completion
ports as the the focal point for the completion
of I/O associated with multiple file handles. Once
a file is associated with a completion port any
asynchronous I/O operations that complete on the
file result in a completion packet being queued
to the port. A thread can wait for any outstanding
I/Os to complete on multiple files simply by waiting
for a completion packet to be queued on the completion
port. The Win32 API provides similar functionality
with the WaitForMultipleObjects API, but
the advantage that completion ports have is that
concurrency, or the number of threads that an application
has actively servicing client requests, is controlled
with the aid of the system.
When an application creates a completion port it
specifies a concurrency value. This value indicates
the maximum number of threads associated with the
port that should be running at any given point in
time. As I stated earlier, the ideal is to have
one thread active at any given point in time for
every processor in the system. The concurrency value
associated with a port is used by NT to control
how many threads an application has active - if
the number of active threads associated with a port
equals the concurrency value then a thread that
is waiting on the completion port will not be allowed
to run. Instead, it is expected that one of the
active threads will finish processing its current
request and check to see if there's another packet
waiting at the port - if there is then it simply
grabs it and goes off to process it. When this happens
there is no context switch, and the CPUs are utilized
to near their full capacity.
Figure 1 below shows a high-level picture of completion
port operation. Incoming client requests cause completion
packets to be queued at the port. A number of threads,
up to the concurrency limit for the port, are allowed
by NT to process client requests. Any additional
threads associated with the port are blocked until
the number of active threads drops, as can happen
when an active thread blocks on file I/O. I'll discuss
this further a little later.
A completion port is created with a call to the
Win32 API CreateIoCompletionPort:
HANDLE CreateIoCompletionPort(
HANDLE
FileHandle,
HANDLE
ExistingCompletionPort,
DWORD
CompletionKey ,
DWORD
NumberOfConcurrentThreads
);
To create the port an application passes in a NULL
for the ExistingCompletionPort parameter
and indicates the concurreny value with the NumberOfConcurrentThreads
parameter. If a FileHandle parameter is specified
then the file handle becomes associated with the
port. When an I/O request that has been issued on
the file handle completes a completion packet is
queued to the completion port. To retrieve a completion
packet and possibly block waiting for one to arrive
a thread calls the GetQueuedCompletionStatus
API:
BOOL GetQueuedCompletionStatus(
HANDLE
CompletionPort,
LPDWORD
lpNumberOfBytesTransferred,
LPDWORD
CompletionKey ,
LPOVERLAPPED
*lpOverlapped,
DWORD
dwMiillisecondTimeout
);
Threads that block on a completion port become associated
with the port and are woken in LIFO order so that
the thread that blocked most recently is the one
that is given the next packet. Threads that block
for long periods of time can have their stacks swapped
out to disk, so if there are more threads associated
with a port then there is work to process the in-memory
footprints of threads blocked the longest are minimized.
A server application will usually receive client
requests via network endpoints that are represented
as file handles. Examples include Winsock2 sockets
or named pipes. As the server creates its communications
endpoints it associates them with a completion port
and its threads wait for incoming requests by calling
GetQueuedCompletionStatus on the port. When
a thread is given a packet from the completion port
it will go off and start processing the request,
becoming an active thread. Many times a thread will
block during its processing, like when it needs
to read or write data to a file on disk, or when
it synchronizes with other threads. Windows NT is
clever enough to detect this and recognize that
the completion port has one less active thread.
Therefore, when a thread becomes inactive because
it blocks, a thread waiting on the completion port
will be woken if there is packet in the queue.
Microsoft's guidelines are to set the concurrency
value roughly equal to the number of processors
in a system. Note that it is possible for the number
of active threads for a completion port to exceed
the concurrency limit. Consider a case where the
limit is specified as 1. A client request comes
in and a thread is dispatched to process the request,
becoming active. A second requests comes in but
a second thread waiting on the port is not allowed
to proceed because the concurrency limit has been
reached. Then the first thread blocks waiting for
a file I/O so it becomes inactive. The second thread
is then released and while it is still active the
first thread's file I/O is completes, making it
active again. At that point in time, and until one
of the threads blocks, the concurrency value is
2, which is higher than the limit of 1. Most of
the time the active count will remain at or just
above the concurrency limit.
The completion port API also makes it possible for
a server application to queue privately defined
completion packets to a completion port using PostQueuedCompletionStatus.
Servers typically use this function to inform its
threads of external events such as the need to shut
down gracefully. |
|
A call to the Win32 API
CreateIoCompletionPort with a NULL completion
port handle results in the execution of the native
API function NtCreateIoCompletion, which
invokes the corresponding kernel-mode system service
of the same name. Internally, completion ports
are based on an undocumented executive synchronization
object called a Queue. Thus, the system
service creates a completion port object and initializes
a queue object in the port's allocated memory
(a pointer to the port also points to the queue
object since the queue is at the start of the
port memory). A queue object has (coincidentally)
a concurrency value that is specified when a thread
initializes one, and in this case the value that
is used is the one that was passed to CreateIoCompletionPort.
KeInitializeQueue is the function that
NtCreateIoCompletion calls to initialize
a port's queue object.
When an application calls
CreateIoCompletionPort to associate a file
handle with a port the Win32 API invokes the native
function NtSetInformationFile with the
file handle as the primary parameter. The information
class that is set is FileCompletionInformation
and the completion port's handle and the CompletionKey
parameter from CreateIoCompletionPort are
the data values. NtSetInformationFile dereferences
the file handle to obtain the file object and
allocates a completion context data structure,
which is defined in NTDDK.H as:
typedef struct _IO_COMPLETION_CONTEXT
{
PVOID
Port;
ULONG
Key;
} IO_COMPLETION_CONTEXT, *PIO_COMPLETION_CONTEXT;
Finally, NtSetInformationFile
sets the CompletionContext field in the file
object to point at the context structure. When an
I/O operation completes on a file object the internal
I/O manager function IopCompleteRequest executes
and, if the I/O was asynchronous, checks to see
if the CompletionContext field in the file
object is non-NULL. If its non-NULL the I/O Manager
allocates a completion packet and queues it to the
completion port by calling KeInsertQueue
with the port as the queue on which to insert the
packet (remember that the completion port object
and queue object are synonymous).
When GetQueuedCompletionStatus
is invoked by a server thread, it calls the native
API function NtRemoveIoCompletion, which
transfers control to the NtRemoveIoCompletion
system service. After validating parameters and
translating the completion port handle to a pointer
to the port, NtRemoveIoCompletion calls KeRemoveQueue.
As you can see, KeRemoveQueue
and KeInsertQueue are the engine behind completion
ports and are the functions that determine whether
a thread waiting for an I/O completion packet should
be activated or not. Internally, a queue object
maintains a count of the current number of active
threads and the maximum active threads. If the current
number equals or exceeds the maximum when a thread
calls KeRemoveQueue, the thread will be put
(in LIFO order) onto a list of threads waiting for
a turn to process a completion packet. The list
of threads hangs off the queue object. A thread's
control block data structure has a pointer in it
that references the queue object of a queue that
it is associated with; if the pointer is NULL then
the thread is not associated with a queue.
So how does NT keep track
of threads that become inactive because they block
on something other than the completion port? The
answer lies in the queue pointer in a thread's control
block. The scheduler routines that are executed
in response to a thread blocking (KeWaitForSingleObject,
KeDelayExecutionThread, etc.) check the thread's
queue pointer and if its not NULL they will call
KiActivateWaiterQueue, a queue-related function.
KiActivateWaiterQueue decrements the count
of active threads associated with the queue, and
if the result is less than the maximum and there
is at least one completion packet in the queue then
the thread at the front of the queue's thread list
is woken and given the oldest packet. Conversely,
whenever a thread that is associated with a queue
wakes up after blocking the scheduler executes the
function KiUnwaitThread, which increments
the queue's active count.
Finally, the PostQueuedCompletionStatus Win32
API calls upon the native function NtSetIoCompletion.
As with the other native APIs in the completion
port group, this one invokes a system service bearing
the same name, which simply inserts that packet
onto the completion port's queue using KeInsertQueue. |
|
Windows
NT's completion port API provides an easy-to-use
and efficient way to maximize a server's performance
by minimizing context switches while obtaining high-degrees
of parallelism. The API is made possible with support
in the I/O Manager, Kernel, and system services.
While the Queue object is exported for use by device
drivers (it is undocumented but its interfaces are
relatively easy to figure out), the completion port
APIs are not. However, if the queue interfaces are
derived it is possible to mimick the completion
port interfaces by simply using the queue routines
and manually associating file objects with queues
by setting the CompletionContext entry. |
Back to Top |
|
|
|
|