24/05/2018, 22:46

Hủy bỏ yêu cầu vàora (Cancelling IO Requests )

Just as happens with people in real life, programs sometimes change their mind about the I/O requests they’ve asked you to perform for them. We’re not talking about simple fickleness here. Applications might terminate after issuing requests that ...

Just as happens with people in real life, programs sometimes change their mind about the I/O requests they’ve asked you to perform for them. We’re not talking about simple fickleness here. Applications might terminate after issuing requests that will take a long time to complete, leaving requests outstanding. Such an occurrence is especially likely in the WDM world, where the insertion of new hardware might require you to stall requests while the Configuration Manager rebalances resources or where you might be told at any moment to power down your device.

To cancel a request in kernel mode, someone calls IoCancelIrp. The operating system automatically calls IoCancelIrp for every IRP that belongs to a thread that’s terminating with requests still outstanding. A user-mode application can call CancelIo to cancel all outstanding asynchronous operations issued by a given thread on a file handle. IoCancelIrp would like to simply complete the IRP it’s given with STATUS_CANCELLED, but there’s a hitch: IoCancelIrp doesn’t know where you have salted away pointers to the IRP, and it doesn’t know for sure whether you’re currently processing the IRP. So it relies on a cancel routine you provide to do most of the work of cancelling an IRP.

It turns out that a call to IoCancelIrp is more of a suggestion than a demand. It would be nice if every IRP that somebody tried to cancel really got completed with STATUS_CANCELLED. But it’s OK if a driver wants to go ahead and finish the IRP normally if that can be done relatively quickly. You should provide a way to cancel I/O requests that might spend significant time waiting in a queue between a dispatch routine and a StartIo routine. How long is significant is a matter for your own sound judgment; my advice is to err on the side of providing for cancellation because it’s not that hard to do and makes your driver fit better into the operating system.

If It Weren’t for Multitasking…

An intricate synchronization problem is associated with cancelling IRPs. Before I explain the problem and the solution, I want to describe the way cancellation would work in a world where there was no multitasking and no concern with multiprocessor computers. In that utopia, several pieces of the I/O Manager would fit together with your StartIo routine and with a cancel routine you’d provide, as follows:

  • When you queue an IRP, you set the CancelRoutine pointer in the IRP to the address of your cancel routine. When you dequeue the IRP, you set CancelRoutine to NULL.
  • IoCancelIrp unconditionally sets the Cancel flag in the IRP. Then it checks to see whether the CancelRoutine pointer in the IRP is NULL. While the IRP is in your queue, CancelRoutine will be non-NULL. In this case, IoCancelIrp calls your cancel routine. Your cancel routine removes the IRP from the queue where it currently resides and completes the IRP with STATUS_CANCELLED.
  • Once you dequeue the IRP, IoCancelIrp finds the CancelRoutine pointer set to NULL, so it doesn’t call your cancel routine. You process the IRP to completion with reasonable promptness (a concept that calls for engineering judgment), and it doesn’t matter to anyone that you didn’t actually cancel the IRP.

Synchronizing Cancellation

Unfortunately for us as programmers, we write code for a multiprocessing, multitasking environment in which effects can sometimes appear to precede causes. There are many possible race conditions between the queue insertion, queue removal, and cancel routines in the naive scenario I just described. For example, what would happen if IoCancelIrp called your cancel routine to cancel an IRP that happened to be at the head of your queue? If you were simultaneously removing an IRP from the queue on another CPU, you can see that your cancel routine would probably conflict with your queue removal logic. But this is just the simplest of the possible races.

In earlier times, driver programmers dealt with the cancel races by using a global spin lock—the cancel spin lock. Because you shouldn’t use this spin lock for synchronization in your own driver, I’ve explained it briefly in the sidebar. Read the sidebar for its historical perspective, but don’t plan to use this lock.

Here is a sketch of IoCancelIrp. You need to know this to correctly write IRP-handling code. (This isn’t a copy of the Windows XP source code—it’s an abridged excerpt.)

BOOLEAN IoCancelIrp(PIRP Irp)

  {

  IoAcquireCancelSpinLock(&Irp->CancelIrql);

  Irp->Cancel = TRUE;

  PDRIVER_CANCEL CancelRoutine = IoSetCancelRoutine(Irp, NULL);

  if (CancelRoutine)

    {

    PIO_STACK_LOCATION stack = IoGetCurrentIrpStackLocation(Irp);

    (*CancelRoutine)(stack->DeviceObject, Irp);

    return TRUE;

    }

  else

    {

    IoReleaseCancelSpinLock(Irp->CancelIrql);

    return FALSE;

    }

  }

  1. IoCancelIrp first acquires the global cancel spin lock. As you know if you read the sidebar earlier, lots of old drivers contend for the use of this lock in their normal IRP-handling path. New drivers hold this lock only briefly while handling the cancellation of an IRP.
  2. Setting the Cancel flag to TRUE alerts any interested party that IoCancelIrp has been called for this IRP.
  3. IoSetCancelRoutine performs an interlocked exchange to simultaneously retrieve the existing CancelRoutine pointer and set the field to NULL in one atomic operation.
  4. IoCancelIrp calls the cancel routine, if there is one, without first releasing the global cancel spin lock. The cancel routine must release the lock! Note also that the device object argument to the cancel routine comes from the current stack location, where IoCallDriver is supposed to have left it.
  5. If there is no cancel routine, IoCancelIrp itself releases the global cancel spin lock. Good idea, huh?
  • Could someone call IoCancelIrp twice? The thing to think about is that the Cancel flag might be set in an IRP because of some number of primeval calls to IoCancelIrp and that someone might call IoCancelIrp one more time (getting a little impatient, are we?) while StartPacket is active. This wouldn’t matter because our first test of the Cancel flag occurs after we install our cancel pointer. We would find the flag set to TRUE in this hypothetical situation and would therefore execute the second call to IoSetCancelRoutine. Either IoCancel­Irp or we win the race to reset the cancel pointer to NULL, and whoever wins ends up completing the IRP. The residue from the primeval calls is simply irrelevant.

Cancelling IRPs You Create or Handle

Sometimes you’ll want to cancel an IRP that you’ve created or passed to another driver. Great care is required to avoid an obscure, low-probability problem. Just for the sake of illustration, suppose you want to impose an overall 5-second timeout on a synchronous I/O operation. If the time period elapses, you want to cancel the operation. Here is some naive code that, you might suppose, would execute this plan:

SomeFunction()

  {

  KEVENT event;

  IO_STATUS_BLOCK iosb;

  KeInitializeEvent(&event, ...);

  PIRP Irp = IoBuildSynchronousFsdRequest(..., &event, &iosb);

  NTSTATUS status = IoCallDriver(DeviceObject, Irp);

  if (status == STATUS_PENDING)

    {

    LARGE_INTEGER timeout;

    timeout.QuadPart = -5 * 10000000;

    if (KeWaitForSingleObject(&event, Executive, KernelMode,

      FALSE, &timeout) == STATUS_TIMEOUT)

      {

      IoCancelIrp(Irp);  // <== don't do this!

      KeWaitForSingleObject(&event, Executive, KernelMode,

        FALSE, NULL);

      }

    }

  }

The first call (A) to KeWaitForSingleObject waits until one of two things happens. First, someone might complete the IRP, and the I/O Manager’s cleanup code will then run and signal event.

Alternatively, the timeout might expire before anyone completes the IRP. In this case, KeWaitForSingleObject will return STATUS_TIMEOUT. The IRP should now be completed quite soon in one of two paths. The first completion path is taken when whoever was processing the IRP was really just about done when the timeout happened and has, therefore, already called (or will shortly call) IoCompleteRequest. The other completion path is through the cancel routine that, we must assume, the lower driver has installed. That cancel routine should complete the IRP. Recall that we have to trust other kernel-mode components to do their jobs, so we have to rely on whomever we sent the IRP to complete it soon. Whichever path is taken, the I/O Manager’s completion logic will set event and store the IRP’s ending status in iosb. The second call (B) to KeWaitForSingleObject makes sure that the event and iosb objects don’t pass out of scope too soon. Without that second call, we might return from this function, thereby effectively deleting event and iosb. The I/O Manager might then end up walking on memory that belongs to some other subroutine.

The problem with the preceding code is truly minuscule. Imagine that someone manages to call IoCompleteRequest for this IRP right around the same time we decide to cancel it by calling IoCancelIrp. Maybe the operation finishes shortly after the 5‐second timeout terminates the first KeWaitForSingleObject, for example. IoCompleteRequest initiates a process that finishes with a call to IoFreeIrp. If the call to IoFreeIrp were to happen before IoCancelIrp was done mucking about with the IRP, you can see that IoCancelIrp could inadvertently corrupt memory when it touched the CancelIrql, Cancel, and CancelRoutine fields of the IRP. It’s also possible, depending on the exact sequence of events, for IoCancelIrp to call a cancel routine, just before someone clears the CancelRoutine pointer in preparation for completing the IRP, and for the cancel routine to be in a race with the completion process.

It’s very unlikely that the scenario I just described will happen. But, as someone (James Thurber?) once said in connection with the chances of being eaten by a tiger on Main Street (one in a million, as I recall), “Once is enough.” This kind of bug is almost impossible to find, so you want to prevent it if you can. I’ll show you two ways of cancelling your own IRPs. One way is appropriate for synchronous IRPs, the other for asynchronous IRPs.

Don’t Do This…

A once common but now deprecated technique for avoiding the tiger-on-main-street bug described in the text relies on the fact that, in earlier versions of Windows, the call to IoFreeIrp happened in the context of an APC in the thread that originates the IRP. You could make sure you were in that same thread, raise IRQL to APC_LEVEL, check whether the IRP had been completed yet, and (if not) call IoCancelIrp. You could be sure of blocking the APC and the problematic call to IoFreeIrp.

You shouldn’t rely on future releases of Windows always using an APC to perform the cleanup for a synchronous IRP. Consequently, you shouldn’t rely on boosting IRQL to APC_LEVEL as a way to avoid a race between IoCancelIrp and IoFreeIrp.

Cancelling Your Own Synchronous IRP

Refer to the example in the preceding section, which illustrates a function that creates a synchronous IRP, sends it to another driver, and then wants to wait no longer than 5 seconds for the IRP to complete. The key thing we need to accomplish in a solution to the race between IoFreeIrp and IoCancelIrp is to prevent the call to IoFreeIrp from happening until after any possible call to IoCancelIrp. We do this by means of a completion routine that returns STATUS_MORE_PROCESSING_REQUIRED, as follows:

SomeFunction()

  {

  KEVENT event;

  IO_STATUS_BLOCK iosb;

  KeInitializeEvent(&event, ...);

  PIRP Irp = IoBuildSynchronousFsdRequest(..., &event, &iosb);

  IoSetCompletionRoutine(Irp, OnComplete, (PVOID) &event,

    TRUE, TRUE, TRUE);

  NTSTATUS status = IoCallDriver(...);

  if (status == STATUS_PENDING)

    {

    LARGE_INTEGER timeout;

    timeout.QuadPart = -5 * 10000000;

    if (KeWaitForSingleObject(&event, Executive, KernelMode,

      FALSE, &timeout) == STATUS_TIMEOUT)

      {

      IoCancelIrp(Irp);  // <== okay in this context

      KeWaitForSingleObject(&event, Executive, KernelMode,

        FALSE, NULL);

      }

    }

  IoCompleteRequest(Irp, IO_NO_INCREMENT);

  }

NTSTATUS OnComplete(PDEVICE_OBJECT junk, PIRP Irp, PVOID pev)

  {

  if (Irp->PendingReturned)

    KeSetEvent((PKEVENT) pev, IO_NO_INCREMENT, FALSE);

  return STATUS_MORE_PROCESSING_REQUIRED;

  }

The new code in boldface prevents the race. Suppose IoCallDriver returns STATUS_PENDING. In a normal case, the operation will complete normally, and a lower-level driver will call IoCompleteRequest. Our completion routine gains control and signals the event on which our mainline is waiting. Because the completion routine returns STATUS_MORE_PROCESSING_REQUIRED, IoCom­pleteRequest will then stop working on this IRP. We eventually regain control in our SomeFunction and notice that our wait (the one labeled A) terminated normally. The IRP hasn’t yet been cleaned up, though, so we need to call IoCompleteRequesta second time to trigger the normal cleanup mechanism.

Now suppose we decide we want to cancel the IRP and that Thurber’s tiger is loose so we have to worry about a call to IoFreeIrp releasing the IRP out from under us. Our first wait (labeled A) finishes with STATUS_TIMEOUT, so we perform a second wait (labeled B). Our completion routine sets the event on which we’re waiting. It will also prevent the cleanup mechanism from running by returning STATUS_MORE_PROCESSING_REQUIRED. IoCancelIrp can stomp away to its heart’s content on our hapless IRP without causing any harm. The IRP can’t be released until the second call to IoCompleteRequest from our mainline, and that can’t happen until IoCancelIrp has safely returned.

Notice that the completion routine in this example calls KeSetEvent only when the IRP’s PendingReturned flag is set to indicate that the lower driver’s dispatch routine returned STATUS_PENDING. Making this step conditional is an optimization that avoids the potentially expensive step of setting the event when SomeFunction won’t be waiting on the event in the first place.

I want to mention one last fine point in connection with the preceding code. The call to IoCompleteRequest at the very end of the subroutine will trigger a process that includes setting event and iosb so long as the IRP originally completed with a success status. In the first edition, I had an additional call to KeWaitForSingleObject at this point to make sure that event and iosb could not pass out of scope before the I/O Manager was done touching them. A reviewer pointed out that the routine that references event and iosb will already have run by the time IoCompleteRequest returns; consequently, the additional wait is not needed.

Cancelling Your Own Asynchronous IRP

To safely cancel an IRP that you’ve created with IoAllocateIrp or IoBuildAsynchronousFsdRequest, you can follow this general plan. First define a couple of extra fields in your device extension structure:

typedef struct _DEVICE_EXTENSION {

  PIRP TheIrp;

  ULONG CancelFlag;

  } DEVICE_EXTENSION, *PDEVICE_EXTENSION;

Initialize these fields just before you call IoCallDriver to launch the IRP:

pdx->TheIrp = IRP;

pdx->CancelFlag = 0;

IoSetCompletionRoutine(Irp,

  (PIO_COMPLETION_ROUTINE) CompletionRoutine,

  (PVOID) pdx, TRUE, TRUE, TRUE);

IoCallDriver(..., Irp);

If you decide later on that you want to cancel this IRP, do something like the following:

VOID CancelTheIrp(PDEVICE_EXENSION pdx)

  {

  PIRP Irp =

    (PIRP) InterlockedExchangePointer((PVOID*)&pdx->TheIrp, NULL);

  if (Irp)

    {

    IoCancelIrp(Irp);

    if (InterlockedExchange(&pdx->CancelFlag, 1)

      IoFreeIrp(Irp);

    }

  }

This function dovetails with the completion routine you install for the IRP:

NTSTATUS CompletionRoutine(PDEVICE_OBJECT junk, PIRP Irp, 

  PDEVICE_EXTENSION pdx)

  {

  if (InterlockedExchangePointer(&pdx->TheIrp,   NULL)

  ││ InterlockedExchange(&pdx->CancelFlag, 1))

    IoFreeIrp(Irp);

  return STATUS_MORE_PROCESSING_REQUIRED;

  }

The basic idea underlying this deceptively simple code is that whichever routine sees the IRP last (either CompletionRoutine or CancelTheIrp) will make the requisite call to IoFreeIrp, at point 3 or 6. Here’s how it works:

  • The normal case occurs when you don’t ever try to cancel the IRP. Whoever you sent the IRP to eventually completes it, and your completion routine gets control. The first InterlockedExchangePointer (point 4) returns the non-NULL address of the IRP. Since this is not 0, the compiler short-circuits the evaluation of the Boolean expression and executes the call to IoFreeIrp. Any subsequent call to CancelTheIrp will find the IRP pointer set to NULL at point 1 and won’t do anything else.
  • Another easy case to analyze occurs when CancelTheIrp is called long before anyone gets around to completing this IRP, which means that we don’t have any actual race. At point 1, we nullify the TheIrp pointer. Because the IRP pointer was previously not NULL, we go ahead and call IoCancelIrp. In this situation, our call to IoCancelIrp will cause somebody to complete the IRP reasonably soon, and our completion routine runs. It sees TheIrp as NULL and goes on to evaluate the second half of the Boolean expression. Whoever executes the InterlockedExchange on CancelFlag first will get back 0 and skip calling IoFreeIrp. Whoever executes it second will get back 1 and will call IoFreeIrp.
  • Now for the case we were worried about: suppose someone is completing the IRP right about the time CancelTheIrp wants to cancel it. The worst that can happen is that our completion routine runs before we manage to call IoCancelIrp. The completion routine sees TheIrp as NULL and therefore exchanges CancelFlag with 1. Just as in the previous case, the routine will get 0 as the return value and skip the Io­FreeIrp call. IoCancelIrp can safely operate on the IRP. (It will presumably just return without calling a cancel routine because whoever completed this IRP will undoubtedly have set the Cancel­Routine pointer to NULL first.)

The appealing thing about the technique I just showed you is its elegance: we rely solely on interlocked operations and therefore don’t need any potentially expensive synchronization primitives.

Cancelling Someone Else’s IRP

To round out our discussion of IRP cancellation, suppose someone sends you an IRP that you then forward to another driver. Situations might arise where you’d like to cancel that IRP. For example, perhaps you need that IRP out of the way so you can proceed with a power-down operation. Or perhaps you’re waiting synchronously for the IRP to finish and you’d like to impose a timeout as in the first example of this section.

To avoid the IoCancelIrp/IoFreeIrp race, you need to have your own completion routine in place. The details of the coding then depend on whether you’re waiting for the IRP.

Canceling Someone Else’s IRP on Which You’re Waiting

Suppose your dispatch function passes down an IRP and waits synchronously for it to complete. (See usage scenario 7 at the end of this chapter for the cookbook version.) Use code like this to cancel the IRP if it doesn’t finish quickly enough to suit you:

NTSTATUS DispatchSomething(PDEVICE_OBJECT fdo, PIRP Irp)

  {

  PDEVICE_EXTENSION pdx =

    (PDEVICE_EXTENSION) fdo->DeviceExtension;

  KEVENT event;

  KeInitializeEvent(&event, NotificationEvent, FALSE);

  IoSetCompletionRoutine(Irp, OnComplete, (PVOID) &event,

    TRUE, TRUE, TRUE);

  NTSTATUS status = IoCallDriver(...);

  if (status == STATUS_PENDING)

    {

    LARGE_INTEGER timeout;

    timeout.QuadPart = -5 * 10000000;

    if (KeWaitForSingleObject(&event, Executive, KernelMode,

      FALSE, &timeout) == STATUS_TIMEOUT)

      {

      IoCancelIrp(Irp);

      KeWaitForSingleObject(&event, Executive, KernelMode,

        FALSE, NULL);

      }

    }

  status = Irp->IoStatus.Status;

  IoCompleteRequest(Irp, IO_NO_INCREMENT);

  return status;

  }

NTSTATUS OnComplete(PDEVICE_OBJECT junk, PIRP Irp, PVOID pev)

  {

  if (Irp->PendingReturned)

    KeSetEvent((PKEVENT) pev, IO_NO_INCREMENT, FALSE);

  return STATUS_MORE_PROCESSING_REQUIRED;

  }

This code is almost the same as what I showed earlier for canceling your own synchronous IRP. The only difference is that this example involves a dispatch routine, which must return a status code. As in the earlier example, we install our own completion routine to prevent the completion process from running to its ultimate conclusion before we get past the point where we might call IoCancelIrp.

You might notice that I didn’t say anything about whether the IRP itself was synchronous or asynchronous. This is because the difference between the two types of IRP only matters to the driver that creates them in the first place. File system drivers must make distinctions between synchronous and asynchronous IRPs with respect to how they call the system cache manager, but device drivers don’t typically have this complication. What matters to a lower-level driver is whether it’s appropriate to block a thread in order to handle an IRP synchronously, and that depends on the current IRQL and whether you’re in an arbitrary or a nonarbitrary thread.

Canceling Someone Else’s IRP on Which You’re Not Waiting

Suppose you’ve forwarded somebody else’s IRP to another driver, but you weren’t planning to wait for it to complete. For whatever reason, you decide later on that you’d like to cancel that IRP.

typedef struct _DEVICE_EXTENSION {

  PIRP TheIrp;

  ULONG CancelFlag;

  } DEVICE_EXTENSION, *PDEVICE_EXTENSION;

NTSTATUS DispatchSomething(PDEVICE_OBJECT fdo, PIRP Irp)

  {

  PDEVICE_EXTENSION pdx =

    (PDEVICE_EXTENSION) fdo->DeviceExtension;

  IoCopyCurrentIrpStackLocationToNext(Irp);

  IoSetCompletionRoutine(Irp, (PIO_COMPLETION_ROUTINE) OnComplete, 

    (PVOID) pdx,

    TRUE, TRUE, TRUE);

  pdx->CancelFlag = 0;

  pdx->TheIrp = Irp;

  IoMarkIrpPending(Irp);

  IoCallDriver(pdx->LowerDeviceObject, Irp);

  return STATUS_PENDING;

  }

VOID CancelTheIrp(PDEVICE_EXTENSION pdx)

  {

  PIRP Irp = (PIRP) InterlockedExchangePointer(

    (PVOID*) &pdx->TheIrp, NULL);

  if (Irp)

    {

    IoCancelIrp(Irp);

    if (InterlockedExchange(&pdx->CancelFlag, 1))

      IoCompleteRequest(Irp, IO_NO_INCREMENT);

    }

  }

NTSTATUS OnComplete(PDEVICE_OBJECT fdo, PIRP Irp,

  PDEVICE_EXTENSION pdx)

  {

  if (InterlockedExchangePointer((PVOID*) &pdx->TheIrp, NULL)

    ││ InterlockedExchange(&pdx->CancelFlag, 1))

    return STATUS_SUCCESS;

  return STATUS_MORE_PROCESSING_REQUIRED;

  }

This code is similar to the code I showed earlier for cancelling your own asynchronous IRP. Here, however, allowing IoCompleteRequest to finish completing the IRP takes the place of the call to IoFreeIrp we made when we were dealing with our own IRP. If the completion routine is last on the scene, it returns STATUS_SUCCESS to allow IoCompleteRequest to finish completing the IRP. If CancelTheIrp is last on the scene, it calls IoCompleteRequest to resume the completion processing that the completion routine short-circuited by returning STATUS_MORE_PROCESSING_REQUIRED.

One extremely subtle point regarding this example is the call to IoMark­Irp­Pending in the dispatch routine. Ordinarily, it would be safe to just do this step conditionally in the completion routine, but not this time. If we should happen to call CancelTheIrp in the context of some thread other than the one in which the dispatch routine runs, the pending flag is needed so that IoCompleteRequest will schedule an APC to clean up the IRP in the proper thread. The easiest way to make that true is simple—always mark the IRP pending.

Handling IRP_MJ_CLEANUP

Closely allied to the subject of IRP cancellation is the I/O request with the major function code IRP_MJ_CLEANUP. To explain how you should process this request, I need to give you a little additional background.

When applications and other drivers want to access your device, they first open a handle to the device. Applications call CreateFile to do this; drivers call ZwCreateFile. Internally, these functions create a kernel file object and send it to your driver in an IRP_MJ_CREATE request. When the entity that opened the handle is done accessing your driver, it will call another function, such as CloseHandle or ZwClose. Internally, these functions send your driver an IRP_MJ_CLOSE request. Just before sending you the IRP_MJ_CLOSE, however, the I/O Manager sends you an IRP_MJ_CLEANUP so that you can cancel any IRPs that belong to the same file object but that are still sitting in one of your queues. From the perspective of your driver, the one thing all the requests have in common is that the stack location you receive points to the same file object in every instance.

Figure 5-10 illustrates your responsibility when you receive IRP_MJ_CLEANUP. You should run through your queues of IRPs, removing those that are tagged as belonging to the same file object. You should complete those IRPs with STATUS_CANCELLED.

Driver responsibility for IRP_MJ_CLEANUP.

File Objects

Ordinarily, just one driver (the function driver, in fact) in a device stack implements all three of the following requests: IRP_MJ_CREATE, IRP_MJ_CLOSE, and IRP_MJ_CLEANUP. The I/O Manager creates a file object (a regular kernel object) and passes it in the I/O stack to the dispatch routines for all three of these IRPs. Anybody who sends an IRP to a device should have a pointer to the same file object and should insert that pointer into the I/O stack as well. The driver that handles these three IRPs acts as the owner of the file object in some sense, in that it’s the driver that’s entitled to use the FsContext and FsContext2 fields of the object. So your DispatchCreate routine can put something into one of these context fields for use by other dispatch routines and for eventual cleanup by your DispatchClose routine.

It’s easy to get confused about IRP_MJ_CLEANUP. In fact, programmers who have a hard time understanding IRP cancellation sometimes decide (incorrectly) to just ignore this IRP. You need both cancel and cleanup logic in your driver, though:

  • IRP_MJ_CLEANUP means a handle is being closed. You should purge all the IRPs that pertain to that handle.
  • The I/O Manager and other drivers cancel individual IRPs for a variety of reasons that have nothing to do with closing handles.
  • One of the times the I/O Manager cancels IRPs is when a thread terminates. Threads often terminate because their parent process is terminating, and the I/O Manager will also automatically close all handles that are still open when a process terminates. The coincidence between this kind of cancellation and the automatic handle closing contributes to the incorrect idea that a driver can get by with support for just one concept.

In this book, I’ll show you two ways of painlessly implementing support for IRP_MJ_CLEANUP, depending on whether you’re using one of my DEVQUEUE objects or one of Microsoft’s cancel-safe queues.

Cleanup with a DEVQUEUE

If you’ve used a DEVQUEUE to queue IRPs, your IRP_MJ_DISPATCH_CLEANUP routine will be astonishingly simple:

NTSTATUS DispatchCleanup(PDEVICE_OBJECT fdo, PIRP Irp)

  {

  PDEVICE_EXTENSION pdx =

    (PDEVICE_EXTENSION) fdo->DeviceExtension;

  PIO_STACK_LOCATION stack = IoGetCurrentIrpStackLocation(Irp);

  PFILE_OBJECT fop = stack->FileObject;

  CleanupRequests(&pdx->dqReadWrite, fop,

    STATUS_CANCELLED);

  return CompleteRequest(Irp, STATUS_SUCCESS, 0);

  }

CleanupRequests will remove all IRPs from the queue that belong to the same file object and will complete those IRPs with STATUS_CANCELLED. Note that you complete the IRP_MJ_CLEANUP request itself with STATUS_SUCCESS.

CleanupRequests contains a wealth of detail:

VOID CleanupRequests(PDEVQUEUE pdq, PFILE_OBJECT fop,

  NTSTATUS status)

  {

  LIST_ENTRY cancellist;

  InitializeListHead(&cancellist);

  KIRQL oldirql;

  KeAcquireSpinLock(&pdq->lock, &oldirql);

  PLIST_ENTRY first = &pdq->head;

  PLIST_ENTRY next;

  for (next = first->Flink; next != first; )

    {

    PIRP Irp = CONTAINING_RECORD(next, IRP,

      Tail.Overlay.ListEntry);

    PIO_STACK_LOCATION stack = IoGetCurrentIrpStackLocation(Irp);

    PLIST_ENTRY current = next;

    next = next->Flink;

    if (fop && stack->FileObject != fop)

      continue;

    if (!IoSetCancelRoutine(Irp, NULL))

      continue;

    RemoveEntryList(current);

    InsertTailList(&cancellist, current);

    }

  KeReleaseSpinLock(&pdq->lock, oldirql);

  while (!IsListEmpty(&cancellist))

    {

    next = RemoveHeadList(&cancellist);

    PIRP Irp = CONTAINING_RECORD(next, IRP,

      Tail.Overlay.ListEntry);

    Irp->IoStatus.Status = status;

    IoCompleteRequest(Irp, IO_NO_INCREMENT);

    }

  }

  1. Our strategy will be to move the IRPs that need to be cancelled into a private queue under protection of the queue’s spin lock. Hence, we initialize the private queue and acquire the spin lock before doing anything else.
  2. This loop traverses the entire queue until we return to the list head. Notice the absence of a loop increment step—the third clause in the for statement. I’ll explain in a moment why it’s desirable to have no loop increment.
  3. If we’re being called to help out with IRP_MJ_CLEANUP, the fop argument is the address of a file object that’s about to be closed. We’re supposed to isolate the IRPs that pertain to the same file object, which requires us to first find the stack location.
  4. If we decide to remove this IRP from the queue, we won’t thereafter have an easy way to find the next IRP in the main queue. We therefore perform the loop increment step here.
  5. This especially clever statement comes to us courtesy of Jamie Hanrahan. We need to worry that someone might be trying to cancel the IRP that we’re currently looking at during this iteration. They could get only as far as the point where CancelRequest tries to acquire the spin lock. Before getting that far, however, they necessarily had to execute the statement inside IoCancelIrp that nullifies the cancel routine pointer. If we find that pointer set to NULL when we call IoSetCancelRoutine, therefore, we can be sure that someone really is trying to cancel this IRP. By simply skipping the IRP during this iteration, we allow the cancel routine to complete it later on.
  6. Here’s where we take the IRP out of the main queue and put it in the private queue instead.
  7. Once we finish moving IRPs into the private queue, we can release our spin lock. Then we cancel all the IRPs we moved.

Cleanup with a Cancel-Safe Queue

To easily clean up IRPs that you’ve queued by calling IoCsqInsertIrp, simply adopt the convention that the peek context parameter you use with IoCsqRemoveNextIrp, if not NULL, will be the address of a FILE_OBJECT. Your IRP_MJ_CANCEL routine will look like this (compare with the Cancel sample in the DDK):

NTSTATUS DispatchCleanup(PDEVICE_OBJECT fdo, PIRP Irp)

  {

  PDEVICE_EXTENSION pdx =

    (PDEVICE_EXTENSION) fdo->DeviceExtension;

  PIO_STACK_LOCATION stack = IoGetCurrentIrpStackLocation(Irp);

  PFILE_OBJECT fop = stack->FileObject;

  PIRP qirp;

  while ((qirp = IoCsqRemoveNextIrp(&pdx->csq, fop)))

    CompleteRequest(qirp, STATUS_CANCELLED, 0);

  return CompleteRequest(Irp, STATUS_SUCCESS, 0);

  }

Implement your PeekNextIrp callback routine this way:

PIRP PeekNextIrp(PIO_CSQ csq, PIRP Irp, PVOID PeekContext)

  {

  PDEVICE_EXTENSION pdx = GET_DEVICE_EXTENSION(csq);

  PLIST_ENTRY next = Irp ? Irp->Tail.Overlay.ListEntry.Flink 

    : pdx->IrpQueueAnchor.Flink;

  while (next != &pdx->IrpQueueAnchor)

    {

    PIRP NextIrp = CONTAINING_RECORD(next, IRP,

      Tail.Overlay.ListEntry);

    PIO_STACK_LOCATION stack =

      IoGetCurrentIrpStackLocation(NextIrp);

    if (!PeekContext ││ (PFILE_OBJECT) PeekContext

      == stack->FileObject)

      return NextIrp;

    next = next->Flink;

    }

  return NULL;

  }

0