Technical Blog for Jim Beveridge: Understanding ReadDirectoryChangesW

Wednesday, May 19, 2010

Understanding ReadDirectoryChangesW - Part 1

The longest, most detailed description in the world of how to successfully use ReadDirectoryChangesW.

This is Part 1 of 2. This part describes the theory and Part 2 describes the implementation.

Go to the GitHub repo for this article or just download the sample code.

I have spent this week digging into the barely-documented world of ReadDirectoryChangesW and I hope this article saves someone else some time. I believe I've read every article I could find on the subject, as well as numerous code samples. Almost all of the examples, including the one from Microsoft, either have significant shortcoming or have outright mistakes.

You'd think that this problem would have been a piece of cake for me, having been the author of Multithreading Applications in Win32, where I wrote a chapter about the differences between synchronous I/O, signaled handles, overlapped I/O, and I/O completion ports. Except that I only write overlapped I/O code about once every five years, which is just about long enough for me to forget how painful it was the last time. This endeavor was no exception.

Four Ways to Monitor Files and Directories

First, a brief overview of monitoring directories and files. In the beginning there was SHChangeNotifyRegister. It was implemented using Windows messages and so required a window handle. It was driven by notifications from the shell (Explorer), so your application was only notified about things that the shell cared about - which almost never aligned with what you cared about. It was useful for monitoring things that the user did in Explorer, but not much else.

SHChangeNotifyRegister was fixed in Windows Vista so it could report all changes to all files, but is was too late - there are still several hundred million Windows XP users and that's not going to change any time soon.

SHChangeNotifyRegister also had a performance problem, since it was based on Windows messages. If there were too many changes, your application would start receiving roll-up messages that just said "something changed" and you had to figure out for yourself what had really happened. Fine for some applications, rather painful for others.

Windows 2000 brought two new interfaces, FindFirstChangeNotification and ReadDirectoryChangesW. FindFirstChangeNotification is fairly easy to use but doesn't give any information about what changed. Even so, it can be useful for applications such as fax servers and SMTP servers that can accept queue submissions by dropping a file in a directory. ReadDirectoryChangesW does tell you what changed and how, at the cost of additional complexity.

Similar to SHChangeNotifyRegister, both of these new functions suffer from a performance problem. They can run significantly faster than shell notifications, but moving a thousand files from one directory to another will still cause you to lose some (or many) notifications. The exact cause of the missing notifications is complicated. Surprisingly, it apparently has little to do with how fast you process notifications.

Note that FindFirstChangeNotification and ReadDirectoryChangesW are mutually exclusive. You would use one or the other, but not both.

Windows XP brought the ultimate solution, the Change Journal, which could track in detail every single change, even if your software wasn't running. Great technology, but equally complicated to use.

The fourth and final solution is is to install a File System Filter, which was used in the popular SysInternals FileMon tool. There is a sample of this in the Windows Driver Kit (WDK). However, this solution is essentially a device driver and so potentially can cause system-wide stability problems if not implemented exactly correctly.

For my needs, ReadDirectoryChangesW was a good balance of performance versus complexity.

The Puzzle

The biggest challenge to using ReadDirectoryChangesW is that there are several hundred possibilities for combinations of I/O mode, handle signaling, waiting methods, and threading models. Unless you're an expert on Win32 I/O, it's extremely unlikely that you'll get it right, even in the simplest of scenarios. (In the list below, when I say "call", I mean a call to ReadDirectoryChangesW.)

A. First, here are the I/O modes:

Blocking synchronous
Signaled synchronous
Overlapped asynchronous
Completion Routine (aka Asynchronous Procedure Call or APC)

B. When calling the WaitForXxx functions, you can:

Wait on the directory handle.
Wait on an event object in the OVERLAPPED structure.
Wait on nothing (for APCs.)

C. To handle notifications, you can use:

Blocking
WaitForSingleObject
WaitForMultipleObjects
WaitForMultipleObjectsEx
MsgWaitForMultipleObjectsEx
I/O Completion Ports

D. For threading models, you can use:

One call per worker thread.
Multiple calls per worker thread.
Multiple calls on the primary thread.
Multiple threads for multiple calls. (I/O Completion Ports)

Finally, when calling ReadDirectoryChangesW, you specify flags to choose what you want to monitor, including file creation, last modification date change, attribute changes, and other flags. You can use one flag per call and issue multiple calls or you can use use multiple flags in one call. Multiple flags is always the right solution. If you think you need to use multiple calls with one flag per call to make it easier to figure out what to do, then you need to read more about the data contained in the notification buffer returned by ReadDirectoryChangesW.

If your head is now swimming in information overload, you can easily see why so many people have trouble getting this right.

Wrong Techniques

As I was researching this solution, I saw a lot of recommendations that ranged from dubious to wrong to really, really wrong. Here's some commentary on what I saw.

If you are using the Simplicity solution above, don't use blocking calls because the only way to cancel it is with the undocumented technique of closing the handle or the Vista-only technique of CancelSynchronousIo. Instead, use the Signal Synchronous I/O mode by waiting on the directory handle. Also, to terminate threads, don't use TerminateThread, because that doesn't clean up resources and can cause all sorts of problems. Instead, create a manual-reset event object that is used as the the second handle in the call to WaitForMultipleObjects.When the event is set, exit the thread.

If you have dozens or hundreds of directories to monitor, don't use the Simplicity solution. Switch to the Balanced solution. Alternatively, monitor a root common directory and ignore files you don't care about.

If you have to monitor a whole drive, think twice (or three times) about this idea. You'll be notified about every single temporary file, every Internet cache file, every Application Data change - in short, you'll be getting an enormous number of notifications that could slow down the entire system. If you need to monitor an entire drive, you should probably use the Change Journal instead. This will also allow you to track changes even if your app is not running. Don't even think about monitoring the whole drive with FILE_NOTIFY_CHANGE_LAST_ACCESS.

If you are using overlapped I/O without using an I/O completion port, don't wait on handles. Use Completion Routines instead. This removes the 64 handle limitation, allows the operating system to handle call dispatch, and allows you to embed a pointer to your object in the OVERLAPPED structure. My example in a moment will show all of this.

If you are using worker threads, don't send results back to the primary thread with SendMessage. Use PostMessage instead. SendMessage is synchronous and will not return if the primary thread is busy. This would defeat the purpose of using a worker thread in the first place.

It's tempting to try and solve the issue of lost notifications by providing a huge buffer. However, this may not be the wisest course of action. For any given buffer size, a similarly-sized buffer has to be allocated from the kernel non-paged memory pool. If you allocate too many large buffers, this can lead to serious problems, including a Blue Screen of Death. Thanks to an anonymous contributor in the MSDN Community Content.

Jump to Part 2 of this article.

Go to the GitHub repo for this article or just download the sample code.

21 comments:

AnonymousMarch 30, 2011 at 10:54 AM
Hi Jim,

thank you very much for your detailed explanation. After searching a while in internet I can say your description helped me alot and it is the most complete one.

Cheers
ReplyDelete
Replies
AnonymousAugust 9, 2011 at 5:46 AM
Thx for sharing. Great explanation of ReadDirectoryChangesW!
ReplyDelete
Replies
AnonymousAugust 9, 2011 at 7:50 PM
Very thanks
ReplyDelete
Replies
AnonymousFebruary 17, 2012 at 3:54 AM
Thanks for your article, and the source code, I found it very useful, saved a lot of time!

I found one thing in CReadChangesRequest::ProcessNotification():

if (wstrFilename.Right(1) != L"\\")

Shouldn't this better be:

if (m_wstrDirectory.Right(1) != L"\\")

Regards,

Jost

...
ReplyDelete
Replies
AnonymousMarch 17, 2012 at 6:05 PM
Thank You!!! This helps
ReplyDelete
Replies
AnonymousMarch 27, 2012 at 7:13 AM
Hi Jim,

Thanks for the great article. Any idea how .NET System.IO.FileSystemWatcher implements its functionality. Would you recommend its use with a timer for watching files dropped via FTP?

Dave W
ReplyDelete
Replies
RezatashFebruary 25, 2013 at 1:26 AM
Hi jim. Your article is great and useful. I have a question I hope you can help me. Anti virus softwares offer some feature they call Real-time protection or on-access scan. as wikipedia says:
'real-time'means while data loaded into the computer's active memory: when inserting a CD, opening an email, or browsing the web, or when a file already on the computer is opened or executed.
I'm interested in writing some code to implement this on-access or real-time functionality.
do you have any suggestions to write a code which can monitor active memory changes and retrieve file address responsible for that change to trigger a scan by some tool.
thank you very much.
ReplyDelete
Replies
AnonymousApril 26, 2013 at 1:15 AM
Very helpful. But a probrem can be found at ThreadSafeQueue.h. CThreadSafeQueue::pop() never calls WaitForSingleObject() when the list is not empty.
ReplyDelete
Replies
Jim BeveridgeJune 30, 2013 at 12:30 PM
polyvertex and Anonymous,

Thanks so much for your feedback on this article. My schedule at the moment is completely overwhelmed and I don't expect to have time to dig into this for at least several weeks. Your comments definitely point out the complexity of using these APIs. The good news is that this code has been in production use for several years on systems that log all crashes, and we haven't seen any related crashes.
ReplyDelete
Replies
Tom K.October 30, 2014 at 9:36 AM
I have been searching for some help on 'tail -f' like solution. Finally, I came across your blog and solved my problem. Thanks!
ReplyDelete
Replies
AnonymousMarch 4, 2016 at 4:43 AM
Hi,
Great job, great description and sample.
Do you already have a sample for the "A4C6D4" solution?
Thx
Fred
ReplyDelete
Replies
Thomas HruskaJune 11, 2016 at 10:19 PM
As far as directory monitoring goes, a much more reliable alternative to ReadDirectoryChangesW() is to use a little-known feature of NTFS known as the NTFS USN Journal. Before you say, "That requires admin rights", accessing the journal actually only requires admin rights on the OS volume (i.e. C:). Other volumes are freely available without having to elevate to Administrator.

The far more difficult issue is that accessing the USN Journal itself is quite complex - the MSDN Library documentation on USN Journal operations is quite sparse and mostly focuses on v2 Journal records. If you want to be ReFS ready (assuming that ever happens), you have to also handle v3 records, which gets quite tricky. The much more difficult issue to deal with is that the USN Journal itself only provides a "file reference number" of the parent. To determine the full path and filename, you have to use OpenFileById() - a function only available on Vista and later even though the USN Journal was around long before that. If that call fails (e.g. the path no longer exists), you are hosed unless you have a reference to the ID/path saved somewhere. The other alternative to OpenFileById() is to read the $MFT and parse it...an esoteric exercise at best that few people have ever accomplished. VoidTools' "Everything" software, some Python scripts, and couple of forensic toolkits are about all I ran into when I was looking into reading the $MFT. However, with the USN Journal, you can monitor the entire file system quite efficiently. If you use the Overlapped I/O option on the relevant DeviceIoControl() calls, you can even get extremely close to real-time results - as soon as the kernel filesystem records the USN Journal record, your Overlapped I/O completes. The default USN Journal is rather large, so loss of information is pretty rare. Even on a fairly active system it doesn't roll over too frequently, so, generally-speaking, getting behind is only possible if your process exits and doesn't run for several hours. It's also configurable when it is created, so you can make it even bigger if you do find you get behind.

The only downside I ran into that you have to consider is removable storage. The requisite CreateFile() call to open the device most likely locks the device so it can't be ejected properly if it is a USB thumbdrive or external hard drive or something like that. I didn't test it, but it's not a good idea to be using the USN Journal constantly even if it is the most efficient means to accessing directory changes on full NTFS volumes.

A couple of other thoughts: Instead of using directory/journal monitoring as an IPC mechanism, if you control the source code to all of the applications involved, a mutex and a named pipe is probably a better solution.

Another option could be to use a driver that creates a fake drive letter. Anything written to that "disk" would cause the driver to notify the application that it has something to do. Look up "Dokany" on GitHub. Dokany even has a FUSE wrapper, so you could write your application as a FUSE app and have it work on Linux and other OSes too. Then whatever it was that you were wanting to monitor, you just have it write to your fake drive letter/volume instead of a normal NTFS volume. The only downside is that this type of solution uses one of the rather limited 23 available DOS drive letters and Dokany or an application could accidentally trigger a BSOD. Playing with fire in a production environment is fun!
ReplyDelete
Replies
Jim BeveridgeJune 12, 2016 at 9:42 PM
Thanks Thomas. I mentioned the Change Journal partway into this article, but it's not something I've spent much time on. You make good points in your comments.
ReplyDelete
Replies
UnknownAugust 6, 2017 at 3:08 PM
Good article.

> Windows 2000 brought two new interfaces

Small point on the history though, from the 2000 MSDN Library:

FindFirstChangeNotification:
Windows NT/2000: Requires Windows NT 3.1 or later.
Windows 95/98: Requires Windows 95 or later

ReadDirectoryChangesW:
Windows NT/2000: Requires Windows NT 3.51 SP3 or later.
Windows 95/98:/ Unsupported.
ReplyDelete
Replies
AnonymousApril 21, 2018 at 9:28 PM
Good article. Thanks!
ReplyDelete
Replies

Add comment

Technical Blog for Jim Beveridge

Wednesday, May 19, 2010

Understanding ReadDirectoryChangesW - Part 1

Four Ways to Monitor Files and Directories

The Puzzle

Recommended Solutions

Wrong Techniques

21 comments:

Blog Archive

Labels