Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

NT Handle-to-Path Conversion


July 2000/NT Handle-to-Path Conversion



This may not be a frequently asked question on the Windows programming newsgroups, but it’s certainly a recurring one: given a handle for an open file, how can you get the file’s name? Since Win32 doesn’t provide any API function for that, the expert answer is usually “you can’t.” To my surprise, it turns out that you can get a valid filename string (including the device and path information) for most files from their handle on NT. Even more surprising, it’s possible using only documented NT API functions — no undocumented functions, no kernel-mode DDK programming.

gfnfhnt.c (Listing 1) contains GetFileNameFromHandleNT(), the function I created to produce a filename from a handle. The journey from the file handle to the name string is a convoluted one, with stops at some esoteric and obscure parts of the SDK. Unfortunately, the process requires Win32 functions that are not present on all Win32 platforms; only NT 4 and later have all the necessary parts of the puzzle. Also, although the Win32 functions used here are “documented” in the Platform SDK, much of their true behavior is not spelled out.

The steps of the process used by GetFileNameFromHandleNT() are:

  • memory-map the file
  • get the file’s kernel object name from GetMappedFileName()
  • convert the device part of the kernel object device to a drive letter, or
  • convert the kernel object name to a UNC (Universal Naming Code or sometimes Universal Naming Convention) path.

The rest of this article explains each of these steps, in the sequence they’re performed.

Memory-Mapping

Memory-mapping files is the least exotic part of the code. Memory-mapping files for random access or to share data between different processes is fairly common. The SDK documents accurately describe the process, and sample code and tutorial articles are available. If you’re already familiar with memory-mapping of files, you’ll want to skip this section.

Memory-mapping a file is an elegant way of simplifying high-performance, random-access I/O by letting the operating system take care of all the little details of buffering, caching, file seeks, read-ahead/write-behind scheduling, etc.

In this case, I’m not mapping the file to read or write its data. I don’t care what the file contains, and a function to retrieve the file’s name definitely should not alter the file’s contents. My only reason for mapping the file is to obtain an address I can pass to GetMappedFileName(). Therefore, there’s no need to map the entire file into memory. I’ve set an arbitrary limit of the first 1,024 bytes of the file (or the entire file, whichever comes first).

Of course, not every file handle can be memory-mapped. An open file handle could refer to a file, a pipe, a mailslot, a console, a “communication resource,” a directory, or even a raw disk device. In general, plain old files can be memory-mapped, with the big exception of empty files (of length zero bytes). The more exotic objects can’t be memory-mapped. The limitations of GetFileNameFromHandleNT() shown in Table 1 are essentially the limitations of what types of file handles can be memory-mapped.

GetMappedFileName()

After mapping the file, I have an address I can pass to GetMappedFileName(), an exotic function exported by psapi.dll. psapi.dll is only available on NT 3 and later (you can get psapi.h and psapi.lib from the Platform SDK at msdn.microsoft.com) and won’t be present on Win95 or Win98. In order to fail gracefully on systems where psapi.dll is not present, I use LoadLibrary() and GetProcAddress() instead of implicit linking via an export library. The weakness of this technique is that it isn’t type-safe. You must be very careful in translating the API function’s prototype from the SDK documentation or psapi.h into a pointer-to-function type definition, getting all of the argument types and the return type exactly right.

The inputs to GetMappedFileName() are straightforward. The first argument is a process handle — almost always the constant value returned by GetCurrentProcess(). The second argument is an arbitrary virtual address within the memory-mapped region; the base address is convenient. If you have a handle to another process (e.g., from CreateProcess()) and a memory location that is valid in the other process, GetMappedFileName() should accept them as well, but that’s unlikely unless you’re writing a debugger.

The third and fourth arguments are the output string buffer and its size. What GetMappedFileName() returns in this buffer, as shown in Table 2, might surprise you. According to the Platform SDK documentation, if you give it a memory address that corresponds to memory-mapped data, it “returns the name of the memory-mapped file.” However, if you do a thorough search of the documentation, you’ll find that GetMappedFileName() returns the name in “device format.”

GetMappedFileName() does not return the original path of the file. Nor could the returned string be passed to CreateFile() to open the same file. Nor, as far as I know, could the string be usefully passed to any other Win32 API.

The Unix-like string returned by GetMappedFileName() is the file’s “kernel object name.” This namespace is barely mentioned in the Win32 SDK documentation, for the very good reason that this namespace should be hidden from application programmers. You certainly don’t want to be bothered trying to figure out drive mappings; you want the operating system to take care of those details and just provide the drive letter. GetMappedFileName() is providing a fascinating — but entirely unwanted — glimpse into the kernel-level structure of NT.

At a glance, you can tell that NT organizes all of the physical devices in a hierarchical name tree, starting at \device and branching downward, very much like the /dev tree on a Unix system. When you get to the branch that specifies the storage device, the rest of the string is the file’s path information from the device’s root directory. Or rather, the rest of the string is interpreted in the context of the device’s namespace — which is in theory completely arbitrary. The rules for that namespace can be whatever the device driver decides to implement.

The important thing to notice is that there is no way of telling how many branches specify the device. You can’t do it just by counting backslash characters. As you can see from the examples in Table 2, sometimes the device-name part will have only two links (\Device\Floppy0), and sometimes it will have three (\Device\Harddisk0\Partition1). In theory, there could be any number of links in the device’s kernel object name. Nor can you decode the strings with logic. There’s nothing special about a link name like “LanmanRedirector” that you can depend on. That string will vary from system to system depending on the type of network provider and could be totally changed by a simple driver update.

Further discussion of kernel objects and names is outside the scope of this article. I would highly recommend Chapter 3 of Inside Windows NT or Inside Windows NT, Second Edition. Another good source of information is the Systems Internals website, www.sysinternals.com. All these provide excellent background material, but unless you’re willing to use undocumented NT internal API functions that could evaporate with the next release, their information isn’t of practical use. My object is to parse these strings and translate the device portion into a drive letter using only documented NT API functions

QueryDosDevice()

QueryDosDevice() doesn’t parse kernel object names, but it does do part of the reverse: given a drive letter or other DOS device name (like PRN or AUX), it provides the kernel object name of the device. The code in gfnfhnt.c (Listing 1) works by iterating across all the DOS device names and testing each of them for a match with the leading portion of the string returned by GetMappedFileName().

QueryDosDevice() is, like GetMappedFileName(), not available on all Win32 platforms. (It’s not supported by Windows 95.) So again, I access it with LoadLibrary() and GetProcAddress().

The first call to QueryDosDevice() passes a NULL pointer for the DOS device-name parameter, causing QueryDosDevice() to return a buffer containing a list of the device names available. The code scans that string list, passing each device name back to QueryDosDevice() to get the corresponding kernel device name. If this device name matches the leading portion of the file’s kernel object name, I can just substitute the drive letter, and I’m done.

This works fine for local disk drives, but QueryDosDevice() has some undocumented quirks when the device is a mapped network drive. In that case, the returned string has an extra link in it, inserted between the network driver name and the server name. For example, on my NT 4.0 development system, the kernel object name for my I: drive is \Device\LanmanRedirector\PSERVER\DevTools. (This drive is mapped to the share name “DevTools” on the server PSERVER using NT networking.) However, when the string “I:” is passed to QueryDosDevice(), it returns “\Device\LanmanRedirector\I:\Pserver\DevTools”. On a Win2000 RC2 system, the same call would return “\Device\LanmanRedirector\;I:0\Pserver\DevTools”. This wouldn’t be a problem if the strings returned by GetMappedFileName() and QueryDosDevice() were consistent, but they’re not! gfnfhnt.c (Listing 1) attempts to work around this by looking for the drive letter and colon anywhere in the device-name string. If it finds it, it deletes everything from the preceding backslash to the following backslash. This should remove the extra substring, so the result will match the convention used by GetMappedFileName().

Also, the QueryDosDevice() documentation says: “If the function succeeds, the return value is the number of characters stored into the buffer....” In fact, this count is usually off by a couple of characters, so if you need the count, strlen() it yourself.

At this point, the task should be complete for any file on a local storage device or a mapped network drive. However, the behavior of QueryDosDevice() inserting the extra text is undocumented and therefore could change at any time. The next Windows update could break my logic, and I would exit the QueryDosDevice() loop without having found a match. Also, the file could be a network file that was opened without drive mapping by a UNC path string. For example, take “\\PSERVER\DevTools\temp\foo.txt”. Your user could have entered such a pathname or used an Open common dialog to select a file in the “Network Neighborhood” (or “My Network Places” on Win2000).

If the file is on a network share that is mapped to a drive letter, there is no way to tell which path string was originally used to open the file handle: the drive letter version or the UNC version. Both refer to the same object, and GetMappedFileName() will return the same kernel object name string in both cases. My code prefers the drive-letter version and will return it if it finds a match in the QueryDosDevice() loop.

If a match is not found in the loop, I need to convert the kernel object name into a UNC string. At first glance, that looks easy — the kernel object name string contains the server name and share name, so all I have to do is replace the network driver prefix with a slash character. The problem is that I really don’t know how much of the kernel object name string is the device driver. Where does it end and the server name begin?

WNetEnumResource()

If the input file handle is valid, there must be a current network connection to it. By enumerating the \\server\share names of all of the current network connections, I should be able to recognize that substring in the file’s kernel object name without any special knowledge of device names. Once I’ve identified the \server\share links, the string transformation to produce a UNC path is trivial.

WNetEnumResource() lets you enumerate network resources. It provides the power to browse the network for shared printers, folders, or “containers.” A container could be a server or a domain, which you can enumerate recursively. WNetEnumResource() provides most of the behind-the-scenes functionality you see when you open the Network Neighborhood icon on your desktop. (See Mitch Stuart’s February 1997 WDJ article, “A Reusable Network Enumeration Class.”)

WNetEnumResource() must be preceded by a call to WNetOpenEnum(), where I specify what types of network resources I want to enumerate (in this case the currently open connections). Afterwards, I call WNetCloseEnum() to close the handle returned by WNetOpenEnum(). In between, I call WNetEnumResource() as many times as necessary to enumerate all the open connections. On each call, WNetEnumResource() writes a number of NETRESOURCE blocks into the caller’s buffer. The code in this loop is more complex than you might expect, because WNetEnumResource() could return any number of NETRESOURCEs, even none if it decides your buffer isn’t big enough to hold the next one. The loop in the code will process (at most) one NETRESOURCE on each iteration, reallocating a larger buffer when necessary.

In practice, the loop in gfnfhnt.c (Listing 1) starts out with a 4KB buffer and should never actually reallocate a larger one, but the logic does support the possibility, just in case. There is a subtle point here: WNetEnumResource() needs more space in the buffer you pass it than just sizeof (NETRESOURCE). If you look at the NETRESOURCE block’s definition, you’ll see it contains a number of string pointers:

typedef struct  _NETRESOURCEA {
    ... <code omitted>
    LPSTR    lpLocalName;
    LPSTR    lpRemoteName;
    LPSTR    lpComment ;
    LPSTR    lpProvider;
}NETRESOURCEA, *LPNETRESOURCEA;

Where are those character strings located in memory, and who is responsible for freeing them when they’re no longer needed? If WNetEnumResource() had allocated those strings elsewhere and returned their pointers, it would have a problem knowing when they could be deleted. The answer is that while WNetEnumResource() is building the array of NETRESOURCE blocks at the top of your buffer, it is also copying those strings into the bottom of that buffer. The pointers you see in the NETRESOURCE records are set to those copies, not to the original strings that the OS keeps internally. This very neatly solves the problem of who allocates and deletes those strings without any memory leakage, but it takes up more space in the buffer. Another factor is Unicode. Unicode is the native character set for NT. If yours is an ANSI application, the ANSI version of each API function (in this case WNetEnumResourcesA()) takes care of the character-set conversions. This needs more memory space for temporary Unicode strings that get converted into ANSI. Instead of allocating those temporary strings on the stack or heap, WNetEnumResourcesA() allocates them inside the same buffer. All in all, a surprising amount of buffer space could be required for each NETRESOURCE.

If I was enumerating servers in a domain or shared printers on a remote server, a WNetEnumResource() call might have to wait for network communications and might be slow to execute. In this case, however, I’m only scanning a list of current connections in the network redirector, so speed of execution isn’t an issue.

For any network share mapped as a local drive, WNetEnumResource() will return a NETRESOURCE with the drive letter in the lpLocalName string. It would be tempting to just skip any resource with a local name string on the theory that I’ve already determined that the file isn’t on a mapped drive. The code doesn’t do that; it will check for a match whether the service is mapped to a local drive or not, as insurance against a future Windows upgrade breaking the logic in the QueryDosDevice() loop for network drives.

The NETRESOURCE’s lpRemoteName string will contain the resource’s network name, in the form \\servername\servicename. I try to find a match for this within the file’s kernel object name. The Platform SDK doesn’t promise much about the syntax of the remote name, saying only that “it must follow the network provider’s naming conventions.” If this is the name of a shared folder, could it have a trailing backslash appended? In my testing, I have never seen one, but the code is properly paranoid on this point: it will work either way.

To do the string matching, I’ve used the Visual C++ library routine _tcsstr(), which performs a case-sensitive match. This really should be a case-insensitive match instead. In my experience, network names are always treated as case insensitive, but technically, this is up to the network provider and can’t be guaranteed by the Win32 API. I am not aware of any API function that lets you ask the network provider for its convention. Certainly, Windows networking treats those names as case insensitive, and I suspect that any network provider that treated such names as case sensitive would break a lot of code. So, the string matching should be case insensitive.

However, Visual C++ doesn’t supply a case-insensitive version of _tcsstr(). Since gfnfhnt.c (Listing 1) is already too long, I decided not to append a _tcsistr() string-matching subroutine. The elite developers who read WDJ don’t need to be shown how to do this, but proper paranoia would require this change.

In my experience, the case-sensitive matching has never failed. My hypothesis is that both GetMappedFileName() and WNetEnumResource() are ultimately copying those strings from the same source, so they will never vary by case. I must emphasize that this is not guaranteed by the Platform SDK documentation.

When a match is found, I need to insure that the entire share name string is matched. That is, I don’t want \\server\foo to match \device\networkprovider\server\fooBAR\readme.txt. Therefore, I require that either the last character matched from lpRemoteName or the next character in the kernel object name must be a backslash.

The last complexity is that I want to reject spurious matches. Suppose the target file is \device\networkprovider\servername\sharename\temp\foo\bar\readme.txt, and by a bizarre coincidence, there happens to be a current network connection to a share named bar on a server named foo (\\foo\bar). I want to reject this and similar spurious matches, so two more features are added to the code. First, I verify that the UNC filename implied by the match refers to a real, existing file by calling GetFileAttibutes(). This may cause some network communications traffic and slow down execution slightly, but it’s necessary in order to bulletproof the code. Since spurious matches will be very rare in the first place, the burden isn’t oppressive. Second, when I have a good match, I continue enumerating connections to see if an even better match is possible. The best match is defined as the one that starts closest to the root of the kernel object name.

The To-Do List

I wrote the code in gfnfhnt.c (Listing 1) for portability and simplicity. Everything is in one function with a dead-simple argument list. While it works as is, you’ll probably want to modify it for your own application.

As discussed before, the case-sensitive string match routine _tcsstr() should be replaced by a case-insensitive version to insure this continues to work in the future.

If the function is going to be called repeatedly, you’ll probably want to optimize the LoadLibrary()/GetProcAddress()/FreeLibrary() calls so that your application loads and unloads psapi.dll only once. Additional optimization would be difficult. You would not want to cache the drive letter assignments returned from QueryDosDevice() or the network connects returned by WNetEnumResource() because either of those could change while your program is running.

As written, GetFileNameFromHandleNT() simply returns FALSE if it fails. You certainly would want to install more erudite error reporting.

The code doesn’t support file paths longer than MAX_PATH, using the “\\.\” convention.

Finally, I don’t know whether this code will work correctly inside an NT service.

When Should You Use This Code

Never, if you can avoid it.

On the plus side, I’ve tested this code on NT 4.0 (SP3 through SP5) and Windows 2000 RC2, and it works fine. This file is a rewrite for WDJ of similar code I am using in a shipping application [5]. It can be compiled for ANSI or Unicode applications. It only uses published API functions documented in the Platform SDK, which should be available on future OS releases.

On the other hand, it’s inefficient code bloat; it won’t work on NT 3, Win95, or Win98; it won’t work for empty files; it won’t work for pipes, comm ports, etc.; it depends on under-documented behavior that could change in the next version of the OS; it could be slow to execute where dial-up networking is used. In my opinion, a good software engineer would never want to have this function in his code. You would be far, far better off rewriting your application to save the filename string at the point where the handle was opened and pass it to whatever code might need the filename.

However, programming is often a compromise with the devil. In my case, I had to deal with a file handle that was passed to me from the Windows debugging API — without a corresponding path string. Since my application didn’t open the handle, and I needed the path string to work around another bug in NT 4, I had no choice. If you find yourself trapped in a similar corner, I hope this article can help.

Acknowledgements

Thanks to Dan Chou at Microsoft Developer Support for his help with QueryDosDevice(). I could not have solved this puzzle without his assistance.

References

1. Helen Custer. Inside Windows NT (Microsoft Press, 1993).

2. David A. Solomon. Inside Windows NT, Second Edition (Microsoft Press, 1998).

3. Mark Russinovich and Bryce Cogswell, www.sysinternals.com.

4. Mitch Stuart. “A Reusable Network Enumeration Class,” Windows Developers Journal (February 1997).

5. Advanced Micro Devices. The CodeAnalyst application is available as part of the Athlon SDK from www.amd.com/swdev/swdev.html.

About the Author

Jim Conyngham has a BS degree in Computer Engineering and an MS in Computer Science. He has been developing software for over 25 years. He is currently a Member of Technical Staff at Advanced Micro Devices, working on development tools.

Get Source Code


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.