In our article, "Examining VxD Service Hooking" (Dr. Dobb's Journal, May 1996), we focused on how Windows 95 device drivers can take advantage of built-in support for monitoring or altering system calls inside the Windows 95 kernel (Virtual Memory Manager). This ability is extremely powerful because it enables the development of applications that can see events and actions occurring inside a system at a level of detail not otherwise possible. Windows NT, besides having a less documented internal architecture, does not provide any support for hooking system calls. In this article, we describe the architecture of NT so as to clearly define what is meant by the term "system call" when it is used in the context of Windows NT. We also go inside the Windows NT kernelv (NTOSKRNL.EXE) to expose the mechanism by which a system-call request gets routed to the kernel routine that services it. We then show how device drivers can hook system calls, allowing them to see system requests both before and after they've been serviced.
To demonstrate the types of monitoring that can be achieved with system-call hooking, we present the design and implementation of an application, NTRegmon. NTRegmon uses hooking to show detailed information about every registry access that occurs on an NT system. It also is useful for studying NT registry usage, finding undocumented application-registry settings, and debugging your own registry-enabled programs.
The Windows NT System-Call Architecture
Most Windows NT developers think of the Win32 interface when they hear the term "system call." Calls like CreateFile
, ShowWindow
, PeekMessage
and others are what make up the operating system that is exported to programmers by the NT architecture. However, beneath this layer, which shares most of its definition with the Windows 95 implementation of Win32, is the real Windows NT operating system. The core components of NT are NTOSKRNL and WIN32K, named after the files they are loaded from, NTOSKRNL.EXE and WIN32K.SYS (in the system32 directory under the system root directory). When an NT system is diagrammed from the point of view of request paths, Win32 is actually a layer that runs on top of the kernel; see Figure 1.
Figure 1: NT architecture.
If you are familiar with NT's architecture from Helen Custer's Inside Windows NT
(Microsoft Press, 1993), the sight of the WIN32K kernel-mode component may be surprising. In fact, WIN32K made its debut with NT 4.0 when much of Win32's graphics engine, previously implemented in user mode by GDI32.DLL and USER32.DLL, was moved into kernel mode to boost performance.
When an application makes a call to a Win32 function, the call is handled by a routine in one of the Win32 DLLs that make up the Win32 subsystem. In most cases, the routine performs operations that are specific to Win32, such as validating parameters, updating internal Win32 data structures, and breaking the request up into subrequests. But in the end, the DLL will usually end up calling upon native NT services provided by NTOSKRNL or WIN32K to actually carry out the system-related parts of the request. NTOSKRNL is invoked by a call to another DLL, NTDLL.DLL, which exports NTOSKRNL services to user-level subsystems like Win32, POSIX, and OS/2.
For example, a call to the Win32 function CreateFile
is serviced by KERNEL32.DLL, which validates parameters and then, depending on the flags that were passed, makes one or more calls to NTOSKRNL via NTDLL wrappers. For instance, if the flags indicate that the CreateFile
call should fail if a file of the same name already exists, a call is made to the kernel to see if that condition is true. Then the kernel is called to actually create or open the specified file, and finally, a third call to the kernel might be made to set some of the file's attributes.
One of NTDLL's primary jobs is to initialize a register with a system-call number that identifies the service in the kernel being called, and to execute a system-call trap. For the CreateFile
example, the NTDLL function ZwCreateFile
is one of the routines invoked. Example 1, ZwCreateFile
's disassembly for x86 processors, demonstrates how thin a wrapper NTDLL provides for kernel services. NTDLL contains many snippets of code that look almost exactly like the example. What makes each routine unique is its system-call number and the number of parameters that are popped off the stack when the routine is finished. Calls to WIN32K also look like the ZwCreateFile
example, but instead of being placed in a separate DLL, they are located within USER32 and GDI32.
Example 1: ZwCreateFile disassembly.
ZwCreateFile: mov eax, 17h ; system call number lea edx, [esp+4] ; pointer to params int 2Eh ; NT x86 syscall trap ret 2Ch ; pop params
"Zw"-prefixed calls, like ZwCreateFile,
have alias names that are identical except that "Zw" is replaced with "Nt" (NtCreateFile
); kernel services corresponding to "Zw" calls use the "Nt" prefix. Thus, an application linking with NTDLL can use ZwCreateFile
or NtCreateFile
to access the kernel service NtCreateFile
.
Nothing prevents applications from accessing NTDLL functions without going through Win32, but Win32 generally provides a more friendly interface than the native NT interface. In addition, since NTDLL's interface is undocumented, applications that access it directly run the risk, albeit small, that Microsoft may change it without notice.
The system-call trap is how NT changes gears from user-mode to kernel-mode execution to enter the privileged world of the operating-system kernel. When a trap occurs, the processor's execution mode changes and it begins executing on a kernel-mode stack. The kernel finds the address of the service that will handle the request by looking up a data structure referenced by a field in the executing thread's Thread Environment Block (TEB). The TEB contains all the information necessary for operation of a thread, such as its registers, its priority level, a pointer to its process, and so on. The data structure in question, which we'll call the Service Table List, is shown in Figure 2.
Figure 2: System-call data structures.
In the current implementation of NT 4.0, the list contains two entries that define system-call tables for NTOSKRNL and WIN32K calls. NT 3.51 and its predecessors have only one entry in the list: that for NTOSKRNL calls. Each entry is made up of four fields: The first is a pointer to an array of function addresses called a "service table;" the second field is 0 and is never referenced; the third field contains the number of system calls in the service table referenced by the first field; and the fourth field points to an argument table.
First, the kernel's system-call trap handler uses the system-call number passed to it via a register (EAX in Example 1) to determine which entry in the Service Table List it should access. Win32 system calls have system-call numbers that start at 0x1000, whereas kernel system calls begin at 0. Next, the handler ensures that it is dealing with a valid system-call number. It compares the number to the third field of the appropriate entry in the Service Table List, which contains the highest valid system-call number for the table. (The handler subtracts 0x1000 from Win32K call numbers before the comparison.) If the call number is less than or equal to the highest acceptable number, the handler obtains a pointer to the service table and indexes into it with the system-call number to obtain the address of the service it must call. It then indexes into the argument table and reads the number of bytes it must get from the caller's stack to push onto its own stack as it calls the service. After the service returns, the handler performs some cleanup and returns from the trap into (usually) NTDLL.
Each thread contains a pointer to a potentially unique copy of a Service Table List. However, all the Service Table Lists point to shared service and argument tables with the exception of some system threads that have NULL fields for the Win32K entry in their lists. The reason for having multiple identical Service Table Lists is unclear, but it seems to be the result of forward-thinking design. In the future, different threads may be presented for different versions of the NT kernel. Perhaps just as NT supports different user-mode personalities (POSIX, Win32, OS/2), it will also support different kernel personalities.
Hooking System Calls
Since each thread's TEB has its own Service Table List pointer, it is possible that every thread could also have its own unique table of OS services. However, in practice, the list and tables are globally shared. Simply changing an entry in either the NTOSKRNL or WIN32K service tables to point to a new hook routine in a device driver is all that is needed. Changing an entry hooks the call across all threads in the system, including any new threads that are created. Unfortunately, as NT does not provide a service-hooking function, NT version-dependent code must be written to hook specific services.
Two variables tie system-call hooking to an implementation of NT. The first is the offset in the TEB where the Service Table List pointer is stored, and the second is the system-call numbers that identify services. Since there is no published definition for the TEB, locating the Service Table List requires manually indexing into it by a fixed number of bytes and extracting the pointer. While the offset has not changed from NT 3.51 to NT 4.0, surprisingly, it is different across hardware platforms. System-call numbers, on the other hand, while constant across hardware platforms, changed between NT 3.51 and NT 4.0. For example, under NT 3.51, the system-call number for RegOpenKey
is 0x4D, but under NT 4.0, it is 0x51.