x
). In order to access this data, the instruction encodes the 4 bytes representing the data’s address. This instruction is not position independent, because it will always access address 0x00403374, but if this file were to be loaded at a different location, the instruction would need to be modified so that the mov
instruction accessed the correct address, as shown in .00401004 A1 ❶74 ❷33 ❸40 ❹00 mov eax, dword_403374
You’ll notice that the bytes of the address are stored with the instruction at ❶, ❷, ❸, and ❹. Remember that the bytes are stored with the least significant byte first. The bytes 74, 33, 40, and 00 correspond to the address 0x00403374.
After recompiling for x64, shows the same mov
instruction that appears in .
0000000140001058 8B 05 ❶A2 ❷D3 ❸00 ❹00 mov eax, dword_14000E400
At the assembly level, there doesn’t appear to be any change. The instruction is still mov eax,
dword
_address
, and IDA Pro automatically calculates the instruction’s address. However, the differences at the opcode level allow this code to be position-independent on x64, but not x86.
In the 64-bit version of the code, the instruction bytes do not contain the fixed address of the data. The address of the data is 14000E400
, but the instruction bytes are A2
❶, D3
❷, 00
❸, and 00
❹, which correspond to the value 0x0000D3A2
.
The 64-bit instruction stores the address of the data as an offset from the current instruction pointer, rather than as an absolute address, as stored in the 32-bit version. If this file were loaded at a different location, the instruction would still point to the correct address, unlike in the 32-bit version. In that case, if the file is loaded at a different address, the reference must be changed.
Instruction pointer–relative addressing is a powerful addition to the x64 instruction set that significantly decreases the number of addresses that must be relocated when a DLL is loaded. Instruction pointer–relative addressing also makes it much easier to write shellcode because it eliminates the need to obtain a pointer to EIP in order to access data. Unfortunately, this addition also makes it more difficult to detect shellcode, because it eliminates the need for a call
/pop
as discussed in . Many of those common shellcode techniques are unnecessary or irrelevant when working with malware written to run on the x64 architecture.
Example 21-6. Prologue code for a small function
00000001400010A0 mov [rsp+arg_8], rdx 00000001400010A5 mov [rsp+arg_0], ecx 00000001400010A9 push rdi 00000001400010AA sub rsp, 20h
Here, we see that this function has two parameters: one 32-bit and one 64-bit. This function allocates 0x20 bytes from the stack, as required by all nonleaf functions as a place to provide storage for parameters. If a function has any local stack variables, it will allocate space for them in addition to the 0x20 bytes. In this case, we can tell that there are no local stack variables because only 0x20 bytes are allocated.
Unlike exception handling in 32-bit systems, structured exception handling in x64 does not use the stack. In 32-bit code, the fs:[0]
is used as a pointer to the current exception handler frame, which is stored on the stack so that each function can define its own exception handler. As a result, you will often find instructions modifying fs:[0]
at the beginning of a function. You will also find exploit code that overwrites the exception information on the stack in order to get control of the code executed during an exception.
Structured exception handling in x64 uses a static exception information table stored in the PE file and does not store any data on the stack. Also, there is an _IMAGE_RUNTIME_FUNCTION_ENTRY
structure in the .pdata
section for every function in the executable that stores the beginning and ending address of the function, as well as a pointer to exception-handling information for that function.