Differences in x64 Architecture - Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software - Michael Sikorski and Andrew Honig - RutLib.com

shows a simple C program that accesses a memory address.

references global data (the variable x). In order to access this data, the instruction encodes the 4 bytes representing the data’s address. This instruction is not position independent, because it will always access address 0x00403374, but if this file were to be loaded at a different location, the instruction would need to be modified so that the mov instruction accessed the correct address, as shown in .

00401004 A1 ❶74 ❷33 ❸40 ❹00 mov     eax, dword_403374

You’ll notice that the bytes of the address are stored with the instruction at ❶, ❷, ❸, and ❹. Remember that the bytes are stored with the least significant byte first. The bytes 74, 33, 40, and 00 correspond to the address 0x00403374.

After recompiling for x64, shows the same mov instruction that appears in .

0000000140001058 8B 05 ❶A2 ❷D3 ❸00 ❹00 mov     eax, dword_14000E400

At the assembly level, there doesn’t appear to be any change. The instruction is still mov eax, dword_address, and IDA Pro automatically calculates the instruction’s address. However, the differences at the opcode level allow this code to be position-independent on x64, but not x86.

In the 64-bit version of the code, the instruction bytes do not contain the fixed address of the data. The address of the data is 14000E400, but the instruction bytes are A2 ❶, D3 ❷, 00 ❸, and 00 ❹, which correspond to the value 0x0000D3A2.

The 64-bit instruction stores the address of the data as an offset from the current instruction pointer, rather than as an absolute address, as stored in the 32-bit version. If this file were loaded at a different location, the instruction would still point to the correct address, unlike in the 32-bit version. In that case, if the file is loaded at a different address, the reference must be changed.

Instruction pointer–relative addressing is a powerful addition to the x64 instruction set that significantly decreases the number of addresses that must be relocated when a DLL is loaded. Instruction pointer–relative addressing also makes it much easier to write shellcode because it eliminates the need to obtain a pointer to EIP in order to access data. Unfortunately, this addition also makes it more difficult to detect shellcode, because it eliminates the need for a call/pop as discussed in . Many of those common shellcode techniques are unnecessary or irrelevant when working with malware written to run on the x64 architecture.

. The first four parameters of the call are passed in the RCX, RDX, R8, and R9 registers; additional ones are stored on the stack.

compares the stack management of 32-bit and 64-bit code. Notice in the graph for a 32-bit function that the stack size grows as arguments are pushed on the stack, and then falls when the stack is cleaned up. Stack space is allocated at the beginning of the function, and moves up and down during the function call. When calling a function, the stack size grows; when the function returns, the stack size returns to normal. In contrast, the graph for a 64-bit function shows that the stack grows at the start of the function and remains at that level until the end of the function.

shows an example of the disassembly for a function call compiled for a 32-bit processor.

shows the disassembly for the same function call compiled for a 64-bit processor:

shows an example of a prologue for a small function.

Example 21-6. Prologue code for a small function

00000001400010A0  mov     [rsp+arg_8], rdx 00000001400010A5  mov     [rsp+arg_0], ecx 00000001400010A9  push    rdi 00000001400010AA  sub     rsp, 20h

Here, we see that this function has two parameters: one 32-bit and one 64-bit. This function allocates 0x20 bytes from the stack, as required by all nonleaf functions as a place to provide storage for parameters. If a function has any local stack variables, it will allocate space for them in addition to the 0x20 bytes. In this case, we can tell that there are no local stack variables because only 0x20 bytes are allocated.

64-Bit Exception Handling

Unlike exception handling in 32-bit systems, structured exception handling in x64 does not use the stack. In 32-bit code, the fs:[0] is used as a pointer to the current exception handler frame, which is stored on the stack so that each function can define its own exception handler. As a result, you will often find instructions modifying fs:[0] at the beginning of a function. You will also find exploit code that overwrites the exception information on the stack in order to get control of the code executed during an exception.

Structured exception handling in x64 uses a static exception information table stored in the PE file and does not store any data on the stack. Also, there is an _IMAGE_RUNTIME_FUNCTION_ENTRY structure in the .pdata section for every function in the executable that stores the beginning and ending address of the function, as well as a pointer to exception-handling information for that function.

Назад: Why 64-Bit Malware?

Дальше: Windows 32-Bit on Windows 64-Bit