Analyzing switch Statements - Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software - Michael Sikorski and Andrew Honig - RutLib.com

shows a simple switch statement that uses the variable i. Depending on the value of i, the code under the corresponding case value will be executed.

. It contains a series of conditional jumps between ❶ and ❷. The conditional jump determination is made by the comparison that occurs directly before each jump.

The switch statement has three options, shown at ❸, ❹, and ❺. These code sections are independent of each other because of the unconditional jumps to the end of the listing. (You’ll probably find that switch statements are easier to understand using the graph shown in .)

00401013        cmp     [ebp+var_8], 1 00401017        jz      short loc_401027 ❶ 00401019        cmp     [ebp+var_8], 2 0040101D        jz      short loc_40103D 0040101F        cmp     [ebp+var_8], 3 00401023        jz      short loc_401053 00401025        jmp     short loc_401067 ❷ 00401027 loc_401027: 00401027        mov     ecx, [ebp+var_4] ❸ 0040102A        add     ecx, 1 0040102D        push    ecx 0040102E        push    offset unk_40C000 ; i = %d 00401033        call    printf 00401038        add     esp, 8 0040103B        jmp     short loc_401067 0040103D loc_40103D: 0040103D        mov     edx, [ebp+var_4] ❹ 00401040        add     edx, 2 00401043        push    edx 00401044        push    offset unk_40C004 ; i = %d 00401049        call    printf 0040104E        add     esp, 8 00401051        jmp     short loc_401067 00401053 loc_401053: 00401053        mov     eax, [ebp+var_4] ❺ 00401056        add     eax, 3 00401059        push    eax 0040105A        push    offset unk_40C008 ; i = %d 0040105F        call    printf 00401064        add     esp, 8

breaks down each of the switch options by splitting up the code to be executed from the next decision to be made. Three of the boxes in the figure, labeled ❶, ❷, and ❸, correspond directly to the case statement’s three different options. Notice that all of these boxes terminate at the bottom box, which is the end of the function. You should be able to use this graph to see the three checks the code must go through when var_8 is greater than 3.

From this disassembly, it is difficult, if not impossible, to know whether the original code was a switch statement or a sequence of if statements, because a compiled switch statement looks like a group of if statements—both can contain a bunch of cmp and Jcc instructions. When performing your disassembly, you may not always be able to get back to the original source code, because there may be multiple ways to represent the same code constructs in assembly, all of which are valid and equivalent.

the value of i were 3, three different comparisons would take place before the third case was executed. In , we add one case to (as you can see by comparing the listings), but the assembly code generated is drastically different.

uses a jump table, shown at ❷, which defines offsets to additional memory locations. The switch variable is used as an index into the jump table.

In this example, ecx contains the switch variable, and 1 is subtracted from it in the first line. In the C code, the switch table range is 1 through 4, and the assembly code must adjust it to 0 through 3 so that the jump table can be properly indexed. The jump instruction at ❶ is where the target is based on the jump table.

In this jump instruction, edx is multiplied by 4 and added to the base of the jump table (0x401088) to determine which case code block to jump to. It is multiplied by 4 because each entry in the jump table is an address that is 4 bytes in size.

00401016        sub     ecx, 1 00401019        mov     [ebp+var_8], ecx 0040101C        cmp     [ebp+var_8], 3 00401020        ja      short loc_401082 00401022        mov     edx, [ebp+var_8] 00401025        jmp     ds:off_401088[edx*4] ❶ 0040102C   loc_40102C:               ... 00401040        jmp     short loc_401082 00401042   loc_401042:               ... 00401056        jmp     short loc_401082 00401058   loc_401058:               ... 0040106C        jmp     short loc_401082 0040106E   loc_40106E:               ... 00401082   loc_401082: 00401082        xor     eax, eax 00401084        mov     esp, ebp 00401086        pop     ebp 00401087        retn 00401087   _main   endp 00401088  ❷off_401088  dd offset loc_40102C 0040108C               dd offset loc_401042 00401090               dd offset loc_401058 00401094               dd offset loc_40106E

The graph in for this type of switch statement is clearer than the standard disassembly view.

Figure 6-4. Disassembly graph of jump table switch statement example

As you can see, each of the four cases is broken down clearly into separate assembly code chunks. These chunks appear one after another in a column after the jump table determines which one to use. Notice that all of these boxes and the initial box terminate at the right box, which is the end of the function.

Назад: Understanding Function Call Conventions

Дальше: Disassembling Arrays