Книга: Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software
Назад: The Goal of Analyzing Encoding Algorithms
Дальше: Common Cryptographic Algorithms

shows how the message ATTACK AT NOON would be encoded using an XOR with the byte 0x3C. Each character is represented by a cell, with the ASCII character (or control code) at the top, and the hex value of the character on the bottom.

.

shows what the output of such a script might reveal.

shows the first few bytes of the a.gif file encoded with different XOR keys. The goal of brute-forcing here is to try several different values for the XOR key until you see output that you recognize—in this case, an MZ header. The first column lists the value being used as the XOR key, the second column shows the initial bytes of content as they are transformed, and the last column shows whether the suspected content has been found.

.

.

. Notice how blatant the XOR key of 0x12 is, even at just a glance. Most of the bytes in the initial part of the header are 0x12! This demonstrates a particular weakness of single-byte encoding: It lacks the ability to effectively hide from a user manually scanning encoded content with a hex editor. If the encoded content has a large number of NULL bytes, the single-byte “key” becomes obvious.

, the code for this modified XOR is not much more complicated than the original.

, the C code for the original XOR function is shown at left, and the NULL-preserving XOR function is on the right. So if the key is 0x12, then any 0x00 or 0x12 will not be transformed, but any other byte will be transformed via an XOR with 0x12. When a PE file is encoded in this fashion, the key with which it is encoded is much less visually apparent.

Now compare (with the obvious 0x12 key) with . represents the same encoded PE file, encoded again with 0x12, but this time using the NULL-preserving single-byte XOR encoding. As you can see, with the NULL-preserving encoding, it is more difficult to identify the XOR encoding, and there is no evidence of the key.

.

, most of the listed instructions are an XOR of a register with itself (such as xor edx,edx).

An XOR encoding loop may use either of the other two forms: an XOR of a register with a constant or an XOR of a register with a different register. If you are lucky, the XOR will be of a register with a constant, because that will confirm that you are probably seeing encoding, and you will know the key. The instruction xor edx, 12h in is an example of this second form of XOR.

One of the signs of encoding is a small loop that contains the XOR function. Let’s look at the instruction we identified in . As the IDA Pro flowchart in shows, the XOR with the 0x12 instruction briefly describes some of these encoding schemes. We won’t delve into the specifics of each of these techniques, but you should be aware of them so that you can recognize them if you see them.

, you have seen Base64 encoding. Here, the top few lines show email headers followed by a blank line, with the Base64-encoded data at the bottom.

shows how the transformation happens. The top line is the original string (ATT). The second line is the hex representation of ATT at the nibble level (a nibble is 4 bits). The middle line shows the actual bits used to represent ATT. The fourth line is the value of the bits in each particular 6-bit-long section as a decimal number. Finally, the last string is the character used to represent the decimal number via the index into a reference string.

.

, it appears at first as if both the URL path and the Cookie are Base64-encoded values. While the Cookie value appears to remain constant, it looks like the attacker is sending two different encoded messages in the two GET requests.

A quick way to encode or decode using the Base64 standard is with an online tool such as the decoder found at . Simply enter the Base64-encoded content into the top window and click the button labeled Decode Safely As Text. For example, shows what happens if we run the Cookie value through a Base64 decoder.

, we add the padding and try again:

. Both strings have all the characteristics of Base64 encoding: a restricted, random-looking shows what we find when we run them through a Base64 decoder.

used this custom substitution cipher. Looking again at the strings output, we see that we mistook the custom string for the standard one, since it looked so similar. The actual indexing string was the preceding one, with the a character moved to the front of the string. The attacker simply used the standard algorithm and changed the encoding string. In , we try the decryption again, but this time with the new string.

sss
sss

© RuTLib.com 2015-2018