Monday, 4 December 2017

Practical Reverse Engineering Exercise Solutions: Page 79 / Exercise 7

Exercise 7 on page 79 of the book Practical Reverse Engineering specifies the following ARM disassembly of a function called mystery7:

01:             mystery7
02: 02 46         MOV      R2, R0
03: 08 B9         CBNZ     R0, loc_100E1D8
04: 00 20         MOVS     R0, #0
05: 70 47         BX       LR 

06:      loc_100E1D8
07: 90 F9 00 30   LDRSB.W  R3, [R0]
08: 02 E0         B        loc_100E1E4

09:      loc_100E1DE
10: 01 32         ADDS     R2, #1
11: 92 F9 00 30   LDRSB.W  R3, [R2]

12:      loc_100E1E4
13: 00 2B         CMP      R3, #0
14: FA D1         BNE      loc_100E1DE
15: 10 1A         SUBS     R0, R2, R0
16: 6F F3 9F 70   BFC.W    R0, #0x1E, #2
17: 70 47         BX       LR
18:             ; End of function mystery7

Again, the function provided is executed in Thumb mode, due to several 16 bit instructions and instructions specific to Thumb mode such as CBNZ and the .W suffix such as in line 7, 11 and 16.

mystery7() takes one argument, which is a pointer to a structure in memory. Load operations from this address solely read a single byte and the address read from is always increased by 1. Thus, the function argument is probably a pointer to a character array, i.e. a string. The return value is yet to be determined, but it is generated by manipulating the address of the input argument, so it will be a 32 bit value. Our preliminary function prototype:

uint32 mystery7 (char* arg);

There is also an interesting instruction in line 16 - BFC. BFC stands for Bit Field Clear and clears a certain number of bits in the destination register. The instruction at hand is BFC.W R0, #0x1E, #2 and clears two bits starting at the least significant bit #0x1E (=30), i.e. the two most significant bits 30 and 31.

In line 3, the input argument is inspected for nullness, in which case the function returns 0.

If it is not null, the character stored at the address pointed to by the input argument is loaded (line 11) and compared to 0 (line 13). When it is not 0, the next character is loaded and compared again to 0 and this process is repeated in a loop. Presumably we are searching the end of a string, which is delimited by the null-character 0.

When the loaded character value matches 0, the address of the start of the array is subtracted from the address where the first 0-character was found. Thereby, we calculate the length of the string and store it in register R0. The last instruction before leaving mystery7() involves clearing the two most significant bits. It does not make immediately sense why the two most significant bits are cleared - maybe there is a maximum string length imposed by the compiler.

We are now ready to fully decompile mystery7 to:

uint32 calcStringLength (char* arg) {
if (arg==null) {
return 0;
}

uint32 counter = 0;

while (arg[counter] != 0) {
counter ++;
}

uint32 length = arg[counter] - arg;
length = length && 0x3FFFFFFF; // clear two most significant bits

return length;
}

No comments:

Post a Comment