Sunday, 3 December 2017

Practical Reverse Engineering Exercise Solutions: Page 79 / Exercise 5

Exercise 5 on page 79 of the book Practical Reverse Engineering specifies the following ARM disassembly of a function called mystery5:

01:   mystery5
02: 03 46    MOV   R3, R0
03: 06 2B    CMP   R3, #6
04: 0D D0    BEQ   loc_1032596
05: 07 2B    CMP   R3, #7
06: 09 D0    BEQ   loc_1032592
07: 08 2B    CMP   R3, #8
08: 05 D0    BEQ   loc_103258E
09: 09 2B    CMP   R3, #9
10: 01 D0    BEQ   loc_103258A
11: 09 48    LDR   R0, =aA ; "A"
12: 70 47    BX    LR

13:   loc_103258A
14: 07 48    LDR   R0, =aB ; "B"
15: 70 47    BX    LR

16:   loc_103258E
17: 05 48    LDR   R0, =ac ; "C"
18: 70 47    BX    LR

19:   loc_1032592
20: 03 48    LDR   R0, =aD ; "D"
21: 70 47    BX    LR

22:   loc_1032596
23: 01 48    LDR   R0, =aE ; "E"
24: 70 47    BX    LR
25:   ; End of function mystery5

All instructions have a width of 16 bits, so we are dealing with code in Thumb state.

One argument is passed to the function in register R0 and we can infer from the numerous comparisons that it is presumably of type integer (32 bit).

There are several exit points of the function, as we can see from the Branch and Exchange instructions (BX LR). Before each branch instruction, a LDR pseudoinstruction into register R0 is carried out. It uses PC-relative addressing to load a constant string value into R0.

We arrive at the following function prototype:

char* mystery5 (int arg);

The pattern from line 3 to 12 strongly indicates that the original program utilizes the switch-case programming construct, as the input value is compared to a range of numbers. For the input 6, the string "E" is returned, for the input 7 the string "D", for #8 the string "C" and so on and so forth:

Our proposed C code for mystery5 is as follows:

char* mystery5 (int arg) {
switch (arg):
case 6:  return "E";
case 7:  return "D";
case 8:  return "C";
case 9:  return "B";
default: return "A";
}