## Monday, 4 December 2017

### Practical Reverse Engineering Exercise Solutions: Page 79 / Exercise 6

Exercise 6 on page 79 of the book Practical Reverse Engineering specifies the following ARM disassembly of a function called mystery6:

01:             mystery6
02: 2D E9 18 48   PUSH.W   {R3,R4,R11,LR}
03: 0D F2 08 0B   ADDW     R11, SP, #8
04: 04 68         LDR      R4, [R0]
05: 00 22         MOVS     R2, #0
06: 00 2C         CMP      R4, #0
07: 06 DD         BLE      loc_103B3B6

08:      loc_103B3A8
09: 50 F8 04 3F   LDR.W    R3, [R0,#4]!
10: 8B 42         CMP      R3, R1
11: 06 D0         BEQ      loc_103B3BE
12: 01 32         ADDS     R2, #1
13: A2 42         CMP      R2, R4
14: F8 DB         BLT      loc_103B3A8

15:      loc_103B3B6
16: 00 20         MOVS     R0, #0
17: 00 21         MOVS     R1, #0

18:      locret_103B3BA
19: BD E8 18 88   POP.W    {R3,R4,R11,PC}

20:      loc_103B3BE
21: B2 F1 20 03   SUBS.W   R3, R2, #0X20
22: 01 21         MOVS     R1, #1
23: 99 40         LSLS     R1, R3
24: 01 23         MOVS     R3, #1
25: 13 FA 02 F0   LSLS.W   R0, R3, R2
26: F5 E7         B        locret_103B3BA
27:             ; End of function mystery6

Due to the presence of 16 bit instructions, instructions having the .W suffix and function prologue and epilogue with PUSH and POP respectively, we are dealing with code in Thumb state.

The function apparently takes two arguments in R0 and R1, where R0 is a pointer to an unknown structure (line 4, line 9) and R1 is a value which is compared to elements of the unknown structure (line 10).
For the return values, it seems both R0 and R1 contain in this case return values, as both are set prior to exiting the function. For this purpose, we consult the official ARM documentation for the Procedure Call Standard http://infocenter.arm.com/help/topic/com.arm.doc.ihi0042f/IHI0042F_aapcs.pdf.

The relevant section for us is the following:

A double-word sized Fundamental Data Type (e.g., long long, double and 64-bit containerized vectors) is returned in r0 and r1.

Thus, the function seems to return a 64 bit value. Our preliminary function prototype is as follows:

long long mystery6 (unknownStruct* arg1, int arg2);

Lines 1-7 set register R2 to zero and examine the value stored at offset 0x0 of the unknown structure. When it is less or equal to zero, the function returns 0. Otherwise, the second block beginning with line 9 is executed.
There, the value stored af R0+0x4 of the unknown structure is loaded and the value of R0 increased by 0x4 afterwards (pre-indexed address mode). This value is compared to the second argument arg2.

In case of inequality, the register value of R2 is incremented by 1. Furthermore, R2 is compared to the value stored at offset 0x0 of the unknown structure. If it greater than or equal to this value, the function returns 0. If it is less than the value at offset 0x0, the control flow begins again at line 8.
In case of equality, the control flow continues at line 20.

Consequently, the program executes a loop and uses register R2 as a counter variable, while the value of the unknown structure at offset 0x0 serves as a limit parameter. At offset 0x4 of the unknown structure, there seems to be a sequence of values that each require 4 bytes of storage. We conclude that it probably is an array of type int. The preliminary structure is as follows:

unknownStruct:
field00_i; // limit or size parameter
field00_array_i; // array of integers

With this knowledge in mind, it is highly likely that the element field00_i stores the number of elements in the array. Thus, we give the variables more descriptive names:

unknownStruct:
arraySize_i; // number of integers in the variable arrayInts_p
arrayInts_p; // array of integers

Moreover, the second argument arg2 is compared to the array values and if it cannot be matched to any of the array values, the function returns 0. Thus, the function seems to search after the second argument in the array.

Beginning with line 20, we know that the counter variable R2 contains the array index where the second argument can be found. In line 21, the counter variable R2 is reduced by 0x20 (decimal 32) and stored in R3. Afterwards, a left shift of the value 0x1 by the value stored in R2 is performed and stored in R1. In mathematical terms, we have: R1 = 0x1 << (R2-0x20).
Afterwards, a left shift of the value 0x1 by the counter variable R2 is performed and stored in R0.
Lastly, the function epilogue is executed and the values from R3, R4 and R11 are restored.

At first glance, the left shifting operations might confuse the uninitiated eye. So what is actually happening here? It might help to recall that the function returns a 64-bit value and the lower 32 bits are stored in R0 and the upper 32 bits are stored in R1.
Rather than outlining the theoretical possibilities, it may make more sense to take some example values for the counter variable.
Let's consider three different cases:

1. 0 <= R2 <= 31
2. 32 <= R2 <= 63
3. 64 <= R2

In the first case, the first shift operation will simply store 0 in R1, as the substraction of 32 from a value in [0;31] will result in a shift operation that moves 0x1 beyond what can be represented with a 32 bit value. R0, however, will contain a single "1" bit exactly at the position defined by R2.

In the second case, the first operation will store a single "1" bit exactly at the position defined by (R2-32), whereas R0 only contains zeroes.

In the third case, both R0 and R1 will only contain zeroes as both shift operations try to shift the least significant bit 0x1 to beyond what can be represented with a 32 bit value.

In a nutshell, the result value R0:R1 is a bitmap or bitfield, which contains a single "1" bit at the position indicates by the counter variable R2, which in turn indicates where in the array of the first argument the second argument can be found.
There is, however, an important caveat: When the second argument is located after the 64th element of the input array, the function will return 0.

Lastly, we provide a decompiled function mystery6:

struct {
int arraySize_i;
int arrayInts[];
} unknownStruct;

long long getBitmapLocationOfValue (unknownStruct* arg1, int searchValue) {
int size = arg1->arraySize_i;
if (size <= 0) {
return 0;
}

int arrayIndex = 0;
while(arg1->arrayInts[arrayIndex] != searchValue) {
arrayIndex += 1;
if (arrayIndex >= size) {
return 0;
}
}

long long result = 0x1 << arrayIndex;
return result;

}