# Practical Reverse Engineering Exercise Solutions: Page 79 / Exercise 6

Contents

Exercise 6 on page 79 of the book Practical Reverse Engineering specifies the following ARM disassembly of a function called `mystery6`:

 `````` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 `````` ``````01: mystery6 02: 2D E9 18 48 PUSH.W {R3,R4,R11,LR} 03: 0D F2 08 0B ADDW R11, SP, #8 04: 04 68 LDR R4, [R0] 05: 00 22 MOVS R2, #0 06: 00 2C CMP R4, #0 07: 06 DD BLE loc_103B3B6 08: loc_103B3A8 09: 50 F8 04 3F LDR.W R3, [R0,#4]! 10: 8B 42 CMP R3, R1 11: 06 D0 BEQ loc_103B3BE 12: 01 32 ADDS R2, #1 13: A2 42 CMP R2, R4 14: F8 DB BLT loc_103B3A8 15: loc_103B3B6 16: 00 20 MOVS R0, #0 17: 00 21 MOVS R1, #0 18: locret_103B3BA 19: BD E8 18 88 POP.W {R3,R4,R11,PC} 20: loc_103B3BE 21: B2 F1 20 03 SUBS.W R3, R2, #0X20 22: 01 21 MOVS R1, #1 23: 99 40 LSLS R1, R3 24: 01 23 MOVS R3, #1 25: 13 FA 02 F0 LSLS.W R0, R3, R2 26: F5 E7 B locret_103B3BA 27: ; End of function mystery6 ``````

Due to the presence of 16 bit instructions, instructions having the `.W` suffix and function prologue and epilogue with `PUSH` and `POP` respectively, we are dealing with code in Thumb state.

The function apparently takes two arguments in `R0` and `R1`, where `R0` is a pointer to an unknown structure (line 4, line 9) and R1 is a value which is compared to elements of the unknown structure (line 10). For the return values, it seems both `R0` and `R1` contain in this case return values, as both are set prior to exiting the function. For this purpose, we consult the official ARM documentation for the Procedure Call Standard http://infocenter.arm.com/help/topic/com.arm.doc.ihi0042f/IHI0042F_aapcs.pdf.

The relevant section for us is the following:

A double-word sized Fundamental Data Type (e.g., long long, double and 64-bit containerized vectors) is returned in r0 and r1.

Thus, the function seems to return a 64 bit value. Our preliminary function prototype is as follows:

 ``````1 `````` ``````long long mystery6 (unknownStruct* arg1, int arg2); ``````

Lines 1-7 set register `R2` to zero and examine the value stored at offset `0x0` of the unknown structure. When it is less or equal to zero, the function returns 0. Otherwise, the second block beginning with line 9 is executed.

There, the value stored af `R0+0x4` of the unknown structure is loaded and the value of `R0` increased by `0x4` afterwards (pre-indexed address mode). This value is compared to the second argument arg2.

In case of inequality, the register value of `R2` is incremented by 1. Furthermore, `R2` is compared to the value stored at offset 0x0 of the unknown structure. If it greater than or equal to this value, the function returns 0. If it is less than the value at offset 0x0, the control flow begins again at line 8.

In case of equality, the control flow continues at line 20.

Consequently, the program executes a loop and uses register `R2` as a counter variable, while the value of the unknown structure at offset `0x0` serves as a limit parameter. At offset `0x4` of the unknown structure, there seems to be a sequence of values that each require 4 bytes of storage. We conclude that it probably is an array of type int. The preliminary structure is as follows:

 ``````1 2 3 `````` ``````unknownStruct: field00_i; // limit or size parameter field00_array_i; // array of integers ``````

With this knowledge in mind, it is highly likely that the element field00_i stores the number of elements in the array. Thus, we give the variables more descriptive names:

 ``````1 2 3 `````` ``````unknownStruct: arraySize_i; // number of integers in the variable arrayInts_p arrayInts_p; // array of integers ``````

Moreover, the second argument arg2 is compared to the array values and if it cannot be matched to any of the array values, the function returns 0. Thus, the function seems to search after the second argument in the array.

Beginning with line 20, we know that the counter variable `R2` contains the array index where the second argument can be found. In line 21, the counter variable `R2` is reduced by 0x20 (decimal 32) and stored in R3. Afterwards, a left shift of the value 0x1 by the value stored in `R2` is performed and stored in `R1`. In mathematical terms, we have: `R1 = 0x1 << (R2-0x20)`.

Afterwards, a left shift of the value `0x1` by the counter variable `R2` is performed and stored in `R0`.

Lastly, the function epilogue is executed and the values from `R3, R4 and R11` are restored.

At first glance, the left shifting operations might confuse the uninitiated eye. So what is actually happening here? It might help to recall that the function returns a 64-bit value and the lower 32 bits are stored in `R0` and the upper 32 bits are stored in `R1`.

Ratherthan outlining the theoretical possibilities, it may make more sense to take some example values for the counter variable. Let’s consider three different cases:

1. 0 <= R2 <= 31
2. 32 <= R2 <= 63
3. 64 <= R2

In the first case, the first shift operation will simply store 0 in `R1`, as the substraction of 32 from a value in `[0;31]` will result in a shift operation that moves `0x1` beyond what can be represented with a 32 bit value. `R0`, however, will contain a single “1” bit exactly at the position defined by R2.

In the second case, the first operation will store a single “1” bit exactly at the position defined by (`R2-32`), whereas `R0` only contains zeroes.

In the third case, both `R0` and `R1` will only contain zeroes as both shift operations try to shift the least significant bit 0x1 to beyond what can be represented with a 32 bit value.

In a nutshell, the result value `R0:R1` is a bitmap or bitfield, which contains a single “1” bit at the position indicates by the counter variable `R2`, which in turn indicates where in the array of the first argument the second argument can be found.

There is, however, an important caveat: When the second argument is located after the 64th element of the input array, the function will return 0.

Lastly, we provide a decompiled function `mystery6`:

 `````` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 `````` ``````struct { int arraySize_i; int arrayInts[]; } unknownStruct; long long getBitmapLocationOfValue (unknownStruct* arg1, int searchValue) { int size = arg1->arraySize_i; if (size <= 0) { return 0; } int arrayIndex = 0; while(arg1->arrayInts[arrayIndex] != searchValue) { arrayIndex += 1; if (arrayIndex >= size) { return 0; } } long long result = 0x1 << arrayIndex; return result; } ``````