Thursday, 7 December 2017

Practical Reverse Engineering Exercise Solutions: Page 79 / Exercise 11

Exercise 11 on page 79 of the book Practical Reverse Engineering specifies the following ARM disassembly of a function called mystery11 - the last exercise of the ARM chapter:

01: 010185B0             mystery11
02: 010185B0 2D E9 F8 4F   PUSH.W   {R3–R11,LR}
03: 010185B4 0D F2 20 0B   ADDW     R11, SP, #0x20
04: 010185B8 B0 F9 5A 30   LDRSH.W  R3, [R0,#0x5A]
05: 010185BC 07 46         MOV      R7, R0
06: 010185BE 90 46         MOV      R8, R2
07: 010185C0 00 EB 83 03   ADD.W    R3, R0, R3,LSL#2
08: 010185C4 D3 F8 84 A0   LDR.W    R10, [R3,#0x84]
09: 010185C8 7B 8F         LDRH     R3, [R7,#0x3A]
10: 010185CA 89 46         MOV      R9, R1
11: 010185CC CB B9         CBNZ     R3, loc_1018602
12: 010185CE B0 F9 5A 40   LDRSH.W  R4, [R0,#0x5A]
13: 010185D2 17 F1 20 02   ADDS.W   R2, R7, #0x20
14: 010185D6 00 EB 44 03   ADD.W    R3, R0, R4,LSL#1
15: 010185DA B3 F8 5C 50   LDRH.W   R5, [R3,#0x5C]
16: 010185DE 00 EB 84 03   ADD.W    R3, R0, R4,LSL#2
17: 010185E2 D3 F8 84 00   LDR.W    R0, [R3,#0x84]
18: 010185E6 83 89         LDRH     R3, [R0,#0xC]
19: 010185E8 06 6C         LDR      R6, [R0,#0x40]
20: 010185EA 03 EB 45 03   ADD.W    R3, R3, R5,LSL#1
21: 010185EE 9B 19         ADDS     R3, R3, R6
22: 010185F0 1C 78         LDRB     R4, [R3]
23: 010185F2 5B 78         LDRB     R3, [R3,#1]
24: 010185F4 43 EA 04 24   ORR.W    R4, R3, R4,LSL#8
25: 010185F8 43 8A         LDRH     R3, [R0,#0x12]
26: 010185FA 23 40         ANDS     R3, R4
27: 010185FC 99 19         ADDS     R1, R3, R6
28: 010185FE FD F7 8D FF   BL       sub_101651C

29: 01018602      loc_1018602
30: 01018602 BA 8E         LDRH     R2, [R7,#0x34]
31: 01018604 BB 6A         LDR      R3, [R7,#0x28]
32: 01018606 D0 18         ADDS     R0, R2, R3
33: 01018608 9A F8 02 30   LDRB.W   R3, [R10,#2]
34: 0101860C 0B B1         CBZ      R3, loc_1018612
35: 0101860E 00 22         MOVS     R2, #0
36: 01018610 00 E0         B        loc_1018614

37: 01018612      loc_1018612
38: 01018612 3A 6A         LDR      R2, [R7,#0x20]

39: 01018614      loc_1018614
40: 01018614 FB 8E         LDRH     R3, [R7,#0x36]
41: 01018616 B8 F1 00 0F   CMP.W    R8, #0
42: 0101861A 01 D0         BEQ      loc_1018620
43: 0101861C 80 18         ADDS     R0, R0, R2
44: 0101861E 9B 1A         SUBS     R3, R3, R2

45: 01018620      loc_1018620
46: 01018620 C9 F8 00 30   STR.W    R3, [R9]
47: 01018624 BD E8 F8 8F   POP.W    {R3–R11,PC}
48: 01018624             ; End of function mystery11

According to the exercise description, the called subroutine sub_101651C in line 28 takes three arguments and does not return anything. Thus, we know the registers R0, R1 and R2 are prepared and passed to the function.

The function is executed in Thumb mode, as we can see from the different instructions having both a width of 16 and 32 bits. Furthermore, Thumb specific instructions such as PUSH.W, POP.W and STR.W are contained in the disassembly.

With regard to the instruction semantics, we shed light on the following examples:

LDRSH.W  R3, [R0,#0x5A]: Load Signed Halfword Into Register. For the ARM architcture, a word is a 32-bit value and a half-word a 16-bit value. Thus, a signed 16-bit integer value (2 bytes) is loaded from the memory location [R0 + 0x5A] into R3.

LDRH     R3, [R0,#0xC]: Load Half Word Into Register. Slightly different than LDRSH, this instruction loads an unsigned 16-bit integer value into the destination register. In particular, it loads from memory at [R0 + 0xC] into register R3.

ADD.W    R3, R0, R3,LSL#2: This composition of arithmetic operations is equal to the following expression: R3 = R0 + (R3 << 2)

ORR.W    R4, R3, R4,LSL#8: This composition of logical operations is equal to the following expression: R4 = R3 |  (R4 << 8)

Let us know examine the data types of mystery11's arguments. R0 seems to be a pointer to a data structure, as we are accessing its field member at offset 0x5A. We create a provisional unknownStruct1:

unknownStruct1:
field5a_sint16; // 2 byte signed integer value (i.e. sint16) - line 4

R1 is first accessed in line 10 and its value is copied into R9, which is  used in line 46 as a destination address for storing a 32 bit value. Thus, we assume it is a pointer as well to an unknown memory structure.

The third argument in R2 is preserved in R8 (line 6) and used for a check against zero (line 41). Hence, we assume it is a 32-bit unsigned integer.

As far as the return value is concerned, we notice the return register R0 is set either in line 32 or 43 before the function exits. In both cases, an addition is performed so we assume it is of type uint32.

Our first function prototype proposal:

uint32 mystery11 (unknownStruct1* arg1, int* arg2, uint32 arg3);

Although the function contains no loops, the plethora of memory accesses does not permit us to reason about its functionality at a high level. Thus, we provide the first attempt to generate C code from the disassembly step-by-step along with the data structures:

unknownStruct1:
field20_uint32; // unknown 4 byte value - line 38
field28_uint32; // unknown 4 byte value - line 31
field34_uint16; // unknown 2 byte value - line 30
field36_uint16; // unknown 2 byte value - line 40
field3a_uint16; // 2 byte unsigned integer - line 9
field5a_sint16; // 2 byte signed integer value (i.e. sint16) - line 4


unknownStruct2:
field84_unknownstruct3_p; // pointer to an instance of unknownStruct3


unknownStruct3:
field02_c; // line 33
field12_uint16; // line 25
field0C_uint16; // line 18
field40_p; // line 19 


unknownStruct4:
field5c_uint16; // line 15


uint32 mystery11 (unknownStruct1* arg1, int* arg2, uint32 arg3) {

sint16 r3 = arg1->field5a_sint16; // line 4
unknownStruct1* r7 = arg1; // line 5
uint32 r8 = arg3; // line 6

unknownStruct2* r3 = arg1 + (arg1->field5a_sint16 * 4); // lines 4, 7

unknownStruct3* r10 = r3->field84_p; // line 8
uint16 r3 = arg1->field3a_uint16; // line 9
int* r9 = arg2; // line 10

if (r3 == 0) { // line 11

uint32 r2 = arg1 + 0x20; // line 13

sint16 r4 = arg1->field5a_sint16; // line 12

unknownStruct4* r3 = arg1 + (r4 * 2); // line 14

uint16 r5 = r3->field5c_uint16; // line 15

unknownStruct2* r3 = arg1 + (r4 * 4);  // line 16 - Probably unknownStruct5 == unknownStruct2, since we access the same memory location as in line 7

unknownStruct3* r0 = r3->field84_p; // line 17 - strong similarity to line 8

uint16 r3 = r0->field0C_uint16; // line 18
int* r6 = r0->field40_p; // line 19  - this is most likely a pointer due to the subsequent address calculations

uint32 r3 = r3 + (r5 * 2); // line 20
r3 = r6 + r3; // line 21

char r4 = r3[0]; // line 22
char r3 = r3[1]; // line 23 

r4 = r3 | (r4 << 8); // line 24 - we effectively obtain a 16-bit value by ORing two 8-bit values, where one was shifted 8 positions to the left

uint16 r3 = r0->field12_uint16; // line 25

r3 = r3 & r4; // line 26

r1 = r6 + r3; // line 27

sub_101651C(r0, r1, r2); 
}

uint16 r2 = arg1->field34_uint16; // line 30
uint32 r3 = arg1->field28_uint32; // line 31

uint32 r0 = r2 + r3; // line 32

char r3 = r10->field02_c; // line 33

if (r3 == 0) { // line 34
uint32 r2 = arg1->field20_uint32; // line 38
}
else {
uint32 r2 = 0; // line 35
}

uint16 r3 = arg1->field36_uint16; // line 40

if (arg3 != 0) { // lines 41,42
r0 = r0 + r2; // line 43
r3 = r3 + r2; // line 44
}

*arg2 = r3; // line 46
}


To be honest, this looks like a huge mess, but it is just our first attempt and we will try to refine the code and make it more readable.

Firstly, we follow the data flow from the end of the program to see what kind of value will be stored in the pointer arg2. We notice there can be at most a 32-bit value, so we change the data type of arg2 from int* to uint32*.
Secondly, we eliminate instructions that are either unnecessary or duplicated.
Thirdly, we try to minimize the number of C code by combining the load with the different arithmetic or comparison operations.
Fourthly, we try to give the variables more descriptive names. For instance, the third argument arg3 is only used to indicate or whether or not the addition in the last conditional block should be executed. Therefore, we choose its name to be "performAddition". The subroutine sub_101651C was named perform_magic.

unknownStruct1:
field20_uint32; // unknown 4 byte value - line 38
field28_uint32; // unknown 4 byte value - line 31
field34_uint16; // unknown 2 byte value - line 30
field36_uint16; // unknown 2 byte value - line 40
field3a_uint16; // 2 byte unsigned integer - line 9
field5a_sint16; // 2 byte signed integer value (i.e. sint16) - line 4


unknownStruct2:
field84_unknownstruct3_p; // pointer to an instance of unknownStruct3


unknownStruct3:
field02_c; // line 33
field12_uint16; // line 25
field0C_uint16; // line 18
field40_p; // line 19 


unknownStruct4:
field5c_uint16; // line 15


uint32 mystery11 (unknownStruct1* unknownStruct1Ptr, int* dest, uint32 indicator) {


unknownStruct2* r3 = unknownStruct1Ptr + (unknownStruct1Ptr->field5a_sint16 * 4); // lines 4, 7
unknownStruct3* r10 = r3->field84_unknownstruct3_p; // line 8

if (unknownStruct1Ptr->field3a_uint16 == 0) { // lines 9,11

sint16 r4 = unknownStruct1Ptr->field5a_sint16; // line 12

unknownStruct4* r3 = unknownStruct1Ptr + (r4 * 2); // line 14

uint16 r5 = r3->field5c_uint16; // line 15

unknownStruct2* r3 = unknownStruct1Ptr + (r4 * 4);  // line 16 - Probably unknownStruct5 == unknownStruct2, since we access the same memory location as in line 7

unknownStruct3* r0 = r3->field84_unknownstruct3_p; // line 17 - strong similarity to line 8

uint16 r3 = r0->field0C_uint16; // line 18
int* r6 = r0->field40_p; // line 19  - this is most likely a pointer due to the subsequent address calculations

uint32 r3 = r6 + r3 + (r5 * 2); // lines 20, 21


uint32 r4 = r3[1] | (r3[0] << 8); // lines 22, 23, 24 - we effectively obtain a 16-bit value by ORing two 8-bit values, where one was shifted 8 positions to the left

uint32 r3 = r0->field12_uint16 & r4; // lines 25, 26

r1 = r6 + r3; // line 27

perform_magic(r0, r1, &(unknownStruct1Ptr->field20_uint32)); // lines 13, 28
}

uint32 r0 = arg1->field34_uint16 + arg1->field28_uint32; // lines 30,31, 32
uint32 r2 = 0; // line 35

if (r10->field02_c == 0) { // line 34
r2 = arg1->field20_uint32; // line 38
}

uint16 r3 = unknownStruct1Ptr->field36_uint16; // line 40

if (indicator != 0) { // lines 41,42
r0 = r0 + r2; // line 43
r3 = r3 + r2; // line 44
}

*dest = r3; // line 46
return r0;
}

Wednesday, 6 December 2017

Practical Reverse Engineering Exercise Solutions: Page 79 / Exercise 10

Exercise 10 on page 79 of the book Practical Reverse Engineering specifies the following ARM disassembly of a function called mystery10:

01:             mystery10
02: 2D E9 70 48   PUSH.W   {R4–R6,R11,LR}
03: 0D F2 0C 0B   ADDW     R11, SP, #0xC
04: 37 F0 CC F9   BL       __security_push_cookie
05: 84 B0         SUB      SP, SP, #0x10
06: 0D 46         MOV      R5, R1
07: 00 24         MOVS     R4, #0
08: 10 2D         CMP      R5, #0x10
09: 16 46         MOV      R6, R2
10: 0C D3         BCC      loc_1010786
11: 1A 4B         LDR      R3, =__imp_GetSystemTime
12: 68 46         MOV      R0, SP
13: 1B 68         LDR      R3, [R3]
14: 98 47         BLX      R3
15: 00 9B         LDR      R3, [SP,#0x1C+var_1C]
16: 10 24         MOVS     R4, #0x10
17: 33 60         STR      R3, [R6]
18: 01 9B         LDR      R3, [SP,#0x1C+var_18]
19: 73 60         STR      R3, [R6,#4]
20: 02 9B         LDR      R3, [SP,#0x1C+var_14]
21: B3 60         STR      R3, [R6,#8]
22: 03 9B         LDR      R3, [SP,#0x1C+var_10]
23: F3 60         STR      R3, [R6,#0xC]

24:      loc_1010786
25: 2B 1B         SUBS     R3, R5, R4
26: 04 2B         CMP      R3, #4
27: 04 D3         BCC      loc_1010796
28: 11 4B         LDR      R3, =__imp_GetCurrentProcessId
29: 1B 68         LDR      R3, [R3]
30: 98 47         BLX      R3
31: 30 51         STR      R0, [R6,R4]
32: 04 34         ADDS     R4, #4

33:      loc_1010796
34: 2B 1B         SUBS     R3, R5, R4
35: 04 2B         CMP      R3, #4
36: 04 D3         BCC      loc_10107A6
37: 0C 4B         LDR      R3, =__imp_GetTickCount
38: 1B 68         LDR      R3, [R3]
39: 98 47         BLX      R3
40: 30 51         STR      R0, [R6,R4]
41: 04 34         ADDS     R4, #4

42:      loc_10107A6
43: 2B 1B         SUBS     R3, R5, R4
44: 08 2B         CMP      R3, #8
45: 09 D3         BCC      loc_10107C0
46: 07 4B         LDR      R3, =__imp_QueryPerformanceCounter
47: 68 46         MOV      R0, SP
48: 1B 68         LDR      R3, [R3]
49: 98 47         BLX      R3
50: 00 9B         LDR      R3, [SP,#0x1C+var_1C]
51: 32 19         ADDS     R2, R6, R4
52: 33 51         STR      R3, [R6,R4]
53: 01 9B         LDR      R3, [SP,#0x1C+var_18]
54: 08 34         ADDS     R4, #8
55: 53 60         STR      R3, [R2,#4]

56:      loc_10107C0
57: 20 46         MOV      R0, R4
58: 04 B0         ADD      SP, SP, #0x10
59: 37 F0 A4 F9   BL              __security_pop_cookie
60: BD E8 70 88   POP.W    {R4–R6,R11,PC}
61:             ; End of function mystery10

Although the function looks complicated at first, we notice it does not contain any kind of loops and only executes sequentially with a couple of conditionals.

Presumably, the security cookie push and pop operations are responsible for inserting and removing a stack cookie, which will protect the function from memory corruption vulnerabilities that aim at overwriting the function's return address [msdn]. 

mystery10 also invokes the following built-in Windows routines:

GetSystemTime [msdn]: 
GetSystemTime "retrieves the current system date and time. The system time is expressed in Coordinated Universal Time (UTC)."
Its function prototype is as follows:

void WINAPI GetSystemTime(
  _Out_ LPSYSTEMTIME lpSystemTime
);

It takes one parameter, which is a pointer to memory where the corresponding SYSTEMTIME structure will be stored. As we can see from the definition, all struct field are of type WORD, which means they are 16-bit unsigned integers. For our purposes, having the relative offsets of its members facilitates decompiling the function:

kd> dt _systemtime
uxtheme!_SYSTEMTIME
   +0x000 wYear            : Uint2B
   +0x002 wMonth           : Uint2B
   +0x004 wDayOfWeek       : Uint2B
   +0x006 wDay             : Uint2B
   +0x008 wHour            : Uint2B
   +0x00a wMinute          : Uint2B
   +0x00c wSecond          : Uint2B
   +0x00e wMilliseconds    : Uint2B

The pointer passed to the function as an argument is contained in register R0, which in turn contains the stack pointer (line 12). Thus, the SYSTEMTIME structure will be saved to the current stack location. 

GetCurrentProcessId [msdn]
The Windows routine GetCurrentProcessId "retrieves the process identifier of the calling process.". 
Its function prototype is as follows:

DWORD WINAPI GetCurrentProcessId(void);

It takes no parameter and returns an unsigned 32-bit integer value.

GetTickCount [msdn]:
The Windows routine GetTickCount "retrieves the number of milliseconds that have elapsed since the system was started, up to 49.7 days."
Its function prototype is:

DWORD WINAPI GetTickCount(void);

It takes no parameter and returns an unsigned 32-bit integer value.

QueryPerformanceCounter [msdn]:
The Windows routine QueryPerformanceCounter "retrieves the current value of the performance counter, which is a high resolution (<1us) time stamp that can be used for time-interval measurements."
Its function prototype is:

BOOL WINAPI QueryPerformanceCounter(
  _Out_ LARGE_INTEGER *lpPerformanceCount
);

It takes one parameter, which is a pointer to a _LARGE_INTEGER variable, which represents a 64-bit signed integer value.

After having discussed the different invoked Windows routines, we have a look at the characteristics of the function mystery10 again.
The disassembly is executed in Thumb mode, as we can infer from the 16-bit instructions and the typical PUSH.W / POP.W instructions.

As far as the instructions are concerned, there is an interesting branch instruction named BCC. BCC stands for Branch Carry Clear and accordingly will perform a branch to the destination when the Carry flag is not set, i.e. CF == 0. Note that the carry flag is only set when an arithmetic operation between two unsigned values overflows. 

mystery10 apparently takes three arguments in R0, R1 and R2. Although only the values in R1 and R2 are read (lines 6 und 9 respectively), there has to be a third parameter as otherwise the previously mentioned values would be stored in R0 and R1. 
The second argument in R1 seems to be a 32-bit unsigned integer, as it is used in CMP instructions (e.g. lines 8 and 26) followed by a BCC instruction. As already mentioned, BCC instructions check the Carry flag, which is set only when an arithmetic operation of unsigned values overflows.
R2 contains a pointer to an unknown memory structure. 

The return value of mystery10 is an unsigned 32 bit integer since it is initialized to 0 (line 7) and only addition operations are performed on it (e.g. lines 16, 32 and 41). 

Our provisional function prototype is as follows:

uint32 mystery10 (void* arg1, uint32 arg2, unknownStruct* arg3) ;

From the function's control flow, we can see that the size of arg2 determines whether the Windows routines are called or not. It is repeatedly compared to R4, which is initialized with 0 and increased for every called Windows routine. Rather than translating the disassembly directly to pseudocode / C, we examine the different input ranges of arg2 that trigger the invocation of Windows routines:

[0;3]: No routine called
[4;7]: GetCurrentProcessId,
[8;15]:GetCurrentProcessId, GetTickCount
[16-19]: GetSystemTime
[20-23]: GetSystemTime, GetCurrentProcessId
[24-31]: GetSystemTime, GetCurrentProcessId, GetTickCount
[32-[: GetSystemTime, GetCurrentProcessId, GetTickCount, QueryPerformanceCounter 

For each called Windows routine, its result value is written to the third argument arg3. Moreover, the number of bytes written to arg2 is added to the current value of R4, which is finally returned by the function (line 57).

We are now ready to create a first draft of mystery10 in C:

uint32 mystery10 (void* r0, uint32 arg2, unknownStruct* arg3) {
uint32 returnValue = 0; //line 7
if (checkValue > 15) { //line 8
SYSTEMTIME currentTime;
GetSystemTime(currentTime);
memcpy(arg3,&currentTime,sizeof(SYSTEMTIME)); // lines 17 - 23
returnValue = 16; // line 16
}
if (arg2 - returnValue > 3) {
uint32 processId = GetCurrentProcessId(); // lines 28-30
*(arg3 + returnValue) = processId; // line 31
returnValue += 4;
}
if (arg2 - returnValue > 3) {
uint32 tickCount = GetTickCount(); // lines 37-39
*(arg3 + returnValue) = tickCount; // line 40
returnValue += 4; // line 41
}

if (arg2 - returnValue > 7) {
_LARGE_INTEGER  perfCount = QueryPerformanceCounter();
*(arg3 + returnValue) = perfCount->HighPart; // line 52
*(arg3 + returnValue + 4) = perfCount->LowPart; // line 55
returnValue += 8; // line 54
}
return returnValue;
}

Effectively, the function returns the overall number of bytes written to the structure passed in arg3. The second argument arg2 can be considered a limit, up to which it is allowed to write bytes to arg3. Hence, it probably describes the size of the memory structure pointed to by arg3.

Finally, we provide a function with more descriptive names:

uint32 retrieveTimeInfo (void* r0, uint32 size, unknownStruct* dest) {
uint32 bytesWritten = 0; //line 7
if (checkValue > 15) { //line 8
SYSTEMTIME currentTime;
GetSystemTime(currentTime);
memcpy(dest,&currentTime,sizeof(SYSTEMTIME)); // lines 17 - 23
bytesWritten = 16; // line 16
}
if (size - bytesWritten > 3) {
uint32 processId = GetCurrentProcessId(); // lines 28-30
*(dest + bytesWritten) = processId; // line 31
bytesWritten += 4;
}
if (size - bytesWritten > 3) {
uint32 tickCount = GetTickCount(); // lines 37-39
*(dest + bytesWritten) = tickCount; // line 40
bytesWritten += 4; // line 41
}

if (size - bytesWritten > 7) {
_LARGE_INTEGER  perfCount = QueryPerformanceCounter();
*(dest + bytesWritten) = perfCount->HighPart; // line 52
*(dest + bytesWritten + 4) = perfCount->LowPart; // line 55
bytesWritten += 8; // line 54
}
return bytesWritten;
}

Tuesday, 5 December 2017

Practical Reverse Engineering Exercise Solutions: Page 79 / Exercise 9

Exercise 9 on page 79 of the book Practical Reverse Engineering specifies the following ARM disassembly of a function called mystery9:

01:             mystery9
02: 2D E9 30 48   PUSH.W   {R4,R5,R11,LR}
03: 0D F2 08 0B   ADDW     R11, SP, #8
04: 09 4D         LDR      R5, =byteArray
05: 06 E0         B        loc_100E312

06:      loc_100E304
07: 0B 78         LDRB     R3, [R1]
08: 5A 5D         LDRB     R2, [R3,R5]
09: 63 5D         LDRB     R3, [R4,R5]
10: 93 42         CMP      R3, R2
11: 04 D1         BNE      loc_100E318
12: 01 30         ADDS     R0, #1
13: 01 31         ADDS     R1, #1

14:      loc_100E312
15: 04 78         LDRB     R4, [R0]
16: 00 2C         CMP      R4, #0
17: F5 D1         BNE      loc_100E304

18:      loc_100E318
19: 0B 78         LDRB     R3, [R1]
20: 5A 5D         LDRB     R2, [R3,R5]
21: 03 78         LDRB     R3, [R0]
22: 5B 5D         LDRB     R3, [R3,R5]
23: 98 1A         SUBS     R0, R3, R2
24: BD E8 30 88   POP.W    {R4,R5,R11,PC}
25:             ; End of function mystery9

First of all, mystery9 has a striking similarity to the previously decompiled function mystery8. Its disassembly uses Thumb mode, as we can see for instance from the 16 bit instruction width.

In contrast to mystery9, it takes only two arguments of type char* (strings) and no additional limiting variable. The return value is a signed 32 bit integer, as we can see from line 23. The provisional function prototype is as follows:

sint32 mystery9 (char* arg1, char* arg2);

Merely by looking at the disassemblies, we see that mystery9's functionality can be considered a subset of mystery8's functionality. With the decompilation, the similarity becomes even more evident:

sint32 mystery9 (char* arg1, char* arg2) {
while (*arg1 != 0) {
if (byteArray[*arg1] != byteArray[*arg2]) {
arg1++;
arg2++;
}
}
return byteArray[*arg1] - byteArray[*arg2];
}

The variable byteArray has the same properties as in the function mystery8, i.e. it is an array holding the entire byte value range at the corresponding index : {0, 1, ..., 0xFE, 0xFF}. Likewise, the usage of byteArray can be omitted, since the index and array value are identical:

sint32 mystery9 (char* arg1, char* arg2) {
while (*arg1 != 0) {
if (*arg1 != *arg2) {
arg1++;
arg2++;
}
}
return *arg1 - *arg2;
}

As already mentioned, it essentially performs the same computations as mystery8, however, it does not take a limiting parameter. As a result, it continues in the main loop until the end of both strings or a differing character is found. The return value is the numerical difference between the first differing characters in the strings and thus, when no difference can be found, the value 0.
Therefore, the possible return values are:

  • 0: Both strings are equal
  • >0: The numerical value of the first differing character is greater in arg1 than in arg2
  • <0: The numerical value of the first differing character is smaller in arg1 than in arg2
Lastly, we provide a more descriptive version of mystery9:

sint32 strcmp(char* str1, char* str2) {
 while (*str1 != 0) {
  if (*str1 != *str2) {
   str1++;
   str2++;
  }
 }
 return *str1 - *str2;
}




Practical Reverse Engineering Exercise Solutions: Page 79 / Exercise 8

Exercise 8 on page 79 of the book Practical Reverse Engineering specifies the following ARM disassembly of a function called mystery8:

01:             mystery8
02: 2D E9 78 48   PUSH.W   {R3–R6,R11,LR}
03: 0D F2 10 0B   ADDW     R11, SP, #0x10
04: 0C 4E         LDR      R6, =byteArray
05: 09 E0         B        loc_100E34C

06:      loc_100E338
07: 05 78         LDRB     R5, [R0]
08: 01 3A         SUBS     R2, #1
09: 4D B1         CBZ      R5, loc_100E352
10: 0B 78         LDRB     R3, [R1]
11: 9C 5D         LDRB     R4, [R3,R6]
12: AB 5D         LDRB     R3, [R5,R6]
13: A3 42         CMP      R3, R4
14: 04 D1         BNE      loc_100E352 
15: 01 30         ADDS     R0, #1
16: 01 31         ADDS     R1, #1

17:      loc_100E34C
18: 00 2A         CMP      R2, #0
19: F3 DC         BGT      loc_100E338
20: 01 3A         SUBS     R2, #1

21:      loc_100E352
22: 00 2A         CMP      R2, #0
23: 01 DA         BGE      loc_100E35A
24: 00 20         MOVS     R0, #0
25: 04 E0         B        locret_100E364

26:      loc_100E35A
27: 0B 78         LDRB     R3, [R1]
28: 9A 5D         LDRB     R2, [R3,R6]
29: 03 78         LDRB     R3, [R0]
30: 9B 5D         LDRB     R3, [R3,R6]
31: 98 1A         SUBS     R0, R3, R2

32:      locret_100E364
33: BD E8 78 88   POP.W    {R3–R6,R11,PC}
34:             ; End of function mystery8

The function was compiled in Thumb mode, as we can see from the presence of 16 bit instructions, PUSH and POP instructions and Thumb-specific instructions, e.g. CBZ and instructions with the .W suffix.

It apparently takes three input arguments in R0, R1 and R3. Both R0 and R1 are pointers to a structure from which successive single bytes are read (lines 7, 10, 27 and 29), so we assume type char* for these two arguments (i.e. a string). R2, on the other hand, is a numerical value being decreased (lines 8, 20) and compared to 0. Thus, we assume it is a signed 32 bit integer value and used for counting.

The function either returns 0 (lines 24 and 25) or performs a subtraction involving R3 and R2 (line 31).  Therefore, we assume the return value to also be a 32 bit integer value.

Our provisional function prototype is as follows:

sint32 mystery8 (char* arg1, char* arg2, sint32 arg3);

As mentioned in the description, the variable byteArray is an array of 256 byte values, i.e. byteArray[] = {0,1,..,0xFE, 0xFF}

Due to the multitude of conditionals and load operations, it is in our opinion easier to first decompile it and afterwards analyze in detail what its purpose could potentially be. While the following C program can certainly be optimized, the resemblance to the disassembly could make it easier to understand:

sint32 mystery8 (char* arg1, char* arg2, sint32 arg3) {
while (arg3 > 0) { //lines 18 and 19
arg3 --; // line 8

if (*arg1 == 0) { // line 9
break; 
}

if (byteArray[*arg1] != byteArray[*arg2]) { // lines 10-14
break;
}

arg1++; // line 15
arg2++; // line 16
}

if ((arg3-1) < 0) { // lines 8,20,22,23
return 0;
}

return byteArray[*arg1] - byteArray[*arg2]; // lines 27-31

}


The function takes two string values and inputs and uses their single characters for indexing into the array byteArray. However, for each possible index in the range 0, 1, ... , 0xFF, the value of byteArray[index] is exactly the index itself. So it is not entirely clear why the byteArray is actually used. We feel it can be completely omitted for simplification:

sint32 mystery8 (char* arg1, char* arg2, sint32 arg3) {
while (arg3 > 0) { //lines 18 and 19
arg3 --; // line 8

if (*arg1 == 0) { // line 9
break; 
}

if (*arg1 != *arg2) { // lines 10-14
break;
}

arg1++; // line 15
arg2++; // line 16
}

if ((arg3-1) < 0) { // lines 8,20,22,23
return 0;
}

return *arg1 - *arg2; // lines 27-31

}

The central loop always compares the same positions of arg1 and arg2 and exits if and only if one of the following conditions is satisfied:

  1. The counter variable arg3 reaches 0 OR
  2. We have reached the last character of the first string arg1, i.e. the null character OR
  3. We have found a character in the strings arg1 and arg2 that do not match.
Before exiting the function, the counter variable is decremented and checked for non-negativity. If the value is negative, we know that either arg3 was originally <= 0 or we have exited the loop because arg3 was decremented until it reached 0. The return value in this case is 0.

If it is not negative, we know the counter variable was not the limiting factor, but rather the input strings did not match at a certain character position or the end of the first string argument was reached. The return value in this case is the difference between the character values of the input strings arg1 and arg2 at the lastly changed position. Thus, the possible return values are:
  • 0 when *arg1 == *arg2 (i.e. the characters are equal)
  • < 0 when *arg1 < * arg2 (i.e. the character of arg1 is numerically smaller than the character of arg2)
  • >0 when *arg1 > *arg2 (i.e. the character of arg1 is numerically greater than the character of arg2)
Lastly, let us consider a few examples
  1. mystery8("abc","abc",5) returns: 0 (because *arg1 - *arg2 = 0 -0 = 0)
  2. mystery8("abc","abc",2) returns: 0 (because the counter variable reached the limit)
  3. mystery8("abb","aba",5) returns: 1 (because *arg1 - *arg2 = 'b' - 'a' = 0x62 - 0x61 = 1)
  4. mystery8("abb","aba",2) returns: 0 (because the counter variable reached the limit)
  5. mystery8("aba","abb",5): returns: -1 (because *arg1 - *arg2 = 'a' - 'b' = 0x61 - 0x62 = -1)
  6. mystery8("aba","abb",2): returns: 0 (because the counter variable reached the limit)
To conclude, we can summarize the function behaviour in more descriptive words. The function takes two strings and a limiting position variable as inputs and compares the input strings for equality from the first character up to the position imposed by the limit variable. If no difference in the strings is found up to the limiting position, it returns 0. When a difference is found, it returns the numerical difference between the first mismatching character of the input strings.

Lastly, we provide a version of mystery8 with more descriptive names:

sint32 mystery8 (char* str1, char* str2, sint32 limit) {
 while (limit > 0) { //lines 18 and 19
  limit --; // line 8

   if (*str1 == 0) { // line 9
   break; 
  }
  
  if (*str1 != *str2) { // lines 10-14
   break;
  }
  
  str1++; // line 15
  str2++; // line 16
 }

  if ((limit-1) < 0) { // lines 8,20,22,23
  return 0;
 }

  return *str1 - *str2; // lines 27-31

}

    Monday, 4 December 2017

    Practical Reverse Engineering Exercise Solutions: Page 79 / Exercise 7

    Exercise 7 on page 79 of the book Practical Reverse Engineering specifies the following ARM disassembly of a function called mystery7:

    01:             mystery7
    02: 02 46         MOV      R2, R0
    03: 08 B9         CBNZ     R0, loc_100E1D8
    04: 00 20         MOVS     R0, #0
    05: 70 47         BX       LR 

    06:      loc_100E1D8
    07: 90 F9 00 30   LDRSB.W  R3, [R0]
    08: 02 E0         B        loc_100E1E4

    09:      loc_100E1DE
    10: 01 32         ADDS     R2, #1
    11: 92 F9 00 30   LDRSB.W  R3, [R2]

    12:      loc_100E1E4
    13: 00 2B         CMP      R3, #0
    14: FA D1         BNE      loc_100E1DE
    15: 10 1A         SUBS     R0, R2, R0
    16: 6F F3 9F 70   BFC.W    R0, #0x1E, #2
    17: 70 47         BX       LR
    18:             ; End of function mystery7

    Again, the function provided is executed in Thumb mode, due to several 16 bit instructions and instructions specific to Thumb mode such as CBNZ and the .W suffix such as in line 7, 11 and 16.

    mystery7() takes one argument, which is a pointer to a structure in memory. Load operations from this address solely read a single byte and the address read from is always increased by 1. Thus, the function argument is probably a pointer to a character array, i.e. a string. The return value is yet to be determined, but it is generated by manipulating the address of the input argument, so it will be a 32 bit value. Our preliminary function prototype:

    uint32 mystery7 (char* arg);

    There is also an interesting instruction in line 16 - BFC. BFC stands for Bit Field Clear and clears a certain number of bits in the destination register. The instruction at hand is BFC.W R0, #0x1E, #2 and clears two bits starting at the least significant bit #0x1E (=30), i.e. the two most significant bits 30 and 31.

    In line 3, the input argument is inspected for nullness, in which case the function returns 0.

    If it is not null, the character stored at the address pointed to by the input argument is loaded (line 11) and compared to 0 (line 13). When it is not 0, the next character is loaded and compared again to 0 and this process is repeated in a loop. Presumably we are searching the end of a string, which is delimited by the null-character 0.

    When the loaded character value matches 0, the address of the start of the array is subtracted from the address where the first 0-character was found. Thereby, we calculate the length of the string and store it in register R0. The last instruction before leaving mystery7() involves clearing the two most significant bits. It does not make immediately sense why the two most significant bits are cleared - maybe there is a maximum string length imposed by the compiler.

    We are now ready to fully decompile mystery7 to:

    uint32 calcStringLength (char* arg) {
    if (arg==null) {
    return 0;
    }

    uint32 counter = 0;

    while (arg[counter] != 0) {
    counter ++;
    }

    uint32 length = arg[counter] - arg;
    length = length && 0x3FFFFFFF; // clear two most significant bits

    return length;
    }