Exercise 11 on page 79 of the book Practical Reverse Engineering specifies the following ARM disassembly of a function called mystery11 - the last exercise of the ARM chapter:
01: 010185B0 mystery11
02: 010185B0 2D E9 F8 4F PUSH.W {R3–R11,LR}
03: 010185B4 0D F2 20 0B ADDW R11, SP, #0x20
04: 010185B8 B0 F9 5A 30 LDRSH.W R3, [R0,#0x5A]
05: 010185BC 07 46 MOV R7, R0
06: 010185BE 90 46 MOV R8, R2
07: 010185C0 00 EB 83 03 ADD.W R3, R0, R3,LSL#2
08: 010185C4 D3 F8 84 A0 LDR.W R10, [R3,#0x84]
09: 010185C8 7B 8F LDRH R3, [R7,#0x3A]
10: 010185CA 89 46 MOV R9, R1
11: 010185CC CB B9 CBNZ R3, loc_1018602
12: 010185CE B0 F9 5A 40 LDRSH.W R4, [R0,#0x5A]
13: 010185D2 17 F1 20 02 ADDS.W R2, R7, #0x20
14: 010185D6 00 EB 44 03 ADD.W R3, R0, R4,LSL#1
15: 010185DA B3 F8 5C 50 LDRH.W R5, [R3,#0x5C]
16: 010185DE 00 EB 84 03 ADD.W R3, R0, R4,LSL#2
17: 010185E2 D3 F8 84 00 LDR.W R0, [R3,#0x84]
18: 010185E6 83 89 LDRH R3, [R0,#0xC]
19: 010185E8 06 6C LDR R6, [R0,#0x40]
20: 010185EA 03 EB 45 03 ADD.W R3, R3, R5,LSL#1
21: 010185EE 9B 19 ADDS R3, R3, R6
22: 010185F0 1C 78 LDRB R4, [R3]
23: 010185F2 5B 78 LDRB R3, [R3,#1]
24: 010185F4 43 EA 04 24 ORR.W R4, R3, R4,LSL#8
25: 010185F8 43 8A LDRH R3, [R0,#0x12]
26: 010185FA 23 40 ANDS R3, R4
27: 010185FC 99 19 ADDS R1, R3, R6
28: 010185FE FD F7 8D FF BL sub_101651C
29: 01018602 loc_1018602
30: 01018602 BA 8E LDRH R2, [R7,#0x34]
31: 01018604 BB 6A LDR R3, [R7,#0x28]
32: 01018606 D0 18 ADDS R0, R2, R3
33: 01018608 9A F8 02 30 LDRB.W R3, [R10,#2]
34: 0101860C 0B B1 CBZ R3, loc_1018612
35: 0101860E 00 22 MOVS R2, #0
36: 01018610 00 E0 B loc_1018614
37: 01018612 loc_1018612
38: 01018612 3A 6A LDR R2, [R7,#0x20]
39: 01018614 loc_1018614
40: 01018614 FB 8E LDRH R3, [R7,#0x36]
41: 01018616 B8 F1 00 0F CMP.W R8, #0
42: 0101861A 01 D0 BEQ loc_1018620
43: 0101861C 80 18 ADDS R0, R0, R2
44: 0101861E 9B 1A SUBS R3, R3, R2
45: 01018620 loc_1018620
46: 01018620 C9 F8 00 30 STR.W R3, [R9]
47: 01018624 BD E8 F8 8F POP.W {R3–R11,PC}
48: 01018624 ; End of function mystery11
According to the exercise description, the called subroutine sub_101651C in line 28 takes three arguments and does not return anything. Thus, we know the registers R0, R1 and R2 are prepared and passed to the function.
The function is executed in Thumb mode, as we can see from the different instructions having both a width of 16 and 32 bits. Furthermore, Thumb specific instructions such as PUSH.W, POP.W and STR.W are contained in the disassembly.
With regard to the instruction semantics, we shed light on the following examples:
LDRSH.W R3, [R0,#0x5A]: Load Signed Halfword Into Register. For the ARM architcture, a word is a 32-bit value and a half-word a 16-bit value. Thus, a signed 16-bit integer value (2 bytes) is loaded from the memory location [R0 + 0x5A] into R3.
LDRH R3, [R0,#0xC]: Load Half Word Into Register. Slightly different than LDRSH, this instruction loads an unsigned 16-bit integer value into the destination register. In particular, it loads from memory at [R0 + 0xC] into register R3.
ADD.W R3, R0, R3,LSL#2: This composition of arithmetic operations is equal to the following expression: R3 = R0 + (R3 << 2)
ORR.W R4, R3, R4,LSL#8: This composition of logical operations is equal to the following expression: R4 = R3 | (R4 << 8)
Let us know examine the data types of mystery11's arguments. R0 seems to be a pointer to a data structure, as we are accessing its field member at offset 0x5A. We create a provisional unknownStruct1:
unknownStruct1:
field5a_sint16; // 2 byte signed integer value (i.e. sint16) - line 4
R1 is first accessed in line 10 and its value is copied into R9, which is used in line 46 as a destination address for storing a 32 bit value. Thus, we assume it is a pointer as well to an unknown memory structure.
The third argument in R2 is preserved in R8 (line 6) and used for a check against zero (line 41). Hence, we assume it is a 32-bit unsigned integer.
As far as the return value is concerned, we notice the return register R0 is set either in line 32 or 43 before the function exits. In both cases, an addition is performed so we assume it is of type uint32.
Our first function prototype proposal:
uint32 mystery11 (unknownStruct1* arg1, int* arg2, uint32 arg3);
Although the function contains no loops, the plethora of memory accesses does not permit us to reason about its functionality at a high level. Thus, we provide the first attempt to generate C code from the disassembly step-by-step along with the data structures:
unknownStruct1:
field20_uint32; // unknown 4 byte value - line 38
field28_uint32; // unknown 4 byte value - line 31
field34_uint16; // unknown 2 byte value - line 30
field36_uint16; // unknown 2 byte value - line 40
field3a_uint16; // 2 byte unsigned integer - line 9
field5a_sint16; // 2 byte signed integer value (i.e. sint16) - line 4
unknownStruct2:
field84_unknownstruct3_p; // pointer to an instance of unknownStruct3
unknownStruct3:
field02_c; // line 33
field12_uint16; // line 25
field0C_uint16; // line 18
field40_p; // line 19
unknownStruct4:
field5c_uint16; // line 15
uint32 mystery11 (unknownStruct1* arg1, int* arg2, uint32 arg3) {
sint16 r3 = arg1->field5a_sint16; // line 4
unknownStruct1* r7 = arg1; // line 5
uint32 r8 = arg3; // line 6
unknownStruct2* r3 = arg1 + (arg1->field5a_sint16 * 4); // lines 4, 7
unknownStruct3* r10 = r3->field84_p; // line 8
uint16 r3 = arg1->field3a_uint16; // line 9
int* r9 = arg2; // line 10
if (r3 == 0) { // line 11
uint32 r2 = arg1 + 0x20; // line 13
sint16 r4 = arg1->field5a_sint16; // line 12
unknownStruct4* r3 = arg1 + (r4 * 2); // line 14
uint16 r5 = r3->field5c_uint16; // line 15
unknownStruct2* r3 = arg1 + (r4 * 4); // line 16 - Probably unknownStruct5 == unknownStruct2, since we access the same memory location as in line 7
unknownStruct3* r0 = r3->field84_p; // line 17 - strong similarity to line 8
uint16 r3 = r0->field0C_uint16; // line 18
int* r6 = r0->field40_p; // line 19 - this is most likely a pointer due to the subsequent address calculations
uint32 r3 = r3 + (r5 * 2); // line 20
r3 = r6 + r3; // line 21
char r4 = r3[0]; // line 22
char r3 = r3[1]; // line 23
r4 = r3 | (r4 << 8); // line 24 - we effectively obtain a 16-bit value by ORing two 8-bit values, where one was shifted 8 positions to the left
uint16 r3 = r0->field12_uint16; // line 25
r3 = r3 & r4; // line 26
r1 = r6 + r3; // line 27
sub_101651C(r0, r1, r2);
}
uint16 r2 = arg1->field34_uint16; // line 30
uint32 r3 = arg1->field28_uint32; // line 31
uint32 r0 = r2 + r3; // line 32
char r3 = r10->field02_c; // line 33
if (r3 == 0) { // line 34
uint32 r2 = arg1->field20_uint32; // line 38
}
else {
uint32 r2 = 0; // line 35
}
uint16 r3 = arg1->field36_uint16; // line 40
if (arg3 != 0) { // lines 41,42
r0 = r0 + r2; // line 43
r3 = r3 + r2; // line 44
}
*arg2 = r3; // line 46
}
To be honest, this looks like a huge mess, but it is just our first attempt and we will try to refine the code and make it more readable.
Firstly, we follow the data flow from the end of the program to see what kind of value will be stored in the pointer arg2. We notice there can be at most a 32-bit value, so we change the data type of arg2 from int* to uint32*.
Secondly, we eliminate instructions that are either unnecessary or duplicated.
Thirdly, we try to minimize the number of C code by combining the load with the different arithmetic or comparison operations.
Fourthly, we try to give the variables more descriptive names. For instance, the third argument arg3 is only used to indicate or whether or not the addition in the last conditional block should be executed. Therefore, we choose its name to be "performAddition". The subroutine sub_101651C was named perform_magic.
unknownStruct1:
field20_uint32; // unknown 4 byte value - line 38
field28_uint32; // unknown 4 byte value - line 31
field34_uint16; // unknown 2 byte value - line 30
field36_uint16; // unknown 2 byte value - line 40
field3a_uint16; // 2 byte unsigned integer - line 9
field5a_sint16; // 2 byte signed integer value (i.e. sint16) - line 4
unknownStruct2:
field84_unknownstruct3_p; // pointer to an instance of unknownStruct3
unknownStruct3:
field02_c; // line 33
field12_uint16; // line 25
field0C_uint16; // line 18
field40_p; // line 19
unknownStruct4:
field5c_uint16; // line 15
uint32 mystery11 (unknownStruct1* unknownStruct1Ptr, int* dest, uint32 indicator) {
unknownStruct2* r3 = unknownStruct1Ptr + (unknownStruct1Ptr->field5a_sint16 * 4); // lines 4, 7
unknownStruct3* r10 = r3->field84_unknownstruct3_p; // line 8
if (unknownStruct1Ptr->field3a_uint16 == 0) { // lines 9,11
sint16 r4 = unknownStruct1Ptr->field5a_sint16; // line 12
unknownStruct4* r3 = unknownStruct1Ptr + (r4 * 2); // line 14
uint16 r5 = r3->field5c_uint16; // line 15
unknownStruct2* r3 = unknownStruct1Ptr + (r4 * 4); // line 16 - Probably unknownStruct5 == unknownStruct2, since we access the same memory location as in line 7
unknownStruct3* r0 = r3->field84_unknownstruct3_p; // line 17 - strong similarity to line 8
uint16 r3 = r0->field0C_uint16; // line 18
int* r6 = r0->field40_p; // line 19 - this is most likely a pointer due to the subsequent address calculations
uint32 r3 = r6 + r3 + (r5 * 2); // lines 20, 21
uint32 r4 = r3[1] | (r3[0] << 8); // lines 22, 23, 24 - we effectively obtain a 16-bit value by ORing two 8-bit values, where one was shifted 8 positions to the left
uint32 r3 = r0->field12_uint16 & r4; // lines 25, 26
r1 = r6 + r3; // line 27
perform_magic(r0, r1, &(unknownStruct1Ptr->field20_uint32)); // lines 13, 28
}
uint32 r0 = arg1->field34_uint16 + arg1->field28_uint32; // lines 30,31, 32
uint32 r2 = 0; // line 35
if (r10->field02_c == 0) { // line 34
r2 = arg1->field20_uint32; // line 38
}
uint16 r3 = unknownStruct1Ptr->field36_uint16; // line 40
if (indicator != 0) { // lines 41,42
r0 = r0 + r2; // line 43
r3 = r3 + r2; // line 44
}
*dest = r3; // line 46
return r0;
}
Thursday, 7 December 2017
Wednesday, 6 December 2017
Practical Reverse Engineering Exercise Solutions: Page 79 / Exercise 10
Exercise 10 on page 79 of the book Practical Reverse Engineering specifies the following ARM disassembly of a function called mystery10:
01: mystery10
02: 2D E9 70 48 PUSH.W {R4–R6,R11,LR}
03: 0D F2 0C 0B ADDW R11, SP, #0xC
04: 37 F0 CC F9 BL __security_push_cookie
05: 84 B0 SUB SP, SP, #0x10
06: 0D 46 MOV R5, R1
07: 00 24 MOVS R4, #0
08: 10 2D CMP R5, #0x10
09: 16 46 MOV R6, R2
10: 0C D3 BCC loc_1010786
11: 1A 4B LDR R3, =__imp_GetSystemTime
12: 68 46 MOV R0, SP
13: 1B 68 LDR R3, [R3]
14: 98 47 BLX R3
15: 00 9B LDR R3, [SP,#0x1C+var_1C]
16: 10 24 MOVS R4, #0x10
17: 33 60 STR R3, [R6]
18: 01 9B LDR R3, [SP,#0x1C+var_18]
19: 73 60 STR R3, [R6,#4]
20: 02 9B LDR R3, [SP,#0x1C+var_14]
21: B3 60 STR R3, [R6,#8]
22: 03 9B LDR R3, [SP,#0x1C+var_10]
23: F3 60 STR R3, [R6,#0xC]
24: loc_1010786
25: 2B 1B SUBS R3, R5, R4
26: 04 2B CMP R3, #4
27: 04 D3 BCC loc_1010796
28: 11 4B LDR R3, =__imp_GetCurrentProcessId
29: 1B 68 LDR R3, [R3]
30: 98 47 BLX R3
31: 30 51 STR R0, [R6,R4]
32: 04 34 ADDS R4, #4
33: loc_1010796
34: 2B 1B SUBS R3, R5, R4
35: 04 2B CMP R3, #4
36: 04 D3 BCC loc_10107A6
37: 0C 4B LDR R3, =__imp_GetTickCount
38: 1B 68 LDR R3, [R3]
39: 98 47 BLX R3
40: 30 51 STR R0, [R6,R4]
41: 04 34 ADDS R4, #4
42: loc_10107A6
43: 2B 1B SUBS R3, R5, R4
44: 08 2B CMP R3, #8
45: 09 D3 BCC loc_10107C0
46: 07 4B LDR R3, =__imp_QueryPerformanceCounter
47: 68 46 MOV R0, SP
48: 1B 68 LDR R3, [R3]
49: 98 47 BLX R3
50: 00 9B LDR R3, [SP,#0x1C+var_1C]
51: 32 19 ADDS R2, R6, R4
52: 33 51 STR R3, [R6,R4]
53: 01 9B LDR R3, [SP,#0x1C+var_18]
54: 08 34 ADDS R4, #8
55: 53 60 STR R3, [R2,#4]
56: loc_10107C0
57: 20 46 MOV R0, R4
58: 04 B0 ADD SP, SP, #0x10
59: 37 F0 A4 F9 BL __security_pop_cookie
60: BD E8 70 88 POP.W {R4–R6,R11,PC}
61: ; End of function mystery10
Although the function looks complicated at first, we notice it does not contain any kind of loops and only executes sequentially with a couple of conditionals.
Presumably, the security cookie push and pop operations are responsible for inserting and removing a stack cookie, which will protect the function from memory corruption vulnerabilities that aim at overwriting the function's return address [msdn].
GetSystemTime [msdn]:
GetSystemTime "retrieves the current system date and time. The system time is expressed in Coordinated Universal Time (UTC)."
Its function prototype is as follows:
void WINAPI GetSystemTime(
_Out_ LPSYSTEMTIME lpSystemTime
);
It takes one parameter, which is a pointer to memory where the corresponding SYSTEMTIME structure will be stored. As we can see from the definition, all struct field are of type WORD, which means they are 16-bit unsigned integers. For our purposes, having the relative offsets of its members facilitates decompiling the function:
kd> dt _systemtime
uxtheme!_SYSTEMTIME
+0x000 wYear : Uint2B
+0x002 wMonth : Uint2B
+0x004 wDayOfWeek : Uint2B
+0x006 wDay : Uint2B
+0x008 wHour : Uint2B
+0x00a wMinute : Uint2B
+0x00c wSecond : Uint2B
+0x00e wMilliseconds : Uint2B
The pointer passed to the function as an argument is contained in register R0, which in turn contains the stack pointer (line 12). Thus, the SYSTEMTIME structure will be saved to the current stack location.
GetCurrentProcessId [msdn]
The Windows routine GetCurrentProcessId "retrieves the process identifier of the calling process.".
Its function prototype is as follows:
DWORD WINAPI GetCurrentProcessId(void);
It takes no parameter and returns an unsigned 32-bit integer value.
GetTickCount [msdn]:
The Windows routine GetTickCount "retrieves the number of milliseconds that have elapsed since the system was started, up to 49.7 days."
Its function prototype is:
DWORD WINAPI GetTickCount(void);
It takes no parameter and returns an unsigned 32-bit integer value.
QueryPerformanceCounter [msdn]:
The Windows routine QueryPerformanceCounter "retrieves the current value of the performance counter, which is a high resolution (<1us) time stamp that can be used for time-interval measurements."
Its function prototype is:
BOOL WINAPI QueryPerformanceCounter(
_Out_ LARGE_INTEGER *lpPerformanceCount
);
It takes one parameter, which is a pointer to a _LARGE_INTEGER variable, which represents a 64-bit signed integer value.
After having discussed the different invoked Windows routines, we have a look at the characteristics of the function mystery10 again.
The disassembly is executed in Thumb mode, as we can infer from the 16-bit instructions and the typical PUSH.W / POP.W instructions.
As far as the instructions are concerned, there is an interesting branch instruction named BCC. BCC stands for Branch Carry Clear and accordingly will perform a branch to the destination when the Carry flag is not set, i.e. CF == 0. Note that the carry flag is only set when an arithmetic operation between two unsigned values overflows.
mystery10 apparently takes three arguments in R0, R1 and R2. Although only the values in R1 and R2 are read (lines 6 und 9 respectively), there has to be a third parameter as otherwise the previously mentioned values would be stored in R0 and R1.
The second argument in R1 seems to be a 32-bit unsigned integer, as it is used in CMP instructions (e.g. lines 8 and 26) followed by a BCC instruction. As already mentioned, BCC instructions check the Carry flag, which is set only when an arithmetic operation of unsigned values overflows.
R2 contains a pointer to an unknown memory structure.
The return value of mystery10 is an unsigned 32 bit integer since it is initialized to 0 (line 7) and only addition operations are performed on it (e.g. lines 16, 32 and 41).
Our provisional function prototype is as follows:
uint32 mystery10 (void* arg1, uint32 arg2, unknownStruct* arg3) ;
From the function's control flow, we can see that the size of arg2 determines whether the Windows routines are called or not. It is repeatedly compared to R4, which is initialized with 0 and increased for every called Windows routine. Rather than translating the disassembly directly to pseudocode / C, we examine the different input ranges of arg2 that trigger the invocation of Windows routines:
[0;3]: No routine called
[4;7]: GetCurrentProcessId,
[8;15]:GetCurrentProcessId, GetTickCount
[16-19]: GetSystemTime
[20-23]: GetSystemTime, GetCurrentProcessId
[24-31]: GetSystemTime, GetCurrentProcessId, GetTickCount
[32-[: GetSystemTime, GetCurrentProcessId, GetTickCount, QueryPerformanceCounter
For each called Windows routine, its result value is written to the third argument arg3. Moreover, the number of bytes written to arg2 is added to the current value of R4, which is finally returned by the function (line 57).
We are now ready to create a first draft of mystery10 in C:
uint32 mystery10 (void* r0, uint32 arg2, unknownStruct* arg3) {
uint32 returnValue = 0; //line 7
if (checkValue > 15) { //line 8
SYSTEMTIME currentTime;
GetSystemTime(currentTime);
memcpy(arg3,¤tTime,sizeof(SYSTEMTIME)); // lines 17 - 23
returnValue = 16; // line 16
}
if (arg2 - returnValue > 3) {
uint32 processId = GetCurrentProcessId(); // lines 28-30
*(arg3 + returnValue) = processId; // line 31
returnValue += 4;
}
if (arg2 - returnValue > 3) {
uint32 tickCount = GetTickCount(); // lines 37-39
*(arg3 + returnValue) = tickCount; // line 40
returnValue += 4; // line 41
}
if (arg2 - returnValue > 7) {
_LARGE_INTEGER perfCount = QueryPerformanceCounter();
*(arg3 + returnValue) = perfCount->HighPart; // line 52
*(arg3 + returnValue + 4) = perfCount->LowPart; // line 55
returnValue += 8; // line 54
}
return returnValue;
}
Effectively, the function returns the overall number of bytes written to the structure passed in arg3. The second argument arg2 can be considered a limit, up to which it is allowed to write bytes to arg3. Hence, it probably describes the size of the memory structure pointed to by arg3.
Finally, we provide a function with more descriptive names:
uint32 retrieveTimeInfo (void* r0, uint32 size, unknownStruct* dest) {
uint32 bytesWritten = 0; //line 7
if (checkValue > 15) { //line 8
SYSTEMTIME currentTime;
GetSystemTime(currentTime);
memcpy(dest,¤tTime,sizeof(SYSTEMTIME)); // lines 17 - 23
bytesWritten = 16; // line 16
}
if (size - bytesWritten > 3) {
uint32 processId = GetCurrentProcessId(); // lines 28-30
*(dest + bytesWritten) = processId; // line 31
bytesWritten += 4;
}
if (size - bytesWritten > 3) {
uint32 tickCount = GetTickCount(); // lines 37-39
*(dest + bytesWritten) = tickCount; // line 40
bytesWritten += 4; // line 41
}
if (size - bytesWritten > 7) {
_LARGE_INTEGER perfCount = QueryPerformanceCounter();
*(dest + bytesWritten) = perfCount->HighPart; // line 52
*(dest + bytesWritten + 4) = perfCount->LowPart; // line 55
bytesWritten += 8; // line 54
}
return bytesWritten;
}
Tuesday, 5 December 2017
Practical Reverse Engineering Exercise Solutions: Page 79 / Exercise 9
Exercise 9 on page 79 of the book Practical Reverse Engineering specifies the following ARM disassembly of a function called mystery9:
01: mystery9
02: 2D E9 30 48 PUSH.W {R4,R5,R11,LR}
03: 0D F2 08 0B ADDW R11, SP, #8
04: 09 4D LDR R5, =byteArray
05: 06 E0 B loc_100E312
06: loc_100E304
07: 0B 78 LDRB R3, [R1]
08: 5A 5D LDRB R2, [R3,R5]
09: 63 5D LDRB R3, [R4,R5]
10: 93 42 CMP R3, R2
11: 04 D1 BNE loc_100E318
12: 01 30 ADDS R0, #1
13: 01 31 ADDS R1, #1
14: loc_100E312
15: 04 78 LDRB R4, [R0]
16: 00 2C CMP R4, #0
17: F5 D1 BNE loc_100E304
18: loc_100E318
19: 0B 78 LDRB R3, [R1]
20: 5A 5D LDRB R2, [R3,R5]
21: 03 78 LDRB R3, [R0]
22: 5B 5D LDRB R3, [R3,R5]
23: 98 1A SUBS R0, R3, R2
24: BD E8 30 88 POP.W {R4,R5,R11,PC}
25: ; End of function mystery9
First of all, mystery9 has a striking similarity to the previously decompiled function mystery8. Its disassembly uses Thumb mode, as we can see for instance from the 16 bit instruction width.
In contrast to mystery9, it takes only two arguments of type char* (strings) and no additional limiting variable. The return value is a signed 32 bit integer, as we can see from line 23. The provisional function prototype is as follows:
sint32 mystery9 (char* arg1, char* arg2);
Merely by looking at the disassemblies, we see that mystery9's functionality can be considered a subset of mystery8's functionality. With the decompilation, the similarity becomes even more evident:
sint32 mystery9 (char* arg1, char* arg2) {
while (*arg1 != 0) {
if (byteArray[*arg1] != byteArray[*arg2]) {
arg1++;
arg2++;
}
}
return byteArray[*arg1] - byteArray[*arg2];
}
The variable byteArray has the same properties as in the function mystery8, i.e. it is an array holding the entire byte value range at the corresponding index : {0, 1, ..., 0xFE, 0xFF}. Likewise, the usage of byteArray can be omitted, since the index and array value are identical:
sint32 mystery9 (char* arg1, char* arg2) {
while (*arg1 != 0) {
if (*arg1 != *arg2) {
arg1++;
arg2++;
}
}
return *arg1 - *arg2;
}
As already mentioned, it essentially performs the same computations as mystery8, however, it does not take a limiting parameter. As a result, it continues in the main loop until the end of both strings or a differing character is found. The return value is the numerical difference between the first differing characters in the strings and thus, when no difference can be found, the value 0.
Therefore, the possible return values are:
sint32 strcmp(char* str1, char* str2) {
while (*str1 != 0) {
if (*str1 != *str2) {
str1++;
str2++;
}
}
return *str1 - *str2;
}
01: mystery9
02: 2D E9 30 48 PUSH.W {R4,R5,R11,LR}
03: 0D F2 08 0B ADDW R11, SP, #8
04: 09 4D LDR R5, =byteArray
05: 06 E0 B loc_100E312
06: loc_100E304
07: 0B 78 LDRB R3, [R1]
08: 5A 5D LDRB R2, [R3,R5]
09: 63 5D LDRB R3, [R4,R5]
10: 93 42 CMP R3, R2
11: 04 D1 BNE loc_100E318
12: 01 30 ADDS R0, #1
13: 01 31 ADDS R1, #1
14: loc_100E312
15: 04 78 LDRB R4, [R0]
16: 00 2C CMP R4, #0
17: F5 D1 BNE loc_100E304
18: loc_100E318
19: 0B 78 LDRB R3, [R1]
20: 5A 5D LDRB R2, [R3,R5]
21: 03 78 LDRB R3, [R0]
22: 5B 5D LDRB R3, [R3,R5]
23: 98 1A SUBS R0, R3, R2
24: BD E8 30 88 POP.W {R4,R5,R11,PC}
25: ; End of function mystery9
First of all, mystery9 has a striking similarity to the previously decompiled function mystery8. Its disassembly uses Thumb mode, as we can see for instance from the 16 bit instruction width.
In contrast to mystery9, it takes only two arguments of type char* (strings) and no additional limiting variable. The return value is a signed 32 bit integer, as we can see from line 23. The provisional function prototype is as follows:
sint32 mystery9 (char* arg1, char* arg2);
Merely by looking at the disassemblies, we see that mystery9's functionality can be considered a subset of mystery8's functionality. With the decompilation, the similarity becomes even more evident:
sint32 mystery9 (char* arg1, char* arg2) {
while (*arg1 != 0) {
if (byteArray[*arg1] != byteArray[*arg2]) {
arg1++;
arg2++;
}
}
return byteArray[*arg1] - byteArray[*arg2];
}
The variable byteArray has the same properties as in the function mystery8, i.e. it is an array holding the entire byte value range at the corresponding index : {0, 1, ..., 0xFE, 0xFF}. Likewise, the usage of byteArray can be omitted, since the index and array value are identical:
sint32 mystery9 (char* arg1, char* arg2) {
while (*arg1 != 0) {
if (*arg1 != *arg2) {
arg1++;
arg2++;
}
}
return *arg1 - *arg2;
}
As already mentioned, it essentially performs the same computations as mystery8, however, it does not take a limiting parameter. As a result, it continues in the main loop until the end of both strings or a differing character is found. The return value is the numerical difference between the first differing characters in the strings and thus, when no difference can be found, the value 0.
Therefore, the possible return values are:
- 0: Both strings are equal
- >0: The numerical value of the first differing character is greater in arg1 than in arg2
- <0: The numerical value of the first differing character is smaller in arg1 than in arg2
sint32 strcmp(char* str1, char* str2) {
while (*str1 != 0) {
if (*str1 != *str2) {
str1++;
str2++;
}
}
return *str1 - *str2;
}
Practical Reverse Engineering Exercise Solutions: Page 79 / Exercise 8
Exercise 8 on page 79 of the book Practical Reverse Engineering specifies the following ARM disassembly of a function called mystery8:
01: mystery8
02: 2D E9 78 48 PUSH.W {R3–R6,R11,LR}
03: 0D F2 10 0B ADDW R11, SP, #0x10
04: 0C 4E LDR R6, =byteArray
05: 09 E0 B loc_100E34C
06: loc_100E338
07: 05 78 LDRB R5, [R0]
08: 01 3A SUBS R2, #1
09: 4D B1 CBZ R5, loc_100E352
10: 0B 78 LDRB R3, [R1]
11: 9C 5D LDRB R4, [R3,R6]
12: AB 5D LDRB R3, [R5,R6]
13: A3 42 CMP R3, R4
14: 04 D1 BNE loc_100E352
15: 01 30 ADDS R0, #1
16: 01 31 ADDS R1, #1
17: loc_100E34C
18: 00 2A CMP R2, #0
19: F3 DC BGT loc_100E338
20: 01 3A SUBS R2, #1
21: loc_100E352
22: 00 2A CMP R2, #0
23: 01 DA BGE loc_100E35A
24: 00 20 MOVS R0, #0
25: 04 E0 B locret_100E364
26: loc_100E35A
27: 0B 78 LDRB R3, [R1]
28: 9A 5D LDRB R2, [R3,R6]
29: 03 78 LDRB R3, [R0]
30: 9B 5D LDRB R3, [R3,R6]
31: 98 1A SUBS R0, R3, R2
32: locret_100E364
33: BD E8 78 88 POP.W {R3–R6,R11,PC}
34: ; End of function mystery8
The function was compiled in Thumb mode, as we can see from the presence of 16 bit instructions, PUSH and POP instructions and Thumb-specific instructions, e.g. CBZ and instructions with the .W suffix.
It apparently takes three input arguments in R0, R1 and R3. Both R0 and R1 are pointers to a structure from which successive single bytes are read (lines 7, 10, 27 and 29), so we assume type char* for these two arguments (i.e. a string). R2, on the other hand, is a numerical value being decreased (lines 8, 20) and compared to 0. Thus, we assume it is a signed 32 bit integer value and used for counting.
The function either returns 0 (lines 24 and 25) or performs a subtraction involving R3 and R2 (line 31). Therefore, we assume the return value to also be a 32 bit integer value.
Our provisional function prototype is as follows:
sint32 mystery8 (char* arg1, char* arg2, sint32 arg3);
As mentioned in the description, the variable byteArray is an array of 256 byte values, i.e. byteArray[] = {0,1,..,0xFE, 0xFF}
Due to the multitude of conditionals and load operations, it is in our opinion easier to first decompile it and afterwards analyze in detail what its purpose could potentially be. While the following C program can certainly be optimized, the resemblance to the disassembly could make it easier to understand:
sint32 mystery8 (char* arg1, char* arg2, sint32 arg3) {
while (arg3 > 0) { //lines 18 and 19
arg3 --; // line 8
if (*arg1 == 0) { // line 9
break;
}
if (byteArray[*arg1] != byteArray[*arg2]) { // lines 10-14
break;
}
arg1++; // line 15
arg2++; // line 16
}
if ((arg3-1) < 0) { // lines 8,20,22,23
return 0;
}
return byteArray[*arg1] - byteArray[*arg2]; // lines 27-31
}
The function takes two string values and inputs and uses their single characters for indexing into the array byteArray. However, for each possible index in the range 0, 1, ... , 0xFF, the value of byteArray[index] is exactly the index itself. So it is not entirely clear why the byteArray is actually used. We feel it can be completely omitted for simplification:
sint32 mystery8 (char* arg1, char* arg2, sint32 arg3) {
while (arg3 > 0) { //lines 18 and 19
arg3 --; // line 8
if (*arg1 == 0) { // line 9
break;
}
if (*arg1 != *arg2) { // lines 10-14
break;
}
arg1++; // line 15
arg2++; // line 16
}
if ((arg3-1) < 0) { // lines 8,20,22,23
return 0;
}
return *arg1 - *arg2; // lines 27-31
}
The central loop always compares the same positions of arg1 and arg2 and exits if and only if one of the following conditions is satisfied:
Lastly, we provide a version of mystery8 with more descriptive names:
sint32 mystery8 (char* str1, char* str2, sint32 limit) {
while (limit > 0) { //lines 18 and 19
limit --; // line 8
if (*str1 == 0) { // line 9
break;
}
if (*str1 != *str2) { // lines 10-14
break;
}
str1++; // line 15
str2++; // line 16
}
if ((limit-1) < 0) { // lines 8,20,22,23
return 0;
}
return *str1 - *str2; // lines 27-31
}
01: mystery8
02: 2D E9 78 48 PUSH.W {R3–R6,R11,LR}
03: 0D F2 10 0B ADDW R11, SP, #0x10
04: 0C 4E LDR R6, =byteArray
05: 09 E0 B loc_100E34C
06: loc_100E338
07: 05 78 LDRB R5, [R0]
08: 01 3A SUBS R2, #1
09: 4D B1 CBZ R5, loc_100E352
10: 0B 78 LDRB R3, [R1]
11: 9C 5D LDRB R4, [R3,R6]
12: AB 5D LDRB R3, [R5,R6]
13: A3 42 CMP R3, R4
14: 04 D1 BNE loc_100E352
15: 01 30 ADDS R0, #1
16: 01 31 ADDS R1, #1
17: loc_100E34C
18: 00 2A CMP R2, #0
19: F3 DC BGT loc_100E338
20: 01 3A SUBS R2, #1
21: loc_100E352
22: 00 2A CMP R2, #0
23: 01 DA BGE loc_100E35A
24: 00 20 MOVS R0, #0
25: 04 E0 B locret_100E364
26: loc_100E35A
27: 0B 78 LDRB R3, [R1]
28: 9A 5D LDRB R2, [R3,R6]
29: 03 78 LDRB R3, [R0]
30: 9B 5D LDRB R3, [R3,R6]
31: 98 1A SUBS R0, R3, R2
32: locret_100E364
33: BD E8 78 88 POP.W {R3–R6,R11,PC}
34: ; End of function mystery8
The function was compiled in Thumb mode, as we can see from the presence of 16 bit instructions, PUSH and POP instructions and Thumb-specific instructions, e.g. CBZ and instructions with the .W suffix.
It apparently takes three input arguments in R0, R1 and R3. Both R0 and R1 are pointers to a structure from which successive single bytes are read (lines 7, 10, 27 and 29), so we assume type char* for these two arguments (i.e. a string). R2, on the other hand, is a numerical value being decreased (lines 8, 20) and compared to 0. Thus, we assume it is a signed 32 bit integer value and used for counting.
The function either returns 0 (lines 24 and 25) or performs a subtraction involving R3 and R2 (line 31). Therefore, we assume the return value to also be a 32 bit integer value.
Our provisional function prototype is as follows:
sint32 mystery8 (char* arg1, char* arg2, sint32 arg3);
As mentioned in the description, the variable byteArray is an array of 256 byte values, i.e. byteArray[] = {0,1,..,0xFE, 0xFF}
Due to the multitude of conditionals and load operations, it is in our opinion easier to first decompile it and afterwards analyze in detail what its purpose could potentially be. While the following C program can certainly be optimized, the resemblance to the disassembly could make it easier to understand:
sint32 mystery8 (char* arg1, char* arg2, sint32 arg3) {
while (arg3 > 0) { //lines 18 and 19
arg3 --; // line 8
if (*arg1 == 0) { // line 9
break;
}
if (byteArray[*arg1] != byteArray[*arg2]) { // lines 10-14
break;
}
arg1++; // line 15
arg2++; // line 16
}
if ((arg3-1) < 0) { // lines 8,20,22,23
return 0;
}
return byteArray[*arg1] - byteArray[*arg2]; // lines 27-31
}
The function takes two string values and inputs and uses their single characters for indexing into the array byteArray. However, for each possible index in the range 0, 1, ... , 0xFF, the value of byteArray[index] is exactly the index itself. So it is not entirely clear why the byteArray is actually used. We feel it can be completely omitted for simplification:
sint32 mystery8 (char* arg1, char* arg2, sint32 arg3) {
while (arg3 > 0) { //lines 18 and 19
arg3 --; // line 8
if (*arg1 == 0) { // line 9
break;
}
if (*arg1 != *arg2) { // lines 10-14
break;
}
arg1++; // line 15
arg2++; // line 16
}
if ((arg3-1) < 0) { // lines 8,20,22,23
return 0;
}
return *arg1 - *arg2; // lines 27-31
}
- The counter variable arg3 reaches 0 OR
- We have reached the last character of the first string arg1, i.e. the null character OR
- We have found a character in the strings arg1 and arg2 that do not match.
Before exiting the function, the counter variable is decremented and checked for non-negativity. If the value is negative, we know that either arg3 was originally <= 0 or we have exited the loop because arg3 was decremented until it reached 0. The return value in this case is 0.
If it is not negative, we know the counter variable was not the limiting factor, but rather the input strings did not match at a certain character position or the end of the first string argument was reached. The return value in this case is the difference between the character values of the input strings arg1 and arg2 at the lastly changed position. Thus, the possible return values are:
- 0 when *arg1 == *arg2 (i.e. the characters are equal)
- < 0 when *arg1 < * arg2 (i.e. the character of arg1 is numerically smaller than the character of arg2)
- >0 when *arg1 > *arg2 (i.e. the character of arg1 is numerically greater than the character of arg2)
Lastly, let us consider a few examples
- mystery8("abc","abc",5) returns: 0 (because *arg1 - *arg2 = 0 -0 = 0)
- mystery8("abc","abc",2) returns: 0 (because the counter variable reached the limit)
- mystery8("abb","aba",5) returns: 1 (because *arg1 - *arg2 = 'b' - 'a' = 0x62 - 0x61 = 1)
- mystery8("abb","aba",2) returns: 0 (because the counter variable reached the limit)
- mystery8("aba","abb",5): returns: -1 (because *arg1 - *arg2 = 'a' - 'b' = 0x61 - 0x62 = -1)
- mystery8("aba","abb",2): returns: 0 (because the counter variable reached the limit)
To conclude, we can summarize the function behaviour in more descriptive words. The function takes two strings and a limiting position variable as inputs and compares the input strings for equality from the first character up to the position imposed by the limit variable. If no difference in the strings is found up to the limiting position, it returns 0. When a difference is found, it returns the numerical difference between the first mismatching character of the input strings.
Lastly, we provide a version of mystery8 with more descriptive names:
sint32 mystery8 (char* str1, char* str2, sint32 limit) {
while (limit > 0) { //lines 18 and 19
limit --; // line 8
if (*str1 == 0) { // line 9
break;
}
if (*str1 != *str2) { // lines 10-14
break;
}
str1++; // line 15
str2++; // line 16
}
if ((limit-1) < 0) { // lines 8,20,22,23
return 0;
}
return *str1 - *str2; // lines 27-31
}
Monday, 4 December 2017
Practical Reverse Engineering Exercise Solutions: Page 79 / Exercise 7
Exercise 7 on page 79 of the book Practical Reverse Engineering specifies the following ARM disassembly of a function called mystery7:
01: mystery7
02: 02 46 MOV R2, R0
03: 08 B9 CBNZ R0, loc_100E1D8
04: 00 20 MOVS R0, #0
05: 70 47 BX LR
06: loc_100E1D8
07: 90 F9 00 30 LDRSB.W R3, [R0]
08: 02 E0 B loc_100E1E4
09: loc_100E1DE
10: 01 32 ADDS R2, #1
11: 92 F9 00 30 LDRSB.W R3, [R2]
12: loc_100E1E4
13: 00 2B CMP R3, #0
14: FA D1 BNE loc_100E1DE
15: 10 1A SUBS R0, R2, R0
16: 6F F3 9F 70 BFC.W R0, #0x1E, #2
17: 70 47 BX LR
18: ; End of function mystery7
Again, the function provided is executed in Thumb mode, due to several 16 bit instructions and instructions specific to Thumb mode such as CBNZ and the .W suffix such as in line 7, 11 and 16.
mystery7() takes one argument, which is a pointer to a structure in memory. Load operations from this address solely read a single byte and the address read from is always increased by 1. Thus, the function argument is probably a pointer to a character array, i.e. a string. The return value is yet to be determined, but it is generated by manipulating the address of the input argument, so it will be a 32 bit value. Our preliminary function prototype:
uint32 mystery7 (char* arg);
There is also an interesting instruction in line 16 - BFC. BFC stands for Bit Field Clear and clears a certain number of bits in the destination register. The instruction at hand is BFC.W R0, #0x1E, #2 and clears two bits starting at the least significant bit #0x1E (=30), i.e. the two most significant bits 30 and 31.
In line 3, the input argument is inspected for nullness, in which case the function returns 0.
If it is not null, the character stored at the address pointed to by the input argument is loaded (line 11) and compared to 0 (line 13). When it is not 0, the next character is loaded and compared again to 0 and this process is repeated in a loop. Presumably we are searching the end of a string, which is delimited by the null-character 0.
When the loaded character value matches 0, the address of the start of the array is subtracted from the address where the first 0-character was found. Thereby, we calculate the length of the string and store it in register R0. The last instruction before leaving mystery7() involves clearing the two most significant bits. It does not make immediately sense why the two most significant bits are cleared - maybe there is a maximum string length imposed by the compiler.
We are now ready to fully decompile mystery7 to:
uint32 calcStringLength (char* arg) {
if (arg==null) {
return 0;
}
uint32 counter = 0;
while (arg[counter] != 0) {
counter ++;
}
uint32 length = arg[counter] - arg;
length = length && 0x3FFFFFFF; // clear two most significant bits
return length;
}
01: mystery7
02: 02 46 MOV R2, R0
03: 08 B9 CBNZ R0, loc_100E1D8
04: 00 20 MOVS R0, #0
05: 70 47 BX LR
06: loc_100E1D8
07: 90 F9 00 30 LDRSB.W R3, [R0]
08: 02 E0 B loc_100E1E4
09: loc_100E1DE
10: 01 32 ADDS R2, #1
11: 92 F9 00 30 LDRSB.W R3, [R2]
12: loc_100E1E4
13: 00 2B CMP R3, #0
14: FA D1 BNE loc_100E1DE
15: 10 1A SUBS R0, R2, R0
16: 6F F3 9F 70 BFC.W R0, #0x1E, #2
17: 70 47 BX LR
18: ; End of function mystery7
Again, the function provided is executed in Thumb mode, due to several 16 bit instructions and instructions specific to Thumb mode such as CBNZ and the .W suffix such as in line 7, 11 and 16.
mystery7() takes one argument, which is a pointer to a structure in memory. Load operations from this address solely read a single byte and the address read from is always increased by 1. Thus, the function argument is probably a pointer to a character array, i.e. a string. The return value is yet to be determined, but it is generated by manipulating the address of the input argument, so it will be a 32 bit value. Our preliminary function prototype:
uint32 mystery7 (char* arg);
There is also an interesting instruction in line 16 - BFC. BFC stands for Bit Field Clear and clears a certain number of bits in the destination register. The instruction at hand is BFC.W R0, #0x1E, #2 and clears two bits starting at the least significant bit #0x1E (=30), i.e. the two most significant bits 30 and 31.
In line 3, the input argument is inspected for nullness, in which case the function returns 0.
If it is not null, the character stored at the address pointed to by the input argument is loaded (line 11) and compared to 0 (line 13). When it is not 0, the next character is loaded and compared again to 0 and this process is repeated in a loop. Presumably we are searching the end of a string, which is delimited by the null-character 0.
When the loaded character value matches 0, the address of the start of the array is subtracted from the address where the first 0-character was found. Thereby, we calculate the length of the string and store it in register R0. The last instruction before leaving mystery7() involves clearing the two most significant bits. It does not make immediately sense why the two most significant bits are cleared - maybe there is a maximum string length imposed by the compiler.
We are now ready to fully decompile mystery7 to:
uint32 calcStringLength (char* arg) {
if (arg==null) {
return 0;
}
uint32 counter = 0;
while (arg[counter] != 0) {
counter ++;
}
uint32 length = arg[counter] - arg;
length = length && 0x3FFFFFFF; // clear two most significant bits
return length;
}
Subscribe to:
Posts (Atom)