Small challenge from Gynvael Coldwin
Gynvael Coldwin posted a small challenge at the end of his last podcast on Windows Kernel Debugging with Artem Shishkin:
Our first guess was that the characters use some well-known encoding such as ASCII, but this turned out to be wrong assumption. In order to obtain some knowledge about the structure of the recorded data, we ran some statistical analysis of the different byte values:
This shows that the byte
\xf0 is used far more often than the other ones, so our research continued in this direction. We realized that this byte is a part of the so-called scan code, which is sent whenever a key is pressed on a hardware keyboard. The following web page provides detailed information on this matter, especially the different scan code sets. The byte
\xf0 is frequently used in the scan code set 2:
- Scan Code Set 1 - Original XT scan code set; supported by some modern keyboards
- Scan Code Set 2 - Default scan code set for all modern keyboards
- Scan Code Set 3 - Optional PS/2 scan code set–rarely used
When handling direct hardware input, the hardware obviously has no knowledge of ASCII encodings, keyboard layouts and so on - this is all handled by the operating system. We only know which scan and make codes are used and can infer which keys were pressed (and released).
The basic logic is as follows:
- Key is pressed -> MAKE CODE
- Key is released -> BREAK CODE
There are unique make and break codes for every key and can be looked up in the scan code set.
To translate the number of bytes to text, we assume scan code set 2 has been used, both because of the usage of
\xf0 and due to the fact that it is used for all modern keyboards:
Thus, we obtain the result text:
Sorry, I don't speak Keyboard.