Challenge description
Our Threat Intel Team discovered this binary during a VirusTotal RetroHunt by looking
for the term 'FCSC'. They took a look and deduced that the program expects to
be executed by Windows Defender. A quick look at the binary using notepad revealed a location
starting with INPUTINPUT..., which looks like the way to pass an argument to
the program (by patching it).
Me, a Windows noob, after reading the challenge description:
Reverse-Engineering
First things first, this is a reverse-engineering challenge so let’s look at the code.
The main function is quite long but pretty straightforward.
It starts off with the initialization of an array using magic bytes.
Then things in the environment are checked.
Followed by some more checks.
And then we finally reach the end of the function. Well rename the helper functions
for clarity:
What we can gather is that our input is xored against the mysterious arrays filled out
throughout the function, using the pieces of information gathered from the environment.
Then it is xored against the array data_140021020
, containing:
key = [
0xb9, 0x75, 0x4b, 0x6b, 0x78, 0xa0, 0x00, 0xc9,
0xce, 0x9a, 0xea, 0xd5, 0x6c, 0xbf, 0x78, 0x45,
0xc0, 0x9e, 0xe4, 0x7c, 0xed, 0xce, 0x39, 0x46,
0x62, 0x6f, 0x52, 0x6a, 0x57
]
Finally, if our input starts with FCSC{
and if the result of the xors is equal to the string Reminder: patch CVE-2021-1647
,
the file C:\result.txt
is created and the Congratz
string is written to it.
To solve this challenge we need the mysterious array constructed from the environment checks (var_9a8).
We then could just xor this magic with the constant array and the expected output to get the flag.
This is all cool and all, but what do they mean by executed by Windows Defender
?
(and no I totally did not lose a few hours setting up a vulnerable Windows VM thinking the challenge binary was
somehow exploiting CVE-2021-1647 to check for the flag).
AV 101
All modern antivirus software operate on two different planes. The first one is static analysis, where the AV will
check a file against a set of known malware signatures (think, specific byte patterns). The second one is dynamic
analysis. To prevent nuking the whole operating system, AVs execute the suspiscious binary files inside a software
sandbox where they emulate the Windows API’s as well as the CPU instructions. They then try to derive from the program’s
runtime behavior if the input file is malicious.
A really great introduction to this topic is the talk from Alexei Bulazel at the BlackHat 2018 conference about
Windows Defender’s emulator. This gives great insight about the internals as well as some
reverse-engineering tips.
Executing inside Windows Defender
The hard way
One way of debugging our challenge binary would be to do it directly on a windows machine. Unfortunately, Windows Defender
is a protected process, which means we need a kernel debugger. Even though I have such a setup ready, I decided instead to
be as confortable as possible and debug Defender on linux using gdb.
The easy way (the year of Defender on the Linux Desktop ?)
As it turns out, the tortured mind of Tavis Ormandy created the loadlibrary project, which allows a linux process to
load and dynamically link against dll’s. His project provides the linker as well as a light windows API emulation layer.
As an example, he ported Windows Defender to Linux (how convenient !).
Downloading Windows Defender
First things first, we git clone the loadlibrary repository. Then we download Windows Defender from the microsoft website here:
https://www.microsoft.com/en-us/wdsi/defenderupdates#manual
What we get from this is the mpam-fe.exe file (the one I used has the md5 91106c1fb00aa156123f6a1bad794e8b). Extract everything
into the engine directory of the repo using cabextract
and we are all set.
Ok so we are done right ? Just load the PE into this thing, read the mysterious array from memory, compute the flag
and done ? WRONG
Finding the magic array
A few fun facts about the Windows Defender emulator:
- It is complex and stripped
- It can make use of dynamic recompilation for speed (or not)
- It can chose to execute the code (or not)
- We don’t have the pdb for this version
- Software mmu
All this challenge boils down to is “How do I stop the emulator at just the right time to get the mystery key from memory, all
the while keeping in mind the aformentioned issues ?”.
Brainstorming
From the previously uncovered issues, stopping the emulator at just the right instruction looks too involved. To do this we would
have to reverse parts of the CPU emulation code, then find different strategies to stop the emulator at just the right time depending
on its mode (emulation vs dynamic recompilation) and state (we want to break at a specific rip).
The solution I used was to rely instead on memory breakpoints. To be able to emulate correctly the input code, the software mmu of the
emulator must contain a proper virtual address space for the emulated process. This means that the pages of our binary will be mapped
verbatim somewhere in the emulator’s address space.
If we were to put a hardware memory breakpoint on an interesting value, read at some point by the
emulated process, then the whole emulation process would stop. Because down the line, it doesn’t matter
if the backend is JITted code or an emulator. The host cpu will have to do a memory load regardless,
which will stop the whole emulation at this exact point.
Also note that since the emulator uses a software mmu, and as we don’t have the function converting guest VA to host VA, we will need
to rely on memory search to find out where the interesting pieces of guest memory are.
Strategy
Let’s take a look back at the program’s code:
During the call to the xor function, the magic value array was not yet modified. The INPUTINPUT...
string is something we control and
is loaded, one byte at a time, inside the xor function. Also, its pattern is easily recognizable in memory. If we were to put a memory breakpoint
on the hardcoded flag, the emulator would stop in the xor function. From there, we could dump the stack frame of the calling function and
recover the magic key !
Step 0 - Stopping the emulator before the execution
To place our breakpoints we need to break into the emulator after our program is mapped AND before its execution. A very simple way of
doing this is by adding a breakpoint to the ReadStream
function inside mpclient.c
of loadlibrary:
static DWORD ReadStream(PVOID this, ULONGLONG Offset, PVOID Buffer, DWORD Size, PDWORD SizeRead)
{
LogMessage("Reading from fp = %p (offset: 0x%lx size: 0x%08x buffer = %p)", this, Offset, Size, Buffer);
fseek(this, Offset, SEEK_SET);
*SizeRead = fread(Buffer, 1, Size, this);
asm volatile("int3"); // Add this
return TRUE;
}
This function is a callback defender uses to read the file to analyze. This can be seen in the main function of mpclient.c
:
ScanParams.Descriptor = &ScanDescriptor;
ScanParams.ScanReply = &ScanReply;
ScanReply.EngineScanCallback = EngineScanCallback;
ScanReply.field_C = 0x7fffffff;
ScanDescriptor.Read = ReadStream; // Our read callback
ScanDescriptor.GetSize = GetStreamSize;
ScanDescriptor.GetName = GetStreamName;
By playing around with gdb and counting the number of times this function is called, we can easily stop after the last page
is loaded and place our memory breakpoints.
Step 1 - Finding the flag
The flag can easily be found by searching for it in memory using gdb.
Step 2 - Finding the stack
For the stack we need to hope that it is not split across two different pages. We can find the top of the calling stack frame by looking
for the constant bytes of the stack array defined at the beginning of main (if the stack frame is split across two different pages
it would be troublesome to find the bottom half without this “anchor”).
Step 3 - Finding the magic key / array
Just dump the stack before and after the first xor and diff what changed. We know as well from the calls to xor that the expected magic is 0x1d
bytes large.
Execution
We can execute the binary as such:
$ gdb --args ./mpclient ../the_offenders.exe
Step 0 - Stopping the emulator before the execution
Just step a few times until you reach the last call to ReadStream.
Step 1 - Finding the flag
We look for the INPUTINPUT...
string:
gef➤ grep INPUTINPUTINPUT
[+] Searching 'INPUTINPUTINPUT' in memory
[+] In '[heap]'(0x565af000-0x58954000), permission=rw-
0x5804d550 - 0x5804d56d → "INPUTINPUTINPUTINPUTINPUTINPU"
0x586ad0f8 - 0x586ad115 → "INPUTINPUTINPUTINPUTINPUTINPU"
We place hardware watchpoints on both addresses and continue execution:
gef➤ rwatch* 0x5804d550
Hardware read watchpoint 1: * 0x5804d550
gef➤ rwatch* 0x586ad0f8
Hardware read watchpoint 2: * 0x586ad0f8
Step 2 - Finding the stack
After a while the code breaks on a byte access:
0x5a8c2458 movzx ecx, BYTE PTR [eax]
→ 0x5a8c245b mov eax, DWORD PTR [edx]
0x5a8c245d mov DWORD PTR [eax], ecx
0x5a8c245f jmp 0x5a8c12a6
0x5a8c2464 mov esi, DWORD PTR [ebx]
0x5a8c2466 mov eax, DWORD PTR [esi+0x4]
0x5a8c2469 movzx eax, BYTE PTR [eax]
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── threads ────
[#0] Id 1, Name: "mpclient", stopped, reason: BREAKPOINT
──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── trace ────
[#0] 0x5a8c245b → mov eax, DWORD PTR [edx]
[#1] 0x5a8a570c → pop edi
[#2] 0x5a8a1ae8 → movzx ecx, WORD PTR [ebx+0x48]
[#3] 0x5a89ebe4 → movzx ecx, WORD PTR [ebx+0x48]
[#4] 0x5a8a4507 → jmp 0x5a8a450e
[#5] 0x5a59acd2 → mov ecx, DWORD PTR [esi+0x242c4]
[#6] 0x5a59a056 → test DWORD PTR [ebx+0x2a068], 0x200
[#7] 0x5a5879b6 → mov esi, DWORD PTR [ebp-0x98]
[#8] 0x5a448af1 → mov DWORD PTR [ebx], eax
[#9] 0x5a42d757 → mov edi, DWORD PTR [ebp-0x167c]
───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
gef➤ i r $ecx
ecx 0x49 0x49
0x49 in ASCII is ‘I’, the first character of our input. Let’s take a quick look back
at the start of main:
We can look for the value 0x11eb3d60 in memory to find the address of the constant
array on the stack:
gef➤ grep 0x11eb3d60
[+] Searching '\x60\x3d\xeb\x11' in memory
[+] In '[heap]'(0x565af000-0x58954000), permission=rw-
0x56c65d19 - 0x56c65d29 → "\x60\x3d\xeb\x11[...]"
0x56c65d39 - 0x56c65d49 → "\x60\x3d\xeb\x11[...]"
0x5801d960 - 0x5801d970 → "\x60\x3d\xeb\x11[...]"
gef➤ hexdump byte 0x56c65d19 16
0x56c65d19 60 3d eb 11 5c 00 00 07 00 01 05 00 00 00 00 2c `=..\..........,
gef➤ hexdump byte 0x56c65d39 16
0x56c65d39 60 3d eb 11 5c 00 00 0f 00 01 05 00 00 00 00 56 `=..\..........V
gef➤ hexdump byte 0x5801d960 16
0x5801d960 60 3d eb 11 15 ca 71 be 2b 73 ae f1 85 7d 77 81 `=....q.+s...}w.
Only the data at the address 0x5801d960 matches the rest of the constant bytes
used in main.
Step 3 - Finding the magic key / array
Dump the stack, continue until next load (after the first xor), and dump the stack again:
gef➤ hexdump byte 0x5801d960 512
0x5801d960 60 3d eb 11 15 ca 71 be 2b 73 ae f1 85 7d 77 81 `=....q.+s...}w.
0x5801d970 1f 35 2c 07 3b 61 09 d7 2d 98 10 a3 09 14 df f4 .5,.;a..-.......
0x5801d980 9b a3 54 10 8e 69 25 ae a5 1a 8b 5f 20 67 fc de ..T..i%...._ g..
0x5801d990 a8 b0 9c 1a 93 d1 95 cd be 49 85 6e b7 5d 5a 9a .........I.n.]Z.
0x5801d9a0 d5 1d ec b9 5b 74 c9 17 fe 6e 42 48 de 09 be 96 ....[t...nBH....
0x5801d9b0 b5 b1 32 8a 26 60 a7 47 98 29 22 29 2f 74 78 b3 ..2.&`.G.)")/tx.
0x5801d9c0 43 a1 81 ac 18 d5 48 bb e6 bb 0a f3 38 b2 b4 65 C.....H.....8..e
0x5801d9d0 b2 86 bf c7 94 e6 18 80 0c cf 3a a9 23 bb 42 1a ..........:.#.B.
0x5801d9e0 a1 8d 23 8a b9 58 6b 31 5f e3 61 c2 67 51 d5 a7 ..#..Xk1_.a.gQ..
0x5801d9f0 37 57 bc 9b a3 b1 a4 1b af 7e 9e b2 8c c5 dc a8 7W.......~......
0x5801da00 17 0b e1 ee ae 53 8a df f1 b0 eb 1d 96 e1 3e ba .....S........>.
0x5801da10 a7 af 0e 6f 04 1e aa 74 ab 60 34 c6 27 a5 e8 6e ...o...t.`4.'..n
0x5801da20 31 90 7e 22 9f c3 f4 fd 6e 73 1f e0 f8 92 21 5a 1.~"....ns....!Z
0x5801da30 e6 e0 f3 d1 e2 fe 59 a5 49 9e 6d 63 6e 3b 85 0d ......Y.I.mcn;..
0x5801da40 93 07 a9 bd 0c c4 5d 40 62 b7 42 a0 9a 25 63 fa ......]@b.B..%c.
0x5801da50 a0 ff a1 ff a2 ff a3 ff a4 ff a5 ff a6 ff a7 ff ................
0x5801da60 ac 59 f2 7e dd 0a 4b 89 99 ef 65 d7 24 b8 85 b2 .Y.~..K...e.$...
0x5801da70 1f 35 2c 07 3b 61 09 d7 2d 98 10 a3 09 14 df f4 .5,.;a..-.......
0x5801da80 42 00 6c 00 61 00 62 00 6c 00 61 00 00 00 bf ff B.l.a.b.l.a.....
0x5801da90 c0 ff c1 ff c2 ff c3 ff c4 ff c5 ff c6 ff c7 ff ................
0x5801daa0 c8 ff c9 ff ca ff cb ff cc ff cd ff ce ff cf ff ................
0x5801dab0 5b 63 68 61 6e 66 6f 6c 64 65 72 5d 0d 0a 6e 30 [chanfolder]..n0
0x5801dac0 3d 23 42 6c 61 62 6c 61 0d 0a 6e 31 3d 23 45 6e =#Blabla..n1=#En
0x5801dad0 64 0d 0a 00 00 00 00 00 00 00 00 00 00 00 00 00 d...............
0x5801dae0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x5801daf0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x5801db00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x5801db10 00 00 00 00 00 00 00 00 d4 68 89 ae 0a 09 00 00 .........h......
0x5801db20 ad 53 75 41 6d 8c 00 d7 98 d5 dc c6 77 b1 55 08 .SuAm.......w.U.
0x5801db30 f6 a4 c0 25 ba 9a 5c 18 3d 32 00 7f 1d 00 00 00 ...%..\.=2......
0x5801db40 80 f7 12 00 00 00 00 00 a9 35 01 40 01 00 00 00 .........5.@....
0x5801db50 f0 f6 12 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
gef➤ c
[...]
gef➤ hexdump byte 0x5801d960 512
0x5801d960 60 3d eb 11 15 ca 71 be 2b 73 ae f1 85 7d 77 81 `=....q.+s...}w.
0x5801d970 1f 35 2c 07 3b 61 09 d7 2d 98 10 a3 09 14 df f4 .5,.;a..-.......
0x5801d980 9b a3 54 10 8e 69 25 ae a5 1a 8b 5f 20 67 fc de ..T..i%...._ g..
0x5801d990 a8 b0 9c 1a 93 d1 95 cd be 49 85 6e b7 5d 5a 9a .........I.n.]Z.
0x5801d9a0 d5 1d ec b9 5b 74 c9 17 fe 6e 42 48 de 09 be 96 ....[t...nBH....
0x5801d9b0 b5 b1 32 8a 26 60 a7 47 98 29 22 29 2f 74 78 b3 ..2.&`.G.)")/tx.
0x5801d9c0 43 a1 81 ac 18 d5 48 bb e6 bb 0a f3 38 b2 b4 65 C.....H.....8..e
0x5801d9d0 b2 86 bf c7 94 e6 18 80 0c cf 3a a9 23 bb 42 1a ..........:.#.B.
0x5801d9e0 a1 8d 23 8a b9 58 6b 31 5f e3 61 c2 67 51 d5 a7 ..#..Xk1_.a.gQ..
0x5801d9f0 37 57 bc 9b a3 b1 a4 1b af 7e 9e b2 8c c5 dc a8 7W.......~......
0x5801da00 17 0b e1 ee ae 53 8a df f1 b0 eb 1d 96 e1 3e ba .....S........>.
0x5801da10 a7 af 0e 6f 04 1e aa 74 ab 60 34 c6 27 a5 e8 6e ...o...t.`4.'..n
0x5801da20 31 90 7e 22 9f c3 f4 fd 6e 73 1f e0 f8 92 21 5a 1.~"....ns....!Z
0x5801da30 e6 e0 f3 d1 e2 fe 59 a5 49 9e 6d 63 6e 3b 85 0d ......Y.I.mcn;..
0x5801da40 93 07 a9 bd 0c c4 5d 40 62 b7 42 a0 9a 25 63 fa ......]@b.B..%c.
0x5801da50 a0 ff a1 ff a2 ff a3 ff a4 ff a5 ff a6 ff a7 ff ................
0x5801da60 ac 59 f2 7e dd 0a 4b 89 99 ef 65 d7 24 b8 85 b2 .Y.~..K...e.$...
0x5801da70 1f 35 2c 07 3b 61 09 d7 2d 98 10 a3 09 14 df f4 .5,.;a..-.......
0x5801da80 42 00 6c 00 61 00 62 00 6c 00 61 00 00 00 bf ff B.l.a.b.l.a.....
0x5801da90 c0 ff c1 ff c2 ff c3 ff c4 ff c5 ff c6 ff c7 ff ................
0x5801daa0 c8 ff c9 ff ca ff cb ff cc ff cd ff ce ff cf ff ................
0x5801dab0 5b 63 68 61 6e 66 6f 6c 64 65 72 5d 0d 0a 6e 30 [chanfolder]..n0
0x5801dac0 3d 23 42 6c 61 62 6c 61 0d 0a 6e 31 3d 23 45 6e =#Blabla..n1=#En
0x5801dad0 64 0d 0a 00 00 00 00 00 00 00 00 00 00 00 00 00 d...............
0x5801dae0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x5801daf0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x5801db00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x5801db10 00 00 00 00 00 00 00 00 d4 68 89 ae 0a 09 00 00 .........h......
0x5801db20 e4 53 75 41 6d 8c 00 d7 98 d5 dc c6 77 b1 55 08 .SuAm.......w.U.
0x5801db30 f6 a4 c0 25 ba 9a 5c 18 3d 32 00 7f 1d 00 00 00 ...%..\.=2......
0x5801db40 80 f7 12 00 00 00 00 00 a9 35 01 40 01 00 00 00 .........5.@....
0x5801db50 f0 f6 12 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
By dumping both hexdumps to a file and diffing them we find the modified buffer:
$ diff -u before.txt after.txt
--- before.txt 2021-05-03 09:29:23.454177001 +0200
+++ after.txt 2021-05-03 09:29:37.480917683 +0200
@@ -26,7 +26,7 @@
0x5801daf0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x5801db00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x5801db10 00 00 00 00 00 00 00 00 d4 68 89 ae 0a 09 00 00 .........h......
-0x5801db20 ad 53 75 41 6d 8c 00 d7 98 d5 dc c6 77 b1 55 08 .SuAm.......w.U.
+0x5801db20 e4 53 75 41 6d 8c 00 d7 98 d5 dc c6 77 b1 55 08 .SuAm.......w.U.
0x5801db30 f6 a4 c0 25 ba 9a 5c 18 3d 32 00 7f 1d 00 00 00 ...%..\.=2......
0x5801db40 80 f7 12 00 00 00 00 00 a9 35 01 40 01 00 00 00 .........5.@....
0x5801db50 f0 f6 12 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
The magic buffer starts at address 0x5801db20 during this run. This checks out as
0xad ^ ord(‘I’) = 0xe4. Using the first hexdump we dump the following magic key:
weird = [
0xad, 0x53, 0x75, 0x41, 0x6d, 0x8c, 0x0, 0xd7,
0x98, 0xd5, 0xdc, 0xc6, 0x77, 0xb1, 0x55, 0x8,
0xf6, 0xa4, 0xc0, 0x25, 0xba, 0x9a, 0x5c, 0x18,
0x3d, 0x32, 0x0, 0x7f, 0x1d
]
Putting everything together, we can get the flag by xoring this buffer with the
expected bytes (Reminder ...
) and the constant key:
weird = [
0xad, 0x53, 0x75, 0x41, 0x6d, 0x8c, 0x00, 0xd7,
0x98, 0xd5, 0xdc, 0xc6, 0x77, 0xb1, 0x55, 0x08,
0xf6, 0xa4, 0xc0, 0x25, 0xba, 0x9a, 0x5c, 0x18,
0x3d, 0x32, 0x00, 0x7f, 0x1d
]
target = [
0x52, 0x65, 0x6d, 0x69, 0x6e, 0x64, 0x65, 0x72,
0x3a, 0x20, 0x70, 0x61, 0x74, 0x63, 0x68, 0x20,
0x43, 0x56, 0x45, 0x2d, 0x32, 0x30, 0x32, 0x31,
0x2d, 0x31, 0x36, 0x34, 0x37
]
key = [
0xb9, 0x75, 0x4b, 0x6b, 0x78, 0xa0, 0x00, 0xc9,
0xce, 0x9a, 0xea, 0xd5, 0x6c, 0xbf, 0x78, 0x45,
0xc0, 0x9e, 0xe4, 0x7c, 0xed, 0xce, 0x39, 0x46,
0x62, 0x6f, 0x52, 0x6a, 0x57
]
for i in range(len(weird)):
weird[i] ^= (key[i] ^ target[i])
print("".join(chr(c) for c in weird))
This code prints out the flag FCSC{HelloFromEmulatedWorld!}
!
Afterword
I really liked the concept behind this challenge, this is something you dont see often
and that is refreshing to see. The execution model of this challenge really forces you to take
a step back, analyse the situation carefuly, and use every trick in the book to get what you
want, no matter how unsexy the final method is.