Sideway

FCSC Quals 2022 - HyperPacker (reverse)

Challenge description

Votre mission est de trouver un moyen d'extraire les secrets de plusieurs binaires packés dans un temps imparti.

nc challenges.france-cybersecurity-challenge.fr 2202

The attached file is a python pwntools script, which serves as a skeleton for the solving script:

#!/usr/bin/env python3

import subprocess
from pwn import *

HOST = args.HOST or "challenges.france-cybersecurity-challenge.fr"
PORT = args.PORT or  2202

def main():
    r = remote(HOST, PORT)
    _ = r.recvlines(5)

    cmdline = r.recvline().strip().decode("utf-8").split(" ")
    assert cmdline[0] == "hashcash"
    assert cmdline[1] == "-mb26"
    assert cmdline[2].isalnum()

    log.info(f"Solving PoW")
    solution = subprocess.check_output([cmdline[0], cmdline[1], cmdline[2]])
    log.success(f"Solved PoW: {solution.decode()}")

    r.sendline(solution.strip())
    _ = r.recvline()

    encoded = r.recvline().strip()
    binary = b64d(encoded)

    with open("/tmp/hyperpacker.bin", "wb") as fp:
        fp.write(binary)

    log.success("Binary written at /tmp/hyperpacker.bin")

    # Solve the challenge here
    # ...

if __name__ == "__main__":
    main()

The binaries

Executing the python code downloads a binary, which is a PE file.

$ file /tmp/hyperpacker.bin
hyperpacker.bin: PE32+ executable (console) x86-64, for MS Windows

If we execute it with wine, it takes some time and then a message box pops up with the secret.

messagebox

So far so good, easy enough. However, after downloading and executing other binaries we can notice a few things. The binaries are completely identical in structure (same addresses for functions or global variables) and only some thing change, such as the big encrypted buffer or constants in certain functions. Some binaries also fail to unpack as their execution is way too long. This looks like an optimization problem !

Code analysis

Let’s load the binary in a disassembler and start at the entrypoint.

start

The entrypoint of calls a function which returns a pointer to code, and if not null jumps to it.

decrypt_check

This function first calls a function with a pointer to a huge buffer in the data section. We can assume this is the packed code. The other function takes the same buffer and seems to return the address of the entrypoint of the unpacked code. Note that if the unpacking fails, the return value is the address of start.

Let’s take a look at the sanity checks in sub_1400973f0.

Sanity check

check_header_main

The checking function first checks that the header is valid, then the other functions are used to VirtualProtect the code, compute the entrypoint and return it.

check_header

The header check simply compares the first two bytes of the unpacked code with the word 0x5a4d, which in ascii iz MZ. This is the same magic bytes by which a PE file starts !

Now we know the form of the unpacked code. We can move on to the actual unpacking algorithm itself.

Unpacking

As we guessed that this challenge seems to be an optimization one, we need to see where the program spends most of its time. This can be done by using perf.

$ perf record wine /tmp/hyperpacker.bin && perf report

On my test binary, the most called function sits à 12.57% of the CPU time and is at address 0x140097358.

long_function

This function does a simple check on an input byte array, doing a logical and with an input byte and the inverse of a byte in the second buffer. Let’s keep this aside for now and look at this function’s parent, which happens to be the function called before the checking code).

unpacking_code

In (1) we have the long function called with a stack buffer and a global byte array. In (2) we have another long mystery function. In (3) is a function wrapping the checking code we have seen earlier. Its return value is used as a check to stop the big loop and enter the actual unpacking process. In (4), the output area, the original packed area as well as the stack buf are passed to the function doing the decryption.

decrypt_function

It simply computes a xor key using the stack_buf passed as an argument and decrypts the packed code into the new buffer. The actual key generation uses a lot of aes intrinsics and you don’t want to reverse that, so let’s just not look into it.

From here the issue is pretty clear. The computation of the stack buffer in the big loop of weird_loop_stuff, from which the xor key is derived, is too slow. We need to find a way to speed it up.

Going fast

long_function_before

The second long function takes the stack buf and iterates from a certain offset, here 0x7 (it changes from binary to binary). It increments the value at this offset, and if it is different from 0xff it returns immediately. Otherwise, it sets the current byte to 0, increments the offset and starts the loop again.

This may seem mystical at first, but what it does is increment some kind of pseudo bignum inside of stack_buf. Calling the function a few times on a number with an offset of 2 would look like this:

00 00 fe 02 00 00 00 00
00 00 ff 02 00 00 00 00
00 00 00 03 00 00 00 00
00 00 01 03 00 00 00 00
...

The stack buf is then compared in the other long function to a global buffer. In the binary used for this test the value in the buffer is:

00 00 00 00 00 00 00 8a 00 00 98 00 00 00 00 00

We can already notice that the offset at which the incrementation starts in long_function_before is the same as the first non null byte of this global constant. To sum up, here is the stack_buf generation algorithm in pseudo code:

while True:
    if check_decrypt(packed_code, stack_buf):
        break

    while True:
        increment(stack_buf)

	if check_constant_and(stack_buf, constant):
            break

compute_xor_key_and_decrypt(..., stack_buf)

The issue with this algorithm is that if the constant bignum is big enough, the incrementation will take forever. To optimize this I chose to patch the binary and replace the incrementation function by a function doing the incrementation only at the same offsets as the non null bytes of the global constant bignum.

uint8_t next(uint8_t cur_val, uint8_t target)
{
    for (uint16_t new_val = cur_val + 1; new_val < 0x100; new_val++)
    {
        if ((~target & new_val) == 0)
            return new_val;
    }

    return 0;
}

void advance(uint8_t* state, uint8_t* final_state)
{
    for (unsigned i = 0; i < 16; i++)
    {
        if (final_state[i] == 0)
            continue;

        uint8_t next_val = next(state[i], final_state[i]);
        state[i] = next_val;

        if (next_val)
            return;

        // Wrap around, we continue
    }
}

I compiled this code in -O3 using clang and overwrote the long_function code at 0x140097358 with it. Now, as the optimized method only generates valid pseudo bignums, the checking part in main can be replaced as well.

orig_loop

To do this I compiled and patched the code from 0x1400972b1 with the following shellcode:

[BITS 64]

section .text

; To inject at 0x1400972b1
_start:
    push rdi
    push rsi

    ; pointer to stack_buf
    lea rdi, [rbp-0x18]
    ; pointer to global constant
    mov rsi, 0x140096a80
    ; address of long_function (patched with optimized shellcode)
    mov rax, 0x140097358
    call rax

    pop rdi
    pop rsi

    ; jump back to loop start
    mov rax, 0x140097277
    jmp rax

With these small patches, all binaries, even the longer ones finished unpacking in only seconds ! However, in some rare instances where the bignum was too big I would get a segfault when jumping to the unpacked code.

A few long hours were spent debugging. Somehow the unpacking succeeded, but the output was complete garbage. I started to doubt my implementation and spent way too much time starring at the same 5 lines of code, thinking something was at fault. I even attempted to try my luck with the remote service, hoping it would only send me a long string of non crashing inputs. After a while everything clicked.

why

Let’s put the checking function back into focus.

checking

Notice how it only compares the first two bytes with ‘MZ’ ? On bigger bignums a lot of xor key would be generated, which also means that it increases the risk of collision. To only match on a correct unpacking, I patched this function with the following code, checking the first 8 bytes instead of only 2:

[BITS 64]

section .text

; To inject at 0x1400972b1
_start:
    mov rax, 0x0000000300905a4d
    cmp rax, [rcx]
    jne _fail

    xor eax, eax
    inc eax
    ret

_fail:
    xor rax, rax
    ret

This fixed all issues and all binaries could be successfuly unpacked ! All that remains is to programmatically recover the secret.

Recovering the secret

The secret is being displayed in a message box, which is pretty troublesome for scripting. The exact api call for displaying the message box can be found through tracing wine:

$ WINEDEBUG=+all wine hyperpacker-1.bin >log.txt 2>&1
$ grep SECRET log.txt
18890.724:00d4:00d8:Call user32.MessageBoxA(00000000,140004018 "SECRET is : QDJQD95V41XV2COIZY4L8WDO5PMGN2F8",140004000 "Take your secret !",00000000) ret=14000158a

MessageBoxA receives as a second argument the string to be printed. I thus chose to patch the user32.dll in /usr/lib/wine/x86_64-windows with the following shellcode to print the flag on stdout and exit:

[BITS 64]

section .text

_start:
    ; write
    xor rax, rax
    inc rax         ; syscall write (1)
    mov rdi, rax    ; fd = 1 (stdout)
    mov rsi, rdx    ; the flag is in rdx because of the Windows ABI
    mov rdx, 0x2d
    syscall

    ; exit
    xor rdi, rdi
    mov rax, 0x3c
    syscall

The secret can now simply be recovered by reading stdout !

$ wine /tmp/hyperpacker.bin
SECRET is : QDJQD95V41XV2COIZY4L8WDO5PMGN2F8%

Solving

Here is the final solving script:

#!/usr/bin/env python3

import subprocess
import time
from pwn import *

HOST = args.HOST or "challenges.france-cybersecurity-challenge.fr"
PORT = args.PORT or  2202

def solve():
    shellcode = [
      0x57, 0x56, 0x48, 0x8d, 0x7d, 0xe8, 0x48, 0xbe, 0x80, 0x6a,
      0x09, 0x40, 0x01, 0x00, 0x00, 0x00, 0x48, 0xb8, 0x58, 0x73,
      0x09, 0x40, 0x01, 0x00, 0x00, 0x00, 0xff, 0xd0, 0x5f, 0x5e,
      0x48, 0xb8, 0x77, 0x72, 0x09, 0x40, 0x01, 0x00, 0x00, 0x00,
      0xff, 0xe0
    ]

    improved_fn_shellcode = [
      0x4c, 0x8d, 0x4f, 0x10, 0xeb, 0x17, 0x66, 0x2e, 0x0f, 0x1f,
      0x84, 0x00, 0x00, 0x00, 0x00, 0x00, 0x48, 0x83, 0xc7, 0x01,
      0x48, 0x83, 0xc6, 0x01, 0x49, 0x39, 0xf9, 0x74, 0x4f, 0x0f,
      0xb6, 0x16, 0x84, 0xd2, 0x74, 0xec, 0x0f, 0xb6, 0x07, 0x44,
      0x8d, 0x40, 0x01, 0x66, 0x3d, 0xff, 0x00, 0x74, 0x2b, 0xf7,
      0xd2, 0x41, 0x0f, 0xb7, 0xc8, 0x85, 0xd1, 0x74, 0x35, 0x83,
      0xc0, 0x02, 0x0f, 0xb7, 0xc0, 0xeb, 0x10, 0x0f, 0x1f, 0x44,
      0x00, 0x00, 0x89, 0xc1, 0x83, 0xc0, 0x01, 0x21, 0xd1, 0x85,
      0xc9, 0x74, 0x1d, 0x41, 0x89, 0xc0, 0x66, 0x3d, 0x00, 0x01,
      0x75, 0xec, 0xc6, 0x07, 0x00, 0x48, 0x83, 0xc7, 0x01, 0x48,
      0x83, 0xc6, 0x01, 0x49, 0x39, 0xf9, 0x75, 0xb1, 0xc3, 0x0f,
      0x1f, 0x00, 0x44, 0x88, 0x07, 0xc3
    ]

    check_shellcode = [
      0x48, 0xb8, 0x4d, 0x5a, 0x90, 0x00, 0x03, 0x00, 0x00, 0x00,
      0x48, 0x3b, 0x01, 0x75, 0x05, 0x31, 0xc0, 0xff, 0xc0, 0xc3,
      0x48, 0x31, 0xc0, 0xc3
    ]

    fd = open("/tmp/hyperpacker.bin", "r+b")
    fd.seek(0x000492b1)
    fd.write(bytearray(shellcode))
    fd.seek(0x000499b9)
    fd.write(bytearray(check_shellcode))
    fd.seek(0x00049358)
    fd.write(bytearray(improved_fn_shellcode))

    fd.close()

    result = subprocess.run(["wine", "/tmp/hyperpacker.bin"], capture_output=True)
    assert result.returncode == 0

    res = result.stdout.decode("UTF-8").strip()
    return res[len("SECRET is : "):].replace("\x00", "")

def main():
    r = remote(HOST, PORT)
    _ = r.recvlines(5)

    cmdline = r.recvline().strip().decode("utf-8").split(" ")
    assert cmdline[0] == "hashcash"
    assert cmdline[1] == "-mb26"
    assert cmdline[2].isalnum()

    log.info(f"Solving PoW")
    solution = subprocess.check_output([cmdline[0], cmdline[1], cmdline[2]])
    log.success(f"Solved PoW: {solution.decode()}")

    r.sendline(solution.strip())

    while True:
        x = r.recvline()

        if b"Here is your binary:" not in x:
            log.info("Message:")
            print(x)
            r.interactive()
            break

        encoded = r.recvline().strip()
        binary = b64d(encoded)

        with open("/tmp/hyperpacker.bin", "wb") as fp:
            fp.write(binary)

        with open("/tmp/hyperpacker.bin.bak", "wb") as fp:
            fp.write(binary)

        log.success("Binary written at /tmp/hyperpacker.bin")

        # Solve the challenge here
        # ...
        _ = r.recv()

        log.info("Solving ...")

        secret = solve()
        r.sendline(secret.encode("UTF-8"))

        log.info(f"Solved: {secret}")

        _ = r.recvline()

if __name__ == "__main__":
    main()

Running the script gives us the following flag:

FCSC{2b60aef3d241d3c37c30373a9a4446017a6d5b6761ae5492e3f824e280cb8ceb}

Afterword

This challenge was one my favourite from this year. It is always really fun to patch and repurpose binaries for your own benefit :P.