HTB 2021 Uni CTF Quals - Mechanical madness

Medium hardware

This challenge consists of an assembly program and simulated CPU with the goal of running the program on the CPU. Since we had no way to assemble the program, we first had to write an assembler for it. To help with the reversing we were also given a short program and its bytecode.

:start
	movl ax, 10
:sub1
	movl bx, 1
	sub  bx
	cmp  ax,  ax
	movl bx,  :sub1
	jnz
	rst
v2.0 raw
10000a 100101 10100 130000 100101 e0000 40000

The CPU and some surrounding hardware were implemented as a circuit in logisim evolution.

screenshot

 Reversing

To start, we tried to decode the instruction by looking at the example program and assembly. Comparing two similar lines gave us an initial idea of the structure.

bytecode      assembly       binary
------------------------------------------------------
10000a        movl ax, 10    000100000000000000001010
100101        movl bx, 1     000100000000000100000001

ins  reg  data
--------------
10   00   0a
10   10   01

To learn more about the device, we focused on the other file we got and followed the flow of data inside logisim.

After reading a block of 21 bits from memory (3 bits are discarded), the data is passed to a piece of logic that breaks apart the instruction in 3 parts like we assumed. While we initially thought the other parts to be reg and data, not all instructions use this format. So, we started to think of them as the first and second argument, although they are more commonly called operands when talking about assembly.

screenshot

Then the parts are fed into the CU and we took a look inside. Here we see the IR data line which contains the byescode for the instruction itself. If we look at one of the the multiplexers on the left and follow the trace, we can find out what IR value decodes to what instruction. In this case we can see that the IR value for JMP is 5.

screenshot

Some of the instructions are grouped, so we have to look inside to find out the real encoding. The arithmetic operations are grouped in the ALU. If we go to the ALU view, we see that the instructions are again selected using a multiplexer and that for example, the IR value for SUB is 2.

screenshot

Conditional jumps are also grouped, but selected using a slightly different method, these are using AND and NOT gates. The NOT gates signal a 0 here, so reading up you can see that JG would be encoded as 01000 or 8.

screenshot

We decode the sub-instructions inside the ALU and CJMP until we cover all instruction used in the program we needed to assemble.

Finally, we used the information we gathered to write a python script that could assemble the instructions. We loaded the resulting bytecode into memory and waited a few minutes without any visible result, then we changed the emulation speed from the default 16Hz to 2048 KHz and slowly got the flag!

screenshot

 The assembler and assembly

The assembler first prepossesses the assembly to replace labels with absolute addresses and standardizes the operands. After this the assembler only had to put the binary values corresponding to the instructions and operands together and format it like a logisim memory image.

registers = {"ax":"0", "bx":"1", "cx":"2", "dx":"3"}

instructions = {'ADD':  '00000', 'SUB':  '00001',
                'MUL':  '00010', 'CLR':  '00011',
                'RST':  '00100', 'JMP':  '00101',
                'LJMP': '00110', 'JLP':  '00111',
                'JG':   '01000', 'JGE':  '01001',
                'JL':   '01010', 'JLE':  '01011',
                'JE':   '01100', 'JZ':   '01101',
                'JNZ':  '01110', 'MOV':  '01111',
                'MOVL': '10000', 'CALL': '10001',
                'RET':  '10010', 'CMP':  '10011',
                'PUSH': '10100', 'POP':  '10101',
                'DIV':  '10110', 'MMIV': '10111',
                'MMOV': '11000', '???':  '11001',
                'MSK':  '11010', 'MSKB': '11011'}

def preproces(asm):
    p_asm = []
    labels = {}
    #Step 1: Put instructions in a list and interpret labels
    for line in asm:
        if line.startswith(":"):
            labels[line] = str(len(p_asm))
        else:
            p_asm.append(line.replace(",", "").strip().split())
    #Step 2: Resolve labels and registers
    for line in range(len(p_asm)):
        for part in range(1, len(p_asm[line])):
            if p_asm[line][part].startswith(":"):
                for label in labels.keys():
                    if label in p_asm[line][part]:
                        p_asm[line][part] = p_asm[line][part].replace(label, labels[label])
            elif p_asm[line][part] in registers:
                p_asm[line][part] = registers[p_asm[line][part]]
    #Step 2: Convert hex values and resolve offsets
    for line in range(len(p_asm)):
        for part in range(1, len(p_asm[line])):
            if not p_asm[line][part].isdigit():
                if p_asm[line][part].startswith("0x"):
                    p_asm[line][part] =  str(int(p_asm[line][part],16))
                elif "+" in p_asm[line][part]:
                    sum = p_asm[line][part].split("+")
                    p_asm[line][part] = str(int(sum[0]) + int(sum[1]))
                elif "-" in p_asm[line][part]:
                    sum = p_asm[line][part].split("-")
                    p_asm[line][part] = str(int(sum[0]) + int(sum[1]))
    return p_asm

def assemble(asm):
    bytecode = []
    for line in range(len(asm)):
        #Get bits for the instruction
        inst = instructions[asm[line][0].upper()].zfill(8)
        for part in range(1, len(asm[line])):
            #Get bits for the arguments
            inst += "{0:b}".format(int(asm[line][part])).zfill(8)
        #Add the hex value to the list, filling unused arguments with zeros
        bytecode.append(hex(int(inst.ljust(24, '0'), 2))[2:])
    return bytecode

def main():
    with open("program.asm", "r") as infile:
        asm = infile.read().splitlines()
    p_asm = preproces(asm)
    #Print bytecode in logisim evolution memory dump format
    print("v2.0 raw")
    print(" ".join(assemble(p_asm)))
main()

For people interested in the assembly we needed to assemble, here is what we used as input:

:start
	movl cx, 10
	clr
	movl bx,  1
	movl dx,  0
	mmiv 0x0,  dx
	movl bx,  :sub4
	call bx,  0
	mmov ax, 0x0
	movl bx,  :sub5
	call bx,  0
:sub1
	movl bx,  1
	movl dx,  0
	push dx,  0
	movl bx,  :sub5
	call bx,  0
	movl bx,  1
	sub  bx,  0
	cmp  ax, ax
	movl bx, :sub1
	jnz
	movl bx,  :sub4
	call bx,  0
:sub2
	movl bx,  0
	mmiv 0x1, bx
	mmiv 0x2, bx
:sub3
	pop  ax,  0
	movl bx,  1
	movl dx,  0
	movl bx,  :sub5
	call bx,  0
	mmov bx, 0x1
	msk
	mmiv 0x1, bx
	mmov bx, 0x2
	mskb
	mmiv 0x2, bx
	movl ax,  0xff
	cmp  bx, ax
	movl bx, :sub3
	jl
	movl bx,  0
	mmov dx, 0x1
	movl cx,  1
	movl cx,  0
	movl bx,  :sub2
	jmp  bx,  0
:sub4
	movl ax,  0x05
	movl bx,  :sub5
	call bx,  0
	movl bx,  1
	sub  bx,  0
	cmp  ax,  ax
	movl bx,  :sub4+1
	jnz
	ret
:sub5
	movl cx,  4
	movl cx,  0
	ret

And this is the output generated by the program:

v2.0 raw
10020a 30000 100101 100300 170003 10012e 110100 180000 100137 110100 100101 100300 140300 100137 110100 100101 10100 130000 10010a e0000 10012e 110100 100100 170101 170201 150000 100101 100300 100137 110100 180101 1a0000 170101 180102 1b0000 170201 1000ff 130100 100119 a0000 100100 180301 100201 100200 100116 50100 100005 100137 110100 100101 10100 130000 10012f e0000 120000 100204 100200 120000