Writeup by spikeroot for PORS

Table of contents

TL;DR: Program obfuscation based on sigreturn frames.

PORS is a challenge I authored on the behalf of ECSC Team France for Compete With Team Europe 2025, a training CTF where all the ECSC national teams play against Team Europe. It has been solved by 3 teams during the CTF.

Introduction

For this challenge we are given two files:

an ELF x86-64 binary pors
a file program.bin

Is that yet another VM challenge? Not this time…

When we run the binary, it prompts us for an input:

$ ./pors
Enter your input: azerazerazer
[-] Wrong input, sorry...

The binary does the following operations:

open program.bin,
map three areas at 0x13370000, 0x42420000, 0xcafe0000 and write some data from program.bin into the two former,
execute a mysterious syscall and exit.

undefined8 FUN_00401206(void)
{
  uint local_30;
  uint local_2c;
  void *local_28;
  void *local_20;
  void *local_18;
  FILE *local_10;
  
  local_10 = fopen("program.bin","r");
  if (local_10 == (FILE *)0x0) {
    puts("[-] Failed to load program :/");
    exit(1);
  }
  fread(&local_2c,4,1,local_10);
  local_18 = mmap((void *)0x13370000,(long)(int)((local_2c & 0xfffff000) + 0x1000),7,0x22,-1,0);
  if (local_18 != (void *)0x13370000) {
    puts("[-] init failed, sorry :/");
    exit(1);
  }
  fread((void *)0x13370000,1,(long)(int)local_2c,local_10);
  fread(&local_30,4,1,local_10);
  local_20 = mmap((void *)0x42420000,(long)(int)((local_30 & 0xfffff000) + 0x1000),3,0x22,-1,0);
  if (local_20 != (void *)0x42420000) {
    puts("[-] init failed, sorry :/");
    exit(1);
  }
  fread((void *)0x42420000,1,(long)(int)local_30,local_10);
  local_28 = mmap((void *)0xcafe0000,0x2000,3,0x22,-1,0);
  if (local_28 != (void *)0xcafe0000) {
    puts("[-] init failed, sorry :/");
    exit(1);
  }
  syscall();
  return 0;
}

Obfuscation

But… where is the actual program? The one that prompts us?

The “mysterious syscall” is a sigreturn.

004013c4 48 c7 c4        MOV        RSP,0x42420000
         00 00 42 42
004013cb 48 c7 c0        MOV        RAX,0xf
         0f 00 00 00
004013d2 0f 05           SYSCALL

Actually, the area at 0x42420000 contains a whole array of sigreturn frames, that encodes partially the logic of the program. Sigreturn frames are especially known for the famous SROP technique used in pwn, that allows reaching an arbitrary RIP value while controlling the values of all registers.

The area at 0x13370000 contains the actual “code” of the program, scattered in small blocks, among them:

a “dispatcher block”

00000004 48 89 d0        MOV        RAX,RDX
00000007 48 83 c0 01     ADD        RAX,0x1
0000000b 48 69 c0        IMUL       RAX,RAX,0xf8
         f8 00 00 00
00000012 48 01 c4        ADD        RSP,RAX
00000015 48 c7 c0        MOV        RAX,0xf
         0f 00 00 00
0000001c 0f 05           SYSCALL

small blocks of code similar to:

### executed code ###
00000234 c6 44 24        MOV        byte ptr [RSP + 0x14],0x0
         14 00
### call to dispatcher block ###
00000239 48 c7 c4        MOV        RSP,0x42420000
         00 00 42 42
00000240 48 c7 84        MOV        qword ptr [RSP + 0x88],0x24
         24 88 00 
         00 00 24 
0000024c 48 c7 c0        MOV        RAX,0xf
         0f 00 00 00
00000253 0f 05           SYSCALL

This pattern reminds of control flow flattening (CFF), where each “basic block” runs an elementary amount of code, and calls a “dispatcher” with the number of the next block to execute (that we call the successor). Then, the dispatcher jump to the successor.

Here, the number of the successor is stored in RDX when the dispatcher is called. The dispatcher essentially runs the block N by doing a sigreturn with RSP = 0x42420000 + (N + 1) * <size_of_sigreturn_frame>: each block is associated to the sigreturn frame corresponding to its number. After running its actual code, each block calls back the dispatcher by setting RSP to 0x42420000, writing the number of the successor at [RSP + 0x88] (which corresponds to RDX in the sigreturn frame) and doing a sigreturn syscall. This is the purpose of the very first frame at 0x42420000, unused by the dispatcher.

Some blocks might jump to several different successors, depending on a conditional branch: for instance, in the following block, if AL != BL, the next block executed is 0x43, otherwise it is 0x35.

00000144 8a 44 24 10     MOV        AL,byte ptr [RSP + offset DAT_42420010]
00000148 8a 5c 24 11     MOV        BL,byte ptr [RSP + offset DAT_42420011]
0000014c 38 d8           CMP        AL,BL
0000014e 48 c7 c4        MOV        RSP,0x42420000
         00 00 42 42
00000155 48 c7 c0        MOV        RAX,0x43
         43 00 00 00
0000015c 48 c7 c3        MOV        RBX,0x35
         35 00 00 00
00000163 48 0f 45 c3     CMOVNZ     RAX,RBX
00000167 48 89 84        MOV        qword ptr [RSP + offset DAT_42420088],RAX
         24 88 00 
         00 00
0000016f 48 c7 c0        MOV        RAX,0xf
         0f 00 00 00
00000176 0f 05           SYSCALL

Deobfuscation

The principal difficulty is that the logic is interleaved between both the code blocks and the values that are previously attributed to registers before the execution of each block (that are stored in the sigreturn frames). For instance, in the block above, RSP has previously been set to 0xcafe1008 by sigreturn (the area mapped at 0xcafe0000 is actually used as a “stack” for the obfuscated program).

One possible way to reconstitute the control flow of the program is as follows:

associate each block to its sigreturn frame by identifying the RIP value of the sigreturn frame,
emulate the “dispatcher” by replacing the sigreturns with direct jumps to the next block, handling carefully the conditionals.

In a nutshell, one can generate a new program_deobf.bin where the blocks are of the form:

    mov r8, <r8_in_the_corresponding_sigreturn_frame>
    mov r9, <r9_in_the_corresponding_sigreturn_frame>
    ...
    mov rdi, <rdi_in_the_corresponding_sigreturn_frame>
    mov rsi, <rsi_in_the_corresponding_sigreturn_frame>
    ...
    [the code block]
    mov rax, <the_offset_of_the_successor>
    jmp rax

For blocks with conditionals, the final part is replaced with

    mov rax, <offset_successor1>
    mov rbx, <offset_successor2>
    j<cond> taken
    jmp rax
taken:
    jmp rbx

The blocks being shuffled and not in the same order as the corresponding sigreturn frames, particular attention is required for this step.

The full script to deobfuscate the program is available below and here.

Solving

Then, the pseudo-code can more or less be reconstituted in a decompiler. After googling a bit or asking our favorite LLM, we figure out that the program encodes a 9x9 Suguru puzzle. The areas are hardcoded in a 81-byte buffer areas whose bytes are equal to values in 0-18. Each area is numbered, and a given square (i,j) belongs to the area N if areas[i*9+j] = N. The constraints (already placed numbers) are also hardcoded in another buffer.

After prompting the user for input, the program checks its validity with respect to data concerning the grid. If the input is correct, it is hashed with SHA512 and xored with a constant (the decryption function is actually in the main binary), which prints the flag. Otherwise, it calls another function that prints the “wrong input” message.

The expected input is the numbers in the solved Suguru grid, as a 81-char long string. It is possible to use Z3 to solve the puzzle.

The solver script is available below and here.

After finding the (unique) solution: 141253514235414232142323541235141232142323541231514132142323251351541434143232521, we input it to the binary to get the flag.

$ ./pors
Enter your input: 141253514235414232142323541235141232142323541231514132142323251351541434143232521
[+] Congratulations! You can validate the challenge with this flag: CWTE{pr0gr4mm4t10n_0r13nt33_r3t0ur_s1gn4l}

FLAG: CWTE{pr0gr4mm4t10n_0r13nt33_r3t0ur_s1gn4l}. It means “signal return oriented programming” in French, and thanks to a great coincidence, “PORS” is also “SROP” in reverse :-)

Appendix: `deobf.py`

from pwn import *
from capstone import Cs, CS_ARCH_X86, CS_MODE_64
from capstone.x86 import *

prog_size, program, srop_size, srop_frames = None, None, None, None

info("Parsing the program...")

with open("program.bin", "rb") as f:
    prog_size = u32(f.read(4))
    program = f.read(prog_size)
    srop_size = u32(f.read(4))
    srop_frames = f.read(srop_size)

frame_number = srop_size // 0xf8

parsed_frames = [SigreturnFrame(arch='amd64') for i in range(frame_number)]
block_indices = [0 for i in range(frame_number)]
inv_block_indices = [0 for i in range(frame_number)]

for i in range(frame_number):
    off = i*0xf8
    parsed_frames[i].r8 = u64(srop_frames[off + 0x28:off + 0x30])
    parsed_frames[i].r9 = u64(srop_frames[off + 0x30:off + 0x38])
    parsed_frames[i].r10 = u64(srop_frames[off + 0x38:off + 0x40])
    parsed_frames[i].r11 = u64(srop_frames[off + 0x40:off + 0x48])
    parsed_frames[i].r12 = u64(srop_frames[off + 0x48:off + 0x50])
    parsed_frames[i].r13 = u64(srop_frames[off + 0x50:off + 0x58])
    parsed_frames[i].r14 = u64(srop_frames[off + 0x58:off + 0x60])
    parsed_frames[i].r15 = u64(srop_frames[off + 0x60:off + 0x68])
    parsed_frames[i].rdi = u64(srop_frames[off + 0x68:off + 0x70])
    parsed_frames[i].rsi = u64(srop_frames[off + 0x70:off + 0x78])
    parsed_frames[i].rbp = u64(srop_frames[off + 0x78:off + 0x80])
    parsed_frames[i].rbx = u64(srop_frames[off + 0x80:off + 0x88])
    parsed_frames[i].rdx = u64(srop_frames[off + 0x88:off + 0x90])
    parsed_frames[i].rax = u64(srop_frames[off + 0x90:off + 0x98])
    parsed_frames[i].rcx = u64(srop_frames[off + 0x98:off + 0xa0])
    parsed_frames[i].rsp = u64(srop_frames[off + 0xa0:off + 0xa8])
    parsed_frames[i].rip = u64(srop_frames[off + 0xa8:off + 0xb0])
    block_indices[i] = (parsed_frames[i].rip - 0x13370000) // 0x50
    inv_block_indices[(parsed_frames[i].rip - 0x13370000) // 0x50] = i

code_blocks = [program[i*0x50:(i+1)*0x50] for i in range(frame_number)]

# print(code_blocks, block_indices)

info("Reconstituting the CFG...")

md = Cs(CS_ARCH_X86, CS_MODE_64)
md.detail = True

actual_blocks = []

for i in range(1, frame_number):
    b = code_blocks[i]
    before_mov_rsp = []
    successors = []
    calls = []
    cmov_condition = None

    found_mov_rsp = False
    for insn in md.disasm(b, 0x1337):
        if not found_mov_rsp:
            if insn.mnemonic.lower() == "mov" and insn.op_str.lower().startswith("rsp, 0x42420000"):
                found_mov_rsp = True
            elif insn.mnemonic.lower() == "jmp":
                calls = [insn.reg_name(insn.operands[0].reg)]
                break
            else:
                before_mov_rsp.append(f"{insn.mnemonic} {insn.op_str}")
        else:
            # Cherche valeurs de MOV vers RAX/RBX
            if insn.mnemonic.lower() == "mov":
                if (
                        len(insn.operands) == 2 and
                        insn.operands[0].type == X86_OP_MEM and
                        insn.operands[0].size == 8 and  # qword
                        insn.operands[0].mem.base != 0 and
                        insn.reg_name(insn.operands[0].mem.base) == "rsp" and
                        insn.operands[0].mem.disp == 0x88
                ):
                    if cmov_condition == None:
                        successors = [insn.operands[1].imm]
                    break
                if insn.operands[0].type == X86_OP_REG and insn.operands[1].type == X86_OP_IMM:
                    reg = insn.reg_name(insn.operands[0].reg)
                    imm = insn.operands[1].imm
                    if reg in ("rax", "rbx"):
                        successors.append(imm)
            elif insn.mnemonic.lower().startswith("cmov"):
                cmov_condition = insn.mnemonic[4:]

    if successors == [] and calls == []:
        print(i, b)

    actual_blocks.append((('\n'.join(before_mov_rsp), successors, calls, cmov_condition, parsed_frames[inv_block_indices[i]])))

# print(actual_blocks)
info("Writing deobfuscated code...")

new_blocks = []
block_size = 0x200

for i in range(frame_number-1):
    ab = actual_blocks[i]
    code = ab[0]
    successors = ab[1]
    calls = ab[2]
    cond = ab[3]
    frame = ab[4]
    new_asm = ''
    new_asm += 'mov r8, 0x%x\n' % frame.r8
    new_asm += 'mov r9, 0x%x\n' % frame.r9
    new_asm += 'mov r10, 0x%x\n' % frame.r10
    new_asm += 'mov r11, 0x%x\n' % frame.r11
    new_asm += 'mov r12, 0x%x\n' % frame.r12
    new_asm += 'mov r13, 0x%x\n' % frame.r13
    new_asm += 'mov r14, 0x%x\n' % frame.r14
    new_asm += 'mov r15, 0x%x\n' % frame.r15
    new_asm += 'mov rdi, 0x%x\n' % frame.rdi
    new_asm += 'mov rsi, 0x%x\n' % frame.rsi
    new_asm += 'mov rbp, 0x%x\n' % frame.rbp
    new_asm += 'mov rbx, 0x%x\n' % frame.rbx
    new_asm += 'mov rdx, 0x%x\n' % frame.rdx
    new_asm += 'mov rax, 0x%x\n' % frame.rax
    new_asm += 'mov rcx, 0x%x\n' % frame.rcx
    new_asm += code
    if len(successors) == 1:
        new_asm += '\n'
        new_asm += 'mov rax, 0x%x\n' % ((block_indices[successors[0]+1] - 1) * block_size + 0x10)
        new_asm += 'jmp rax'
    elif len(successors) == 2:
        new_asm += '\n'
        new_asm += 'mov rax, 0x%x\n' % ((block_indices[successors[0]+1] - 1) * block_size + 0x10)
        new_asm += 'mov rbx, 0x%x\n' % ((block_indices[successors[1]+1] - 1) * block_size + 0x10)
        new_asm += 'j%s taken\n' % cond
        new_asm += 'jmp rax\n'
        new_asm += 'taken:\n'
        new_asm += 'jmp rbx'
    elif len(calls) == 1:
        new_asm += 'call %s' % (calls[0])
    else:
        print("oh no")
        exit(1)

    new_block = asm(new_asm, vma=0, arch='amd64')
    new_block += b'\xcc'*(block_size - len(new_block))
    new_blocks.append(new_block)

final_code = b'' # asm("mov esp, 0xcafe0008")
final_code += b'\x90'*(0x10 - len(final_code))
final_code += b''.join(new_blocks)

open("program_deobf.bin", "wb").write(final_code)

Appendix: `solve.py`

from z3 import *

# The puzzle definition
areas = [ [0x00, 0x00, 0x01, 0x01, 0x02, 0x02, 0x03, 0x03, 0x03],
          [0x00, 0x00, 0x04, 0x04, 0x02, 0x02, 0x02, 0x03, 0x03],
          [0x05, 0x05, 0x04, 0x04, 0x06, 0x06, 0x07, 0x07, 0x07],
          [0x08, 0x05, 0x05, 0x04, 0x09, 0x06, 0x0a, 0x07, 0x07],
          [0x08, 0x08, 0x05, 0x09, 0x09, 0x0a, 0x0a, 0x0a, 0x0a],
          [0x0b, 0x08, 0x0c, 0x09, 0x09, 0x0d, 0x0d, 0x0d, 0x0d],
          [0x0b, 0x0c, 0x0c, 0x0c, 0x0e, 0x0f, 0x0f, 0x0f, 0x0f],
          [0x0b, 0x0c, 0x0e, 0x0e, 0x0e, 0x10, 0x0f, 0x11, 0x11],
          [0x12, 0x12, 0x12, 0x12, 0x0e, 0x10, 0x11, 0x11, 0x11] ]

constraints = [ [0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01, 0x00],
                [0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x00],
                [0x00, 0x04, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01],
                [0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00],
                [0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x05, 0x00, 0x00],
                [0x00, 0x00, 0x00, 0x05, 0x00, 0x00, 0x00, 0x00, 0x00],
                [0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00],
                [0x00, 0x05, 0x00, 0x00, 0x04, 0x00, 0x00, 0x00, 0x00],
                [0x00, 0x00, 0x00, 0x00, 0x00, 0x02, 0x05, 0x00, 0x01] ]

grid = [[BitVec("x%d%d" % (i,j), 8) for j in range(9)] for i in range(9)]

s = Solver()

for i in range(9):
    for j in range(9):
        s.add(1 <= grid[i][j])
        if constraints[i][j] != 0:
            s.add(grid[i][j] == constraints[i][j])
        s.add(grid[i][j] <= 9)
        if i < 8:
            s.add(grid[i+1][j] != grid[i][j])
            if j < 8:
                s.add(grid[i+1][j+1] != grid[i][j])
            if j > 0:
                s.add(grid[i+1][j-1] != grid[i][j])
        if i > 0:
            s.add(grid[i-1][j] != grid[i][j])
            if j < 8:
                s.add(grid[i-1][j+1] != grid[i][j])
            if j > 0:
                s.add(grid[i-1][j-1] != grid[i][j])
        if j < 8:
            s.add(grid[i][j+1] != grid[i][j])
        if j > 0:
            s.add(grid[i][j-1] != grid[i][j])

c_areas = [0]*0x13
s_areas = [0]*0x13

for i in range(9):
    for j in range(9):
        c_areas[areas[i][j]] += 1
        s_areas[areas[i][j]] += (1 << (grid[i][j] - 1))

for i in range(0x13):
    s.add(s_areas[i] == (1 << c_areas[i]) - 1)

if s.check() == sat:
    m = s.model()
    sol_mat = [[m[grid[i][j]].as_long() for j in range(9)] for i in range(9)]
    sol = ''.join([str(m[grid[i][j]].as_long()) for i in range(9) for j in range(9)])
    print("Solution: %s" % sol)
else:
    print ("[-] Failed to solve puzzle")

s.add(Or([grid[i][j] != sol_mat[i][j] for i in range(9) for j in range(9)]))

if s.check() == unsat:
    print ("[+] Unique solution")
else:
    print ("[-] warning, solution is not unique...")

from pwn import *

io = process('./pors')

io.sendlineafter(b'Enter your input: ', sol.encode())
print(io.recv().decode())