Introduction
ShuffleMe
is a reverse challenge from the FCSC edition 2021 from Cryptanalyse.
Recover the inputs to get the flag.
shuffleMe
- 22.02 KiB
Initial recon
We are facing a stripped ELF 64-bit executable, the binary is asking for an secret from the argv[1]
, then a input from stdin
.
int main(int argc,char **argv)
{
uint8_t buffer [80];
long local_18;
int local_c;
if (argc != 2) {
fprintf(stderr,"Usage: %s <secret>\n",argv[1]);
exit(1);
}
local_c = atoi(argv[1]);
memset(buffer,0,80);
__isoc99_scanf("%70s",buffer);
local_18 = FUN_401230(buffer,local_c);
if (local_18 == 0x460000) {
puts("Yay!");
}
else {
puts("Nope.");
}
return 0;
}
For this writeup, I will use the base address 0x400000
The following function can look weird, the FUN_401230
function call FUN 401245
which seems to end quite suddenly.
00401245 41 58 POP R8
00401247 49 83 c0 38 ADD R8,0x38 ; The next instruction after 0x0040127b
0040124b 50 PUSH RAX ; The Buffer
0040124c 48 89 f0 MOV RAX,RSI ; Store the secret to RAX
0040124f 48 c7 c1 MOV RCX,0xb5
b5 00 00 00
00401256 48 f7 e1 MUL RCX
00401259 48 05 d9 ADD RAX,0xd9
00 00 00
0040125f 48 c7 c1 MOV RCX,0x200
00 02 00 00
00401266 48 f7 f1 DIV RCX
00401269 48 89 d6 MOV RSI,RDX
0040126c 48 89 f1 MOV RCX,RSI
0040126f 48 c1 e1 04 SHL RCX,0x4
00401273 4c 01 c1 ADD RCX,R8
00401276 58 POP RAX
00401277 49 83 eb 07 SUB R11,0x7
0040127b ff e1 JMP RCX
By manually disassembling the code after 0x40127b
we can see a pattern: one instruction, a tiny NOP sled, then a jump to 0x40124b
.
From here, the challenge start drawing by itself, our secret input will be used to select which instruction jump, then the next instruction will be resolved and so on.
This is a kind of Control Flow Flattening protection, but in our case it’s up to the user to choose the instruction flow.
Introduction
The first thing that will help us in the analysis is to get all the possible RCX value from our secret input.
We can manually get the operation and re-write them to a z3 script, it’s not even 10 instructions.
import z3
MASK = 0xffffffffffffffff
def find_solution(target_addr):
s = z3.Solver()
rax = z3.BitVec("reg_RAX", 32) # our 32 bit input
r8 = 0x40127d # addr of jmp rcx
constraints = (((( (rax*0xb5)& MASK)+ 0xD9) & MASK)%0x200)
s.add(((constraints<<4)+r8)==target_addr) # RCX computed value
if s.check() == z3.sat:
model = s.model()
for model_index, _ in enumerate(model):
return int(str(model[_])) # just one solution
else:
print("No solution found !")
exit(1)
The last task that could be done manually is getting all the target address and link them to the a secret value.
By copying the instruction set from 0x40127d
to 0x403278
using objdump
or ghidra
and 2 minutes of pasting using Cyberchef we can have the list of the instructions address.
target_addrr = [4200717, 4200733, 4200749, 4200765, 4200781, ...]
instr2secret = {}
for addrr in target_addrr:
instr2secret[addrr] = find_solution(addrr)
Reversing the Control Flow
It’s not possible to find the flag if I don’t have the code that will check it, so the first step is to find the secret.
Solving this type of challenge can be quite a pain, hopefully, the Miasm framework will help us.
The idea is to emulate FUN_00401230
with the constants found previously for each instruction, store the shellcode generated, and then use a symbolic execution engine
to find the target return address (0x460000).
With this script I can get the shellcode for every secret value.
import pathlib
from miasm.jitter.jitload import JitterException
from miasm.analysis.binary import Container
from miasm.analysis.machine import Machine
from miasm.core.locationdb import LocationDB
from miasm.jitter.csts import PAGE_READ, PAGE_WRITE,EXCEPT_SYSCALL, PAGE_EXEC
from miasm.arch.x86.arch import mn_x86
from miasm.expression.expression import *
from my_z3_script import instr2secret
shellcode = b""
def supervisor(jitter):
global shellcodes go back to
RCXValue = jitter.cpu.RCX
blk = mdis.dis_block(RCXValue)
if(RCXValue == blk.lines[0].offset): # Just to make sure i really execute the instruction
print(blk.lines[0], hex(blk.lines[0].offset)) # print the shellcode instr per instr
shellcode = shellcode+mn_x86.asm(blk.lines[0])[0] # get the bytecode
jitter.running = True # continue the execution
return True
def end_shellcode(jitter):
global shellcode
print(shellcode)
# Now use symbolic execution to find rax=0x460000
shellcode = b""
jitter.running = False
return False
entry1 = b"A"*70 # garbage input
path = str(pathlib.Path(__file__).parent.absolute())+"/"
f = open(path+"shuffleMe", 'rb')
data = f.read()
f.close()
loc_db = LocationDB()
cont = Container.from_string(data, loc_db)
machine = Machine("x86_64")
mdis = machine.dis_engine(cont.bin_stream, loc_db=loc_db)
def init_jit():
global myjit
myjit = machine.jitter(loc_db)
myjit.init_stack()
base_addrr = 0x00400000
myjit.vm.add_memory_page(base_addrr, PAGE_READ | PAGE_WRITE | PAGE_EXEC , data)
myjit.vm.add_memory_page(0xdead0000, PAGE_READ , entry1)
myjit.cpu.RDI = 0xdead0000
myjit.cpu.RAX = 0xdead0000
myjit.add_breakpoint(0x0401205, end_shellcode)
myjit.add_breakpoint(0x40127b, supervisor)
init_jit()
for addrr, secret in instr2secret.items():
print(secret)
entry2 = secret
myjit.cpu.RSI = entry2
myjit.cpu.RDX = entry2
try:
myjit.run(0x401200)
except JitterException: # Some address point will make the program crash
print("Error for: ",hex(addrr), f"({secret})")
myjit.running = False
init_jit()
The next step with the symbolic execution didn’t work, the result was hardly parsable, and the amount of time to handle over the 500 possibilities would be too big. I found a better solution.
The instruction flow created from our secret will define the entry point of the shellcode, but the next address will follow the same set until the end.
NOTE The shellcode is linear, which means there are no conditions or jumps.
By setting the secret value, we finally only select where we start in the shellcode.
The next step is to count all the instructions for each entry point, then sort all the values. The entry with the most of instructions will logically be the plain shellcode.
Some adjustments will be needed:
def supervisor(jitter):
global global_count
RCXValue = jitter.cpu.RCX
blk = mdis.dis_block(RCXValue)
if(RCXValue == blk.lines[0].offset):
global_count+=1
jitter.running = True
return True
# [...]
def end_shellcode(jitter):
global global_count
entry_count[global_secret] = global_count
print(global_secret,":",entry_count[global_secret])
jitter.running = False
return False
# [...]
for addrr, secret in my_z3_script.instr2secret.items():
global_count = 0
global_secret = secret
entry2 = secret
myjit.cpu.RSI = entry2
myjit.cpu.RDX = entry2
try:
myjit.run(0x401200)
except JitterException as e:
print("Error for: ",hex(addrr), f"({secret})")
myjit.running = False
init_jit()
print(dict(sorted(entry_count.items(), key=lambda item: item[1]))) # sorted result
As exepted, after 5 minutes I finally have the sorted result.
From:
2752575725
: The secret value to get directly theRET
instruction, size 1.
To:
972893146
: The secret value to directly go at the begin of the shellcode, size 511.
Let’s go back to our code from earlier and dump the shellcode from the secret 972893146
, save the shellcode, and then analyze it. I use ghidra by manually selecting the language x86 64 bits (because there’s no metadata).
The shellcode is pretty simple.
The red-framed part is the same for every byte of the flag (stored in RDI
).
I have chosen to parse using z3 (again), symbolic execution for later :(
Using gdb I’ll get dynamically the state of the registers (at 0x0000001e
).
RBX = 0
R9 = 0
R10 = 0x10000
R12 = 0x33
Because I’m lazy and I already have the tool in hand, I’ll modify the supervisor function to get the list of constants xored to BX and the RSI value (which is the seed for the next instruction).
I’m sure it will work, and it relieves me of potential calculation errors.
def supervisor(jitter):
global shellcode
global global_count
global RSI_array
global BX_array
RCXValue = jitter.cpu.RCX
jitter.running = True
blk = mdis.dis_block(RCXValue)
if(RCXValue == blk.lines[0].offset):
if blk.lines[0].name =="XOR" and blk.lines[0].args[0] == ExprId('RBX', 64) and blk.lines[0].args[1] == ExprId('RSI', 64):
RSI_array.append(jitter.cpu.RSI)
if blk.lines[0].name =="XOR" and blk.lines[0].args[0] == ExprId('BX', 16) and type(blk.lines[0].args[1]) == ExprInt:
BX_array.append(int(blk.lines[0].args[1]))
print(blk.lines[0], hex(blk.lines[0].offset))
return True
We now have everything in hand to solve this challenge.
import z3
RSI_val = [299, 2, 509, 140, 351, 358, 465, 272, 467, 10, 485, 212, 135, 494, 57, 472, 379, 274, 205, 28, 175, 374, 417, 416, 35, 282, 181, 100, 471, 510, 9, 104, 459, 34, 413, 428, 511, 390, 369, 48, 115, 42, 389, 500, 295, 14, 473, 248, 27, 306, 109, 316, 335, 406, 321, 192, 195, 314, 85, 388, 119, 30, 425, 392, 107, 66, 317, 204, 159, 422]
BX_val = [18285, 17217, 21166, 17359, 31268, 24583, 26551, 13090, 13281, 25449, 26499, 25776, 25827, 13786, 13839, 25530, 26397, 25201, 14581, 12589, 12447, 13379, 25282, 25538, 12817, 14626, 26064, 12372, 13795, 14535, 14385, 14416, 26541, 14619, 12717, 13720, 12751, 13746, 14408, 25684, 25623, 13852, 12725, 25494, 25666, 12607, 13291, 13516, 13614, 13829, 12636, 14596, 26409, 26608, 25636, 26278, 13814, 25432, 14700, 25574, 25109, 26232, 14480, 25578, 13662, 12402, 14340, 25768, 13738, 31963]
s = z3.Solver()
flag = [z3.BitVec(f"flag({i})", 8) for i in range(70)]
RBX = 0
R9 = 0
R10 = 0x10000
R12 = 0x33
for i in range(0, 0x46):
RBX = ((flag[i]<<8)|flag[i])
RBX = RBX^BX_val[i]
RBX = RBX^RSI_val[i]
R9 = R9|RBX
R9 = (R9+R10) & 0xffffffffffffffff
RAX = R9
s.add(RAX==0x460000)
if(s.check() == z3.sat):
model = s.model()
flag = "".join([chr(model[flag[i]].as_long()) for i in range(70)])
print(flag)
else:
print(":(")
FLAG : FCSC{af22cfdd46bfc8105cb28e04988f904049dd60be1245718ffef5b9bbf9b509d5}
Conclusion
Nice challenge. You need a little organization for this kind of challenge. There are a lot of steps, but it remains a very nice puzzle type.