Writeup by sin-infosec for Hashy Parmentier

crypto symmetric

November 22, 2023

Table of contents

TLDR

Find collisions in a homemade hash function.

Recon

The goal is clear, we need to send 64 messages of increasing length to the server, and have only a maximum of 6 distincts corresponding hash values for these messages. Let’s have a look at the hash function:

class Hash:
	def __init__(self):
		self.h = b"\x00" * 4

	def update(self, data):
		assert len(data) % 4 == 0 # TODO
		for i in range(0, len(data), 4):
			block = data[i:i+4] * 4
			h = self.h * 4
			self.h = strxor(strxor(h, block), AES.new(h, AES.MODE_ECB).encrypt(block))[:4]
		return self

	def digest(self):
		return self.h

So our input is divided into 4-byte blocks and processed sequentially. The hash is initialized to four null bytes and then the process is the following for all the input blocks one by one:

  • Our block of data $D$ of 4 bytes is extended to 16 bytes to get $B = D||D||D||D$

  • The 4-byte hash $H$ is extended to 16 bytes to get $K = H||H||H||H$

  • $C = AES_{enc,K}(B)$ is computed using ECB mode.

  • The new $H$ is the first four bytes of $(K \oplus B \oplus C)$

The attack

So let’s assume we find a block of data $D$ so that for a certain $K$, we have $B_{0-4} = C_{0-4}$ where $X_{0-4}$ represents the first four bytes of $X$. Then at the end of each iteration, we would have $H=K_{0-4} = H $, so any input data of any number of this repeating block would result in the same hash. Now what’s the probability to get $B_{0-4} = C_{0-4}$:

Assuming that AES is considered to give a uniformely distributed output undistinguishable from a truly random output, which is widely accepted (otherwise you may have built an AES dinstinguisher), we can define the probability for a given input $B$ independently of the key to get :

$$P[B_{0-4}=C_{0-4}] =\frac{1}{2^{8}} \times \frac{1}{2^{8}} \times \frac{1}{2^{8}} \times \frac{1}{2^{8}} = \frac{1}{2^{32}} $$

We have $2^{32}$ different possible input blocks, so what is the probability of at least a success with probability $p$ after $n$ trials? it is equivalent to $(1 - P[\text{only fails}])$, where the probability of a fail is:

$$P[B_{0-4}\neq C_{0-4}] = 1 - \frac{1}{2^{32}}$$

As the trials are independant, we get :

$$ P[\text{at least once }B_{0-4}=C_{0-4}\text{ in }2^{32} \text{ trials}] =1 - \left(1 - \frac{1}{2^{32}}\right)^{2^{32}} \approx 0.6321$$

Roughly 63% chances to succeed, that’s more than enough! To efficiently bruteforce $2^{32}$ AES encryptions, I quickly wrote a dirty bruteforcer in Go (disclaimer: ugly code incoming):

package main

import (
   "crypto/aes"
   "encoding/hex"
   "fmt"
   "bytes"
  )

func main() {

   var key = []byte{<insert_key_bytes_here>}
   cipher, _ := aes.NewCipher(key)
   var plaintext = []byte{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}
   for a := 0; a < 256; a++ {
      plaintext[0] = byte(a)
      plaintext[4] = byte(a)
      plaintext[8] = byte(a)
      plaintext[12] = byte(a)
      for b := 0; b < 256; b++ {
         plaintext[1] = byte(b)
         plaintext[5] = byte(b)
         plaintext[9] = byte(b)
         plaintext[13] = byte(b)
         for c := 0; c < 256; c++ {
            plaintext[2] = byte(c)
            plaintext[6] = byte(c)
            plaintext[10] = byte(c)
            plaintext[14] = byte(c)
            for d := 0; d < 256; d++ {
               plaintext[3] = byte(d)
               plaintext[7] = byte(d)
               plaintext[11] = byte(d)
               plaintext[15] = byte(d)
               out := make([]byte, len(plaintext))
               cipher.Encrypt(out, []byte(plaintext))
               if  bytes.Compare(out[:4], plaintext[:4]) == 0 {

                  fmt.Printf("%s\n",hex.EncodeToString(plaintext))
               }
            }
         }
      }
   }
}

It runs the $2^{32}$ encryptions in around 3mins whereas python was doing it in 2-3 hours. Now the first key will be 16 null bytes, let’s see if we can find an input that satisfies our conditions with this key:

$ time go run brute.go

real	2m57.667s

No results… well 63% is not 100%! Let’s try again with another key by first feeding in the block of data you like, and then trying with the new $K = H||H||H||H$ (here I randomly took '\x02'*4)

$ time go run brute.go
c9387966c9387966c9387966c9387966

real	3m1.215s

Bingo we get a success! So now by sending first '02020202' to the remote service, I get one hash value in the Set. Then by sending '02020202' + k * 'c9387966' for any positive integer k the hash will remain the same. All we have left to do is implementing the attack:

from pwn import *

o = remote("challenges1.france-cybersecurity-challenge.fr", 6001)

values = ["02020202", "c9387966"]
N = 64
start = values[0]

for i in range(4, 4 * (N + 1), 4):
	o.recvuntil(b": ")
	o.sendline(start)
	start += values[1]

print(o.recvline())
print(o.recvline())

o.close()

I am pretty sure that there was a more intended solution with less bruteforce because here we manage to get a set of length 1 when the limit is 6, but this was just efficient enough :)

$in