Walking Simulator (474pt)
Description: You have given a three graphs. And three final permutations of partial flag permutations. (A i-th perm is modified by a i-th binary) Can you retrieve the flag?
TLDR: black box the fuck out of it
We are given three binaries:
part_3 and a python script:
Essentially, you run
public.py with a flag and it is chunked into 16-byte blocks each of which are turned into a permutation of 37 numbers, for example:
36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 4 6 0 1 3 2 5
When you run the binary, you provide this initial permutation and then need to make choices:
$ ./part_1 FLAG PERMUTATION: 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 4 6 0 1 3 2 5 Ok, let's start! Starting a step. You are in middle node. Choice: 4 terminate called after throwing an instance of 'std::invalid_argument' what(): Incorrect path! Aborted
Most of the time, this will fail with the message like the one above. With some values however, you will get another prompt:
$ ./part_1 FLAG PERMUTATION: 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 4 6 0 1 3 2 5 Ok, let's start! Starting a step. You are in middle node. Choice: 9 Starting a step. You are in middle node. Choice:
Finally, we are provided with three files:
Good job! 11 23 29 10 19 30 27 5 22 25 35 26 12 18 13 17 8 31 9 15 3 36 33 32 14 24 6 20 28 2 1 34 7 0 4 21 16
This file supposedly contains output from running
part_1 on the first flag chunk.
So now we understand the dataflow is
flag --> permuation --> new permuation.
We are provided with
new permutation and we need to reverse the previous two steps to obtain the flag.
Part One: factoradic
I started with the initial python script first since python is generally more comfortable to reverse.
The script is below (restructured a bit and with some comments):
from functools import lru_cache from Crypto.Util.number import bytes_to_long @lru_cache(1024) def factorial(n): return 1 if n == 0 else n * factorial(n-1) def int_to_perm(v): assert v < factorial(PERM_SIZE) tmp =  for x in range(PERM_SIZE-1, -1, -1): tmp += [ v // factorial(x) ] v = v % factorial(x) tmp = tmp[::-1] ret =  for i in range(PERM_SIZE): p = tmp[i] ret = ret[:p] + [i] + ret[p:] return ' '.join(map(str, ret)) PERM_SIZE = 37 def flag_to_perms(flag): assert len(flag) % 16 == 0, "Incorrect flag len" # chunk flag into 16 byte parts parts = [ flag[i*16:(i+1)*16].encode() for i in range(len(flag)//16) ] # convert to integer VALS = map(bytes_to_long, parts) # generate permutations return list(map(int_to_perm, VALS)) flag = input() perms = flag_to_perms(flag) print(*perms, sep='\n') for i in range(len(flag)//16): with open("flag_perm_" + str(i+1) + ".in", "w") as f: f.write(perms[i])
Essentially, the script interprets each 16-byte chunk as an integer and calls
The first loop looks similar to decoding a number in a specific base, except instead of dividing by
base^i, we are dividing by
Interestingly, this is a valid number system: https://en.wikipedia.org/wiki/Factorial_number_system
The second loop constructs a permuation of
range(37) using this representation.
However, since numbers are always added in order, the index of
36 tells us
tmp and after removing this number, the index of
35 tells us
tmp and so on…
I implemented this with the following function:
def perm_to_int(p): # input is a string permutation: '36 35 34 ... 3 2 5' ret = [int(x) for x in p.split(' ')] tmp =  for i in range(PERM_SIZE-1, -1, -1): idx = ret.index(i) tmp.append(idx) ret = ret[:idx] + ret[idx+1:] # tmp represents the integer in factoradic tmp = tmp[::-1] f = 0 for i in range(PERM_SIZE): f += tmp[i] * factorial(i) return f
Ok, so now we understand the first part of the dataflow and we need to reverse the binaries.
Part Two: The binaries
All three binaries are very similar with slightly different data.
When we open
part_1, we are confronted with familiar c++ mangled function names.
Main looks like this:
This code is responsible for printing
FLAG PERMUTATION: and then reading the numbers into a global array.
Then it calls
do_shit with a global paramater (in
part_1 this is
That function looks like:
I started looking into these functions and was confronted with an overwhelming amount of nested comparisons and nested function calls:
Whenever you see this, you know you’re in for a treat:
At this point it was clear that reversing the entire binary would be impossible within the time constraint. Instead we need to find a clever way to solve the challenge.
Recently with RE challenges, I’ve been using a technique where I write down clear questions about the program on paper and try to answer them. This helps me stay focused on one train of thought rather than getting lost jumping around the binary. It also helps ensure I’m always making forward progress and not just spinning my wheels.
I’ll try to replay my though process here:
1. What is the goal state?
This is always a good place to start for reversing challenges. We know that our output contains the string “Good job!” so let’s find it.
This string is referenced in a single function in the binary:
Here we also see code that prints out
input_array which was where our original permutation was stored.
We can examine the cross references to this goal state and it is found only once here:
This function, respectively, is called here:
This function happens to be
print_node_type_scramble from our original
Additionally, here we see that
rdi (the only input) must be
2. Does our original input actually change or is this “choice” stuff just obfuscation?
Since the goal state prints from the same buffer as our original input, I was curious if it is ever modified.
I set a breakpoint right before the call to
print_node_type_scramble and iterated choices until I found that
9 was valid.
After examining memory, our original permutation has been shuffled.
To test this, I wrote a python script that would bruteforce choices and I discovered that several choices were valid for the initial state such as: 9,13,23,30…
3. Do the valid choices depend on our input permutation?
With my bruteforcing script, I tested several input permutations and discovered that the set of valid choices was the same regardless of what input permutation you provided.
This simplifies our understanding of the binary since we know the “correct path” must be hardcoded into the binary in one form or another.
4. Do the valid choices depend on the history or just the current state?
[rbp+0] acts as a state marker that updates at each loop.
I was curious if this was the only parameter to determine what choices were valid.
To test this, I used the fact that
9, 24 were two valid moves (found by bruteforcing).
Specifically, our initial state is
9 transitions us to
In the initial state
24 throws an error and
9 is valid.
However, if we dynamically patch
0x36 in the first iteration, we would expect
24 to be a valid move and
9 to be invalid.
Luckily, this is the observed behavior.
Therefore, the binary behaves like a DFA where each state transition has the side effect of permuting the flag.
Our goal is to find a path from the start state (
0x23) to the end state (
5. Can we dump the transition table?
Let’s go back to the main loop and anaylize it at a higher level with our new understanding:
It’s basically doing this:
state = 0x23 while True: # print_node_type_scramble if state == 0x3: print('Good job!') return choice = int(raw_input()) next_state = get_new_val(fuck_me(state, choice)) # this might throw an error internally if check_invalid_path(state, next_state): state = next_state else: exit(-1)
I tried several different methods to interface these methods in a black-box format and eventually settled on using a python script communication to a gdb process.
In gdb you can define function pointers and then actually emulate function calls like:
pwndbg> set $fuck = (int (*)(int, int))0x555555556380 pwndbg> set $get_val = (int (*)(int))0x5555556EEB00 pwndbg> set $check = (int (*)(int, int))0x555555555C00 pwndbg> p $fuck(0x23, 0x9) $1 = 158 pwndbg> p $get_val(158) $2 = 54 pwndbg> p $check(0x23, 54) $3 = 1
Using this technique I wrote a script to dump every
(state, choice, next) tuple where the validation check
$check(state, next) passes.
6. Final steps
With a transition table you can do a DFS from the end state and find a valid path.
Using the discovered sequence of choices, we can uncover the permutation mapping.
I opted to send a sequence of all
0's and a single
1 in a specific spot.
Then by recording where the
1 ends up, you can decode the mapping.
Luckily for us, the final permutation mapping also doesn’t depend on the original input (which is something I didn’t explicitly test before this stage).
Now, using this mapping, it is simply a matter of reversing the flag output and decoding it as bytes.
The other two binaries were very similar and I just needed to adjust breakpoints and address locations to use the same solution scripts.