# sanitize (282pt)

`Reversing`

Description: Help me to find lost flag.

Files: sanitize.tar.gz

Solution:

The program takes some user input and performs some operations on the input and the flag file (read from disk). Then it prints a long hex string which turns out to be branch counters (eg. how many times execution reached a certain point in the program). The counter code was generated with Sanitizer Coverage.

The goal is to recover the flag by using the leaked information about program execution.

After some initial reversing, we discover that the user input is an ascii string followed by a number `k` greater than two, followed by `k` more mutually exclusive numbers (which will correspond to indexes in the flag). For example:

``````[ascii string]
[k > 2]
[x(0)]
[x(1)]
...
[x(k-1)]
``````

Next the binary initializes some sort of global struct `glob`. Then it iterates over our ascii string and the characters we specified in our flag string and updates the global struct in the following way: `glob = mystery(glob, character)`.

For instance if we give the input:

``````abc
3
0
1
2
``````

the program will do:

``````glob = init()
glob = mystery(glob, "a")
glob = mystery(glob, "b")
glob = mystery(glob, "c")
glob = mystery(glob, "f")
glob = mystery(glob, "l")
glob = mystery(glob, "a")
``````

Ideally we would like to find some place in `mystery` where a branch occurs that is dependent on the character values (since that will enable us to obtain different counter signatures for different flag values).

Reversing the `mystery` function took awhile since it does a lot of struct manipulation. There are essentially two struct types:

``````// size: 0x20
struct data_struct {
char val; // set to each character ascii value
int counter; // + 0x4
long left; // + 0x8
long right; // + 0x10
struct data_struct *other; // + 0x18
};

// size: 0x8
struct ref_struct {
struct data_struct *element;
};
``````

`glob` is set to an empty `ref_struct` at the start. When we call `mystery` it creates a new `data_struct` that is empty except for the `val` parameter which is set to the ascii value of the character. Then it creates a new `ref_struct` that contains a pointer to this `data_struct`. Finally it calls a new function `process_node` with the two `ref_struct`’s as arguments. Here is pseudocode that shows the execution of that function:

``````// func: 0x400d70
struct ref_struct *check_vals(struct ref_struct *ref_glob, struct ref_struct *ref_char) {

struct ref_struct *new_ref = (struct ref_struct *)malloc(8);

// compare dword at offset 0x4
if (ref_glob->element->counter < ref_char->element->counter) {
new_ref->element = ref_glob->element;
ref_glob->element->other = 0;

new_ref->element->other = ref_char->element;
ref_char->element->other = 0;
} else {
new_ref->element = ref_char->element;
ref_char->element->other = 0;

new_ref->element->other = ref_glob->element;
ref_glob->element->other = 0;
}

return new_ref;
}

void process_node(struct ref_struct *ref_glob, struct ref_struct *ref_char) {
struct ref_struct *new_ref = check_vals(ref_glob, ref_char);

// check_vals will order by lowest first
// [lowest val][highest val]

if (new_ref != 0) {
// 0x4011f4
struct data_struct *temp = 0;
struct data_struct *first_elem = new_ref->element;
struct data_struct *second_elem = first_elem->other;

while (1) {
if (second_elem == 0) {
// 0x40145e
return new_ref;
} else {
// 0x40121e
if (first_elem->counter == second_elem->counter) {
// 0x40124d
if (second_elem->other != 0 && second_elem->other->counter == first_elem->counter) {
// 0x4012c0
temp = first_elem;
first_elem = second_elem;
second_elem = second_elem->other;

continue;
}

// 0x40125c
if (first_elem->val <= second_elem->val) {
first_elem->other = second_elem->other;

if (first_elem != second_elem) {
second_elem->left = first_elem;
second_elem->other = first_elem->right;
first_elem->right = second_elem;

first_elem->counter += 1;
}
} else {
if (temp == 0) {
new_ref = second_elem;
} else {
temp->other = second_elem;
}

if (first_elem != second_elem) {
first_elem->left = second_elem;
first_elem->other = second_elem->right;
second_elem->right = first_elem;

second_elem->counter += 1;
}

first_elem = second_elem;
}

second_elem = first_elem->other;
}
}
}
}
}
``````

Essentially, this method builds some sort of data structure by adding our new character node to the current tree (`glob` is therefore the head reference to this structure). I wasn’t able to figure out exactly what kind of structure this is, but we can dynamically examine how the structure grows in order to figure out how to leak flag data. Here is the structure at each step while processing the input characters: `a b c l a g`:

`left` and `right` pointers are represented with left and right arrows, `other` is represented with a down arrow, the `val` and `counter` attributes are written in the boxes.

One useful observation is that every other iteration (starting at the first iteration), the new tree is simply the same tree with the head replaced by our new character (which now has the `other` attribute set to the old head). This is because the counter variables between the two nodes are not equal so we skip most of the tree generation code.

The iteration after this, the new character is prepended to the head again, but now since both counter values will be zero, we enter the main body of the function and specifically reach the branch: `if (first_elem->val <= second_elem->val)`. Specifically we are comparing the current character to the previous character (which became the head of the tree).

Depending on the outcome of this comparison, the tree will adjust in some way and it will cause the counter signatures to be different (depending on the path that was taken).

So for instance if we provide the input `AAf` and select index `10` of the flag, the binary will reach a comparison between `f` and `flag[10]`. If `flag[10]` is greater than `f` we will get some counter signature, otherwise we will get a different counter signature.

One caveat is that we need to select at least three flag indexes. I found that choosing indexes `4` and `2` corresponding to characters `{` and `a` of the known flag prefix `flag{` resulted in behavior that did not depend on the previous comparison. Therefore, our exploit takes the form:

``````AA[test char]
3
[test index]
4
2
``````

As we iterate our test char over the ascii values, we will reach a character that causes the counter signature to change and we know that the previous character was equal to the flag character at that index.

Now we simply go character by character and leak the flag.