Manthan

Campaign setup, instrumented builds, crash analysis and triage workflow for memory corruption discovery using AFL++ and ASAN.

The fuzzing workflow here grew out of research on file-parsing libraries — targets like llama.cpp, libdicom and libbiosig that ingest untrusted binary formats and have historically received less scrutiny than network-facing code. AFL++ with AddressSanitizer is the standard toolchain for this: ASAN surfaces the exact bug class on first reproduction, which collapses the triage step considerably. Most of the campaigns I run are against C/C++ parsers for niche formats where seed corpora are sparse and custom dictionaries matter.

Overview

Fuzzing is the primary methodology for discovering memory corruption vulnerabilities in C/C++ targets. AFL++ with AddressSanitizer (ASAN) is the standard toolchain.

Target selection

Targets organized by file format and parser library:

CDF: libgsf
DICOM: libdicom, grassroots dicom
EGI/VHDR: libbiosig
FLAC: miniaudio
GGUF: llama.cpp
JP2K: nvidia nvjpeg2000
Node: libigl
OTF/TTF/PDF: Adobe Acrobat Reader, Foxit Reader, xpdf

Campaign setup

mkdir -pv ~/campaigns/<target>/{source,input,output}

Building with instrumentation

export LLVM_CONFIG="llvm-config-13"
export CC=afl-clang-fast
export CXX=afl-clang-fast++
export AFL_USE_ASAN=1

./configure --prefix=<campaign-dir> --disable-shared
make clean && make && make install

The --disable-shared flag ensures static linking for better instrumentation coverage.

Running a campaign

afl-fuzz -i input/ -o output/ -- ./bin/target_binary @@ output/

ASAN configuration

Key environment variables for crash analysis:

export ASAN_OPTIONS="halt_on_error=1:print_stack_trace=1:detect_leaks=0"
export ASAN_OPTIONS="$ASAN_OPTIONS:detect_stack_use_after_return=1"
export ASAN_OPTIONS="$ASAN_OPTIONS:strict_string_checks=1"

Crash analysis workflow

Reproduce: run the crashing input against the ASAN-instrumented binary
Classify: ASAN report identifies the bug class (heap-buffer-overflow, use-after-free, stack-overflow, etc.)
GDB inspection: bt, info registers, examine memory at crash site
Minimize: reduce the PoC to the smallest triggering input

PoC minimization

afl-tmin -i crash_input -o minimized_input -- ./target @@

Batch reproducibility check:

for f in output/crashes/id:*; do
  echo "=== $f ==="
  timeout 5 ./target "$f" 2>&1 | head -5
done

Security impact triage

Prioritize crashes by exploitability:

Control flow hijacking — overwritten function pointers, vtable corruption, RIP/PC control
Write primitive — arbitrary or bounded write to attacker-controlled address
Information leak — out-of-bounds read exposing heap/stack content
Denial of service — null deref, assertion failure, infinite loop

Custom mutations

AFL++ dictionaries improve coverage for structured formats. Example for PNG:

# png.dict
header_png="\x89PNG\r\n\x1a\n"
chunk_IHDR="IHDR"
chunk_IDAT="IDAT"
chunk_IEND="IEND"
chunk_PLTE="PLTE"

Use with: afl-fuzz -x png.dict ...

AFL++ installation

sudo apt install build-essential git python3-dev automake flex bison \
  libglib2.0-dev libpixman-1-dev python3-setuptools clang
git clone https://github.com/AFLplusplus/AFLplusplus
cd AFLplusplus && make distrib && sudo make install

What to watch during a run

Campaigns against format parsers tend to surface heap-buffer-overflows and out-of-bounds reads early in the first few hours, then slow down as coverage plateaus. When afl-whatsup shows no new paths for a prolonged period, check whether the seed corpus is covering format variants — a missing magic byte or an unseen chunk type is often the reason coverage stalls. Note that crash count alone is not triage: a single root cause frequently generates hundreds of distinct crash inputs, so minimize and deduplicate before reporting.

~/posts/Fuzzing with AFL++