Reverse Engineering

Exploit Dev — The Machine

Before you can write a single exploit you have to see the machine the way the CPU does — registers, the instruction pointer you will eventually hijack, how process memory is laid out, why the stack grows down, just enough assembly to read disassembly, the System V calling convention, and little-endian. We build that mental model, preview exactly where a buffer overflow takes control, and set up the lab (GDB+GEF, pwntools, checksec).

See the machine the way the CPU does — the foundation every exploit is built on

Why you start with the machine

Exploitation isn't magic and it isn't memorising payloads — it's understanding how a program actually runs on the hardware, then abusing that. A buffer overflow only makes sense once you know what a "return address" is and why it sits on the stack next to your buffer. ROP only makes sense once you understand call and ret. So before a single exploit, we build the mental model: how the CPU executes code, how memory is laid out, and just enough assembly to read what's happening.

This is Part 1 of a 14-part Exploit Development series. We go fundamentals → advanced, one idea at a time, so that by the end you can find a bug, control execution, and defeat real-world mitigations — not copy a script. This part is the foundation everything else stands on.

Scope: Linux x86-64 (the most common target; Windows comes in Part 14). If you want a pure-assembly primer first, see x86 Assembly Basics — here we cover only the assembly an exploit developer needs, through the exploitation lens.

Registers — the CPU's working memory

The CPU does its work in a handful of tiny, ultra-fast storage slots called registers. You'll be reading and corrupting these constantly, so learn them now. On x86-64 they're 64-bit and named R..; the 32-bit halves are E.. (legacy x86).


Register	What it is / does	x86 / sub-registers
`RAX`	Accumulator — function return value lands here	EAX (32) / AX / AL
`RDI, RSI, RDX, RCX, R8, R9`	The first six function arguments (System V) — §7	EDI, ESI, …
`RBP`	Base / frame pointer — anchors the current stack frame	EBP
`RSP`	Stack pointer — always points at the top of the stack	ESP
`RIP`	Instruction pointer — the address of the next instruction to run	EIP
`RFLAGS`	Status flags (zero, carry, sign…) — drive conditional jumps	EFLAGS
`R8–R15`	General-purpose extras (x86-64 added these)	—

🔑 Key idea: RIP is the crown jewel. Whoever controls RIP controls what the CPU executes next — that is, controls the program. Every memory-corruption exploit is ultimately a fight to put an address of your choosing into RIP. Keep one eye on it the whole series.

How code runs — fetch, decode, execute

How does a program "run"? The CPU repeats one loop forever: fetch the instruction at the address in RIP, decode it, execute it, and advance RIP to the next instruction — unless the instruction itself changes RIP (a jump, call, or return).

🔑 Key idea: "Control flow" is just the sequence of values RIP takes. Normally instructions run in order and RIP walks forward. A jmp/call/ret sets RIP to somewhere else — that's a legitimate control-flow change. An exploit is an illegitimate control-flow change: you trick a ret (or a corrupted pointer) into setting RIP to your address.

Process memory layout

When the OS loads a program it carves its virtual address space into regions. You need this map because each bug lives in a specific region — a stack overflow is in the stack, a heap bug in the heap.


Region	What lives there	Position
.text	The machine code (the instructions). Read-only + executable.	low addresses
.rodata	Read-only constants (string literals, etc.)	↑
.data / .bss	Global/static variables (initialised / zero-initialised)	↑
Heap	Dynamic memory (`malloc`) — grows UP toward higher addresses	middle
Libraries (mmap)	libc and other shared objects mapped in	↑
Stack	Function frames: locals, saved registers, return addresses — grows DOWN	high addresses

The two that matter most for control-flow hijacking: the stack (where return addresses live — Parts 2–11) and libc (full of useful code we'll return into — Part 8). Note the heap grows up and the stack grows down, toward each other.

The stack — grows down, fills up

The stack is where almost all of the early exploitation happens, so understand it cold. It's a LIFO (last-in-first-out) scratchpad that the CPU manages with two operations and one register (RSP, the stack pointer).


Op	What it does	Direction
`push X`	Decrements `RSP` by 8, then writes `X` at `[RSP]`	the stack grows toward LOWER addresses
`pop X`	Reads `[RSP]` into `X`, then increments `RSP` by 8	the stack shrinks toward HIGHER addresses

🔑 Key idea: The stack grows downward — new data goes at lower addresses. But a buffer (like a char buf[64]) is written upward, from low to high. So if you write past the end of a buffer, you overwrite the things stored above it on the stack — including, eventually, the saved return address. That single mismatch (stack grows down, buffers fill up) is the seed of the stack buffer overflow.

Just enough assembly to read along

You don't need to write much assembly, but you must be able to read it — disassembly is what you stare at in GDB. Here's the small set that covers most of what you'll see (Intel syntax, dst, src).


Instruction	What it does
`mov rax, rbx`	Copy: `rax = rbx`
`lea rax, [rbp-0x10]`	Load Effective Address: `rax = address of` that slot (no dereference)
`push rax` / `pop rax`	Onto / off the stack (§5)
`call func`	Push the return address, then jump to `func` (§7)
`ret`	Pop the top of the stack into `RIP` — return to caller
`jmp` / `je` / `jne`	Jump (unconditional / if zero / if not zero) — set `RIP`
`add / sub / xor / cmp`	Arithmetic / compare (sets flags for the conditional jumps)
`syscall`	Invoke a kernel system call (number in `RAX`, args in `RDI…`)

Two AT&T-vs-Intel notes since you'll meet both: Intel is mov dst, src (GDB set disassembly intel); AT&T is mov %src, %dst with % on registers and $ on immediates. We use Intel throughout the series.

Calling conventions — `call`, `ret` & the return address

This is the heart of why control-flow hijacking works. When one function calls another, there's a strict contract — the calling convention — for where arguments go, where the return value comes back, and crucially how the called function knows where to return to.

System V AMD64 (Linux x86-64)


Element	Where
Arguments 1–6	`RDI, RSI, RDX, RCX, R8, R9` (in that order)
Arguments 7+	Pushed onto the stack
Return value	`RAX`
`call func`	pushes the return address (the instruction after the call) onto the stack, then jumps to `func`
`ret`	pops that saved address off the stack back into `RIP`

shell

# A function's prologue / epilogue (what every function does):
push rbp            ; save the caller's frame pointer
mov  rbp, rsp       ; set up this function's frame
sub  rsp, 0x40      ; make room for local variables (e.g. char buf[64])
...                 ; the function body
leave               ; == mov rsp, rbp ; pop rbp   (tear down the frame)
ret                 ; pop the saved return address into RIP → back to caller

🔑 Key idea: Look at what ret does: it takes whatever is on top of the stack and jumps there. The saved return address lives on the stack, a few bytes above your local buffer. So if you can overwrite that saved address with bytes of your choosing, then when the function does ret, the CPU jumps to YOUR address. That is the entire mechanism of a stack overflow exploit — and you now understand why it works.

Endianness — addresses go in backwards

One small thing that trips up every beginner: x86 is little-endian. Multi-byte values are stored with the least-significant byte first (at the lowest address). So the 4-byte value 0xDEADBEEF sits in memory as EF BE AD DE.

python

# Value:    0x00401234   (an address you want in RIP)
# In memory / in your payload bytes (little-endian):  34 12 40 00

# This is why you NEVER type addresses as raw text — you pack them.
# pwntools does it for you:
from pwn import *
payload = b"A"*72 + p64(0x401234)     # p64 → b"\x34\x12\x40\x00\x00\x00\x00\x00"
# p32() for 32-bit targets, u64()/u32() to unpack leaked bytes back into a number.

Forgetting endianness (or hand-writing an address forwards) is the #1 reason a beginner's first overflow "doesn't work" even though the offset is right. Always pack addresses with p64()/p32().

Where the bug lives — the overflow, previewed

Now put it together — here's the whole series in one picture, the bug you'll exploit for real in Part 4. A function declares a local buffer; gets() (or any unbounded copy) reads more bytes than fit; the write runs up past the buffer, over the saved RBP, and into the saved return address; the function returns — into your bytes.

shell

// The classic vulnerable function (Part 4 builds the exploit):
void vuln() {
    char buf[64];        // 64 bytes on the stack
    gets(buf);           // reads UNBOUNDED input → overflow
}                        // on 'ret', RIP = whatever overwrote the saved return addr

# Stack at the moment of the overflow (low → high addresses):
#   [ buf: 64 bytes ][ saved RBP: 8 ][ saved RET ADDR: 8 ][ ... ]
#     ^ input fills here ──────────► overwrites RBP ──► overwrites RET → controls RIP
#
# So: 64 (buf) + 8 (RBP) = 72 bytes of padding, then 8 bytes = the address RIP takes.
#   payload = b"A"*72 + p64(target_address)

🔑 Key idea: That's it. That number — how many bytes until you reach the return address (the offset) — plus an address to jump to, is a working control-flow hijack. Everything in Tiers 2–4 is what address to jump to when modern defenses make it hard. You now understand the core of binary exploitation; the rest is detail and mitigations.

Set up your lab

Last, build the lab so Part 2 onward is hands-on. You want a Linux x86-64 box (a VM/container is fine), the GNU toolchain, a debugger with an exploitation plugin, and pwntools.

bash

# 1. Toolchain + debugger + pwntools
sudo apt install gcc gdb python3-pip
pip install pwntools                      # the exploitation framework (process, p64, cyclic, ELF, ROP)
# GDB enhancer — pick ONE: GEF or pwndbg
bash -c "$(curl -fsSL https://gef.blah.cat/sh)"     # GEF
# (pwndbg: git clone https://github.com/pwndbg/pwndbg && ./setup.sh)

# 2. Compile a deliberately-vulnerable binary with mitigations OFF (for learning)
cat > vuln.c <<'EOF'
#include <stdio.h>
void vuln(){ char buf[64]; gets(buf); }
int main(){ vuln(); return 0; }
EOF
gcc -fno-stack-protector -z execstack -no-pie -g -o vuln vuln.c
#     ^ no canary          ^ exec stack   ^ no PIE (fixed addresses)

# 3. Check what protections are on (you'll do this on every target)
checksec --file=./vuln        # or:  pwn checksec ./vuln
#   → expect: No canary found, NX disabled, No PIE

# 4. Disable ASLR system-wide for the early labs (re-enable later)
echo 0 | sudo tee /proc/sys/kernel/randomize_va_space

# 5. Drive it in GDB
gdb ./vuln
  gef> break vuln
  gef> run
  gef> info registers            # see RIP, RSP, RBP…
  gef> x/20gx $rsp               # examine 20 qwords at the stack pointer
  gef> pattern create 200        # GEF: cyclic pattern to find the offset (Part 4)

⚠ Everything in this series is for binaries you own / are authorised to test — your own lab, CTF pwn challenges, or a scoped engagement. We turn mitigations off only to learn the mechanics; Tier 3 turns them back on and defeats them properly.

Closing

The foundation is in place. You can name the registers, you know RIP is the prize, you can read the basic assembly GDB shows you, and you understand why the stack grows down into the buffers that fill it up — so when an overflow runs past the end of a buffer and over the saved return address, you know exactly what just happened and why ret hands you control.

Don't move on until that mental model is solid. Compile the vuln binary, open it in GDB, break on vuln, run, then x/20gx $rsp and find the saved return address sitting on the stack above the buffer — watch RSP and RIP change as you step. The first time you actually see it there, the whole thing clicks.

Reactions

Published	Jun 12, 2026
Updated	Jul 16, 2026
Reading time	9 min
Access	public