See the machine the way the CPU does — the foundation every exploit is built on
Why you start with the machine
Exploitation isn't magic and it isn't memorising payloads — it's understanding how a program actually runs on the hardware, then abusing that. A buffer overflow only makes sense once you know what a "return address" is and why it sits on the stack next to your buffer. ROP only makes sense once you understand call and ret. So before a single exploit, we build the mental model: how the CPU executes code, how memory is laid out, and just enough assembly to read what's happening.
This is Part 1 of a 14-part Exploit Development series. We go fundamentals → advanced, one idea at a time, so that by the end you can find a bug, control execution, and defeat real-world mitigations — not copy a script. This part is the foundation everything else stands on.
Registers — the CPU's working memory
The CPU does its work in a handful of tiny, ultra-fast storage slots called registers. You'll be reading and corrupting these constantly, so learn them now. On x86-64 they're 64-bit and named R..; the 32-bit halves are E.. (legacy x86).
| Register | What it is / does | x86 / sub-registers |
|---|---|---|
RAX | Accumulator — function return value lands here | EAX (32) / AX / AL |
RDI, RSI, RDX, RCX, R8, R9 | The first six function arguments (System V) — §7 | EDI, ESI, … |
RBP | Base / frame pointer — anchors the current stack frame | EBP |
RSP | Stack pointer — always points at the top of the stack | ESP |
RIP | Instruction pointer — the address of the next instruction to run | EIP |
RFLAGS | Status flags (zero, carry, sign…) — drive conditional jumps | EFLAGS |
R8–R15 | General-purpose extras (x86-64 added these) | — |
RIP is the crown jewel. Whoever controls RIP controls what the CPU executes next — that is, controls the program. Every memory-corruption exploit is ultimately a fight to put an address of your choosing into RIP. Keep one eye on it the whole series.How code runs — fetch, decode, execute
How does a program "run"? The CPU repeats one loop forever: fetch the instruction at the address in RIP, decode it, execute it, and advance RIP to the next instruction — unless the instruction itself changes RIP (a jump, call, or return).
RIP takes. Normally instructions run in order and RIP walks forward. A jmp/call/ret sets RIP to somewhere else — that's a legitimate control-flow change. An exploit is an illegitimate control-flow change: you trick a ret (or a corrupted pointer) into setting RIP to your address.Process memory layout
When the OS loads a program it carves its virtual address space into regions. You need this map because each bug lives in a specific region — a stack overflow is in the stack, a heap bug in the heap.
| Region | What lives there | Position |
|---|---|---|
| .text | The machine code (the instructions). Read-only + executable. | low addresses |
| .rodata | Read-only constants (string literals, etc.) | ↑ |
| .data / .bss | Global/static variables (initialised / zero-initialised) | ↑ |
| Heap | Dynamic memory (malloc) — grows UP toward higher addresses | middle |
| Libraries (mmap) | libc and other shared objects mapped in | ↑ |
| Stack | Function frames: locals, saved registers, return addresses — grows DOWN | high addresses |
The stack — grows down, fills up
The stack is where almost all of the early exploitation happens, so understand it cold. It's a LIFO (last-in-first-out) scratchpad that the CPU manages with two operations and one register (RSP, the stack pointer).
| Op | What it does | Direction |
|---|---|---|
push X | Decrements RSP by 8, then writes X at [RSP] | the stack grows toward LOWER addresses |
pop X | Reads [RSP] into X, then increments RSP by 8 | the stack shrinks toward HIGHER addresses |
char buf[64]) is written upward, from low to high. So if you write past the end of a buffer, you overwrite the things stored above it on the stack — including, eventually, the saved return address. That single mismatch (stack grows down, buffers fill up) is the seed of the stack buffer overflow.Just enough assembly to read along
You don't need to write much assembly, but you must be able to read it — disassembly is what you stare at in GDB. Here's the small set that covers most of what you'll see (Intel syntax, dst, src).
| Instruction | What it does |
|---|---|
mov rax, rbx | Copy: rax = rbx |
lea rax, [rbp-0x10] | Load Effective Address: rax = address of that slot (no dereference) |
push rax / pop rax | Onto / off the stack (§5) |
call func | Push the return address, then jump to func (§7) |
ret | Pop the top of the stack into RIP — return to caller |
jmp / je / jne | Jump (unconditional / if zero / if not zero) — set RIP |
add / sub / xor / cmp | Arithmetic / compare (sets flags for the conditional jumps) |
syscall | Invoke a kernel system call (number in RAX, args in RDI…) |
mov dst, src (GDB set disassembly intel); AT&T is mov %src, %dst with % on registers and $ on immediates. We use Intel throughout the series.Calling conventions — call, ret & the return address
This is the heart of why control-flow hijacking works. When one function calls another, there's a strict contract — the calling convention — for where arguments go, where the return value comes back, and crucially how the called function knows where to return to.
System V AMD64 (Linux x86-64)
| Element | Where |
|---|---|
| Arguments 1–6 | RDI, RSI, RDX, RCX, R8, R9 (in that order) |
| Arguments 7+ | Pushed onto the stack |
| Return value | RAX |
call func | pushes the return address (the instruction after the call) onto the stack, then jumps to func |
ret | pops that saved address off the stack back into RIP |
# A function's prologue / epilogue (what every function does): push rbp ; save the caller's frame pointer mov rbp, rsp ; set up this function's frame sub rsp, 0x40 ; make room for local variables (e.g. char buf[64]) ... ; the function body leave ; == mov rsp, rbp ; pop rbp (tear down the frame) ret ; pop the saved return address into RIP → back to caller
ret does: it takes whatever is on top of the stack and jumps there. The saved return address lives on the stack, a few bytes above your local buffer. So if you can overwrite that saved address with bytes of your choosing, then when the function does ret, the CPU jumps to YOUR address. That is the entire mechanism of a stack overflow exploit — and you now understand why it works.Endianness — addresses go in backwards
One small thing that trips up every beginner: x86 is little-endian. Multi-byte values are stored with the least-significant byte first (at the lowest address). So the 4-byte value 0xDEADBEEF sits in memory as EF BE AD DE.
# Value: 0x00401234 (an address you want in RIP) # In memory / in your payload bytes (little-endian): 34 12 40 00 # This is why you NEVER type addresses as raw text — you pack them. # pwntools does it for you: from pwn import * payload = b"A"*72 + p64(0x401234) # p64 → b"\x34\x12\x40\x00\x00\x00\x00\x00" # p32() for 32-bit targets, u64()/u32() to unpack leaked bytes back into a number.
p64()/p32().Where the bug lives — the overflow, previewed
Now put it together — here's the whole series in one picture, the bug you'll exploit for real in Part 4. A function declares a local buffer; gets() (or any unbounded copy) reads more bytes than fit; the write runs up past the buffer, over the saved RBP, and into the saved return address; the function returns — into your bytes.
// The classic vulnerable function (Part 4 builds the exploit): void vuln() { char buf[64]; // 64 bytes on the stack gets(buf); // reads UNBOUNDED input → overflow } // on 'ret', RIP = whatever overwrote the saved return addr # Stack at the moment of the overflow (low → high addresses): # [ buf: 64 bytes ][ saved RBP: 8 ][ saved RET ADDR: 8 ][ ... ] # ^ input fills here ──────────► overwrites RBP ──► overwrites RET → controls RIP # # So: 64 (buf) + 8 (RBP) = 72 bytes of padding, then 8 bytes = the address RIP takes. # payload = b"A"*72 + p64(target_address)
Set up your lab
Last, build the lab so Part 2 onward is hands-on. You want a Linux x86-64 box (a VM/container is fine), the GNU toolchain, a debugger with an exploitation plugin, and pwntools.
# 1. Toolchain + debugger + pwntools sudo apt install gcc gdb python3-pip pip install pwntools # the exploitation framework (process, p64, cyclic, ELF, ROP) # GDB enhancer — pick ONE: GEF or pwndbg bash -c "$(curl -fsSL https://gef.blah.cat/sh)" # GEF # (pwndbg: git clone https://github.com/pwndbg/pwndbg && ./setup.sh) # 2. Compile a deliberately-vulnerable binary with mitigations OFF (for learning) cat > vuln.c <<'EOF' #include <stdio.h> void vuln(){ char buf[64]; gets(buf); } int main(){ vuln(); return 0; } EOF gcc -fno-stack-protector -z execstack -no-pie -g -o vuln vuln.c # ^ no canary ^ exec stack ^ no PIE (fixed addresses) # 3. Check what protections are on (you'll do this on every target) checksec --file=./vuln # or: pwn checksec ./vuln # → expect: No canary found, NX disabled, No PIE # 4. Disable ASLR system-wide for the early labs (re-enable later) echo 0 | sudo tee /proc/sys/kernel/randomize_va_space # 5. Drive it in GDB gdb ./vuln gef> break vuln gef> run gef> info registers # see RIP, RSP, RBP… gef> x/20gx $rsp # examine 20 qwords at the stack pointer gef> pattern create 200 # GEF: cyclic pattern to find the offset (Part 4)
Closing
The foundation is in place. You can name the registers, you know RIP is the prize, you can read the basic assembly GDB shows you, and you understand why the stack grows down into the buffers that fill it up — so when an overflow runs past the end of a buffer and over the saved return address, you know exactly what just happened and why ret hands you control.
Don't move on until that mental model is solid. Compile the vuln binary, open it in GDB, break on vuln, run, then x/20gx $rsp and find the saved return address sitting on the stack above the buffer — watch RSP and RIP change as you step. The first time you actually see it there, the whole thing clicks.