Memory Corruption

Definition

Memory corruption is the class of software bugs where a program reads from or writes to memory outside its intended bounds, breaking the integrity of program state. The corruption can be silent (the program continues, possibly producing wrong results) or fatal (segmentation fault). Crucially for security, an attacker who controls the corrupting input can sometimes turn the bug into a primitive that controls the program's behavior — from leaking information to executing arbitrary code. Memory corruption is the foundational vulnerability class of the binary-exploitation branch; everything else in this branch is either a specific corruption type, a mitigation against the chain from corruption to exploitation, or a tool used to find or exploit such bugs.

Why it matters

Memory-safety vulnerabilities have produced over half of all critical CVEs in Microsoft software for two decades (Microsoft Security Response Center has published this statistic repeatedly since 2019) and remain the dominant exploitable bug class in C, C++, and unsafe-Rust codebases. Every major exploitation chain against a modern operating system, browser, or kernel — Pwn2Own targets, iOS/Android jailbreaks, every CVE-2024 / CVE-2025 critical in Chrome / Windows / Linux kernel — reduces at some layer to a memory-corruption primitive.

The class matters as a teaching note because it surfaces three transferable senior facts about systems security:

Memory safety is a property of the language, not the programmer. Diligent C/C++ programmers writing careful code still produce memory-corruption bugs because the languages provide no enforcement. This is why Rust, Swift, modern C++ guidelines, and hardware features like ARM MTE exist — to move the guarantee from human discipline to mechanical enforcement.
A bug is not an exploit. Most memory-corruption bugs in the real world do not become exploits because modern mitigations (exploit-mitigations) interrupt the chain. The attacker's job is increasingly chaining multiple bugs together — info-leak + control-flow primitive + memory write — rather than weaponizing a single bug.
Detection lives at compile-time and runtime equally. Sanitizers (AddressSanitizer, MemorySanitizer, UBSan), fuzzers (libFuzzer, AFL++, syzkaller), hardware features (Intel CET, ARM MTE, ARM PAC), and static analyzers all find different subsets of memory corruption. Senior practice combines them rather than picking one.

The branch's positioning is: this note is the root concept; stack-buffer-overflow is the canonical worked example; future notes cover specific corruption types (use-after-free, heap overflow, integer overflow) and the mitigation landscape. The defender pair is exploit-mitigations.

How it works

Memory corruption reduces to 6 canonical bug classes that an exploit operator and a defender both need to hold:

Stack buffer overflow. A local buffer on the stack receives more data than it has space for, overwriting adjacent stack data — local variables, saved frame pointers, saved return address. The canonical introductory bug. See stack-buffer-overflow for the full deep-dive.
Heap buffer overflow. A heap-allocated buffer receives more data than its allocation. Adjacent heap chunks (allocator metadata, neighboring objects) get corrupted. Exploitation typically targets allocator metadata (for "House of *" techniques against glibc/ptmalloc) or adjacent objects whose function pointers can be hijacked.
Use-after-free (UAF). A program frees a memory allocation, then later accesses it through a stale pointer (a "dangling pointer"). If the allocator has reused that memory for a different object — often controllable by the attacker — the program reads or writes through a pointer that now refers to attacker-shaped data. Browser exploits live here.
Double-free. A program frees the same allocation twice. The allocator's internal state becomes corrupted (free-list links, size metadata). Many "House of *" allocator-exploitation patterns originate from double-frees.
Out-of-bounds read. A program reads from memory outside the intended object boundary. Less directly exploitable for code execution, but the canonical info-leak primitive — leaking stack canaries, PIE base addresses, libc pointers needed to defeat ASLR. Heartbleed (CVE-2014-0160) is the textbook example.
Integer overflow / integer conversion bug. Arithmetic on integer types produces a value outside the type's range, wrapping around. The wrapped value is then used as a size or index, leading to undersized allocations, oversized reads, or signed/unsigned confusion. Often a cause of buffer overflows rather than a direct exploit primitive.

The exploitation chain from any of these six bugs to working code execution typically goes:

bug primitive (write / read)
 → info leak (defeat ASLR/PIE/canary)
 → arbitrary read (read program state, allocator state)
 → control-flow hijack (overwrite a function pointer, vtable, return address, GOT entry)
 → arbitrary execute (jump to attacker-controlled code or ROP chain)

A representative C function with a textbook stack buffer overflow:

void greet(char *name) {
    char buf[64];
    strcpy(buf, name);          // bug: no bounds check
    printf("Hello, %s\n", buf);
}

strcpy writes until it sees \0. A name longer than 63 bytes (plus the null terminator) overflows buf. With no stack canary, no PIE, and no DEP, this is the 1996 Aleph One "Smashing the Stack for Fun and Profit" exploit shape; with modern mitigations, it requires additional primitives. See stack-buffer-overflow.

The bug is not "the program crashed"; it is the program's notion of which bytes belong to which variable / object / region was violated, and an attacker who controls the boundary-crossing input can sometimes leverage the violation to control the program's execution.

Techniques / patterns

Locate the bug class before the exploit. Most beginners try to "write an exploit"; senior researchers first identify the bug class (UAF? heap overflow? integer overflow leading to undersized alloc?). The class determines which primitives are available and which mitigations matter.
Find the primitive before the payload. A "primitive" is a controlled capability: arbitrary read of N bytes at attacker-chosen address, arbitrary write of one pointer to attacker-chosen address, one-shot control of rip. Senior exploitation is the discipline of building primitives, not writing shellcode. Modern Linux/Windows exploitation rarely involves shellcode at all — it's ROP chains, JIT spraying, data-only attacks.
Use sanitizers during development and fuzzing. Compile with -fsanitize=address,undefined and run the test suite or fuzz harness. ASAN catches stack overflow, heap overflow, UAF, and double-free at the moment of the bug rather than at the eventual crash point hours later.
Coverage-guided fuzzing finds memory corruption better than humans. libFuzzer / AFL++ / honggfuzz instrument the target binary, generate input mutations, and prioritize inputs that explore new code paths. They find bugs in code written by experts daily.
Understand the allocator. Heap exploitation depends on which allocator is in use: glibc ptmalloc (Linux), tcmalloc (Chrome), mimalloc (Microsoft), jemalloc (FreeBSD, mobile), the Windows segment heap, the Linux kernel SLUB allocator. Each has its own metadata layout, free-list structure, and exploitation techniques.
Read the disassembly. Source code shows intent; disassembly shows behavior. Compiler optimizations (inlining, dead-code elimination, stack reuse) can introduce or hide memory-corruption bugs that are invisible at the source level.
OPSEC on the defender side: every exploit-quality memory corruption produces detectable telemetry. Crash dumps with non-standard addresses, EDR alerts on unusual control-flow transfers, GS-cookie failure events. The exploit may succeed; the indicator of the attempt is rarely zero.

Variants and bypasses

Memory corruption splits into 4 modern operational categories, each with its own mitigation landscape and exploitation toolkit.

1. Stack-based corruption

Stack buffer overflow, stack-based UAF, missing bounds checks on local arrays. The simplest category to teach and exploit. Mitigations: stack canaries (GS cookies / SSP), Shadow Stack (Intel CET, ARM PAC), -fstack-protector-strong. Modern operating systems make pure stack-based exploitation difficult; usually requires chaining with an info leak.

2. Heap-based corruption

Heap buffer overflow, UAF, double-free, type confusion. The dominant category in modern exploitation because heap-allocated objects (vtables, function pointers, callback structs) are abundant targets. Mitigations: hardened allocators (PartitionAlloc in Chrome, MiraclePtr, GuardedMalloc), heap canaries, heap-spray defenses, ARM MTE memory tagging. Most browser CVEs land here.

3. Kernel-space corruption

Same bug classes (stack/heap overflow, UAF, double-free) but in kernel code. Exploitation primitives differ — kernel exploitation seeks privileged context (root/SYSTEM/EL1) rather than user-space code execution. Mitigations: SMEP, SMAP, KPTI, KASLR, control-flow integrity (Intel CET kernel mode), and OS-specific kernel hardening. Increasingly hard but still produces critical CVEs annually.

4. JIT / VM corruption

JavaScript engines, WebAssembly runtimes, eBPF verifier, and other dynamic-compilation systems generate executable code at runtime. Bugs in the JIT compiler or its type/range inference produce memory corruption with privileged primitives (writeable + executable memory regions, type confusion at the IR level). Browser pwns and kernel-eBPF exploits land here. Mitigations: JIT hardening, type-checking guards, sandbox isolation.

Impact

Ordered by typical real-world severity:

Remote code execution (RCE). Memory corruption in a network-facing service or browser-loaded content reachable over the internet. The highest-impact memory-corruption outcome. Browser RCE chains (Pwn2Own) are the prestige category.
Local privilege escalation (LPE). Kernel memory corruption that yields kernel-context code execution. The standard chain after RCE: browser RCE escapes the renderer sandbox via a kernel LPE.
Sandbox escape. Memory corruption that lets an attacker break out of a sandbox (browser renderer, container, hypervisor, mobile app sandbox). Always a chain link, rarely terminal.
Information disclosure. Out-of-bounds reads that leak secrets — private keys (Heartbleed), session cookies, address-space layouts. Not always direct code execution but frequently the first step.
Denial of service. Memory corruption that crashes the program reliably. The least-impactful outcome but still a meaningful operational concern for critical services.
Data corruption / silent integrity violation. The most insidious outcome — the corrupted memory is used as if valid, producing wrong but plausible behavior. Financial calculations off by a wrap-around increment; safety-critical logic skipping a check. Detection is hardest because there is no crash.

Detection and defense

Ordered by effectiveness (the chain from prevention through detection):

Use a memory-safe language where possible. Rust, Swift, Go, modern Java, C#, TypeScript. New code in critical services should default to memory-safe languages. This is the structural fix that bypasses the entire bug class. Microsoft's Azure division and Linux kernel maintainers have publicly committed to Rust for new kernel-adjacent code starting 2023+.
Sanitizer-driven CI for C/C++/unsafe-Rust. -fsanitize=address,undefined,thread runs in CI on every PR. AddressSanitizer alone catches stack overflow, heap overflow, UAF, and double-free at the moment of the violation, with full stack trace. The single highest-leverage technical control for unavoidable C/C++ code.
Coverage-guided fuzzing in CI. libFuzzer / AFL++ / OSS-Fuzz coverage on every PR for input-parsing code. Finds memory corruption that hand-written tests miss. Industry standard for browsers, OS components, and protocol implementations.
Modern compiler-level mitigations. -D_FORTIFY_SOURCE=3, -fstack-protector-strong, -fPIE -pie, -Wl,-z,relro -Wl,-z,now, Intel CET (-fcf-protection=full), ARM PAC. Doesn't fix the bugs but raises exploitation cost. See exploit-mitigations for the full landscape.
Hardware memory tagging. ARM MTE (Memory Tagging Extension) tags allocations with a 4-bit color stored in the pointer; the CPU traps on color mismatches at access time. Catches UAF, heap overflow, and out-of-bounds access at hardware speed. Available on ARMv8.5+ hardware. Apple platforms ship with MTE-enabled allocators in 2024+; Android Pixel devices ship MTE in developer preview.
Runtime detection via EDR / kernel telemetry. The exploitation chain — info leak followed by ROP gadgets followed by unusual control-flow transfers — produces detectable runtime anomalies. Modern EDR products instrument LoadLibrary, VirtualProtect, and process-memory changes; kernel security telemetry (Sysmon, ETW, Linux kernel audit, eBPF-based tools) catch many real-world exploit chains.

What does not work as a primary defense

"We have a code review process." Manual code review finds some memory-corruption bugs; sanitizers and fuzzers find more, faster, and with reproducers. Code review remains valuable for design issues; it should not be the primary defense for bug-class issues.
"We use C++ smart pointers." std::unique_ptr and std::shared_ptr mitigate some UAF and double-free cases but not all — raw pointer access, manual lifetime management for performance, and integration with C APIs all leave holes. Smart pointers are good practice; they are not a structural answer.
"Our code is mature and well-tested." Mature C/C++ codebases produce memory-corruption CVEs constantly. Maturity does not solve a language-property issue.
"We block exploitation, so the bug doesn't matter." Mitigations buy time; they do not fix the bug. Multi-stage exploit chains routinely defeat single mitigations. Mitigations are necessary but not sufficient.

Practical labs

Run only against owned lab environments or authorized engagements. Memory corruption in production targets without authorization is a serious offense.

// Lab 1 — Minimal stack buffer overflow target. Compile without mitigations.
// Save as vuln.c
#include <stdio.h>
#include <string.h>
void greet(char *name) {
    char buf[64];
    strcpy(buf, name);
    printf("Hello, %s\n", buf);
}
int main(int argc, char **argv) {
    if (argc < 2) return 1;
    greet(argv[1]);
    return 0;
}

# Lab 1 (cont.) — Compile with mitigations disabled and observe the crash.
gcc -m32 -fno-stack-protector -no-pie -z execstack -o vuln vuln.c
./vuln $(python3 -c 'print("A"*200)')
# Expected: segmentation fault. Inspect with dmesg | tail or gdb.

# Lab 2 — Re-compile with sanitizer; observe the precise catch.
gcc -fsanitize=address -g -o vuln_asan vuln.c
./vuln_asan $(python3 -c 'print("A"*200)')
# Expected: AddressSanitizer prints "stack-buffer-overflow" with exact stack trace.
# This is what every C/C++ CI should do on every test run.

# Lab 3 — Re-compile with mitigations enabled; observe canary detection.
gcc -fstack-protector-strong -O2 -o vuln_canary vuln.c
./vuln_canary $(python3 -c 'print("A"*200)')
# Expected: "*** stack smashing detected ***: terminated".
# The canary changes the failure mode from silent corruption to controlled abort.

# Lab 4 — Build a UAF target and observe with ASAN.
cat > uaf.c <<'EOF'
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
int main() {
    char *p = malloc(64);
    strcpy(p, "before free");
    free(p);
    printf("UAF read: %s\n", p);   // bug: read after free
    return 0;
}
EOF
gcc -fsanitize=address -g -o uaf_asan uaf.c
./uaf_asan
# Expected: ASAN reports "heap-use-after-free" with the alloc/free/use stacks.

# Lab 5 — Fuzz a tiny C parser with libFuzzer.
cat > fuzz_target.c <<'EOF'
#include <stdint.h>
#include <string.h>
#include <stdlib.h>
int parse(const uint8_t *data, size_t size) {
    char buf[32];
    if (size > 0) memcpy(buf, data, size);   // intentional bug
    return buf[0];
}
int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
    return parse(data, size);
}
EOF
clang -fsanitize=fuzzer,address -g -o fuzz fuzz_target.c
./fuzz -runs=10000
# Expected: libFuzzer finds the memcpy overflow within seconds, prints the
# reproducer input, and writes it to a crash-* file.

# Lab 6 — Inspect compiler-emitted mitigation primitives.
gcc -O2 -fstack-protector-strong -fPIE -pie -fcf-protection=full -c vuln.c -o vuln.o
objdump -d vuln.o | grep -E "endbr|__stack_chk_fail" | head
# Expected: see __stack_chk_fail references (canary check) and endbr32/endbr64
# instructions (Intel CET indirect-branch tracking). These are the primitives
# that exploit mitigations are built from.

Practical examples

Heartbleed (CVE-2014-0160, OpenSSL). Classic out-of-bounds read in TLS heartbeat extension. Server reads attacker-specified length of memory adjacent to a small client-supplied payload buffer, returning up to 64 KB of process memory per request. Leaked private keys, session cookies, passwords. Fix was a bounds check; the cost of the unfixed window was billions in industry-wide cert rotations.
Use-after-free in Chrome V8 (regular Pwn2Own target). Modern browser exploitation chains routinely start with a JIT-engine type-confusion or UAF in the rendering engine. Yields renderer-process RCE → escapes the sandbox via a Mojo IPC bug → escalates via a kernel UAF → finally lands as kernel context. Three or four memory-corruption primitives chained.
iMessage zero-click (CVE-2023-41064 / FORCEDENTRY family). Apple's image-parsing code processed attacker-supplied WebP / PDF content; integer overflow in the parser produced an undersized allocation, followed by a heap overflow. Full RCE on the device with no user interaction. The chain depended on multiple memory-corruption primitives stacked across parser stages.
CVE-2021-3156 (Sudo "Baron Samedit") heap buffer overflow. Off-by-one error in Sudo's command-line parsing. Despite stack canaries and ASLR, the bug yields heap corruption sufficient to obtain root locally on virtually every Linux distribution shipping Sudo at the time. Demonstrates that long-mature code (Sudo, decades old) is not immune.
ASAN catches a UAF in CI before deploy. Routine fuzzing run on a new image-parser library finds a UAF in the cleanup path on malformed input. ASAN stack trace points exactly at the dangling-pointer access; engineers fix in 20 minutes. This is the modal real-world detection-of-bug outcome in well-instrumented codebases — most memory corruption is caught pre-deploy, not in production.

stack-buffer-overflow — the canonical worked example of class 1 (stack-based corruption); the deep-dive companion to this note.
exploit-mitigations — the structural defender pair; what ASLR, DEP, CET, MTE, canaries, and PAC actually do and what they cost an attacker.
CIA triad — memory corruption is usually an integrity breach (program state changed without authorization) that enables further breaches; the framing matters for incident classification.
Threat Modeling Quickstart — memory corruption is the canonical "Tampering" STRIDE category at the systems-software layer.
Attacker-Defender Duality — exploit research and mitigation engineering are the same field looked at from opposite chairs; the duality is especially clean here.
AEAD and nonce misuse — memory corruption that leaks AEAD nonces (or reuses them via cross-allocation aliasing) destroys cryptographic security; the two notes intersect.
EDR / process correlation — modern memory-corruption exploits leave EDR-visible signals (unexpected VirtualProtect, ROP gadget patterns, unusual control-flow transfers).
Windows Privilege Escalation — kernel memory corruption is one of the listed privesc primitives; LPE chains often terminate in kernel memory-corruption bugs.

Suggested future atomic notes

heap-buffer-overflow-and-allocator-exploitation — class 2 deep dive: ptmalloc / tcmalloc / mimalloc / segment heap.
use-after-free-and-dangling-pointers — class 3 deep dive: the dominant browser-exploit primitive.
double-free-and-allocator-corruption — class 4 deep dive: House-of-* lineage.
out-of-bounds-read-and-info-leaks — class 5 deep dive: Heartbleed pattern, info-leak primitives.
integer-overflow-and-type-confusion — class 6 deep dive: the canonical cause of allocation-size bugs.
rop-and-ret2libc — the primary technique for code execution despite DEP.
aslr-pie-and-info-leak-chains — defeating address-space layout randomization.
stack-canaries-and-shadow-stacks — Intel CET / ARM PAC / SSP deep dive.
arm-mte-and-memory-tagging — the hardware approach to memory safety.
control-flow-integrity-cfi — the indirect-branch protection layer.
fuzzing-with-libfuzzer-and-afl — modern coverage-guided fuzzing patterns.
sanitizers-asan-msan-ubsan — the runtime-detection toolkit for development and CI.
detect-memory-corruption-exploitation — defender-side playbook pair.

References

Foundational: Aleph One — Smashing the Stack for Fun and Profit (Phrack 49, 1996; the foundational text) — http://phrack.org/issues/49/14.html
Research / Deep Dive: Microsoft Security Response Center — A proactive approach to more secure code (memory-safety vulnerability statistics) — https://msrc.microsoft.com/blog/2019/07/a-proactive-approach-to-more-secure-code/
Research / Deep Dive: Dennis Andriesse — Practical Binary Analysis (No Starch Press, 2018) — the modern reference on binary analysis and exploitation toolchain
Official Tool Docs: AddressSanitizer — https://clang.llvm.org/docs/AddressSanitizer.html
Research / Deep Dive: ARM — Memory Tagging Extension (MTE) — https://developer.arm.com/documentation/108035/latest/

Reference system