Android Software Reverse Engineering & Decompilation

Mastering Ghidra Sleigh: Developing Custom Call Conventions and Data Types for Android RE

Google AdSense Native Placement - Horizontal Top-Post banner

Introduction to Ghidra Sleigh for Android Reverse Engineering

Ghidra, the open-source software reverse engineering (SRE) suite from the NSA, has revolutionized the accessibility of advanced binary analysis. At its core lies Sleigh, a powerful processor specification language that allows Ghidra to understand the semantics of virtually any CPU architecture. For Android reverse engineers, mastering Sleigh is crucial for tackling native ARM and ARM64 binaries that employ custom calling conventions, obfuscated instruction sequences, or non-standard data types. While Ghidra provides robust out-of-the-box support for ARM architectures, real-world Android binaries often deviate, especially in proprietary libraries or malware, necessitating custom Sleigh extensions.

This article dives deep into developing custom call conventions and data type interpretations using Ghidra’s Sleigh language. We’ll explore the structure of a Ghidra processor module, specifically focusing on the .pspec, .slaspec, and .cspec files, and demonstrate how to tailor them to enhance decompilation accuracy for challenging Android binaries.

Understanding the Ghidra Processor Module Structure

A Ghidra processor module is a collection of files that describe a CPU architecture to Ghidra. Key files include:

  • .pspec (Processor Specification): The entry point, defining the processor’s properties and linking to its compiler specifications (.cspec) and Sleigh specification (.slaspec).
  • .slaspec (Sleigh Specification Language): The heart of the processor module, containing the Sleigh rules that define instruction opcodes, operands, and their corresponding p-code semantics.
  • .cspec (Compiler Specification): Defines calling conventions, register usage, stack frame information, and data organization specific to different compilers or operating environments.

For Android reverse engineering, especially when dealing with native shared libraries (.so files), the .cspec is paramount for correctly modeling function calls, while intelligent use of .slaspec can help interpret custom instruction patterns or data accesses.

Developing Custom Call Conventions with .cspec

Android native libraries typically adhere to the AArch64 or ARM EABI calling conventions. However, developers might implement custom function prologues/epilogues, inline assembly, or even custom trampolines that deviate from these standards. Ghidra’s decompiler heavily relies on .cspec definitions to correctly identify function arguments, return values, and stack frame management.

Let’s consider a scenario where a specific native library uses a custom calling convention for certain internal functions. Suppose functions in a specific library always pass the first argument (a pointer to a context struct) in register x19, regardless of other parameters, and return a 64-bit value in x20 and x21 (lower and upper half, respectively). This deviates from standard AArch64 which uses x0-x7 for arguments and x0 for return values.

Step 1: Locate or Create a Custom .cspec

You’ll typically find existing .cspec files within Ghidra’s processor definitions (e.g., Ghidra/Processors/ARM/data/cspec). For a custom module, you might create a new .cspec or modify an existing one.

Example my_android_arm64.cspec snippet:

<compiler_spec>  <global_returns>    <register name=

Android Mobile Specs & Compare Directory

Are you researching mobile hardware properties, processor SoCs, GPU chipsets, or RAM configurations? Access our complete specs catalog to compare up to 5 devices side-by-side!

Compare Devices Specs →
Google AdSense Inline Placement - Content Footer banner