Introduction to JEB Scripting for Android Malware Analysis
JEB Decompiler is an indispensable tool for reverse engineering Android applications, offering powerful static analysis capabilities. While its interactive graphical user interface (GUI) provides extensive features for manual inspection, the true efficiency in dealing with a high volume of samples or performing repetitive, complex tasks lies in leveraging its robust Python scripting API. This masterclass will guide you through the fundamentals of JEB scripting, demonstrating how to automate common Android malware analysis workflows, thereby accelerating your reverse engineering efforts and enabling custom analyses that go beyond the GUI’s default capabilities.
Automating tasks within JEB can transform your workflow from a tedious, manual process into a streamlined, reproducible pipeline. Imagine automatically identifying all cryptographic API calls, extracting specific types of strings (e.g., URLs, IP addresses), or even renaming obfuscated methods across hundreds of samples. JEB’s Python API makes this possible, granting programmatic access to virtually every aspect of the loaded artifacts, from DEX units and classes to methods, instructions, and even decompiled Java code.
Setting Up Your JEB Scripting Environment
JEB scripts are primarily written in Python. To get started, you’ll need to understand the basic structure of a JEB script and how it interacts with the JEB client. JEB provides an internal Python interpreter, so you typically don’t need a separate Python installation unless you’re integrating with external tools or libraries.
Basic Script Structure
Every JEB script must inherit from `com.pnfsoftware.jeb.client.api.IScript` and implement the `run` method. The `run` method receives a context object (`ctx`) which is your gateway to the JEB API. The context allows you to access the project, artifacts, units, and logging facilities.
from com.pnfsoftware.jeb.client.api import IScript, IClientContext # Added IClientContext for type hinting, good practice
from com.pnfsoftware.jeb.core.units.code.android import IDexUnit, IDexMethod
class AutomatedMalwareAnalysis(IScript):
def run(self, ctx: IClientContext):
"""Main entry point for the JEB script."""
self.ctx = ctx
self.logger = ctx.getLogger()
self.logger.info("JEB Scripting Masterclass: Starting automated analysis...")
# Accessing the project and units
project = ctx.getProject()
if not project:
self.logger.error("No project loaded. Please load an Android sample.")
return
# Find all DEX units within the project
dex_units = project.findUnits(IDexUnit, False) # False for not creating new units
if not dex_units:
self.logger.error("No DEX units found in the current project.")
return
for dex_unit in dex_units:
self.logger.info(f"Analyzing DEX unit: {dex_unit.getName()} (ID: {dex_unit.getUnitId()})")
# Call specific analysis functions here
self.find_crypto_apis(dex_unit)
self.extract_strings_with_patterns(dex_unit)
self.logger.info("Automated analysis completed.")
Running a Script
To run a script in JEB, save your Python file (e.g., `automated_analysis.py`) and then, within the JEB GUI, go to `File -> Script -> Execute script…` and select your file. The output will appear in the Logger window.
Practical Example 1: Identifying Crypto-Related API Calls
Goal and Rationale
Malware frequently employs encryption for various purposes, such as protecting command-and-control (C2) communications, encrypting configuration data, or obfuscating payloads. Rapidly identifying the use of Java cryptographic APIs (e.g., `javax.crypto.*`, `java.security.*`) is a crucial first step in understanding a sample’s capabilities and potentially locating decryption routines or keys. This automation saves significant time compared to manually searching through method calls.
Script Implementation
This function iterates through all methods in a DEX unit and then examines each instruction within those methods for calls to specific cryptographic packages.
# ... (inside AutomatedMalwareAnalysis class) ...
def find_crypto_apis(self, dex_unit: IDexUnit):
self.logger.info(f"Searching for cryptographic API calls in {dex_unit.getName()}...")
crypto_methods_found = []
# Define common crypto package prefixes
crypto_packages = [
"Ljavax/crypto",
"Ljava/security",
"Landroid/security"
]
for m in dex_unit.getMethods():
if not m.isInternal(): # Focus on methods implemented in the app itself
continue
body = m.getBody()
if not body: # Skip methods without a body (e.g., abstract methods, native methods)
continue
for instr in body.getInstructions():
# Check if the instruction is a method call
call = instr.getCall()
if call and call.getMethod():
called_method = call.getMethod()
method_signature = called_method.getSignature() # e.g., Ljavax/crypto/Cipher;->getInstance(Ljava/lang/String;)Ljavax/crypto/Cipher;
# Check if the method signature contains any of the crypto package prefixes
for pkg_prefix in crypto_packages:
if pkg_prefix in method_signature:
caller_signature = m.getSignature() # The method containing the crypto call
crypto_methods_found.append((caller_signature, method_signature))
self.logger.warn(f"Found crypto API call in {caller_signature} to {method_signature}")
break # Move to next instruction after finding one match
self.logger.info(f"Finished searching for crypto APIs in {dex_unit.getName()}. Found {len(crypto_methods_found)} instances.")
return crypto_methods_found
Practical Example 2: Extracting Hardcoded Strings with Patterns
Motivation
Hardcoded strings are a goldmine in malware analysis. They often reveal C2 server URLs, encryption keys, file paths, package names, unique identifiers, or other configuration details. Manually sifting through all strings in a large application is inefficient. Automating string extraction with regular expressions allows you to quickly pinpoint strings of interest, even if they are subtly obfuscated or mixed with legitimate data.
Script Walkthrough
This script demonstrates how to access all strings within a DEX unit and apply regular expressions to identify common malware-related patterns like URLs and IP addresses. You can easily extend this with patterns for file paths, API keys, or specific identifiers.
# ... (inside AutomatedMalwareAnalysis class) ...
import re
def extract_strings_with_patterns(self, dex_unit: IDexUnit):
self.logger.info(f"Extracting strings based on patterns in {dex_unit.getName()}...")
extracted_strings = {}
# Define common patterns for malware analysis
c2_pattern = re.compile(r"https?://[a-zA-Z0-9-.]+.[a-zA-Z]{2,}(:d+)?(/[a-zA-Z0-9-._~:/?#[]@!$&'()*+,;=.]*)?")
ip_pattern = re.compile(r"b(?:[0-9]{1,3}.){3}[0-9]{1,3}b")
filepath_pattern = re.compile(r"(/storage/emulated/0|/data/data/[a-zA-Z0-9.]+)(/[a-zA-Z0-9._-]+)+")
# Access all strings referenced in the DEX unit
for string_ref in dex_unit.getStrings():
value = string_ref.getValue()
if not value or len(value) < 5: # Ignore very short or empty strings
continue
# Check for C2 URLs
if c2_pattern.search(value):
extracted_strings.setdefault("C2_URLs", []).append(value)
self.logger.debug(f"C2 URL found: {value}")
# Check for IP Addresses
if ip_pattern.search(value):
extracted_strings.setdefault("IP_Addresses", []).append(value)
self.logger.debug(f"IP Address found: {value}")
# Check for common Android file paths
if filepath_pattern.search(value):
extracted_strings.setdefault("File_Paths", []).append(value)
self.logger.debug(f"File path found: {value}")
# Add more patterns here as needed, e.g., specific API keys, package names
self.logger.info(f"Finished extracting patterned strings from {dex_unit.getName()}. Total categories found: {len(extracted_strings)}.")
# Optionally, you can log or save these results to a file
for category, items in extracted_strings.items():
self.logger.info(f" {category}: {len(items)} items")
return extracted_strings
Advanced JEB Scripting Concepts and Best Practices
Interacting with the UI and Decompiler
JEB’s API also allows you to interact with the GUI, display custom dialogs, or even modify the decompiled output. For instance, you can use `ctx.displayMessageBox()` for simple pop-ups or `ctx.executeAction()` to trigger GUI actions programmatically. Modifying comments or renaming elements directly in the database using methods like `m.setName()` or `c.setName()` on `IDexMethod` or `IDexClass` objects can greatly enhance readability for subsequent manual analysis.
Handling Multiple Artifacts and Units
Real-world Android applications often consist of multiple DEX files (e.g., `classes.dex`, `classes2.dex`). The initial script example iterates through all `IDexUnit` instances, ensuring comprehensive analysis across all components of the application. Always design your scripts to be robust against multi-DEX scenarios.
Logging and Debugging
Effective logging is crucial for debugging and understanding your script’s execution. Use `self.logger.info()`, `self.logger.warn()`, `self.logger.error()`, and `self.logger.debug()` to output messages to JEB’s Logger window. For complex scripts, consider writing results to an external file using Python’s standard file I/O operations.
Modularity and Reusability
As your scripts grow, organize your code into functions and classes. This improves readability, maintainability, and reusability. For common tasks, consider creating a library of helper functions that can be imported into multiple analysis scripts.
Conclusion
JEB scripting unlocks a powerful dimension of Android malware analysis, transforming tedious manual tasks into efficient, automated workflows. By understanding the core API concepts and applying them to practical problems like identifying crypto calls or extracting patterned strings, you can significantly enhance your reverse engineering capabilities. This masterclass has provided a foundation; the true potential lies in your creativity to solve specific analysis challenges by leveraging JEB’s extensive programmatic access to the underlying binary structures. Embrace scripting to make your reverse engineering faster, more consistent, and more profound.
Android Mobile Specs & Compare Directory
Are you researching mobile hardware properties, processor SoCs, GPU chipsets, or RAM configurations? Access our complete specs catalog to compare up to 5 devices side-by-side!
Compare Devices Specs →