Understanding the inner workings of the Android kernel is crucial for optimizing performance, enhancing stability, and resolving complex system issues. While various debugging tools exist, Ftrace stands out as an indispensable, in-kernel tracing utility that offers unparalleled visibility into the kernel’s real-time behavior. This article delves into advanced Ftrace techniques, guiding you through its powerful capabilities to diagnose anything from UI jank to subtle system freezes on Android devices.
Accessing and Initializing Ftrace on Android
Before diving into advanced features, ensure you have root access on your Android device and ADB configured on your host machine. Ftrace controls and data are exposed through the debugfs filesystem, typically mounted at /sys/kernel/debug/tracing.
adb shellsu cd /sys/kernel/debug/tracing
This directory contains numerous files to configure and interact with Ftrace. It’s good practice to clear previous trace data and disable tracing before starting a new session.
echo 0 > tracing_on echo > trace echo nop > current_tracer
Demystifying Ftrace Tracers and Events
Ftrace offers various ‘tracers’, each designed for a specific type of kernel activity. While the function tracer provides basic function call tracking, advanced scenarios often demand more specialized tools.
Event Tracing: Pinpointing Subsystem Behavior
Kernel events are predefined points in the kernel code that log specific actions, such as scheduling decisions, memory allocations, or driver-specific operations. Tracing these events offers a high-level view of system dynamics without the overhead of function tracing every call.
To list available event categories and individual events:
cat available_events
For example, to trace scheduler events, you would enable them like this:
echo 1 > events/sched/enable echo 1 > events/irq/enable echo 1 > tracing_on # Start tracing # Perform actions you want to trace echo 0 > tracing_on # Stop tracing cat trace > /sdcard/sched_irq_trace.txt # Save trace data
Analyzing sched events can reveal scheduler latency, CPU wake-ups, and process priority inversions, which are common culprits for performance issues.
Function Graph Tracer: Unveiling Execution Flow and Latency
The function_graph tracer is a powerful tool for understanding the call graph and execution times of functions. Unlike the simpler function tracer, it shows function entry and exit, along with the time spent within each function and its children. This is invaluable for identifying bottlenecks.
echo function_graph > current_tracer # To trace a specific function, e.g., 'binder_thread_read' echo binder_thread_read > set_graph_function echo 1 > tracing_on # Start tracing # Reproduce the issue echo 0 > tracing_on # Stop tracing cat trace_pipe # View real-time output or cat trace for full log
The output provides a hierarchical view, with indentation indicating call depth and timestamps for entry/exit, making it easy to spot functions consuming excessive time.
Filtering and Buffering for Precision
The sheer volume of kernel events can quickly overwhelm the trace buffer. Ftrace provides powerful filtering mechanisms to focus on relevant data.
-
Function Filtering (
set_ftrace_filter):Specify exact function names or glob patterns to trace only specific functions. This dramatically reduces overhead.
echo 'msm_fb_xxx_commit' > set_ftrace_filter # Trace a specific function echo 'drm_*' > set_ftrace_filter # Trace all functions starting with 'drm_'To clear the filter:
echo > set_ftrace_filter -
Notrace Filter (
set_ftrace_notrace):Exclude specific functions from tracing. Useful when a function is too noisy but critical to keep others in its call path.
echo 'futex_*' > set_ftrace_notrace -
Ring Buffer Management:
Control the size of the kernel’s trace buffer and its overwrite behavior.
echo 10240 > buffer_size_kb # Set buffer to 10MB (per CPU) echo 1 > overwrite # Allow new traces to overwrite old ones (default) echo 0 > overwrite # Stop tracing when buffer is full
Practical Walkthrough: Diagnosing Scheduler Latency
Let’s use Ftrace to investigate scheduler latency, a common cause of UI jank. We’ll monitor when a task is delayed before it can execute after being runnable.
Step 1: Setup and Enable Event Tracing
First, clear any previous trace data and set up for scheduler event tracing.
cd /sys/kernel/debug/tracing echo 0 > tracing_on echo > trace echo nop > current_tracer # Enable scheduler and task events for detailed insights echo 1 > events/sched/sched_switch/enable echo 1 > events/sched/sched_wakeup/enable echo 1 > events/sched/sched_wakeup_new/enable echo 1 > events/task/task_newtask/enable echo 1 > events/task/task_rename/enable
Step 2: Capture Trace Data
Start tracing and then perform the UI action or scenario that exhibits jank or latency. For example, scrolling a long list or launching an application.
echo 1 > tracing_on # Start capturing # Perform UI actions or trigger the scenario echo 0 > tracing_on # Stop capturing
Step 3: Analyze the Trace
Extract the trace data. For deep analysis, transferring to a host machine and using tools like kernelshark or trace-cmd is recommended. However, a quick look via trace_pipe or cat trace can already reveal patterns.
cat trace > /sdcard/scheduler_latency_trace.txt # Transfer to PC for analysis with kernelshark: adb pull /sdcard/scheduler_latency_trace.txt . kernelshark scheduler_latency_trace.txt
Look for sched_wakeup events followed by a significant delay before the corresponding sched_switch for that task. High latency here indicates the task was ready but couldn’t get CPU time. Investigate what task was running during that delay (often shown by other sched_switch events) or if interrupts or other kernel work were occupying the CPU.
<preemption-disabled>... # other events occurring while task is runnable but not executing your_app-1234 [002] ... sched_wakeup: comm=your_app pid=1234 prio=120 target_cpu=002 your_app-1234 [002] ... sched_switch: prev_comm=system_server prev_pid=567 prev_prio=120 ...next_comm=your_app next_pid=1234 next_prio=120 ...
The time delta between sched_wakeup and sched_switch for your_app is the wakeup latency. Investigate the prev_comm from the sched_switch to see what was holding the CPU.
Conclusion
Ftrace is an incredibly powerful, yet often underutilized, tool in the Android kernel developer’s arsenal. By mastering advanced techniques such as event tracing, function graph analysis, and intelligent filtering, you can gain unprecedented visibility into kernel operations. This mastery empowers you to precisely pinpoint performance bottlenecks, diagnose obscure stability issues, and ultimately build more robust and efficient Android systems.
Android Mobile Specs & Compare Directory
Are you researching mobile hardware properties, processor SoCs, GPU chipsets, or RAM configurations? Access our complete specs catalog to compare up to 5 devices side-by-side!
Compare Devices Specs →