Practical Lab: Cache-Timing Side-Channel Attack on Android AES Crypto Implementations

Introduction

Modern cryptographic algorithms, like the Advanced Encryption Standard (AES), are designed to be mathematically robust. However, their physical implementations can inadvertently leak sensitive information through side channels. One prominent side channel is cache timing, where an attacker can observe the timing differences of memory access operations to infer secrets. This article explores the principles and practical considerations of performing a cache-timing side-channel attack specifically targeting AES cryptographic implementations on Android devices.

We will delve into how AES lookup tables interact with CPU caches, how an attacker can leverage these interactions, and outline a methodology for constructing a proof-of-concept attack. Understanding these vulnerabilities is crucial for developing robust, side-channel-resistant cryptographic software.

Background: AES and CPU Caches

How AES Works (Briefly)

AES is a symmetric block cipher that operates on 128-bit blocks of data using key sizes of 128, 192, or 256 bits. Its core operations include SubBytes, ShiftRows, MixColumns, and AddRoundKey. The SubBytes step is particularly interesting for cache-timing attacks. It involves a byte substitution using a fixed lookup table known as the S-box. In many software implementations, these S-boxes (and their inverse, InvS-box) are precomputed and stored in memory as 256-byte arrays.

CPU Cache Fundamentals

CPUs utilize multiple levels of cache (L1, L2, L3) to bridge the speed gap between the processor and main memory. When the CPU needs data, it first checks the fastest L1 cache. If not found, it checks L2, then L3, and finally main memory. A cache hit (data found in cache) is significantly faster than a cache miss (data not found, requiring retrieval from slower memory). The time difference between a hit and a miss can be exploited.

The Cache-Timing Link to AES

When an AES implementation performs a SubBytes operation, it accesses the S-box lookup table. If the S-box entry for a particular byte is already in the cache, it’s a fast access (cache hit). If not, it’s a slow access (cache miss). An attacker can observe these timing differences to infer which S-box entries were accessed, and over many encryptions, potentially deduce parts of the secret key.

The key insight is that the specific S-box entries accessed during the first round of AES encryption (which depends on the plaintext XORed with the secret key) will affect cache behavior in a predictable way. By observing which cache lines are ‘hit’ or ‘missed’ more often, the attacker can narrow down the possible values of the key bytes.

Attack Methodology: Prime + Probe

A common technique for cache-timing attacks is ‘Prime + Probe’. This method requires the attacker to share a cache with the victim process, which is often the case on multi-core systems, including Android devices.

1. Setting Up the Android Lab Environment

To conduct this lab, you’ll need:

A rooted Android device (for fine-grained control and performance monitoring).
Android SDK with platform tools (ADB).
Android NDK (Native Development Kit) for compiling native C/C++ code.
A method to disable CPU frequency scaling for consistent timings (e.g., setting CPU governor to ‘performance’ via adb shell).

adb shell su -c 'echo "performance" > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor'

Repeat for other CPU cores (cpu1, cpu2, etc.).

2. The Attack Mechanism

Prime Phase

The attacker fills a specific part of the shared cache with their own known data. This is typically done by accessing a large array designed to occupy cache lines that the AES S-boxes might use.

Victim Phase

The victim process performs an AES encryption using the secret key. During the SubBytes operation, it accesses entries in the S-box. If those S-box entries correspond to cache lines occupied by the attacker’s data, they will be evicted from the cache and replaced by the S-box data.

Probe Phase

The attacker then re-accesses their own data and measures the time taken. If a significant delay is observed for certain cache lines, it indicates that the victim’s AES operation accessed those same cache lines, causing the attacker’s data to be evicted. This timing difference reveals information about the victim’s S-box access patterns.

3. Practical Implementation Steps (Conceptual)

This attack requires highly precise timing, typically achieved using hardware performance counters (like rdtsc on x86, or specific ARM timer registers accessible via userspace on rooted devices). On Android, clock_gettime(CLOCK_MONOTONIC_RAW, ...) can provide microsecond or nanosecond resolution, but hardware counters are more reliable for cache timings.

Victim Code Snippet (Conceptual C)

// Assume AES_encrypt function uses a global S-box lookup table. E.g.: Sbox[byte_val] S-box table definition somewhere in memory.
void victim_encrypt(unsigned char* plaintext, unsigned char* ciphertext, const unsigned char* key) {
    // ... AES_init_key(key, ...);
    AES_encrypt_block(plaintext, ciphertext);
    // ... The first round S-box lookups are critical
}

Attacker Code Snippet (Conceptual C – Prime + Probe Logic)

#include <stdint.h>
#include <stdio.h>
#include <time.h>
#include <stdlib.h>

#define CACHE_SIZE (64 * 1024) // Example L1d cache size
#define PROBE_ARRAY_SIZE (CACHE_SIZE * 2)

volatile uint8_t probe_array[PROBE_ARRAY_SIZE];

// Function to get high-resolution time (platform-specific)
static inline uint64_t get_time_ns() {
    struct timespec ts;
    clock_gettime(CLOCK_MONOTONIC_RAW, &ts);
    return (uint64_t)ts.tv_sec * 1000000000ULL + (uint64_t)ts.tv_nsec;
}

void prime_cache() {
    for (int i = 0; i < PROBE_ARRAY_SIZE; i += 64) { // Iterate by cache line size
        probe_array[i] = 1; // Access to bring into cache
    }
}

void probe_cache() {
    uint64_t start_time, end_time;
    for (int i = 0; i < PROBE_ARRAY_SIZE; i += 64) {
        start_time = get_time_ns();
        volatile uint8_t temp = probe_array[i]; // Access to measure time
        end_time = get_time_ns();
        uint64_t access_time = end_time - start_time;

        // Analyze access_time: if significantly higher, it was a cache miss
        // indicating victim accessed this cache line.
        if (access_time > THRESHOLD_NS) { // THRESHOLD_NS needs calibration
            printf("Cache line %d hit by victim (evicted attacker data). Access time: %llu nsn", i / 64, access_time);
        }
    }
}

int main() {
    // 1. Prime the cache
    prime_cache();

    // 2. Trigger victim encryption (this would be in a separate process or thread)
    //    e.g., inter-process communication to tell victim to encrypt
    //    victim_encrypt(plaintext, ciphertext, key); 

    // 3. Probe the cache
    probe_cache();

    return 0;
}

This attacker code would be compiled using the Android NDK for your target ARM architecture:

arm-linux-androideabi-gcc -static -o attacker_app attacker.c -lrt

Then push to the device and run:

adb push attacker_app /data/local/tmp/
adb shell /data/local/tmp/attacker_app

4. Data Analysis

The goal is to analyze the sequence of cache misses to infer the first round’s S-box lookups. Knowing the plaintext and observing which S-box entries were accessed helps to deduce the XORed value of plaintext and key bytes. Over multiple encryptions with varying plaintexts, an attacker can statistically determine the most likely key bytes.

This analysis typically involves:

Collecting a large number of timing traces for encryptions of known plaintexts.
Identifying which S-box cache lines correspond to observed timing anomalies.
Using statistical methods or advanced cryptanalysis techniques to correlate these observations with possible key byte values.

Mitigations Against Cache-Timing Attacks

Defending against cache-timing attacks is challenging but crucial for secure implementations:

Constant-Time Implementations: The most effective mitigation. Ensure all cryptographic operations take an identical amount of time regardless of secret values. This often involves avoiding data-dependent branches, memory accesses (like S-box lookups), and loop iterations. Techniques include bit-slicing or using tables that are accessed in a way that doesn’t depend on secret values.
Hardware Cryptographic Accelerators: Many modern ARM SoCs (including those in Android devices) have dedicated hardware modules for AES. These are often designed to be resistant to timing attacks, as their operations are opaque to the software and run in constant time.
Cache Flushing/Partitioning: While less practical for general applications, some highly sensitive environments might attempt to flush caches before and after sensitive operations. Cache partitioning (e.g., using Intel’s CAT or ARM’s MPAM) can isolate cryptographic processes, but this is usually a system-level feature not directly controlled by apps.
Masking/Blinding: Randomizing intermediate values in computations to decorrelate them from the secret key. This makes it harder for an attacker to build a consistent timing profile.

Conclusion

Cache-timing side-channel attacks represent a powerful threat to cryptographic implementations, even on seemingly secure platforms like Android. By carefully observing micro-architectural effects, an attacker can bypass the mathematical strength of algorithms like AES to recover secret keys. This practical lab provides a conceptual framework for understanding and demonstrating such attacks. It underscores the critical importance of designing and implementing cryptographic routines with side-channel resistance in mind, opting for constant-time code or leveraging hardware-backed solutions wherever possible to protect sensitive user data.

Android Mobile Specs & Compare Directory

Are you researching mobile hardware properties, processor SoCs, GPU chipsets, or RAM configurations? Access our complete specs catalog to compare up to 5 devices side-by-side!

Compare Devices Specs →