Talk With an Expert

Measuring Malware Obfuscation: Evaluating CNN- Based Detection for Real-World Resilience

Measuring Malware Obfuscation: Evaluating CNN- Based Detection for Real-World Resilience (PDF, 2.86MB)Published: 19 Nov, 2025
Created by:
Michael Reglein

Static malware detection offers the speed and scalability necessary to process millions of files daily. However, it remains vulnerable when confronted with obfuscation, a deliberate modification of code designed to evade detection and analysis.

This study examined how layered obfuscation affects image-based convolutional neural network (CNN) detectors and introduces a novel, reproducible framework for measuring obfuscation itself. Using greyscale byte-to-image representations of Windows portable executables, CNN detectors were tested against progressively more complex obfuscations, ranging from minor structural edits to multilayer overlays. This study highlights that effective detection requires not only capable models but also thoughtful sample exposure, calibration, and validation that reflect the realities of modern obfuscated malware.

Results demonstrated that exposure to obfuscated malware, rather than CNN complexity, governed model robustness. Obfuscation-naïve models, trained only on reference binaries, performed well on reference binaries; however, they degraded sharply when tested against obfuscated samples. Conversely, models trained with limited exposure to obfuscated variants maintained better detection rates and resilience. Preprocessing choices for the byte-to-image representation also proved influential on performance, preserving key obfuscation signals while minimizing computational cost.

To verify that the preprocessing maintained meaningful variation between diffculty tiers, a Jensen-Shannon challenge score was introduced to measure divergence between reference and obfuscated binaries. This challenge score confirmed that higher obfuscation levels produced distinct statistical signatures, while explaining why intermediate tiers sometimes converged. Together, these findings offer instructive insights for researchers and practitioners seeking to enhance their understanding of the resilience of static AI-based malware detectors.

Measuring Malware Obfuscation: Evaluating CNN- Based Detection for Real-World Resilience | SANS Institute