Malware Detection with CNNs on Entropy and Greyscale Images

Harry Darton Sheffield Hallam University, School of Computing and Digital Technologies https://orcid.org/0000-0001-6511-7646

malware detection, convolutional neural networks, entropy images, greyscale images, static analysis.

Abstract

This study investigates whether convolutional neural networks (CNNs) trained on visual representations of Portable Executable (PE) files can rival traditional machine learning classifiers trained on engineered features. A dataset of over 200,000 PE files [1] was used to derive two feature sets (Basic and Ember-Lite) [2] and to generate 256x256 greyscale and entropy images [3],[4]. Three CNNs (SimpleCNN, ResNet-18 [5], EfficientNet-B0 [6]) were trained and evaluated against five baselines (Random Forest, XGBoost [7], CatBoost [8], LightGBM, Logistic Regression). Tree-based models with enriched features achieved the highest scores, with CatBoost reaching a ROC-AUC of 0.990. The best CNN, EfficientNet-B0 on entropy images, obtained a ROC-AUC of 0.954. Although CNNs did not surpass feature-based models, they showed competitive results when feature engineering was constrained. These findings indicate that visual approaches offer a promising alternative for static malware detection, particularly when combined with entropy-based representations.

Accepted

2025-11-26

Darton, H. (2025). Malware Detection with CNNs on Entropy and Greyscale Images. En Latin-American Journal of Computing (Vol. 13, Número 1). Escuela Politécnica Nacional. https://doi.org/10.5281/zenodo.17941249

Issue

Next Issue

Section

Research Articles for the Next Issue (Early Access)