Robust malware clustering of windows portable executables using ensemble latent representation and distribution modeling. (23rd January 2023)
- Record Type:
- Journal Article
- Title:
- Robust malware clustering of windows portable executables using ensemble latent representation and distribution modeling. (23rd January 2023)
- Main Title:
- Robust malware clustering of windows portable executables using ensemble latent representation and distribution modeling
- Authors:
- Rizvi, Syed Khurram Jah
Fraz, Muhammad Moazam - Abstract:
- Summary: Malware is a malicious program used for unauthorized access to organizational infrastructure and systems. To overcome challenges of exponential growth of malware, notable research has been made for unsupervised clustering of Windows‐based portable executable (PE). Nevertheless, to the best of our knowledge there has been no research for robust cluster prediction of Windows based PEs using static features. To this end, we proposed an ensemble neural network architecture for unsupervised feature learning and its distribution modeling for robust clustering of PE(s). The novel architecture is a cascaded formation of a deep autoencoder (AE) network and latent distribution modeling (LDM) network. The AE performs feature learning using latent representation and LDM performs the distribution modeling of latent representation using Gaussian approximation. An objective function is also devised for model optimization. The network adjusts the Gaussian components to optimize the distribution modeling. It also performs adjustments for data representations toward related Gaussian centers to make the model behave in adaptive manner. A novel malware dataset has also been collected by employing endpoint security management solution over enterprise network to assess proposed architecture. The dataset contains 21, 486 samples including 14, 497 malicious and 6989 benign ones. We also performed the evaluation of proposed architecture over publicly available benchmark malware datasetSummary: Malware is a malicious program used for unauthorized access to organizational infrastructure and systems. To overcome challenges of exponential growth of malware, notable research has been made for unsupervised clustering of Windows‐based portable executable (PE). Nevertheless, to the best of our knowledge there has been no research for robust cluster prediction of Windows based PEs using static features. To this end, we proposed an ensemble neural network architecture for unsupervised feature learning and its distribution modeling for robust clustering of PE(s). The novel architecture is a cascaded formation of a deep autoencoder (AE) network and latent distribution modeling (LDM) network. The AE performs feature learning using latent representation and LDM performs the distribution modeling of latent representation using Gaussian approximation. An objective function is also devised for model optimization. The network adjusts the Gaussian components to optimize the distribution modeling. It also performs adjustments for data representations toward related Gaussian centers to make the model behave in adaptive manner. A novel malware dataset has also been collected by employing endpoint security management solution over enterprise network to assess proposed architecture. The dataset contains 21, 486 samples including 14, 497 malicious and 6989 benign ones. We also performed the evaluation of proposed architecture over publicly available benchmark malware dataset including 138, 047 samples comprising 96, 742 malicious and 41, 323 benign PEs. The experimental results demonstrated that the proposed architecture yielded more than 95% accuracy for cluster prediction. The novel architecture has achieved superior performance and outperformed progressive techniques. The dataset along with implementation are accessible at bit.ly/3J6ZF8S . … (more)
- Is Part Of:
- Concurrency and computation. Volume 35:Number 8(2023)
- Journal:
- Concurrency and computation
- Issue:
- Volume 35:Number 8(2023)
- Issue Display:
- Volume 35, Issue 8 (2023)
- Year:
- 2023
- Volume:
- 35
- Issue:
- 8
- Issue Sort Value:
- 2023-0035-0008-0000
- Page Start:
- n/a
- Page End:
- n/a
- Publication Date:
- 2023-01-23
- Subjects:
- autoencoder -- clustering of portable executable -- distribution modeling -- ensemble neural network -- static analysis
Parallel processing (Electronic computers) -- Periodicals
Parallel computers -- Periodicals
004.35 - Journal URLs:
- http://onlinelibrary.wiley.com/ ↗
- DOI:
- 10.1002/cpe.7621 ↗
- Languages:
- English
- ISSNs:
- 1532-0626
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3405.622000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 26318.xml