Impact of benign sample size on binary classification accuracy. (January 2023)

Record Type:: Journal Article
Title:: Impact of benign sample size on binary classification accuracy. (January 2023)
Main Title:: Impact of benign sample size on binary classification accuracy
Authors:: Mimura, Mamoru
Abstract:: Abstract: Recently, there has been a significant increase in malware attacks and malicious traffic. Consequently, several machine learning-based detection models have been developed to detect them. However, the detection accuracy of these models is currently evaluated using different methodologies and datasets, with some studies overstating high detection rates. The lack of a common testing approach coupled with the limited datasets used for the experiments make it challenging to compare the performances of these models to identify those that provide superior detection accuracy. A few studies have focused on benign samples and their effects on detection accuracy. The datasets used in the experiments generally consist of benign and malicious samples; hence, binary classification is used in the machine learning models. In the binary classification task, the size of a benign sample affects the classification accuracy of malicious samples, that is, it can either improve or degrade detection accuracy. In this study, we propose a novel metric for evaluating accuracy degradation by increasing benign sample size. We mainly used the FFRI dataset, which consists of 11, 243 malware samples and 250, 000 benign samples, and evaluated the classification accuracy with extracted strings from the malware. In addition, we obtained other malware samples that we used as supplementary to the main dataset. We increased the number of benign samples for testing by tenfold, while maintaining the … (more)
Is Part Of:: Expert systems with applications. Volume 211(2023)
Journal:: Expert systems with applications
Issue:: Volume 211(2023)
Issue Display:: Volume 211, Issue 2023 (2023)
Year:: 2023
Volume:: 211
Issue:: 2023
Issue Sort Value:: 2023-0211-2023-0000
Page Start:
Page End:
Publication Date:: 2023-01
Subjects:: Malware -- Machine learning -- Binary classification -- Benign sample -- Random forest -- Support vector machine -- XGBoost
Expert systems (Computer science) -- Periodicals
Systèmes experts (Informatique) -- Périodiques
Electronic journals
006.33
Journal URLs:: http://www.sciencedirect.com/science/journal/09574174 ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.eswa.2022.118630 ↗
Languages:: English
ISSNs:: 0957-4174
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3842.004220
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 24122.xml