A comparative study of support vector machine and neural networks for file type identification using n-gram analysis. (April 2021)
- Record Type:
- Journal Article
- Title:
- A comparative study of support vector machine and neural networks for file type identification using n-gram analysis. (April 2021)
- Main Title:
- A comparative study of support vector machine and neural networks for file type identification using n-gram analysis
- Authors:
- Sester, Joachim
Hayes, Darren
Scanlon, Mark
Le-Khac, Nhien-An - Abstract:
- Abstract: File type identification (FTI) has become a major discipline for anti-virus developers, firewall designers and for forensic cybercrime investigators. Over the past few years, research has seen the introduction of several classifiers and features. One of these advances is the so-called n-grams analysis, which is an interpretation of statistical counting in classified fragments. Recently, n-grams based approaches were already successfully combined with computational intelligence classifiers. However, the academic body of literature is scant when it comes to a comprehensive explanation of machine learning based approaches such as neural networks (NN) or support vector machines (SVM). For example, how the input parameters, including learning rate, different values of n for n-grams, etc. influence the results. In addition, very few studies have compared the scalability of NN vs. SVM approaches. Therefore, a systematic research in comparing different approaches is needed to address these questions. Hence, this paper investigates this type of comparison, by focusing on the n-gram analysis as a feature for the two different classifiers: SVMs and NNs. This paper details our experiments with two NNs and four SVMs, using linear kernels and RBF kernels on RealDC datasets. In general, we found that SVM-based approaches performed better than the NN, but their scalability is still a challenge.
- Is Part Of:
- Forensic science international. Volume 36(2021)Supplement
- Journal:
- Forensic science international
- Issue:
- Volume 36(2021)Supplement
- Issue Display:
- Volume 36, Issue 2021 (2021)
- Year:
- 2021
- Volume:
- 36
- Issue:
- 2021
- Issue Sort Value:
- 2021-0036-2021-0000
- Page Start:
- Page End:
- Publication Date:
- 2021-04
- Subjects:
- File type identification -- n-grams analysis -- Forensic analysis -- Neural networks -- Support vector machine
- Journal URLs:
- http://www.sciencedirect.com/ ↗
- DOI:
- 10.1016/j.fsidi.2021.301121 ↗
- Languages:
- English
- ISSNs:
- 2666-2817
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 19470.xml