SemTra: A semi-supervised approach to traffic flow labeling with minimal human effort. (July 2019)
- Record Type:
- Journal Article
- Title:
- SemTra: A semi-supervised approach to traffic flow labeling with minimal human effort. (July 2019)
- Main Title:
- SemTra: A semi-supervised approach to traffic flow labeling with minimal human effort
- Authors:
- Fahad, Adil
Almalawi, Abdulmohsen
Tari, Zahir
Alharthi, Kurayman
Al Qahtani, Fawaz S.
Cheriet, Mohamed - Abstract:
- Highlights: Present a multi-view approach that creates multiple representations for traffic data. Incorporate different distance metrics in clustering to reveal the underlying structure. Explore the concept of evidence accumulation clustering. Propose mapping process based on the internal-structure and the probability function. Propose local and Global self-training approaches for predicting actual class label. Abstract: Network traffic classification is an essential component for service differentiation, network design and management and security systems. The limitations of traditional port-based and payload methods have been addressed by recent promising studies which rely on the analysis of the statistics of traffic flows and the use of machine learning techniques. However, due to the high cost of manual labeling, it is hard to obtain sufficient, reliable and up-to-date labeled data for effective IP traffic classification. This paper proposes a novel semi-supervised approach, called SemTra, which automatically alleviates the shortage of labeled flows for machine learning by exploiting the advantages of both supervised and unsupervised models. In particular, SemTra involves the following: (i) generating multi-view representations of the original data based on dimensionality reduction methods to have strong discrimination ability, (ii) incorporating the generated representations into the ensemble clustering model to provide a combined clustering output with better qualityHighlights: Present a multi-view approach that creates multiple representations for traffic data. Incorporate different distance metrics in clustering to reveal the underlying structure. Explore the concept of evidence accumulation clustering. Propose mapping process based on the internal-structure and the probability function. Propose local and Global self-training approaches for predicting actual class label. Abstract: Network traffic classification is an essential component for service differentiation, network design and management and security systems. The limitations of traditional port-based and payload methods have been addressed by recent promising studies which rely on the analysis of the statistics of traffic flows and the use of machine learning techniques. However, due to the high cost of manual labeling, it is hard to obtain sufficient, reliable and up-to-date labeled data for effective IP traffic classification. This paper proposes a novel semi-supervised approach, called SemTra, which automatically alleviates the shortage of labeled flows for machine learning by exploiting the advantages of both supervised and unsupervised models. In particular, SemTra involves the following: (i) generating multi-view representations of the original data based on dimensionality reduction methods to have strong discrimination ability, (ii) incorporating the generated representations into the ensemble clustering model to provide a combined clustering output with better quality and stability, (iii) adapting the concept of self-training to iteratively utilize the few labeled data along with unlabeled within local and global viewpoints; and (iiii) obtaining the final class decision by combining the decisions of mapping strategy of clusters, the local self-training and global self-training approaches. Extensive experiments were carried out to compare the effectiveness of SemTra over representative semi-supervised methods using sixteen network traffic datasets. … (more)
- Is Part Of:
- Pattern recognition. Volume 91(2019:Jul.)
- Journal:
- Pattern recognition
- Issue:
- Volume 91(2019:Jul.)
- Issue Display:
- Volume 91 (2019)
- Year:
- 2019
- Volume:
- 91
- Issue Sort Value:
- 2019-0091-0000-0000
- Page Start:
- 1
- Page End:
- 12
- Publication Date:
- 2019-07
- Subjects:
- Internet traffic classification -- Semi-supervised learning -- Multiview
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.patcog.2019.02.001 ↗
- Languages:
- English
- ISSNs:
- 0031-3203
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 9721.xml