A network-based feature extraction model for imbalanced text data. (1st June 2022)
- Record Type:
- Journal Article
- Title:
- A network-based feature extraction model for imbalanced text data. (1st June 2022)
- Main Title:
- A network-based feature extraction model for imbalanced text data
- Authors:
- Li, Keping
Yan, Dongyang
Liu, Yanyan
Zhu, Qiaozhen - Abstract:
- Highlights: A network-based model is proposed for processing imbalanced text data. We design a Polar Layer to combine random walk paths with CNN. We introduce an electing strategy to improve the performance of the proposed model. The proposed model can well solve the problem of unbalanced text data. Abstract: The explosive growth of text data has attracted many researchers to explore the efficient method to extract valuable hidden information. Many technologies, especially deep learning methods, have achieved great success in text analysis. However, the most powerful methods always require a considerable quantity of data for training, which may suffer from imbalanced data in some cases. In this paper, we propose a network-based Convolution Neural Network (NCNN) to mitigate the effect of imbalanced data. The proposed model first generates new synthetic samples for the imbalanced data based on the random walking of the network. Then an extra layer called Polar Layer is introduced to connect the output from the network model of the text to the classical CNN. Two electing strategies (n-NCNN and x-NCNN) are proposed to improve the performance of NCNN further. In the experimental section, the proposed model is applied to Reuters 21578 and WebKb . By comparing with six approaches, we prove the effectiveness of the proposed NCNN model on the imbalanced text data.
- Is Part Of:
- Expert systems with applications. Volume 195(2022)
- Journal:
- Expert systems with applications
- Issue:
- Volume 195(2022)
- Issue Display:
- Volume 195, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 195
- Issue:
- 2022
- Issue Sort Value:
- 2022-0195-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-06-01
- Subjects:
- Complex Network -- CNN -- Text Analysis -- Imbalanced Data -- Random Walk
Expert systems (Computer science) -- Periodicals
Systèmes experts (Informatique) -- Périodiques
Electronic journals
006.33 - Journal URLs:
- http://www.sciencedirect.com/science/journal/09574174 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.eswa.2022.116600 ↗
- Languages:
- English
- ISSNs:
- 0957-4174
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3842.004220
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 21000.xml