Research on Word Vector Training Method Based on Improved Skip-Gram Algorithm. (27th February 2022)
- Record Type:
- Journal Article
- Title:
- Research on Word Vector Training Method Based on Improved Skip-Gram Algorithm. (27th February 2022)
- Main Title:
- Research on Word Vector Training Method Based on Improved Skip-Gram Algorithm
- Authors:
- Tang, Yachun
- Other Names:
- Li Qiangyi Academic Editor.
- Abstract:
- Abstract : Through the effective word vector training method, we can obtain semantic-rich word vectors and can achieve better results on the same task. In view of the shortcomings of the traditional skip-gram model in coding and modeling the processing of context words, this study proposes an improved word vector-training method based on skip-gram algorithm. Based on the analysis of the existing skip-gram model, the concept of distribution hypothesis is introduced. The distribution of each word in the word context is taken as the representation of the word, the word is put into the semantic space of the word, and then the word is modelled, which is better modelled by the smoothing of words and the semantic space of words. In the training process, the random gradient descent method is used to solve the vector representation of each word and each Chinese character. The proposed training method is compared with skip gram, CWE + P, and SEING by using word sense similarity task and text classification task in the experiment. Experimental results showed that the proposed method had significant advantages in the Chinese-word segmentation task with a performance gain rate of about 30%. The method proposed in this study provides a reference for the in-depth study of word vector and text mining.
- Is Part Of:
- Advances in multimedia. Volume 2022(2022)
- Journal:
- Advances in multimedia
- Issue:
- Volume 2022(2022)
- Issue Display:
- Volume 2022, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 2022
- Issue:
- 2022
- Issue Sort Value:
- 2022-2022-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-02-27
- Subjects:
- Multimedia systems -- Periodicals
Computer networks -- Periodicals
Multimédia
Réseaux d'ordinateurs
Computer networks
Multimedia systems
Periodicals
006.7 - Journal URLs:
- https://www.hindawi.com/journals/am/ ↗
http://bibpurl.oclc.org/web/22854 ↗ - DOI:
- 10.1155/2022/4414207 ↗
- Languages:
- English
- ISSNs:
- 1687-5680
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library HMNTS - ELD Digital store
- Ingest File:
- 21138.xml