Prediction of Chinese word-formation patterns using the layer-weighted semantic graph-based KFP-MCO classifier. (September 2016)
- Record Type:
- Journal Article
- Title:
- Prediction of Chinese word-formation patterns using the layer-weighted semantic graph-based KFP-MCO classifier. (September 2016)
- Main Title:
- Prediction of Chinese word-formation patterns using the layer-weighted semantic graph-based KFP-MCO classifier
- Authors:
- Gao, Guangxia
Zhang, Zhiwang - Abstract:
- Highlights: A novel layered semantic graph of Chinese semantic word-formation is proposed. The layer-weighted graph edit distance and the similarity kernel are defined. A new algorithm based on KFP-MCOC is used to predict word-formation patterns. Comparison of predictive performance is conducted between KFP-MCOC and SVM. Statistical test showed that the accuracy of the proposed approach is significant. Abstract: Nowadays natural language processing plays an important and critical role in the domain of intelligent computing, pattern recognition, semantic analysis and machine intelligence. For Chinese information processing, to construct the predictive models of different semantic word-formation patterns with a large-scale corpus can significantly improve the efficiency and accuracy of the paraphrase of the unregistered or new word, ambiguities elimination, automatic lexicography, machine translation and other applications. Therefore it is required to find the relationship between word-formation patterns and different influential factors, which can be denoted as a classification problem. However, due to noise, anomalies, imprecision, polysemy, ambiguity, nonlinear structure, and class-imbalance in semantic word-formation data, multi-criteria optimization classifier (MCOC), support vector machines (SVM) and other traditional classification approaches will give the poor predictive performance. In this paper, according to the characteristic analysis of Chinese word-formations, weHighlights: A novel layered semantic graph of Chinese semantic word-formation is proposed. The layer-weighted graph edit distance and the similarity kernel are defined. A new algorithm based on KFP-MCOC is used to predict word-formation patterns. Comparison of predictive performance is conducted between KFP-MCOC and SVM. Statistical test showed that the accuracy of the proposed approach is significant. Abstract: Nowadays natural language processing plays an important and critical role in the domain of intelligent computing, pattern recognition, semantic analysis and machine intelligence. For Chinese information processing, to construct the predictive models of different semantic word-formation patterns with a large-scale corpus can significantly improve the efficiency and accuracy of the paraphrase of the unregistered or new word, ambiguities elimination, automatic lexicography, machine translation and other applications. Therefore it is required to find the relationship between word-formation patterns and different influential factors, which can be denoted as a classification problem. However, due to noise, anomalies, imprecision, polysemy, ambiguity, nonlinear structure, and class-imbalance in semantic word-formation data, multi-criteria optimization classifier (MCOC), support vector machines (SVM) and other traditional classification approaches will give the poor predictive performance. In this paper, according to the characteristic analysis of Chinese word-formations, we firstly proposed a novel layered semantic graph of each disyllabic word, the layer-weighted graph edit distance (GED) and its similarity kernel embedded into a new vector space, then on the normalized data MCOC with kernel, fuzzification and penalty factors (KFP-MCOC) and SVM are employed to predict Chinese semantic word-formation patterns. Our experimental results and comparison with SVM show that KFP-MCOC based on the layer-weighted semantic graphs can increase the separation of different patterns, the predictive accuracy of target patterns and the generalization of semantic pattern classification on new compound words. … (more)
- Is Part Of:
- Computer speech & language. Volume 39(2016)
- Journal:
- Computer speech & language
- Issue:
- Volume 39(2016)
- Issue Display:
- Volume 39, Issue 2016 (2016)
- Year:
- 2016
- Volume:
- 39
- Issue:
- 2016
- Issue Sort Value:
- 2016-0039-2016-0000
- Page Start:
- 29
- Page End:
- 46
- Publication Date:
- 2016-09
- Subjects:
- Semantic -- Word-formation -- Graph edit distance -- Multi-criteria optimization -- Pattern classification
Speech processing systems -- Periodicals
Automatic speech recognition -- Periodicals
Computers -- Periodicals
Linguistics -- Periodicals
Speech-Language Pathology -- Periodicals
Traitement automatique de la parole -- Périodiques
Reconnaissance automatique de la parole -- Périodiques
Automatic speech recognition
Speech processing systems
Electronic journals
Periodicals
006.454 - Journal URLs:
- http://www.journals.elsevier.com/computer-speech-and-language/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.csl.2016.01.005 ↗
- Languages:
- English
- ISSNs:
- 0885-2308
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.276600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 2467.xml