EXPLOITING SYNTACTIC, SEMANTIC, AND LEXICAL REGULARITIES IN LANGUAGE MODELING VIA DIRECTED MARKOV RANDOM FIELDS. (10th July 2012)
- Record Type:
- Journal Article
- Title:
- EXPLOITING SYNTACTIC, SEMANTIC, AND LEXICAL REGULARITIES IN LANGUAGE MODELING VIA DIRECTED MARKOV RANDOM FIELDS. (10th July 2012)
- Main Title:
- EXPLOITING SYNTACTIC, SEMANTIC, AND LEXICAL REGULARITIES IN LANGUAGE MODELING VIA DIRECTED MARKOV RANDOM FIELDS
- Authors:
- Wang, Shaojun
Wang, Shaomin
Cheng, Li
Greiner, Russell
Schuurmans, Dale - Abstract:
- Abstract : We present a directed Markov random field (MRF) model that combines n ‐gram models, probabilistic context‐free grammars (PCFGs), and probabilistic latent semantic analysis (PLSA) for the purpose of statistical language modeling. Even though the composite directed MRF model potentially has an exponential number of loops and becomes a context‐sensitive grammar, we are nevertheless able to estimate its parameters in cubic time using an efficient modified Expectation‐Maximization (EM) method, the generalized inside–outside algorithm, which extends the inside–outside algorithm to incorporate the effects of the n ‐gram and PLSA language models. We generalize various smoothing techniques to alleviate the sparseness of n ‐gram counts in cases where there are hidden variables. We also derive an analogous algorithm to find the most likely parse of a sentence and to calculate the probability of initial subsequence of a sentence, all generated by the composite language model. Our experimental results on the Wall Street Journal corpus show that we obtain significant reductions in perplexity compared to the state‐of‐the‐art baseline trigram model with Good–Turing and Kneser–Ney smoothing techniques.
- Is Part Of:
- Computational intelligence. Volume 29:Number 4(2013:Nov.)
- Journal:
- Computational intelligence
- Issue:
- Volume 29:Number 4(2013:Nov.)
- Issue Display:
- Volume 29, Issue 4 (2013)
- Year:
- 2013
- Volume:
- 29
- Issue:
- 4
- Issue Sort Value:
- 2013-0029-0004-0000
- Page Start:
- 649
- Page End:
- 679
- Publication Date:
- 2012-07-10
- Subjects:
- language modeling -- lexical information -- syntactic structure -- semantic content -- directed Markov random field -- generalized inside–outside algorithm
Artificial intelligence -- Periodicals
Computational linguistics -- Periodicals
006.3 - Journal URLs:
- http://www.blackwellpublishing.com/journal.asp?ref=0824-7935&site=1 ↗
http://onlinelibrary.wiley.com/ ↗ - DOI:
- 10.1111/j.1467-8640.2012.00436.x ↗
- Languages:
- English
- ISSNs:
- 0824-7935
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3390.595000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 223.xml