DISCOVERING ROBUST EMBEDDINGS IN (DIS)SIMILARITY SPACE FOR HIGH‐DIMENSIONAL LINGUISTIC FEATURES. (6th August 2012)
- Record Type:
- Journal Article
- Title:
- DISCOVERING ROBUST EMBEDDINGS IN (DIS)SIMILARITY SPACE FOR HIGH‐DIMENSIONAL LINGUISTIC FEATURES. (6th August 2012)
- Main Title:
- DISCOVERING ROBUST EMBEDDINGS IN (DIS)SIMILARITY SPACE FOR HIGH‐DIMENSIONAL LINGUISTIC FEATURES
- Authors:
- Mu, Tingting
Miwa, Makoto
Tsujii, Junichi
Ananiadou, Sophia - Abstract:
- <abstract abstract-type="main" xml:lang="en"> <title> <x xml:space="preserve">Abstract</x> </title> <p>Recent research has shown the effectiveness of rich feature representation for tasks in natural language processing (NLP). However, exceedingly large number of features do not always improve classification performance. They may contain redundant information, lead to noisy feature presentations, and also render the learning algorithms intractable. In this paper, we propose a supervised embedding framework that modifies the relative positions between instances to increase the compatibility between the input features and the output labels and meanwhile preserves the local distribution of the original data in the embedded space. The proposed framework attempts to support flexible balance between the preservation of intrinsic geometry and the enhancement of class separability for both interclass and intraclass instances. It takes into account characteristics of linguistic features by using an inner product‐based optimization template. (Dis)similarity features, also known as empirical kernel mapping, is employed to enable computationally tractable processing of extremely high‐dimensional input, and also to handle nonlinearities in embedding generation when necessary. Evaluated on two NLP tasks with six data sets, the proposed framework provides better classification performance than the support vector machine without using any dimensionality reduction technique. It also generates<abstract abstract-type="main" xml:lang="en"> <title> <x xml:space="preserve">Abstract</x> </title> <p>Recent research has shown the effectiveness of rich feature representation for tasks in natural language processing (NLP). However, exceedingly large number of features do not always improve classification performance. They may contain redundant information, lead to noisy feature presentations, and also render the learning algorithms intractable. In this paper, we propose a supervised embedding framework that modifies the relative positions between instances to increase the compatibility between the input features and the output labels and meanwhile preserves the local distribution of the original data in the embedded space. The proposed framework attempts to support flexible balance between the preservation of intrinsic geometry and the enhancement of class separability for both interclass and intraclass instances. It takes into account characteristics of linguistic features by using an inner product‐based optimization template. (Dis)similarity features, also known as empirical kernel mapping, is employed to enable computationally tractable processing of extremely high‐dimensional input, and also to handle nonlinearities in embedding generation when necessary. Evaluated on two NLP tasks with six data sets, the proposed framework provides better classification performance than the support vector machine without using any dimensionality reduction technique. It also generates embeddings with better class discriminability as compared to many existing embedding algorithms.</p> </abstract> … (more)
- Is Part Of:
- Computational intelligence. Volume 30:Number 2(2014:May)
- Journal:
- Computational intelligence
- Issue:
- Volume 30:Number 2(2014:May)
- Issue Display:
- Volume 30, Issue 2 (2014)
- Year:
- 2014
- Volume:
- 30
- Issue:
- 2
- Issue Sort Value:
- 2014-0030-0002-0000
- Page Start:
- 285
- Page End:
- 315
- Publication Date:
- 2012-08-06
- Subjects:
- Artificial intelligence -- Periodicals
Computational linguistics -- Periodicals
006.3 - Journal URLs:
- http://www.blackwellpublishing.com/journal.asp?ref=0824-7935&site=1 ↗
http://onlinelibrary.wiley.com/ ↗ - DOI:
- 10.1111/j.1467-8640.2012.00452.x ↗
- Languages:
- English
- ISSNs:
- 0824-7935
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3390.595000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 2963.xml