A negative category based approach for Wikipedia document classification. (10th April 2010)
- Record Type:
- Journal Article
- Title:
- A negative category based approach for Wikipedia document classification. (10th April 2010)
- Main Title:
- A negative category based approach for Wikipedia document classification
- Authors:
- Murugeshan, Meenakshi Sundaram
Lakshmi, K.
Mukherjee, Saswati - Abstract:
- Profile based methods have been successfully used for the classification of unstructured texts. This paper presents a profile based method for Wikipedia XML document classification. We have used profiles built using negative category information. Our approach exploits the structure of Wikipedia documents to build profiles. Two class profiles are built; one based on the whole content and the other based on the initial description of the Wikipedia documents. In addition, we have also explored the option of using the terms in the section and subsection titles. The effectiveness of cosine and fractional similarity measures in classifying XML documents is compared. The importance of combining two profile based classifiers is experimentally shown to have worked better than individual classifiers.
- Is Part Of:
- International journal of knowledge engineering and data mining. Volume 1:Number 1(2010)
- Journal:
- International journal of knowledge engineering and data mining
- Issue:
- Volume 1:Number 1(2010)
- Issue Display:
- Volume 1, Issue 1 (2010)
- Year:
- 2010
- Volume:
- 1
- Issue:
- 1
- Issue Sort Value:
- 2010-0001-0001-0000
- Page Start:
- 84
- Page End:
- 97
- Publication Date:
- 2010-04-10
- Subjects:
- XML classification -- profile creation -- negative categories -- similarity measures -- feature selection -- initial descriptions -- Wikipedia documents -- document classification -- unstructured text -- cosine -- fractional similarity
Knowledge representation (Information theory) -- Periodicals
Data mining -- Periodicals
006.305 - Journal URLs:
- http://www.inderscience.com/browse/index.php?journalCODE=ijkedm ↗
http://www.inderscience.com/ ↗ - Languages:
- English
- ISSNs:
- 1755-2087
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 8728.xml