Automated subject indexing using word embeddings and controlled vocabularies: a comparative study. (28th September 2022)
- Record Type:
- Journal Article
- Title:
- Automated subject indexing using word embeddings and controlled vocabularies: a comparative study. (28th September 2022)
- Main Title:
- Automated subject indexing using word embeddings and controlled vocabularies: a comparative study
- Authors:
- Sfakakis, Michalis
Papachristopoulos, Leonidas
Zoutsou, Kyriaki
Papatheodorou, Christos
Tsakonas, Giannis - Abstract:
- Text mining methods contribute significantly to the understanding and the management of digital content, increasing the potential of entry links. This paper introduces a method for subject analysis combining topic modelling and automated labelling of the generated topics exploiting terms from existing knowledge organisation systems. A testbed was developed in which the Latent Dirichlet Allocation (LDA) algorithm was deployed for modelling the topics of a corpus of papers related to the Digital Library Evaluation domain. The generated topics were represented in the form of bags-of-words word embeddings and were utilised for retrieving terms from the EuroVoc Thesaurus and the Computer Science Ontology (CSO). The results of this study show that the domain of DL can be described with different vocabularies, but during the process of automatic labelling the context needs to be taken into account.
- Is Part Of:
- International journal of metadata, semantics and ontologies. Volume 15:Number 4(2022)
- Journal:
- International journal of metadata, semantics and ontologies
- Issue:
- Volume 15:Number 4(2022)
- Issue Display:
- Volume 15, Issue 4 (2022)
- Year:
- 2022
- Volume:
- 15
- Issue:
- 4
- Issue Sort Value:
- 2022-0015-0004-0000
- Page Start:
- 233
- Page End:
- 243
- Publication Date:
- 2022-09-28
- Subjects:
- subject indexing -- similarity measures -- text classification -- machine learning -- word embedding
Metadata -- Periodicals
Semantic Web -- Periodicals
Ontologies (Information retrieval) -- Periodicals
Data structures (Computer science) -- Periodicals
Information theory -- Periodicals
005.74 - Journal URLs:
- http://www.inderscience.com/browse/index.php?journalID=152 ↗
http://www.inderscience.com/ ↗ - Languages:
- English
- ISSNs:
- 1744-2621
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 23407.xml