A semi-automatic indexing system based on embedded information in HTML documents. Issue 2 (15th June 2015)
- Record Type:
- Journal Article
- Title:
- A semi-automatic indexing system based on embedded information in HTML documents. Issue 2 (15th June 2015)
- Main Title:
- A semi-automatic indexing system based on embedded information in HTML documents
- Authors:
- Vállez, Mari
Pedraza-Jiménez, Rafael
Codina, Lluís
Blanco, Saúl
Rovira, Cristòfol - Abstract:
- Abstract : Purpose: – The purpose of this paper is to describe and evaluate the tool DigiDoc MetaEdit which allows the semi-automatic indexing of HTML documents. The tool works by identifying and suggesting keywords from a thesaurus according to the embedded information in HTML documents. This enables the parameterization of keyword assignment based on how frequently the terms appear in the document, the relevance of their position, and the combination of both. Design/methodology/approach: – In order to evaluate the efficiency of the indexing tool, the descriptors/keywords suggested by the indexing tool are compared to the keywords which have been indexed manually by human experts. To make this comparison a corpus of HTML documents are randomly selected from a journal devoted to Library and Information Science. Findings: – The results of the evaluation show that there: first, is close to a 50 per cent match or overlap between the two indexing systems, however, if you take into consideration the related terms and the narrow terms the matches can reach 73 per cent; and second, the first terms identified by the tool are the most relevant. Originality/value: – The tool presented identifies the most important keywords in an HTML document based on the embedded information in HTML documents. Nowadays, representing the contents of documents with keywords is an essential practice in areas such as information retrieval and e-commerce.
- Is Part Of:
- Library hi tech. Volume 33:Issue 2(2015)
- Journal:
- Library hi tech
- Issue:
- Volume 33:Issue 2(2015)
- Issue Display:
- Volume 33, Issue 2 (2015)
- Year:
- 2015
- Volume:
- 33
- Issue:
- 2
- Issue Sort Value:
- 2015-0033-0002-0000
- Page Start:
- 195
- Page End:
- 210
- Publication Date:
- 2015-06-15
- Subjects:
- Digital documents -- Information retrieval -- Indexing -- Search engines -- Hypertext markup language
Library science -- Technological innovations -- Periodicals
Libraries -- Automation -- Periodicals
Information science -- Periodicals
025.00285 - Journal URLs:
- http://www.emeraldinsight.com/0737-8831.htm ↗
http://www.emeraldinsight.com/ ↗
http://firstsearch.oclc.org ↗ - DOI:
- 10.1108/LHT-12-2014-0114 ↗
- Languages:
- English
- ISSNs:
- 0737-8831
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 5198.870000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 8240.xml