A comprehensive methodology for discovering semantic relationships among geospatial vocabularies using oceanographic data discovery as an example. Issue 11 (2nd November 2017)
- Record Type:
- Journal Article
- Title:
- A comprehensive methodology for discovering semantic relationships among geospatial vocabularies using oceanographic data discovery as an example. Issue 11 (2nd November 2017)
- Main Title:
- A comprehensive methodology for discovering semantic relationships among geospatial vocabularies using oceanographic data discovery as an example
- Authors:
- Jiang, Yongyao
Li, Yun
Yang, Chaowei
Liu, Kai
Armstrong, Edward M.
Huang, Thomas
Moroni, David F.
Finch, Christopher J. - Abstract:
- ABSTRACT: It is challenging to find relevant data for research and development purposes in the geospatial big data era. One long-standing problem in data discovery is locating, assimilating and utilizing the semantic context for a given query. Most research in the geospatial domain has approached this problem in one of two ways: building a domain-specific ontology manually or discovering automatically, semantic relationships using metadata and machine learning techniques. The former relies on rich expert knowledge but is static, costly and labor intensive, whereas the second is automatic and prone to noise. An emerging trend in information science takes advantage of large-scale user search histories, which are dynamic but subject to user- and crawler-generated noise. Leveraging the benefits of these three approaches and avoiding their weaknesses, a novel methodology is proposed to (1) discover vocabulary-based semantic relationships from user search histories and clickstreams, (2) refine the similarity calculation methods from existing ontologies and (3) integrate the results of ontology, metadata, user search history and clickstream analysis to better determine their semantic relationships. An accuracy assessment by domain experts for the similarity values indicates an 83% overall accuracy for the top 10 related terms over randomly selected sample queries. This research functions as an example for building vocabulary-based semantic relationships for different geographicalABSTRACT: It is challenging to find relevant data for research and development purposes in the geospatial big data era. One long-standing problem in data discovery is locating, assimilating and utilizing the semantic context for a given query. Most research in the geospatial domain has approached this problem in one of two ways: building a domain-specific ontology manually or discovering automatically, semantic relationships using metadata and machine learning techniques. The former relies on rich expert knowledge but is static, costly and labor intensive, whereas the second is automatic and prone to noise. An emerging trend in information science takes advantage of large-scale user search histories, which are dynamic but subject to user- and crawler-generated noise. Leveraging the benefits of these three approaches and avoiding their weaknesses, a novel methodology is proposed to (1) discover vocabulary-based semantic relationships from user search histories and clickstreams, (2) refine the similarity calculation methods from existing ontologies and (3) integrate the results of ontology, metadata, user search history and clickstream analysis to better determine their semantic relationships. An accuracy assessment by domain experts for the similarity values indicates an 83% overall accuracy for the top 10 related terms over randomly selected sample queries. This research functions as an example for building vocabulary-based semantic relationships for different geographical domains to improve various aspects of data discovery, including the accuracy of the vocabulary relationships of commonly used search terms. … (more)
- Is Part Of:
- International journal of geographical information science. Volume 31:Issue 11(2017)
- Journal:
- International journal of geographical information science
- Issue:
- Volume 31:Issue 11(2017)
- Issue Display:
- Volume 31, Issue 11 (2017)
- Year:
- 2017
- Volume:
- 31
- Issue:
- 11
- Issue Sort Value:
- 2017-0031-0011-0000
- Page Start:
- 2310
- Page End:
- 2328
- Publication Date:
- 2017-11-02
- Subjects:
- Query augmentation -- semantic search -- web mining -- search history -- click behavior -- big data
Geography -- Data processing -- Periodicals
Information storage and retrieval systems -- Periodicals
Géomatique -- Périodiques
Systèmes d'information -- Périodiques
910.285 - Journal URLs:
- http://www.tandfonline.com/loi/tgis20 ↗
http://www.tandfonline.com/ ↗ - DOI:
- 10.1080/13658816.2017.1357819 ↗
- Languages:
- English
- ISSNs:
- 1365-8816
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4542.266150
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 4455.xml