SETL: A programmable semantic extract-transform-load framework for semantic data warehouses. (August 2017)
- Record Type:
- Journal Article
- Title:
- SETL: A programmable semantic extract-transform-load framework for semantic data warehouses. (August 2017)
- Main Title:
- SETL: A programmable semantic extract-transform-load framework for semantic data warehouses
- Authors:
- Deb Nath, Rudra Pratap
Hose, Katja
Pedersen, Torben Bach
Romero, Oscar - Abstract:
- Highlights: This paper describes our programmable Semantic ETL (SETL) framework. SETL builds on Semantic Web (SW) standards and tools. SETL provides a number of powerful modules, classes, and methods for (dimensional and semantic) DW constructs and tasks. SETL supports semantic and traditional data sources, semantic integration, and creating or publishing a (MD) semantic DW. Using SETL, we perform a comprehensive experimental evaluation by producing a MD semantic DW that integrates a semantic and non semantic data sources. The evaluation shows that SETL improves considerably over the competing solutions/tools in terms of productivity, KB quality, and performance. Abstract: In order to create better decisions for business analytics, organizations increasingly use external structured, semi-structured, and unstructured data in addition to the (mostly structured) internal data. Current Extract-Transform-Load (ETL) tools are not suitable for this "open world scenario" because they do not consider semantic issues in the integration processing. Current ETL tools neither support processing semantic data nor create a semantic Data Warehouse (DW), a repository of semantically integrated data. This paper describes our programmable Semantic ETL (SETL) framework. SETL builds on Semantic Web (SW) standards and tools and supports developers by offering a number of powerful modules, classes, and methods for (dimensional and semantic) DW constructs and tasks. Thus it supports semantic dataHighlights: This paper describes our programmable Semantic ETL (SETL) framework. SETL builds on Semantic Web (SW) standards and tools. SETL provides a number of powerful modules, classes, and methods for (dimensional and semantic) DW constructs and tasks. SETL supports semantic and traditional data sources, semantic integration, and creating or publishing a (MD) semantic DW. Using SETL, we perform a comprehensive experimental evaluation by producing a MD semantic DW that integrates a semantic and non semantic data sources. The evaluation shows that SETL improves considerably over the competing solutions/tools in terms of productivity, KB quality, and performance. Abstract: In order to create better decisions for business analytics, organizations increasingly use external structured, semi-structured, and unstructured data in addition to the (mostly structured) internal data. Current Extract-Transform-Load (ETL) tools are not suitable for this "open world scenario" because they do not consider semantic issues in the integration processing. Current ETL tools neither support processing semantic data nor create a semantic Data Warehouse (DW), a repository of semantically integrated data. This paper describes our programmable Semantic ETL (SETL) framework. SETL builds on Semantic Web (SW) standards and tools and supports developers by offering a number of powerful modules, classes, and methods for (dimensional and semantic) DW constructs and tasks. Thus it supports semantic data sources in addition to traditional data sources, semantic integration, and creating or publishing a semantic (multidimensional) DW in terms of a knowledge base. A comprehensive experimental evaluation comparing SETL to a solution made with traditional tools (requiring much more hand-coding) on a concrete use case, shows that SETL provides better programmer productivity, knowledge base quality, and performance. … (more)
- Is Part Of:
- Information systems. Volume 68(2017)
- Journal:
- Information systems
- Issue:
- Volume 68(2017)
- Issue Display:
- Volume 68, Issue 2017 (2017)
- Year:
- 2017
- Volume:
- 68
- Issue:
- 2017
- Issue Sort Value:
- 2017-0068-2017-0000
- Page Start:
- 17
- Page End:
- 43
- Publication Date:
- 2017-08
- Subjects:
- ETL -- RDF -- Semantic integration -- Data warehouse -- Semantic-aware -- Knowledge base
Database management -- Periodicals
Electronic data processing -- Periodicals
Bases de données -- Gestion -- Périodiques
Informatique -- Périodiques
Database management
Electronic data processing
Periodicals
005.7 - Journal URLs:
- http://www.sciencedirect.com/science/journal/03064379 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.is.2017.01.005 ↗
- Languages:
- English
- ISSNs:
- 0306-4379
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4496.367300
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 1071.xml