Multi-valued indexing in Apache AsterixDB (SI DOLAP 2022). Issue 113 (January 2023)
- Record Type:
- Journal Article
- Title:
- Multi-valued indexing in Apache AsterixDB (SI DOLAP 2022). Issue 113 (January 2023)
- Main Title:
- Multi-valued indexing in Apache AsterixDB (SI DOLAP 2022)
- Authors:
- Galvizo, Glenn
Carey, Michael J. - Abstract:
- Abstract: Secondary indexes in relational database systems are traditionally built under the assumption that one data record maps to one indexed value. Nowadays, particularly in NoSQL systems, single data records can hold collections of values that users want to access efficiently in an ad-hoc manner. Multi-valued indexes aim to give users the best of both worlds: (i) to keep a more natural data model of records with collections of values, and (ii) to reap the benefits of a secondary index. In this paper, we detail the steps taken to realize multi-valued indexes in AsterixDB, a Big Data management system with a structured query language operating over a collection of documents. This includes (a) creating the specification language for such indexes, (b) illustrating data flows for bulk-loading and maintaining an index, and (c) discussing query plans to take advantage of multi-valued indexes for use in predicates with existential and universal quantification. We conclude with experiments that measure the impact of maintaining an AsterixDB multi-valued index and experiments that compare the query performance our multi-valued indexes against similar indexes in MongoDB and Couchbase Server's Query Service. Highlights: NoSQL databases such as AsterixDB support nested and multi-valued fields. A data definition facility is provided for specifying multi-valued indexes. Data flows for multi-valued index creation and maintenance are detailed. Query compilation and processing withAbstract: Secondary indexes in relational database systems are traditionally built under the assumption that one data record maps to one indexed value. Nowadays, particularly in NoSQL systems, single data records can hold collections of values that users want to access efficiently in an ad-hoc manner. Multi-valued indexes aim to give users the best of both worlds: (i) to keep a more natural data model of records with collections of values, and (ii) to reap the benefits of a secondary index. In this paper, we detail the steps taken to realize multi-valued indexes in AsterixDB, a Big Data management system with a structured query language operating over a collection of documents. This includes (a) creating the specification language for such indexes, (b) illustrating data flows for bulk-loading and maintaining an index, and (c) discussing query plans to take advantage of multi-valued indexes for use in predicates with existential and universal quantification. We conclude with experiments that measure the impact of maintaining an AsterixDB multi-valued index and experiments that compare the query performance our multi-valued indexes against similar indexes in MongoDB and Couchbase Server's Query Service. Highlights: NoSQL databases such as AsterixDB support nested and multi-valued fields. A data definition facility is provided for specifying multi-valued indexes. Data flows for multi-valued index creation and maintenance are detailed. Query compilation and processing with multi-valued indexes are described. AsterixDB's multi-valued indexes perform well compared to other systems. … (more)
- Is Part Of:
- Information systems. Issue 113(2023)
- Journal:
- Information systems
- Issue:
- Issue 113(2023)
- Issue Display:
- Volume 113, Issue 113 (2023)
- Year:
- 2023
- Volume:
- 113
- Issue:
- 113
- Issue Sort Value:
- 2023-0113-0113-0000
- Page Start:
- Page End:
- Publication Date:
- 2023-01
- Subjects:
- Multi-valued indexing -- Index specification -- Index implementation -- Query optimization -- AsterixDB
Database management -- Periodicals
Electronic data processing -- Periodicals
Bases de données -- Gestion -- Périodiques
Informatique -- Périodiques
Database management
Electronic data processing
Periodicals
005.7 - Journal URLs:
- http://www.sciencedirect.com/science/journal/03064379 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.is.2022.102144 ↗
- Languages:
- English
- ISSNs:
- 0306-4379
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4496.367300
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 25942.xml