The negative skycube. (February 2020)
- Record Type:
- Journal Article
- Title:
- The negative skycube. (February 2020)
- Main Title:
- The negative skycube
- Authors:
- Alami, Karim
Hanusse, Nicolas
Kamnang-Wanko, Patrick
Maabout, Sofian - Abstract:
- Abstract: Multidimensional data analysis has attracted a lot of research efforts during past years. One of the aspects that has been addressed so far is that to allow users to analyze their data from different perspectives, each of which corresponds to a selected subset of dimensions. To optimize these analysis queries, precomputation, and materialization, are among most studied solutions. In the context of skyline analysis, the skycube structure has been proposed as an optimization structure to allow users to ask for the non dominated records with respect to every selected dimensions set. More precisely, given a set of dimensions D = { D 1, …, D d } and a relation T ( i d, D ), the Skycube of T is the set of all skylines obtained by considering each of the subsets of D (subspaces). To make the Skycube practically useful, two lines of research have been pursued so far: the first one aims to propose efficient algorithms for computing it. Note that the number of these skylines is exponential w.r.t. | D | . Hence, both execution time and storage space make these solutions struggling with even moderately large datasets, say | D | larger than 10 and the number of tuples greater than 1 0 6 . This motivated the second line of researches which propose Skycube summarization techniques to reduce both time and space consumption. Both lines of research, store the whole or a summary of the following information: "for every tuple t, keep track of the dimensions subsets X (subspaces) whereAbstract: Multidimensional data analysis has attracted a lot of research efforts during past years. One of the aspects that has been addressed so far is that to allow users to analyze their data from different perspectives, each of which corresponds to a selected subset of dimensions. To optimize these analysis queries, precomputation, and materialization, are among most studied solutions. In the context of skyline analysis, the skycube structure has been proposed as an optimization structure to allow users to ask for the non dominated records with respect to every selected dimensions set. More precisely, given a set of dimensions D = { D 1, …, D d } and a relation T ( i d, D ), the Skycube of T is the set of all skylines obtained by considering each of the subsets of D (subspaces). To make the Skycube practically useful, two lines of research have been pursued so far: the first one aims to propose efficient algorithms for computing it. Note that the number of these skylines is exponential w.r.t. | D | . Hence, both execution time and storage space make these solutions struggling with even moderately large datasets, say | D | larger than 10 and the number of tuples greater than 1 0 6 . This motivated the second line of researches which propose Skycube summarization techniques to reduce both time and space consumption. Both lines of research, store the whole or a summary of the following information: "for every tuple t, keep track of the dimensions subsets X (subspaces) where t belongs to the respective skyline". In this paper, we consider the complementary statement, i.e., "for every tuple t, we store a compact data structure encoding the subspaces X with respect to which, t is dominated ". This is what we call the negative skycube . Despite the apparent equivalence between the two statements (dominated vs not dominated), our analysis and extensive experiments show that these two points of view do not lead to the same behavior of the related algorithms. More specifically, our proposal shows that: (i) the negative summary can be obtained much faster than state of the art techniques for positive summaries, (ii) in general, it consumes less memory space, (iii) skyline queries evaluation using this summary is much faster, (iv) the positive Skycube can be obtained more rapidly than state of the art algorithms especially designed for this purpose, and (v) it is highly effective with respect to insertions and deletions. Highlights: We propose an auxiliary compact data structure, called NSC to optimize multidimensional skyline queries. We exhibit some properties to speed up NSC construction. We show the NP-hardness of NSC compression and provide an approximate solution to do so. We show how to maintain NSC after both insertions and deletions. We present experiments showing the advantages and limitations of our proposals. … (more)
- Is Part Of:
- Information systems. Volume 88(2020)
- Journal:
- Information systems
- Issue:
- Volume 88(2020)
- Issue Display:
- Volume 88, Issue 2020 (2020)
- Year:
- 2020
- Volume:
- 88
- Issue:
- 2020
- Issue Sort Value:
- 2020-0088-2020-0000
- Page Start:
- Page End:
- Publication Date:
- 2020-02
- Subjects:
- Multidimensional skyline -- Data structure -- Skycube -- Query optimization -- Incremental maintenance
Database management -- Periodicals
Electronic data processing -- Periodicals
Bases de données -- Gestion -- Périodiques
Informatique -- Périodiques
Database management
Electronic data processing
Periodicals
005.7 - Journal URLs:
- http://www.sciencedirect.com/science/journal/03064379 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.is.2019.101443 ↗
- Languages:
- English
- ISSNs:
- 0306-4379
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4496.367300
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 12595.xml