ReX: Representative extrapolating relational databases. (July 2017)
- Record Type:
- Journal Article
- Title:
- ReX: Representative extrapolating relational databases. (July 2017)
- Main Title:
- ReX: Representative extrapolating relational databases
- Authors:
- Buda, Teodora Sandra
Cerqueus, Thomas
Grava, Cristian
Murphy, John - Abstract:
- Highlights: An automated database extrapolation system that generates a representative extrapolated database of an original database, given a rational scaling rate. An automated sampling method, called RVFDS, that aims to generate a representative sample of the original database. This method is used in addition to the extrapolation method to handle rational scaling rates. A comprehensive experimental comparison, using both real and synthetic databases, between a state-of-the-art scaling technique, namely UpSizeR, and our proposed representative scaling system, ReX. Abstract: Generating synthetic data is useful in multiple application areas (e.g., database testing, software testing). Nevertheless, existing synthetic data generators are either limited to generating data that only respect the database schema constraints, or they are not accurate in terms of representativeness, unless a complex set of inputs are given from the user (such as the data characteristics of the desired generated data). In this paper, we present an extension of a prior representative extrapolation technique, namely ReX [20], limited to natural scaling rates. The objective is to produce in an automated and efficient way a representative extrapolated database, given an original database O and a rational scaling rate, s ∈ Q . In the extended version, the ReX system can handle rational scaling rates by combining existing efficient sampling and extrapolation techniques. Furthermore, we propose a novelHighlights: An automated database extrapolation system that generates a representative extrapolated database of an original database, given a rational scaling rate. An automated sampling method, called RVFDS, that aims to generate a representative sample of the original database. This method is used in addition to the extrapolation method to handle rational scaling rates. A comprehensive experimental comparison, using both real and synthetic databases, between a state-of-the-art scaling technique, namely UpSizeR, and our proposed representative scaling system, ReX. Abstract: Generating synthetic data is useful in multiple application areas (e.g., database testing, software testing). Nevertheless, existing synthetic data generators are either limited to generating data that only respect the database schema constraints, or they are not accurate in terms of representativeness, unless a complex set of inputs are given from the user (such as the data characteristics of the desired generated data). In this paper, we present an extension of a prior representative extrapolation technique, namely ReX [20], limited to natural scaling rates. The objective is to produce in an automated and efficient way a representative extrapolated database, given an original database O and a rational scaling rate, s ∈ Q . In the extended version, the ReX system can handle rational scaling rates by combining existing efficient sampling and extrapolation techniques. Furthermore, we propose a novel sampling technique, RVFDS for handling positive rational values for the desired size of the generated database. We evaluate ReX in comparison with a realistic scaling method, namely UpSizeR [43], on both real and synthetic databases. We show that our solution statistically and significantly outperforms the compared method for rational scaling rates in terms of representativeness. … (more)
- Is Part Of:
- Information systems. Volume 67(2017)
- Journal:
- Information systems
- Issue:
- Volume 67(2017)
- Issue Display:
- Volume 67, Issue 2017 (2017)
- Year:
- 2017
- Volume:
- 67
- Issue:
- 2017
- Issue Sort Value:
- 2017-0067-2017-0000
- Page Start:
- 83
- Page End:
- 99
- Publication Date:
- 2017-07
- Subjects:
- Relational database -- Extrapolation -- Sampling -- Representative -- Random
Database management -- Periodicals
Electronic data processing -- Periodicals
Bases de données -- Gestion -- Périodiques
Informatique -- Périodiques
Database management
Electronic data processing
Periodicals
005.7 - Journal URLs:
- http://www.sciencedirect.com/science/journal/03064379 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.is.2017.03.001 ↗
- Languages:
- English
- ISSNs:
- 0306-4379
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4496.367300
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 165.xml