SJSON: A succinct representation for JSON documents. Issue 97 (March 2021)
- Record Type:
- Journal Article
- Title:
- SJSON: A succinct representation for JSON documents. Issue 97 (March 2021)
- Main Title:
- SJSON: A succinct representation for JSON documents
- Authors:
- Lee, Junhee
Anjos, Edman
Satti, Srinivasa Rao - Abstract:
- Abstract: The massive amounts of data processed in modern computational systems are becoming a problem of increasing importance. This data is commonly stored directly or indirectly through the use of data exchange languages, such as JSON (JavaScript Object Notation) and XML (eXtensible Markup Language), for human-readable platform-agnostic access. This paper focuses on exploring a set of succinct representations for JSON documents, which we call SJSON, achieving both reduced RAM and disk usage while supporting efficient queries on the documents. The representations we propose are mainly based on the idea that JSON documents can be decomposed into structural part and raw data part. In our method, we emulate the structure of the JSON document as a rooted ordered tree and represent it using succinct data structures, as opposed to the usual pointer-based implementation. Furthermore, the remaining raw data is reorganized into arrays of attributes and values. This deconstruction between structure and data allows for a straightforward connection between a node in the succinct tree and its corresponding name–value pair, dispensing pointers altogether. The proposed scheme is implemented as the SJSON library in C++, and evaluated with respect to a number of metrics, comparing its performance with popular alternative JSON parsers. Empirical results show that the library is able to represent JSON files succinctly while efficiently supporting traversal queries. Highlights: A set ofAbstract: The massive amounts of data processed in modern computational systems are becoming a problem of increasing importance. This data is commonly stored directly or indirectly through the use of data exchange languages, such as JSON (JavaScript Object Notation) and XML (eXtensible Markup Language), for human-readable platform-agnostic access. This paper focuses on exploring a set of succinct representations for JSON documents, which we call SJSON, achieving both reduced RAM and disk usage while supporting efficient queries on the documents. The representations we propose are mainly based on the idea that JSON documents can be decomposed into structural part and raw data part. In our method, we emulate the structure of the JSON document as a rooted ordered tree and represent it using succinct data structures, as opposed to the usual pointer-based implementation. Furthermore, the remaining raw data is reorganized into arrays of attributes and values. This deconstruction between structure and data allows for a straightforward connection between a node in the succinct tree and its corresponding name–value pair, dispensing pointers altogether. The proposed scheme is implemented as the SJSON library in C++, and evaluated with respect to a number of metrics, comparing its performance with popular alternative JSON parsers. Empirical results show that the library is able to represent JSON files succinctly while efficiently supporting traversal queries. Highlights: A set of succinct representations of JSON documents is suggested. Succinct tree structure is used to store structure information of the document. Bit string indexing is used to maintain data part of the new document. Experimental results based on the C++ implementation exhibit space-efficiency. … (more)
- Is Part Of:
- Information systems. Issue 97(2021)
- Journal:
- Information systems
- Issue:
- Issue 97(2021)
- Issue Display:
- Volume 97, Issue 97 (2021)
- Year:
- 2021
- Volume:
- 97
- Issue:
- 97
- Issue Sort Value:
- 2021-0097-0097-0000
- Page Start:
- Page End:
- Publication Date:
- 2021-03
- Subjects:
- JSON -- Succinct data structure -- Semi-structured document representation -- Heterogeneous array indexing
Database management -- Periodicals
Electronic data processing -- Periodicals
Bases de données -- Gestion -- Périodiques
Informatique -- Périodiques
Database management
Electronic data processing
Periodicals
005.7 - Journal URLs:
- http://www.sciencedirect.com/science/journal/03064379 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.is.2020.101686 ↗
- Languages:
- English
- ISSNs:
- 0306-4379
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4496.367300
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 15425.xml