GISQAF: MapReduce guided spatial query processing and analytics system. (18th December 2015)
- Record Type:
- Journal Article
- Title:
- GISQAF: MapReduce guided spatial query processing and analytics system. (18th December 2015)
- Main Title:
- GISQAF: MapReduce guided spatial query processing and analytics system
- Authors:
- Al‐Naami, Khaled Mohammed
Seker, Sadi Evren
Khan, Latifur - Abstract:
- Summary: The Global Database of Event, Language, and Tone (GDELT) is the only global political georeferenced event dataset with more than 250 million observations covering all countries in the world since January 1, 1979. TABARI and CAMEO are the tools that are used to collect and code events from all international news coverage. To query such big geospatial data, traditional RDBMS can no longer be used, and the need for parallel distributed solutions has become a necessity. MapReduce paradigm has proven to be a scalable platform to process and analyze Big Data in the cloud. Hadoop, as an implementation of MapReduce, is an open‐source application that has been widely used and accepted in academia and industry. However, when dealing with Spatial Data, Hadoop is not equipped well and does not perform efficiently. SpatialHadoop is an extension of Hadoop with the support of spatial data. In this paper, we present Geographic Information System Query and Analytics Framework (GISQAF), which has been built on top of SpatialHadoop. GISQAF focuses on two parts: query processing and data analytics. For the query processing part, we show how this solution outperforms Hadoop query processing by orders of magnitude when applying queries on the GDELT dataset with a size of 60 GB. We show the results for various types of queries. For the data analytics part, we present an approach for finding Spatial co‐occurring events. We show how GISQAF is suitable and efficient to handle data analyticsSummary: The Global Database of Event, Language, and Tone (GDELT) is the only global political georeferenced event dataset with more than 250 million observations covering all countries in the world since January 1, 1979. TABARI and CAMEO are the tools that are used to collect and code events from all international news coverage. To query such big geospatial data, traditional RDBMS can no longer be used, and the need for parallel distributed solutions has become a necessity. MapReduce paradigm has proven to be a scalable platform to process and analyze Big Data in the cloud. Hadoop, as an implementation of MapReduce, is an open‐source application that has been widely used and accepted in academia and industry. However, when dealing with Spatial Data, Hadoop is not equipped well and does not perform efficiently. SpatialHadoop is an extension of Hadoop with the support of spatial data. In this paper, we present Geographic Information System Query and Analytics Framework (GISQAF), which has been built on top of SpatialHadoop. GISQAF focuses on two parts: query processing and data analytics. For the query processing part, we show how this solution outperforms Hadoop query processing by orders of magnitude when applying queries on the GDELT dataset with a size of 60 GB. We show the results for various types of queries. For the data analytics part, we present an approach for finding Spatial co‐occurring events. We show how GISQAF is suitable and efficient to handle data analytics techniques. Copyright © 2015 John Wiley & Sons, Ltd. … (more)
- Is Part Of:
- Software, practice & experience. Volume 46:Number 10(2016)
- Journal:
- Software, practice & experience
- Issue:
- Volume 46:Number 10(2016)
- Issue Display:
- Volume 46, Issue 10 (2016)
- Year:
- 2016
- Volume:
- 46
- Issue:
- 10
- Issue Sort Value:
- 2016-0046-0010-0000
- Page Start:
- 1329
- Page End:
- 1349
- Publication Date:
- 2015-12-18
- Subjects:
- big data -- MapReduce -- Hadoop -- spatial query processing -- data analytics -- spatial co‐occurring events
Computer software -- Periodicals
Computer programming -- Periodicals
Computer programs -- Periodicals
005.3 - Journal URLs:
- http://onlinelibrary.wiley.com/ ↗
- DOI:
- 10.1002/spe.2383 ↗
- Languages:
- English
- ISSNs:
- 0038-0644
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 8321.453000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 1106.xml