Real-time Twitter data analysis using Hadoop ecosystem. Issue 1 (1st January 2018)
- Record Type:
- Journal Article
- Title:
- Real-time Twitter data analysis using Hadoop ecosystem. Issue 1 (1st January 2018)
- Main Title:
- Real-time Twitter data analysis using Hadoop ecosystem
- Authors:
- Rodrigues, Anisha P.
Chiplunkar, Niranjan N. - Editors:
- Robnik-Šikonja, Marko
- Abstract:
- Abstract: In the era of the Internet, social media has become an integral part of modern society. People use social media to share their opinions and to have an up-to-date knowledge about the current trends on a daily basis. Twitter is one of the renowned social media that gets a huge amount of tweets each day. This information can be used for economic, industrial, social or government approaches by arranging and analyzing the tweets as per our demand. Since Twitter contains a huge volumeof data, storing and processing this data is a complex problem. Hadoop is a big data storage and processing tool for analyzing data with 3Vs, i.e. data with huge volume, variety and velocity. Hadoop is a framework which deals with Big data and it has its own family which supports processing of different things which are tied up in one umbrella called the Hadoop Ecosystem. In this paper, we will be analyzing tweets streamed in real time. We have used Apache Flume to capture real-time tweets. As an analysis, we have proposed a method for finding recent trends in tweets and performed sentiment analysis on real-time tweets. The analysis is done using Hadoop ecosystem tools such as Apache Hive and Apache Pig. Performance in terms of execution time is compared for analysis of real-time tweets using Pig and Hive. From the experimental results, conclusion can be drawn that Pig is more efficient than Hive as Pig takes less time for execution than Hive.
- Is Part Of:
- Cogent engineering. Volume 5:Issue 1(2018)
- Journal:
- Cogent engineering
- Issue:
- Volume 5:Issue 1(2018)
- Issue Display:
- Volume 5, Issue 1 (2018)
- Year:
- 2018
- Volume:
- 5
- Issue:
- 1
- Issue Sort Value:
- 2018-0005-0001-0000
- Page Start:
- Page End:
- Publication Date:
- 2018-01-01
- Subjects:
- Apache Flume -- Apache Hive -- Apache Pig -- Hadoop
Engineering -- Periodicals
Technology -- Periodicals
Engineering
Technology
Periodicals
620 - Journal URLs:
- http://bibpurl.oclc.org/web/73324 ↗
http://cogentoa.tandfonline.com/journal/oaen20 ↗
http://www.tandfonline.com/toc/oaen20/1/1 ↗
http://www.tandfonline.com/ ↗
http://cogentoa.tandfonline.com/journal/oaps20 ↗ - DOI:
- 10.1080/23311916.2018.1534519 ↗
- Languages:
- English
- ISSNs:
- 2331-1916
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 21686.xml