Graph node rank based important keyword detection from Twitter. Issue 2 (17th July 2020)
- Record Type:
- Journal Article
- Title:
- Graph node rank based important keyword detection from Twitter. Issue 2 (17th July 2020)
- Main Title:
- Graph node rank based important keyword detection from Twitter
- Authors:
- Kumar, Mukesh
Rehan, Palak - Abstract:
- Abstract : Social media networks like Twitter, Facebook, WhatsApp etc. are most commonly used medium for sharing news, opinions and to stay in touch with peers. Messages on twitter are limited to 140 characters. This led users to create their own novel syntax in tweets to express more in lesser words. Free writing style, use of URLs, markup syntax, inappropriate punctuations, ungrammatical structures, abbreviations etc. makes it harder to mine useful information from them. For each tweet, we can get an explicit time stamp, the name of the user, the social network the user belongs to, or even the GPS coordinates if the tweet is created with a GPS-enabled mobile device. With these features, Twitter is, in nature, a good resource for detecting and analyzing the real time events happening around the world. By using the speed and coverage of Twitter, we can detect events, a sequence of important keywords being talked, in a timely manner which can be used in different applications like natural calamity relief support, earthquake relief support, product launches, suspicious activity detection etc. The keyword detection process from Twitter can be seen as a two step process: detection of keyword in the raw text form (words as posted by the users) and keyword normalization process (reforming the users' unstructured words in the complete meaningful English language words). In this paper a keyword detection technique based upon the graph, spanning tree and Page Rank algorithm isAbstract : Social media networks like Twitter, Facebook, WhatsApp etc. are most commonly used medium for sharing news, opinions and to stay in touch with peers. Messages on twitter are limited to 140 characters. This led users to create their own novel syntax in tweets to express more in lesser words. Free writing style, use of URLs, markup syntax, inappropriate punctuations, ungrammatical structures, abbreviations etc. makes it harder to mine useful information from them. For each tweet, we can get an explicit time stamp, the name of the user, the social network the user belongs to, or even the GPS coordinates if the tweet is created with a GPS-enabled mobile device. With these features, Twitter is, in nature, a good resource for detecting and analyzing the real time events happening around the world. By using the speed and coverage of Twitter, we can detect events, a sequence of important keywords being talked, in a timely manner which can be used in different applications like natural calamity relief support, earthquake relief support, product launches, suspicious activity detection etc. The keyword detection process from Twitter can be seen as a two step process: detection of keyword in the raw text form (words as posted by the users) and keyword normalization process (reforming the users' unstructured words in the complete meaningful English language words). In this paper a keyword detection technique based upon the graph, spanning tree and Page Rank algorithm is proposed. A text normalization technique based upon hybrid approach using Levenshtein distance, demetaphone algorithm and dictionary mapping is proposed to work upon the unstructured keywords as produced by the proposed keyword detector. The proposed normalization technique is validated using the standard lexnorm 1.2 dataset. The proposed system is used to detect the keywords from Twiter text being posted at real time. The detected and normalized keywords are further validated from the search engine results at later time for detection of events. … (more)
- Is Part Of:
- Applied computing and informatics. Volume 17:Issue 2(2021)
- Journal:
- Applied computing and informatics
- Issue:
- Volume 17:Issue 2(2021)
- Issue Display:
- Volume 17, Issue 2 (2021)
- Year:
- 2021
- Volume:
- 17
- Issue:
- 2
- Issue Sort Value:
- 2021-0017-0002-0000
- Page Start:
- 194
- Page End:
- 209
- Publication Date:
- 2020-07-17
- Subjects:
- Graphs -- Normalization -- Social media -- Spanning trees
Information science -- Periodicals
Information storage and retrieval systems -- Periodicals
004 - Journal URLs:
- https://www.emerald.com/insight/publication/issn/2634-1964 ↗
http://www.elsevier.com/journals ↗
https://www.emeraldgrouppublishing.com/journal/aci ↗ - DOI:
- 10.1016/j.aci.2018.08.002 ↗
- Languages:
- English
- ISSNs:
- 2210-8327
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 22326.xml