A Python library for exploratory data analysis on twitter data based on tokens and aggregated origin–destination information. (February 2022)
- Record Type:
- Journal Article
- Title:
- A Python library for exploratory data analysis on twitter data based on tokens and aggregated origin–destination information. (February 2022)
- Main Title:
- A Python library for exploratory data analysis on twitter data based on tokens and aggregated origin–destination information
- Authors:
- Graff, Mario
Moctezuma, Daniela
Miranda-Jiménez, Sabino
Tellez, Eric S. - Abstract:
- Abstract: Twitter is perhaps the social media more amenable for research. It requires only a few steps to obtain information, and there are plenty of libraries that can help in this regard. Nonetheless, knowing whether a particular event is expressed on Twitter is a challenging task that requires a considerable collection of tweets. This proposal aims to facilitate, to a researcher interested, the process of mining events on Twitter by opening a collection of processed information taken from Twitter since December 2015. The events could be related to natural disasters, health issues, and people's mobility, among other studies that can be pursued with the library proposed. Different applications are presented in this contribution to illustrate the library's capabilities: an exploratory analysis of the topics discovered in tweets, a study on similarity among dialects of the Spanish language, and a mobility report on different countries. In summary, the Python library presented is applied to different domains and retrieves a plethora of information in terms of frequencies by day of words and bi-grams of words for Arabic, English, Spanish, and Russian languages. As well as mobility information related to the number of travels among locations for more than 200 countries or territories. Highlights: Library to Exploratory Data Analysis on Twitter data. Tokens frequency of n-grams and q-grams in 4 languages for several years. Mobility information of many countries available sinceAbstract: Twitter is perhaps the social media more amenable for research. It requires only a few steps to obtain information, and there are plenty of libraries that can help in this regard. Nonetheless, knowing whether a particular event is expressed on Twitter is a challenging task that requires a considerable collection of tweets. This proposal aims to facilitate, to a researcher interested, the process of mining events on Twitter by opening a collection of processed information taken from Twitter since December 2015. The events could be related to natural disasters, health issues, and people's mobility, among other studies that can be pursued with the library proposed. Different applications are presented in this contribution to illustrate the library's capabilities: an exploratory analysis of the topics discovered in tweets, a study on similarity among dialects of the Spanish language, and a mobility report on different countries. In summary, the Python library presented is applied to different domains and retrieves a plethora of information in terms of frequencies by day of words and bi-grams of words for Arabic, English, Spanish, and Russian languages. As well as mobility information related to the number of travels among locations for more than 200 countries or territories. Highlights: Library to Exploratory Data Analysis on Twitter data. Tokens frequency of n-grams and q-grams in 4 languages for several years. Mobility information of many countries available since late 2015. … (more)
- Is Part Of:
- Computers & geosciences. Volume 159(2022)
- Journal:
- Computers & geosciences
- Issue:
- Volume 159(2022)
- Issue Display:
- Volume 159, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 159
- Issue:
- 2022
- Issue Sort Value:
- 2022-0159-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-02
- Subjects:
- Twitter exploratory analysis -- Mobility patterns -- Open-source Python library
Environmental policy -- Periodicals
550.5 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00983004 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.cageo.2021.105012 ↗
- Languages:
- English
- ISSNs:
- 0098-3004
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.695000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 20668.xml