Automatic Extraction of Medication Mentions from Tweets—Overview of the BioCreative VII Shared Task 3 Competition. (3rd February 2023)
- Record Type:
- Journal Article
- Title:
- Automatic Extraction of Medication Mentions from Tweets—Overview of the BioCreative VII Shared Task 3 Competition. (3rd February 2023)
- Main Title:
- Automatic Extraction of Medication Mentions from Tweets—Overview of the BioCreative VII Shared Task 3 Competition
- Authors:
- Weissenbacher, Davy
O'Connor, Karen
Rawal, Siddharth
Zhang, Yu
Tsai, Richard Tzong-Han
Miller, Timothy
Xu, Dongfang
Anderson, Carol
Liu, Bo
Han, Qing
Zhang, Jinfeng
Kulev, Igor
Köprü, Berkay
Rodriguez-Esteban, Raul
Ozkirimli, Elif
Ayach, Ammer
Roller, Roland
Piccolo, Stephen
Han, Peijin
Vydiswaran, V G Vinod
Tekumalla, Ramya
Banda, Juan M
Bagherzadeh, Parsa
Bergler, Sabine
Silva, João F
Almeida, Tiago
Martinez, Paloma
Rivera-Zavala, Renzo
Wang, Chen-Kai
Dai, Hong-Jie
Alberto Robles Hernandez, Luis
Gonzalez-Hernandez, Graciela
… (more) - Abstract:
- Abstract: This study presents the outcomes of the shared task competition BioCreative VII (Task 3) focusing on the extraction of medication names from a Twitter user's publicly available tweets (the user's 'timeline'). In general, detecting health-related tweets is notoriously challenging for natural language processing tools. The main challenge, aside from the informality of the language used, is that people tweet about any and all topics, and most of their tweets are not related to health. Thus, finding those tweets in a user's timeline that mention specific health-related concepts such as medications requires addressing extreme imbalance. Task 3 called for detecting tweets in a user's timeline that mentions a medication name and, for each detected mention, extracting its span. The organizers made available a corpus consisting of 182 049 tweets publicly posted by 212 Twitter users with all medication mentions manually annotated. The corpus exhibits the natural distribution of positive tweets, with only 442 tweets (0.2%) mentioning a medication. This task was an opportunity for participants to evaluate methods that are robust to class imbalance beyond the simple lexical match. A total of 65 teams registered, and 16 teams submitted a system run. This study summarizes the corpus created by the organizers and the approaches taken by the participating teams for this challenge. The corpus is freely available atAbstract: This study presents the outcomes of the shared task competition BioCreative VII (Task 3) focusing on the extraction of medication names from a Twitter user's publicly available tweets (the user's 'timeline'). In general, detecting health-related tweets is notoriously challenging for natural language processing tools. The main challenge, aside from the informality of the language used, is that people tweet about any and all topics, and most of their tweets are not related to health. Thus, finding those tweets in a user's timeline that mention specific health-related concepts such as medications requires addressing extreme imbalance. Task 3 called for detecting tweets in a user's timeline that mentions a medication name and, for each detected mention, extracting its span. The organizers made available a corpus consisting of 182 049 tweets publicly posted by 212 Twitter users with all medication mentions manually annotated. The corpus exhibits the natural distribution of positive tweets, with only 442 tweets (0.2%) mentioning a medication. This task was an opportunity for participants to evaluate methods that are robust to class imbalance beyond the simple lexical match. A total of 65 teams registered, and 16 teams submitted a system run. This study summarizes the corpus created by the organizers and the approaches taken by the participating teams for this challenge. The corpus is freely available at https://biocreative.bioinformatics.udel.edu/tasks/biocreative-vii/track-3/ . The methods and the results of the competing systems are analyzed with a focus on the approaches taken for learning from class-imbalanced data. … (more)
- Is Part Of:
- Database. Volume 2023(2023)
- Journal:
- Database
- Issue:
- Volume 2023(2023)
- Issue Display:
- Volume 2023, Issue 2023 (2023)
- Year:
- 2023
- Volume:
- 2023
- Issue:
- 2023
- Issue Sort Value:
- 2023-2023-2023-0000
- Page Start:
- Page End:
- Publication Date:
- 2023-02-03
- Subjects:
- Biology -- Databases -- Periodicals
Bioinformatics -- Periodicals
570.285 - Journal URLs:
- http://database.oxfordjournals.org/ ↗
http://ukcatalogue.oup.com/ ↗ - DOI:
- 10.1093/database/baac108 ↗
- Languages:
- English
- ISSNs:
- 1758-0463
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 25834.xml