A literature embedding model for cardiovascular disease prediction using risk factors, symptoms, and genotype information. (1st March 2023)
- Record Type:
- Journal Article
- Title:
- A literature embedding model for cardiovascular disease prediction using risk factors, symptoms, and genotype information. (1st March 2023)
- Main Title:
- A literature embedding model for cardiovascular disease prediction using risk factors, symptoms, and genotype information
- Authors:
- Moon, Jihye
Posada-Quintero, Hugo F.
Chon, Ki H. - Abstract:
- Highlights: Literature abstracts provide multifaceted information for cardiovascular disease prediction. Literature abstract data enables identifying new risk factors for cardiovascular disease. Unknown genetic cardiovascular risk factors could be annotated by literature mining. Literature-based artificial intelligence could predict subjects' cardiovascular events. Abstract: Accurate prediction of cardiovascular disease (CVD) requires multifaceted information consisting of not only a patient's medical history, but genomic data, symptoms, lifestyle, and risk factors which are often not incorporated into a decision making process as the data are vast, difficult to obtain, and require complex algorithms. However, given the vast amount of publicly available information and recent advances in machine and deep learning techniques, acquiring multifaceted data for accurate diagnosis and medical decision making via artificial intelligence has become more attainable. In this work we illustrate a literature embedding model, which identifies various risk factors, symptoms, mechanisms, and genes associated with CVD using a query word (i.e., "stroke"), and then based on the collected information the model predicts whether or not a person will be susceptible to CVD. For this purpose, we collected published literature from PubMed using search keywords consisting of a word such as "heart" and 19, 264 human gene names, then trained our literature embedding model using the collected abstracts.Highlights: Literature abstracts provide multifaceted information for cardiovascular disease prediction. Literature abstract data enables identifying new risk factors for cardiovascular disease. Unknown genetic cardiovascular risk factors could be annotated by literature mining. Literature-based artificial intelligence could predict subjects' cardiovascular events. Abstract: Accurate prediction of cardiovascular disease (CVD) requires multifaceted information consisting of not only a patient's medical history, but genomic data, symptoms, lifestyle, and risk factors which are often not incorporated into a decision making process as the data are vast, difficult to obtain, and require complex algorithms. However, given the vast amount of publicly available information and recent advances in machine and deep learning techniques, acquiring multifaceted data for accurate diagnosis and medical decision making via artificial intelligence has become more attainable. In this work we illustrate a literature embedding model, which identifies various risk factors, symptoms, mechanisms, and genes associated with CVD using a query word (i.e., "stroke"), and then based on the collected information the model predicts whether or not a person will be susceptible to CVD. For this purpose, we collected published literature from PubMed using search keywords consisting of a word such as "heart" and 19, 264 human gene names, then trained our literature embedding model using the collected abstracts. For the intrinsic evaluation, we analyzed whether or not the captured words and genes were correctly identified as risk factors and associated symptoms for the input query words. For the extrinsic evaluation, we used our embedding model as feature selection and dimensionality reduction tasks on cohort data for CVD prediction. Our model accurately (average accuracy of >96 %) captured associated risk factors, symptoms, and genes for a given input query word (intrinsic evaluation). Using the selected features and reduced dimensions, our method provided better performance for CVD prediction with less computational time when compared with other popular methods (extrinsic evaluation). Our model provided outstanding results for both evaluation tasks, which enables accurate risk factor identification for CVD prediction. Our model has the potential to facilitate easier collation of multifaceted information for better data mining of vast publicly available data so that efficient and accurate risk factors and symptoms can be identified, which enables better-informed decisions for CVD prediction and treatment. … (more)
- Is Part Of:
- Expert systems with applications. Volume 213:Part A(2023)
- Journal:
- Expert systems with applications
- Issue:
- Volume 213:Part A(2023)
- Issue Display:
- Volume 213, Issue 1 (2023)
- Year:
- 2023
- Volume:
- 213
- Issue:
- 1
- Issue Sort Value:
- 2023-0213-0001-0000
- Page Start:
- Page End:
- Publication Date:
- 2023-03-01
- Subjects:
- Machine learning -- Cardiovascular disease prediction -- Knowledge extraction -- Natural language processing -- Risk factor identification -- Information retrieval
Expert systems (Computer science) -- Periodicals
Systèmes experts (Informatique) -- Périodiques
Electronic journals
006.33 - Journal URLs:
- http://www.sciencedirect.com/science/journal/09574174 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.eswa.2022.118930 ↗
- Languages:
- English
- ISSNs:
- 0957-4174
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3842.004220
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 24386.xml