Use of unstructured text in prognostic clinical prediction models: a systematic review. (27th April 2022)
- Record Type:
- Journal Article
- Title:
- Use of unstructured text in prognostic clinical prediction models: a systematic review. (27th April 2022)
- Main Title:
- Use of unstructured text in prognostic clinical prediction models: a systematic review
- Authors:
- Seinen, Tom M
Fridgeirsson, Egill A
Ioannou, Solomon
Jeannetot, Daniel
John, Luis H
Kors, Jan A
Markus, Aniek F
Pera, Victor
Rekkas, Alexandros
Williams, Ross D
Yang, Cynthia
van Mulligen, Erik M
Rijnbeek, Peter R - Abstract:
- Abstract: Objective: This systematic review aims to assess how information from unstructured text is used to develop and validate clinical prognostic prediction models. We summarize the prediction problems and methodological landscape and determine whether using text data in addition to more commonly used structured data improves the prediction performance. Materials and Methods: We searched Embase, MEDLINE, Web of Science, and Google Scholar to identify studies that developed prognostic prediction models using information extracted from unstructured text in a data-driven manner, published in the period from January 2005 to March 2021. Data items were extracted, analyzed, and a meta-analysis of the model performance was carried out to assess the added value of text to structured-data models. Results: We identified 126 studies that described 145 clinical prediction problems. Combining text and structured data improved model performance, compared with using only text or only structured data. In these studies, a wide variety of dense and sparse numeric text representations were combined with both deep learning and more traditional machine learning methods. External validation, public availability, and attention for the explainability of the developed models were limited. Conclusion: The use of unstructured text in the development of prognostic prediction models has been found beneficial in addition to structured data in most studies. The text data are source of valuableAbstract: Objective: This systematic review aims to assess how information from unstructured text is used to develop and validate clinical prognostic prediction models. We summarize the prediction problems and methodological landscape and determine whether using text data in addition to more commonly used structured data improves the prediction performance. Materials and Methods: We searched Embase, MEDLINE, Web of Science, and Google Scholar to identify studies that developed prognostic prediction models using information extracted from unstructured text in a data-driven manner, published in the period from January 2005 to March 2021. Data items were extracted, analyzed, and a meta-analysis of the model performance was carried out to assess the added value of text to structured-data models. Results: We identified 126 studies that described 145 clinical prediction problems. Combining text and structured data improved model performance, compared with using only text or only structured data. In these studies, a wide variety of dense and sparse numeric text representations were combined with both deep learning and more traditional machine learning methods. External validation, public availability, and attention for the explainability of the developed models were limited. Conclusion: The use of unstructured text in the development of prognostic prediction models has been found beneficial in addition to structured data in most studies. The text data are source of valuable information for prediction model development and should not be neglected. We suggest a future focus on explainability and external validation of the developed models, promoting robust and trustworthy prediction models in clinical practice. … (more)
- Is Part Of:
- Journal of the American Medical Informatics Association. Volume 29:Number 7(2022)
- Journal:
- Journal of the American Medical Informatics Association
- Issue:
- Volume 29:Number 7(2022)
- Issue Display:
- Volume 29, Issue 7 (2022)
- Year:
- 2022
- Volume:
- 29
- Issue:
- 7
- Issue Sort Value:
- 2022-0029-0007-0000
- Page Start:
- 1292
- Page End:
- 1302
- Publication Date:
- 2022-04-27
- Subjects:
- clinical prediction model -- prognostic prediction -- natural language processing -- machine learning -- electronic health records
Medical informatics -- Periodicals
Information Services -- Periodicals
Medical Informatics -- Periodicals
Médecine -- Informatique -- Périodiques
Informatica
Geneeskunde
Informatique médicale
Computer network resources
Electronic journals
610.285 - Journal URLs:
- http://jamia.bmj.com/ ↗
http://www.jamia.org ↗
http://www.pubmedcentral.nih.gov/tocrender.fcgi?journal=76 ↗
http://www.sciencedirect.com/science/journal/10675027 ↗
http://jamia.oxfordjournals.org/ ↗
http://www.oxfordjournals.org/en/ ↗ - DOI:
- 10.1093/jamia/ocac058 ↗
- Languages:
- English
- ISSNs:
- 1067-5027
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4689.025000
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 21814.xml