Reliable or not? An automated classification of webpages about early childhood vaccination using supervised machine learning. Issue 6 (June 2021)
- Record Type:
- Journal Article
- Title:
- Reliable or not? An automated classification of webpages about early childhood vaccination using supervised machine learning. Issue 6 (June 2021)
- Main Title:
- Reliable or not? An automated classification of webpages about early childhood vaccination using supervised machine learning
- Authors:
- Meppelink, Corine S.
Hendriks, Hanneke
Trilling, Damian
van Weert, Julia C.M.
Shao, Anqi
Smit, Eline S. - Abstract:
- Highlights: Many people find it hard to determine whether online health information is reliable. Supervised machine learning (SML) is a useful way to classify online content. Four SML-models were trained on data about early childhood vaccination. Our best performing model also successfully classified texts about HPV vaccination. Basic classifiers are particularly useful to identify reliable information. Abstract: Objective: To investigate the applicability of supervised machine learning (SML) to classify health-related webpages as 'reliable' or 'unreliable' in an automated way. Methods: We collected the textual content of 468 different Dutch webpages about early childhood vaccination. Webpages were manually coded as 'reliable' or 'unreliable' based on their alignment with evidence-based vaccination guidelines. Four SML models were trained on part of the data, whereas the remaining data was used for model testing. Results: All models appeared to be successful in the automated identification of unreliable (F1 scores: 0.54–0.86) and reliable information (F1 scores: 0.82–0.91). Typical words for unreliable information are 'dr', 'immune system', and 'vaccine damage', whereas 'measles', 'child', and 'immunization rate', were frequent in reliable information. Our best performing model was also successful in terms of out-of-sample prediction, tested on a dataset about HPV vaccination. Conclusion: Automated classification of online content in terms of reliability, using basicHighlights: Many people find it hard to determine whether online health information is reliable. Supervised machine learning (SML) is a useful way to classify online content. Four SML-models were trained on data about early childhood vaccination. Our best performing model also successfully classified texts about HPV vaccination. Basic classifiers are particularly useful to identify reliable information. Abstract: Objective: To investigate the applicability of supervised machine learning (SML) to classify health-related webpages as 'reliable' or 'unreliable' in an automated way. Methods: We collected the textual content of 468 different Dutch webpages about early childhood vaccination. Webpages were manually coded as 'reliable' or 'unreliable' based on their alignment with evidence-based vaccination guidelines. Four SML models were trained on part of the data, whereas the remaining data was used for model testing. Results: All models appeared to be successful in the automated identification of unreliable (F1 scores: 0.54–0.86) and reliable information (F1 scores: 0.82–0.91). Typical words for unreliable information are 'dr', 'immune system', and 'vaccine damage', whereas 'measles', 'child', and 'immunization rate', were frequent in reliable information. Our best performing model was also successful in terms of out-of-sample prediction, tested on a dataset about HPV vaccination. Conclusion: Automated classification of online content in terms of reliability, using basic classifiers, performs well and is particularly useful to identify reliable information. Practice implications: The classifiers can be used as a starting point to develop more complex classifiers, but also warning tools which can help people evaluate the content they encounter online. … (more)
- Is Part Of:
- Patient education and counseling. Volume 104:Issue 6(2021)
- Journal:
- Patient education and counseling
- Issue:
- Volume 104:Issue 6(2021)
- Issue Display:
- Volume 104, Issue 6 (2021)
- Year:
- 2021
- Volume:
- 104
- Issue:
- 6
- Issue Sort Value:
- 2021-0104-0006-0000
- Page Start:
- 1460
- Page End:
- 1466
- Publication Date:
- 2021-06
- Subjects:
- Supervised machine learning -- Consumer health information -- Vaccination -- Misinformation -- Reliability
Patient education -- Periodicals
Health counseling -- Periodicals
Health education -- Periodicals
Counseling -- Periodicals
Patient Education -- Periodicals
Éducation des patients -- Périodiques
Counseling -- Périodiques
Éducation sanitaire -- Périodiques
615.5071 - Journal URLs:
- http://www.sciencedirect.com/science/journal/07383991 ↗
http://www.clinicalkey.com/dura/browse/journalIssue/07383991 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.pec.2020.11.013 ↗
- Languages:
- English
- ISSNs:
- 0738-3991
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 6412.864600
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 16886.xml