Effects of dataset size and interactions on the prediction performance of logistic regression and deep learning models. (January 2022)

Record Type:: Journal Article
Title:: Effects of dataset size and interactions on the prediction performance of logistic regression and deep learning models. (January 2022)
Main Title:: Effects of dataset size and interactions on the prediction performance of logistic regression and deep learning models
Authors:: Bailly, Alexandre
Blanc, Corentin
Francis, Élie
Guillotin, Thierry
Jamal, Fadi
Wakim, Béchara
Roy, Pascal
Abstract:: Highlights: Machine Learning and Deep Learning models are powerful disease prediction tools. Some of these models were trained on various dataset sizes and compared. ML models were less influenced by the dataset size but needed interaction terms. DL models achieved good results without interaction terms. Well-specified ML models performed better than DL models. Abstract: Background and objective: Machine learning and deep learning models are very powerful in predicting the presence of a disease. To achieve good predictions, those models require a certain amount of data to train on, whereas this amount i) is generally limited and difficult to obtain; and, ii) increases with the complexity of the interactions between the outcome (disease presence) and the model variables. This study compares the ways training dataset size and interactions affect the performance of those prediction models. Methods: To compare the two influences, several datasets were simulated that differed in the number of observations and the complexity of the interactions between the variables and the outcome. A few logistic regressions and neural networks were trained on the simulated datasets and their performance evaluated by cross-validation and compared using accuracy, F1 score, and AUC metrics. Results: Models trained on simulated datasets without interactions provided good results: AUC close to 0.80 with either logistic regression or neural networks. Models trained on simulated dataset with order 2 … (more)
Is Part Of:: Computer methods and programs in biomedicine. Volume 213(2022)
Journal:: Computer methods and programs in biomedicine
Issue:: Volume 213(2022)
Issue Display:: Volume 213, Issue 2022 (2022)
Year:: 2022
Volume:: 213
Issue:: 2022
Issue Sort Value:: 2022-0213-2022-0000
Page Start:
Page End:
Publication Date:: 2022-01
Subjects:: Deep learning -- Machine learning -- Data set size -- Interactions
Medicine -- Computer programs -- Periodicals
Biology -- Computer programs -- Periodicals
Computers -- Periodicals
Medicine -- Periodicals
Médecine -- Logiciels -- Périodiques
Biologie -- Logiciels -- Périodiques
Biology -- Computer programs
Medicine -- Computer programs
Periodicals
Electronic journals
610.28
Journal URLs:: http://www.sciencedirect.com/science/journal/01692607 ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.cmpb.2021.106504 ↗
Languages:: English
ISSNs:: 0169-2607
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3394.095000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 20071.xml