Privacy preserving distributed learning classifiers – Sequential learning with small sets of data. (September 2021)

Record Type:: Journal Article
Title:: Privacy preserving distributed learning classifiers – Sequential learning with small sets of data. (September 2021)
Main Title:: Privacy preserving distributed learning classifiers – Sequential learning with small sets of data
Authors:: Zerka, Fadila
Urovi, Visara
Bottari, Fabio
Leijenaar, Ralph T.H.
Walsh, Sean
Gabrani-Juma, Hanif
Gueuning, Martin
Vaidyanathan, Akshayaa
Vos, Wim
Occhipinti, Mariaelena
Woodruff, Henry C.
Dumontier, Michel
Lambin, Philippe
Abstract:: Abstract: Background: Artificial intelligence (AI) typically requires a significant amount of high-quality data to build reliable models, where gathering enough data within a single institution can be particularly challenging. In this study we investigated the impact of using sequential learning to exploit very small, siloed sets of clinical and imaging data to train AI models. Furthermore, we evaluated the capacity of such models to achieve equivalent performance when compared to models trained with the same data over a single centralized database. Methods: We propose a privacy preserving distributed learning framework, learning sequentially from each dataset. The framework is applied to three machine learning algorithms: Logistic Regression, Support Vector Machines (SVM), and Perceptron. The models were evaluated using four open-source datasets (Breast cancer, Indian liver, NSCLC-Radiomics dataset, and Stage III NSCLC). Findings: The proposed framework ensured a comparable predictive performance against a centralized learning approach. Pairwise DeLong tests showed no significant difference between the compared pairs for each dataset. Interpretation: Distributed learning contributes to preserve medical data privacy. We foresee this technology will increase the number of collaborative opportunities to develop robust AI, becoming the default solution in scenarios where collecting enough data from a single reliable source is logistically impossible. Distributed sequential … (more)
Is Part Of:: Computers in biology and medicine. Volume 136(2021)
Journal:: Computers in biology and medicine
Issue:: Volume 136(2021)
Issue Display:: Volume 136, Issue 2021 (2021)
Year:: 2021
Volume:: 136
Issue:: 2021
Issue Sort Value:: 2021-0136-2021-0000
Page Start:
Page End:
Publication Date:: 2021-09
Subjects:: Distributed learning -- Sequential learning -- Rare disease -- Medical data privacy
Medicine -- Data processing -- Periodicals
Biology -- Data processing -- Periodicals
610.285
Journal URLs:: http://www.sciencedirect.com/science/journal/00104825/ ↗
http://www.elsevier.com/journals ↗
DOI:: 10.1016/j.compbiomed.2021.104716 ↗
Languages:: English
ISSNs:: 0010-4825
Deposit Type:: Legaldeposit
View Content:: Available online (eLD content is only available in our Reading Rooms) ↗
Physical Locations:: British Library DSC - 3394.880000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store
Ingest File:: 18638.xml