An adaptive Laplacian weight random forest imputation for imbalance and mixed-type data. Issue 111 (January 2023)
- Record Type:
- Journal Article
- Title:
- An adaptive Laplacian weight random forest imputation for imbalance and mixed-type data. Issue 111 (January 2023)
- Main Title:
- An adaptive Laplacian weight random forest imputation for imbalance and mixed-type data
- Authors:
- Ren, Lijuan
Seklouli, Aicha Sekhari
Zhang, Haiqing
Wang, Tao
Bouras, Abdelaziz - Abstract:
- Abstract: As the application of information technology in the medical field is resulting in a large amount of medical data. As early withdrawal and refusal of participants, there are a lot of missing values in medical data. Although various processing methods for missing values have been proposed, few methods for those medical data with characteristics of imbalance and mixed-type data. In this work, we proposed an adaptive Laplacian weight random forest, called ALWRF. In ALWRF, feature weights were adjusted dynamically when model constructing, which increases selection probabilities of features with low Laplacian score and high importance. Meanwhile, a random operator is introduced to increase the diversity of trees. Furthermore, we proposed an imputation method based on SMOTE-NC oversampling technology and the ALWRF method for imbalanced and mixed-type data, called SncALWRFI. Meanwhile, Bayesian optimization and cross-validation were employed to search optimal parameters. The experimental results showed that the ALWRF method outperforms random forest and Bayesian optimized random forest in terms of classification and regression accuracy. Further, in the experiment for missing values, the SncALWRFI showed the best imputation accuracy, and it performed high imputation effectiveness in public datasets with characteristics of imbalanced and mixed-type.
- Is Part Of:
- Information systems. Issue 111(2023)
- Journal:
- Information systems
- Issue:
- Issue 111(2023)
- Issue Display:
- Volume 111, Issue 111 (2023)
- Year:
- 2023
- Volume:
- 111
- Issue:
- 111
- Issue Sort Value:
- 2023-0111-0111-0000
- Page Start:
- Page End:
- Publication Date:
- 2023-01
- Subjects:
- Missing values -- Imputation -- Random forest -- Mixed-type -- Imbalanced
Database management -- Periodicals
Electronic data processing -- Periodicals
Bases de données -- Gestion -- Périodiques
Informatique -- Périodiques
Database management
Electronic data processing
Periodicals
005.7 - Journal URLs:
- http://www.sciencedirect.com/science/journal/03064379 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.is.2022.102122 ↗
- Languages:
- English
- ISSNs:
- 0306-4379
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 4496.367300
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 24109.xml