Alzheimer-type dementia prediction by sparse logistic regression using claim data. (November 2020)
- Record Type:
- Journal Article
- Title:
- Alzheimer-type dementia prediction by sparse logistic regression using claim data. (November 2020)
- Main Title:
- Alzheimer-type dementia prediction by sparse logistic regression using claim data
- Authors:
- Fukunishi, Hiroaki
Nishiyama, Mitsuki
Luo, Yuan
Kubo, Masahiro
Kobayashi, Yasuki - Abstract:
- Highlights: This study developed Alzheimer-type dementia prediction model based on health insurance claim data and long-term care claim data for Japanese elderly. Feature selection was critical issue for utilizing claim data including a large amount of information. Sparse logistic regression models with L0 regularization (SLR-L0) and L1 regularization (SLR-L1) were used for feature selection. SLR-L0 was more effective for selecting influential features than SLR-L1. Abstract: This study aimed to predict the risk of Alzheimer-type dementia for persons aged over 75 years old without receiving long-term care services using regularly collected claim data. A refined dataset including 48, 123 persons was prepared from claim data of health insurance and long-term care insurance in a large city in the metropolitan area in Japan. The utilized features include the age and sex of subjects, 502 diseases based on ICD-10 diagnosis codes, and 107 prescription drugs based on therapeutic classes. The most important challenge in this work was feature selection form a large number of features. We adopted sparse logistic regression models with L0 regularization (SLR-L0) and L1 regularization (SLR-L1) as classification models based on machine learning. These regularizations enable feature selection by estimating sparse solution of non-zero coefficients in the model optimization. Predictions were performed by integrating 100 predictors trained by bootstrap samples. As a result, the area under theHighlights: This study developed Alzheimer-type dementia prediction model based on health insurance claim data and long-term care claim data for Japanese elderly. Feature selection was critical issue for utilizing claim data including a large amount of information. Sparse logistic regression models with L0 regularization (SLR-L0) and L1 regularization (SLR-L1) were used for feature selection. SLR-L0 was more effective for selecting influential features than SLR-L1. Abstract: This study aimed to predict the risk of Alzheimer-type dementia for persons aged over 75 years old without receiving long-term care services using regularly collected claim data. A refined dataset including 48, 123 persons was prepared from claim data of health insurance and long-term care insurance in a large city in the metropolitan area in Japan. The utilized features include the age and sex of subjects, 502 diseases based on ICD-10 diagnosis codes, and 107 prescription drugs based on therapeutic classes. The most important challenge in this work was feature selection form a large number of features. We adopted sparse logistic regression models with L0 regularization (SLR-L0) and L1 regularization (SLR-L1) as classification models based on machine learning. These regularizations enable feature selection by estimating sparse solution of non-zero coefficients in the model optimization. Predictions were performed by integrating 100 predictors trained by bootstrap samples. As a result, the area under the ROC curves (AUCs) were 0.663 for SLR-L0 and 0.660 for SLR-L1. These performances were similar, however, the average numbers of selected features were 13 out of a total of 611 for SLR-L0 and 253 for SLR-R1. The results indicate that SLR-L1 tended to include less useful features, whereas SLR-L0 narrowed down influential features. SLR-L0 might be more useful than SLR-L1 for practical use or the discussion of risk factors with medical experts. … (more)
- Is Part Of:
- Computer methods and programs in biomedicine. Volume 196(2020)
- Journal:
- Computer methods and programs in biomedicine
- Issue:
- Volume 196(2020)
- Issue Display:
- Volume 196, Issue 2020 (2020)
- Year:
- 2020
- Volume:
- 196
- Issue:
- 2020
- Issue Sort Value:
- 2020-0196-2020-0000
- Page Start:
- Page End:
- Publication Date:
- 2020-11
- Subjects:
- Machine learning -- Alzheimer-type dementia -- Prediction -- Sparse logistic regression -- Health insurance claim data -- Long-term care insurance claim data
Medicine -- Computer programs -- Periodicals
Biology -- Computer programs -- Periodicals
Computers -- Periodicals
Medicine -- Periodicals
Médecine -- Logiciels -- Périodiques
Biologie -- Logiciels -- Périodiques
Biology -- Computer programs
Medicine -- Computer programs
Periodicals
Electronic journals
610.28 - Journal URLs:
- http://www.sciencedirect.com/science/journal/01692607 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.cmpb.2020.105582 ↗
- Languages:
- English
- ISSNs:
- 0169-2607
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.095000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 14758.xml