Using imputation to harmonize longitudinal measures of cognition across two large cohorts: AIBL and ADNI: Biomarkers (non‐neuroimaging) / Longitudinal change over time. (7th December 2020)
- Record Type:
- Journal Article
- Title:
- Using imputation to harmonize longitudinal measures of cognition across two large cohorts: AIBL and ADNI: Biomarkers (non‐neuroimaging) / Longitudinal change over time. (7th December 2020)
- Main Title:
- Using imputation to harmonize longitudinal measures of cognition across two large cohorts: AIBL and ADNI
- Authors:
- Shishegar, Rosita
Cox, Timothy
Rolls, David
Dore, Vincent
Lamb, Fiona
Robertson, Joanne S
Laws, Simon M
Porter, Tenielle
Maruff, Paul T
Savage, Greg
Rowe, Christopher C
Masters, Colin L
Weiner, Michael W.
Villemagne, Victor LL
Burnham, Samantha C - Abstract:
- Abstract: Background: To ensure the generalisability of findings and consider more nuanced hypotheses, larger sample sizes are required. Combining data from different but similar study cohorts is one solution. However, the disparity of these datasets, e.g. using differing tests to assess specific cognitive domains, makes this a non‐trivial task. Here, we propose a harmonisation solution using imputation strategies 1, 2 for cognitive memory performance in AIBL 3 and ADNI 4 . Method: AIBL participants (N=1813 [1180 NC, 297 MCI and 328 AD; aged 72.52±7.72; 777 males]) and ADNI participants (N=1945 [756 NC, 820 MCI, 369 AD; aged 73.92±7.23; 1028 males]) were included in this study. Scores for tests administered in one cohort but not the other were imputed in the cohort for which they were missing (with the RAVLT 5 being used in ADNI and the CVLT‐II 6 in AIBL). Non‐parametric multivariate imputation using random forests (missForest) 7 was employed to impute the 'missing' data and test scores across both cohorts from the full neuropsychological testing batteries, and age, gender, years of education and APOE ‐ɛ4 8 status were used to inform the models. The discriminatory power of the actual and imputed scores to differentiate between NC, MCI & AD was assessed. To further validate the method, data for Logical Memory II (LMII; administered in both cohorts) was simulated as missing in each cohort, respectively, and the correlation between the actual and imputed LMII test scores wasAbstract: Background: To ensure the generalisability of findings and consider more nuanced hypotheses, larger sample sizes are required. Combining data from different but similar study cohorts is one solution. However, the disparity of these datasets, e.g. using differing tests to assess specific cognitive domains, makes this a non‐trivial task. Here, we propose a harmonisation solution using imputation strategies 1, 2 for cognitive memory performance in AIBL 3 and ADNI 4 . Method: AIBL participants (N=1813 [1180 NC, 297 MCI and 328 AD; aged 72.52±7.72; 777 males]) and ADNI participants (N=1945 [756 NC, 820 MCI, 369 AD; aged 73.92±7.23; 1028 males]) were included in this study. Scores for tests administered in one cohort but not the other were imputed in the cohort for which they were missing (with the RAVLT 5 being used in ADNI and the CVLT‐II 6 in AIBL). Non‐parametric multivariate imputation using random forests (missForest) 7 was employed to impute the 'missing' data and test scores across both cohorts from the full neuropsychological testing batteries, and age, gender, years of education and APOE ‐ɛ4 8 status were used to inform the models. The discriminatory power of the actual and imputed scores to differentiate between NC, MCI & AD was assessed. To further validate the method, data for Logical Memory II (LMII; administered in both cohorts) was simulated as missing in each cohort, respectively, and the correlation between the actual and imputed LMII test scores was assessed. Result: Similarly, high levels of significant discrimination (p<.001) between clinical classifications were observed for the actual and the imputed scores of the CVLT‐II and RAVLT (Figure 1). Comparing the imputed LMII test scores which were simulated as missing to the actual LMII scores resulted in correlation coefficients of ≥90%. Conclusion: The need to combine large datasets will increase as we aim to improve the generalisability of findings and test more complex hypotheses. The results here suggest it is possible to use data imputation, capitalising on underlying structures and relationships, to impute specific tests scores in a cohort for which that test what not administered. In turn, providing a practical solution for data harmonization across large, cross‐sectional and longitudinal datasets. References: 1 doi:10.1038/srep21689. 2 doi:10.1002/sim.5894. 3 doi:10.1017/S1041610209009405. 4 doi:10.1212/WNL.0b013e3181cb3e25. 5 doi:10.1002/1097‐4679(199311)49:6<883::AID‐JCLP2270490616>3.0.CO;2‐6 6 doi:10.1076/clin.14.4.526.7204. 7 doi:10.1093/bioinformatics/btr597. 8 doi:10.1212/wnl.43.8.1467. … (more)
- Is Part Of:
- Alzheimer's & dementia. Volume 16(2020)Supplement 5
- Journal:
- Alzheimer's & dementia
- Issue:
- Volume 16(2020)Supplement 5
- Issue Display:
- Volume 16, Issue 5 (2020)
- Year:
- 2020
- Volume:
- 16
- Issue:
- 5
- Issue Sort Value:
- 2020-0016-0005-0000
- Page Start:
- n/a
- Page End:
- n/a
- Publication Date:
- 2020-12-07
- Subjects:
- Alzheimer's disease -- Periodicals
Alzheimer Disease -- Periodicals
Dementia -- Periodicals
Démence
Maladie d'Alzheimer
Périodique électronique (Descripteur de forme)
Ressource Internet (Descripteur de forme)
616.83 - Journal URLs:
- http://www.sciencedirect.com/science/journal/15525260 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1002/alz.044302 ↗
- Languages:
- English
- ISSNs:
- 1552-5260
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 0806.255333
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 15097.xml