Integrative nearest neighbor classifier for block-missing multi-modality data. (July 2022)
- Record Type:
- Journal Article
- Title:
- Integrative nearest neighbor classifier for block-missing multi-modality data. (July 2022)
- Main Title:
- Integrative nearest neighbor classifier for block-missing multi-modality data
- Authors:
- Yu, Guan
Hou, Surui - Abstract:
- In modern biomedical classification applications, data are often collected from multiple modalities, ranging from various omics technologies to brain scans. As different modalities provide complementary information, classifiers using multi-modality data usually have good classification performance. However, in many studies, due to the high cost of measures, in a lot of samples, some modalities are missing and therefore all data from those modalities are missing completely. In this case, the training data set is a block-missing multi-modality data set. In this paper, considering such classification problems, we develop a new weighted nearest neighbors classifier, called the integrative nearest neighbor (INN) classifier. INN harnesses all available information in the training data set and the feature vector of the test data point effectively to predict the class label of the test data point without deleting or imputing any missing data. Given a test data point, INN determines the weights on the training samples adaptively by minimizing the worst-case upper bound on the estimation error of the regression function over a convex class of functions. Our simulation study shows that INN outperforms common weighted nearest neighbors classifiers that only use complete training samples or modalities that are available in each sample. It performs better than methods that impute the missing data as well, even for the case where some modalities are missing not at random. The effectivenessIn modern biomedical classification applications, data are often collected from multiple modalities, ranging from various omics technologies to brain scans. As different modalities provide complementary information, classifiers using multi-modality data usually have good classification performance. However, in many studies, due to the high cost of measures, in a lot of samples, some modalities are missing and therefore all data from those modalities are missing completely. In this case, the training data set is a block-missing multi-modality data set. In this paper, considering such classification problems, we develop a new weighted nearest neighbors classifier, called the integrative nearest neighbor (INN) classifier. INN harnesses all available information in the training data set and the feature vector of the test data point effectively to predict the class label of the test data point without deleting or imputing any missing data. Given a test data point, INN determines the weights on the training samples adaptively by minimizing the worst-case upper bound on the estimation error of the regression function over a convex class of functions. Our simulation study shows that INN outperforms common weighted nearest neighbors classifiers that only use complete training samples or modalities that are available in each sample. It performs better than methods that impute the missing data as well, even for the case where some modalities are missing not at random. The effectiveness of INN has been also demonstrated by our theoretical studies and a real application from the Alzheimer's disease neuroimaging initiative. … (more)
- Is Part Of:
- Statistical methods in medical research. Volume 31:Number 7(2022)
- Journal:
- Statistical methods in medical research
- Issue:
- Volume 31:Number 7(2022)
- Issue Display:
- Volume 31, Issue 7 (2022)
- Year:
- 2022
- Volume:
- 31
- Issue:
- 7
- Issue Sort Value:
- 2022-0031-0007-0000
- Page Start:
- 1242
- Page End:
- 1262
- Publication Date:
- 2022-07
- Subjects:
- Block-missing -- classification -- multi-modality data -- nearest neighbor -- prediction
Medicine -- Research -- Statistical methods -- Periodicals
Research -- Periodicals
Review Literature -- Periodicals
Statistics -- methods -- Periodicals
Médecine -- Recherche -- Méthodes statistiques -- Périodiques
610.727 - Journal URLs:
- http://smm.sagepub.com/ ↗
http://www.ingentaselect.com/rpsv/cw/arn/09622802/contp1.htm ↗
http://www.uk.sagepub.com/home.nav ↗
http://firstsearch.oclc.org ↗
http://firstsearch.oclc.org/journal=0962-2802;screen=info;ECOIP ↗ - DOI:
- 10.1177/09622802221084596 ↗
- Languages:
- English
- ISSNs:
- 0962-2802
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 21495.xml