ScHFC: a hybrid fuzzy clustering method for single-cell RNA-seq data optimized by natural computation. Issue 2 (5th February 2022)
- Record Type:
- Journal Article
- Title:
- ScHFC: a hybrid fuzzy clustering method for single-cell RNA-seq data optimized by natural computation. Issue 2 (5th February 2022)
- Main Title:
- ScHFC: a hybrid fuzzy clustering method for single-cell RNA-seq data optimized by natural computation
- Authors:
- Wang, Jing
Xia, Junfeng
Tan, Dayu
Lin, Rongxin
Su, Yansen
Zheng, Chun-Hou - Abstract:
- Abstract: Rapid development of single-cell RNA sequencing (scRNA-seq) technology has allowed researchers to explore biological phenomena at the cellular scale. Clustering is a crucial and helpful step for researchers to study the heterogeneity of cell. Although many clustering methods have been proposed, massive dropout events and the curse of dimensionality in scRNA-seq data make it still difficult to analysis because they reduce the accuracy of clustering methods, leading to misidentification of cell types. In this work, we propose the scHFC, which is a hybrid fuzzy clustering method optimized by natural computation based on Fuzzy C Mean (FCM) and Gath-Geva (GG) algorithms. Specifically, principal component analysis algorithm is utilized to reduce the dimensions of scRNA-seq data after it is preprocessed. Then, FCM algorithm optimized by simulated annealing algorithm and genetic algorithm is applied to cluster the data to output a membership matrix, which represents the initial clustering result and is taken as the input for GG algorithm to get the final clustering results. We also develop a cluster number estimation method called multi-index comprehensive estimation, which can estimate the cluster numbers well by combining four clustering effectiveness indexes. The performance of the scHFC method is evaluated on 17 scRNA-seq datasets, and compared with six state-of-the-art methods. Experimental results validate the better performance of our scHFC method in terms ofAbstract: Rapid development of single-cell RNA sequencing (scRNA-seq) technology has allowed researchers to explore biological phenomena at the cellular scale. Clustering is a crucial and helpful step for researchers to study the heterogeneity of cell. Although many clustering methods have been proposed, massive dropout events and the curse of dimensionality in scRNA-seq data make it still difficult to analysis because they reduce the accuracy of clustering methods, leading to misidentification of cell types. In this work, we propose the scHFC, which is a hybrid fuzzy clustering method optimized by natural computation based on Fuzzy C Mean (FCM) and Gath-Geva (GG) algorithms. Specifically, principal component analysis algorithm is utilized to reduce the dimensions of scRNA-seq data after it is preprocessed. Then, FCM algorithm optimized by simulated annealing algorithm and genetic algorithm is applied to cluster the data to output a membership matrix, which represents the initial clustering result and is taken as the input for GG algorithm to get the final clustering results. We also develop a cluster number estimation method called multi-index comprehensive estimation, which can estimate the cluster numbers well by combining four clustering effectiveness indexes. The performance of the scHFC method is evaluated on 17 scRNA-seq datasets, and compared with six state-of-the-art methods. Experimental results validate the better performance of our scHFC method in terms of clustering accuracy and stability of algorithm. In short, scHFC is an effective method to cluster cells for scRNA-seq data, and it presents great potential for downstream analysis of scRNA-seq data. The source code is available at https://github.com/WJ319/scHFC . … (more)
- Is Part Of:
- Briefings in bioinformatics. Volume 23:Issue 2(2022)
- Journal:
- Briefings in bioinformatics
- Issue:
- Volume 23:Issue 2(2022)
- Issue Display:
- Volume 23, Issue 2 (2022)
- Year:
- 2022
- Volume:
- 23
- Issue:
- 2
- Issue Sort Value:
- 2022-0023-0002-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-02-05
- Subjects:
- scRNA-seq -- cell clustering -- fuzzy C Mean -- Gath-Geva -- natural computation
Genetics -- Data processing -- Periodicals
Molecular biology -- Data processing -- Periodicals
Genomes -- Data processing -- Periodicals
572.80285 - Journal URLs:
- http://bib.oxfordjournals.org ↗
http://www.oxfordjournals.org/content?genre=journal&issn=1477-4054 ↗
http://ukcatalogue.oup.com/ ↗
http://firstsearch.oclc.org ↗ - DOI:
- 10.1093/bib/bbab588 ↗
- Languages:
- English
- ISSNs:
- 1467-5463
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 2283.958363
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 20751.xml