A semi-supervised deep learning method based on stacked sparse auto-encoder for cancer prediction using RNA-seq data. (November 2018)
- Record Type:
- Journal Article
- Title:
- A semi-supervised deep learning method based on stacked sparse auto-encoder for cancer prediction using RNA-seq data. (November 2018)
- Main Title:
- A semi-supervised deep learning method based on stacked sparse auto-encoder for cancer prediction using RNA-seq data
- Authors:
- Xiao, Yawen
Wu, Jun
Lin, Zongli
Zhao, Xiaodong - Abstract:
- Highlights: A deep learning model, the stacked sparse auto-encoder based model, is proposed for cancer prediction. The deep learning model, with pre-training and sparsity, outperforms classical models. Important and abstract features are extracted. Prediction results are presented on lung, stomach and breast cancer data. Abstract: Background and objective: Cancer has become a complex health problem due to its high mortality. Over the past few decades, with the rapid development of the high-throughput sequencing technology and the application of various machine learning methods, remarkable progress in cancer research has been made based on gene expression data. At the same time, a growing amount of high-dimensional data has been generated, such as RNA-seq data, which calls for superior machine learning methods able to deal with mass data effectively in order to make accurate treatment decision. Methods: In this paper, we present a semi-supervised deep learning strategy, the stacked sparse auto-encoder (SSAE) based classification, for cancer prediction using RNA-seq data. The proposed SSAE based method employs the greedy layer-wise pre-training and a sparsity penalty term to help capture and extract important information from the high-dimensional data and then classify the samples. Results: We tested the proposed SSAE model on three public RNA-seq data sets of three types of cancers and compared the prediction performance with several commonly-used classification methods. TheHighlights: A deep learning model, the stacked sparse auto-encoder based model, is proposed for cancer prediction. The deep learning model, with pre-training and sparsity, outperforms classical models. Important and abstract features are extracted. Prediction results are presented on lung, stomach and breast cancer data. Abstract: Background and objective: Cancer has become a complex health problem due to its high mortality. Over the past few decades, with the rapid development of the high-throughput sequencing technology and the application of various machine learning methods, remarkable progress in cancer research has been made based on gene expression data. At the same time, a growing amount of high-dimensional data has been generated, such as RNA-seq data, which calls for superior machine learning methods able to deal with mass data effectively in order to make accurate treatment decision. Methods: In this paper, we present a semi-supervised deep learning strategy, the stacked sparse auto-encoder (SSAE) based classification, for cancer prediction using RNA-seq data. The proposed SSAE based method employs the greedy layer-wise pre-training and a sparsity penalty term to help capture and extract important information from the high-dimensional data and then classify the samples. Results: We tested the proposed SSAE model on three public RNA-seq data sets of three types of cancers and compared the prediction performance with several commonly-used classification methods. The results indicate that our approach outperforms the other methods for all the three cancer data sets in various metrics. Conclusions: The proposed SSAE based semi-supervised deep learning model shows its promising ability to process high-dimensional gene expression data and is proved to be effective and accurate for cancer prediction. … (more)
- Is Part Of:
- Computer methods and programs in biomedicine. Volume 166(2018)
- Journal:
- Computer methods and programs in biomedicine
- Issue:
- Volume 166(2018)
- Issue Display:
- Volume 166, Issue 2018 (2018)
- Year:
- 2018
- Volume:
- 166
- Issue:
- 2018
- Issue Sort Value:
- 2018-0166-2018-0000
- Page Start:
- 99
- Page End:
- 105
- Publication Date:
- 2018-11
- Subjects:
- Stacked sparse auto-encoder -- Cancer prediction -- Gene expression data -- Semi-supervised learning -- Deep learning
Medicine -- Computer programs -- Periodicals
Biology -- Computer programs -- Periodicals
Computers -- Periodicals
Medicine -- Periodicals
Médecine -- Logiciels -- Périodiques
Biologie -- Logiciels -- Périodiques
Biology -- Computer programs
Medicine -- Computer programs
Periodicals
Electronic journals
610.28 - Journal URLs:
- http://www.sciencedirect.com/science/journal/01692607 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.cmpb.2018.10.004 ↗
- Languages:
- English
- ISSNs:
- 0169-2607
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.095000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 8546.xml