Unsupervised ensemble learning for genome sequencing. (September 2022)
- Record Type:
- Journal Article
- Title:
- Unsupervised ensemble learning for genome sequencing. (September 2022)
- Main Title:
- Unsupervised ensemble learning for genome sequencing
- Authors:
- Pagès-Zamora, Alba
Ochoa, Idoia
Cavero, Gonzalo Ruiz
Villalvilla-Ornat, Pol - Abstract:
- Highlights: The variant calling step in next generation sequencing technologies is formulated as a classification problem. An unsupervised ensemble classification method is proposed as a variant caller for DNA sequencing. An EM-based variant calling algorithm that estimates the maximum a posteriori class to take a decision is presented. The number of classes to be decided is greater than the number of different labels that are observed. Experimental results with real human DNA sequencing data support the approach. Abstract: Unsupervised ensemble learning refers to methods devised for a particular task that combine data provided by decision learners taking into account their reliability, which is usually inferred from the data. Here, the variant calling step of the next generation sequencing technologies is formulated as an unsupervised ensemble classification problem. A variant calling algorithm based on the expectation-maximization algorithm is further proposed that estimates the maximum-a-posteriori decision among a number of classes larger than the number of different labels provided by the learners. Experimental results with real human DNA sequencing data show that the proposed algorithm is competitive compared to state-of-the-art variant callers as GATK, HTSLIB, and Platypus.
- Is Part Of:
- Pattern recognition. Volume 129(2022)
- Journal:
- Pattern recognition
- Issue:
- Volume 129(2022)
- Issue Display:
- Volume 129, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 129
- Issue:
- 2022
- Issue Sort Value:
- 2022-0129-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-09
- Subjects:
- Expectation maximization algorithm -- Variant calling -- Genome sequencing -- Unsupervised multi-class ensemble classifier -- GATK
Pattern perception -- Periodicals
Perception des structures -- Périodiques
Patroonherkenning
006.4 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00313203 ↗
http://www.sciencedirect.com/ ↗ - DOI:
- 10.1016/j.patcog.2022.108721 ↗
- Languages:
- English
- ISSNs:
- 0031-3203
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 21584.xml