The COPILOT Raw Illumina Genotyping QC Protocol. Issue 4 (22nd April 2022)
- Record Type:
- Journal Article
- Title:
- The COPILOT Raw Illumina Genotyping QC Protocol. Issue 4 (22nd April 2022)
- Main Title:
- The COPILOT Raw Illumina Genotyping QC Protocol
- Authors:
- Patel, Hamel
Lee, Sang‐Hyuck
Breen, Gerome
Menzel, Stephen
Ojewunmi, Oyesola
Dobson, Richard J.B. - Abstract:
- Abstract: The Illumina genotyping microarrays generate data in image format, which is processed by the platform‐specific software GenomeStudio, followed by an array of complex bioinformatics analyses that rely on various software, different programming languages, and numerous dependencies to be installed and configured correctly. The entire process can be time‐consuming, can lead to reproducibility errors, and can be a daunting task for bioinformaticians. To address this, we introduce the COPILOT protocol, which has been successfully used to transform raw Illumina genotype intensity data into high‐quality analysis‐ready data on tens of thousands of human patient samples that have been genotyped on a variety of Illumina genotyping arrays. This includes processing both mainstream and custom content genotyping chips with over 4 million markers per sample. The COPILOT QC protocol consists of two distinct tandem procedures to process raw Illumina genotyping data. The first protocol is an up‐to‐date process to systematically QC raw Illumina microarray genotyping data using the Illumina‐specific GenomeStudio software. The second protocol takes the output from the first protocol and further processes the data through the COPILOT (C ontainerised wO rkflow for P rocessing IL lumina genO typing daT a) containerized QC pipeline, to automate an array of complex bioinformatics analyses to improve data quality through a secondary clustering algorithm and to automatically identify typicalAbstract: The Illumina genotyping microarrays generate data in image format, which is processed by the platform‐specific software GenomeStudio, followed by an array of complex bioinformatics analyses that rely on various software, different programming languages, and numerous dependencies to be installed and configured correctly. The entire process can be time‐consuming, can lead to reproducibility errors, and can be a daunting task for bioinformaticians. To address this, we introduce the COPILOT protocol, which has been successfully used to transform raw Illumina genotype intensity data into high‐quality analysis‐ready data on tens of thousands of human patient samples that have been genotyped on a variety of Illumina genotyping arrays. This includes processing both mainstream and custom content genotyping chips with over 4 million markers per sample. The COPILOT QC protocol consists of two distinct tandem procedures to process raw Illumina genotyping data. The first protocol is an up‐to‐date process to systematically QC raw Illumina microarray genotyping data using the Illumina‐specific GenomeStudio software. The second protocol takes the output from the first protocol and further processes the data through the COPILOT (C ontainerised wO rkflow for P rocessing IL lumina genO typing daT a) containerized QC pipeline, to automate an array of complex bioinformatics analyses to improve data quality through a secondary clustering algorithm and to automatically identify typical Genome‐Wide Association Study (GWAS) data issues, including gender discrepancies, heterozygosity outliers, related individuals, and population outliers, through ancestry estimation. The data is returned to the user in analysis‐ready PLINK binary format and is accompanied by a comprehensive and interactive HTML summary report file which quickly helps the user understand the data and guides the user for further data analyses. The COPILOT protocol and containerized pipeline are also available at https://khp‐informatics.github.io/COPILOT/index.html . © 2022 The Authors. Current Protocols published by Wiley Periodicals LLC. Basic Protocol 1 : Processing raw Illumina genotyping data using GenomeStudio Basic Protocol 2 : COPILOT: A containerised workflow for processing Illumina genotyping data … (more)
- Is Part Of:
- Current protocols. Volume 2:Issue 4(2022)
- Journal:
- Current protocols
- Issue:
- Volume 2:Issue 4(2022)
- Issue Display:
- Volume 2, Issue 4 (2022)
- Year:
- 2022
- Volume:
- 2
- Issue:
- 4
- Issue Sort Value:
- 2022-0002-0004-0000
- Page Start:
- n/a
- Page End:
- n/a
- Publication Date:
- 2022-04-22
- Subjects:
- docker -- genotyping -- GWAS -- Illumina -- QC pipeline
Life sciences -- Laboratory manuals -- Periodicals
Biology -- Laboratory manuals -- Periodicals
Life sciences -- Technique -- Periodicals
Biology -- Technique -- Periodicals
570.028 - Journal URLs:
- https://currentprotocols.onlinelibrary.wiley.com/journal/26911299 ↗
http://onlinelibrary.wiley.com/ ↗ - DOI:
- 10.1002/cpz1.373 ↗
- Languages:
- English
- ISSNs:
- 2691-1299
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 26933.xml