Assessing polygenic risk score models for applications in populations with under-represented genomics data: an example of Vietnam. Issue 6 (2nd November 2022)
- Record Type:
- Journal Article
- Title:
- Assessing polygenic risk score models for applications in populations with under-represented genomics data: an example of Vietnam. Issue 6 (2nd November 2022)
- Main Title:
- Assessing polygenic risk score models for applications in populations with under-represented genomics data: an example of Vietnam
- Authors:
- Pham, Duy
Truong, Buu
Tran, Khai
Ni, Guiyan
Nguyen, Dat
Tran, Trang T H
Tran, Mai H
Nguyen Thuy, Duong
Vo, Nam S
Nguyen, Quan - Abstract:
- Abstract: Most polygenic risk score (PRS)models have been based on data from populations of European origins (accounting for the majority of the large genomics datasets, e.g. >78% in the UK Biobank and >85% in the GTEx project). Although several large-scale Asian biobanks were initiated (e.g. Japanese, Korean, Han Chinese biobanks), most other Asian countries have little or near-zero genomics data. To implement PRS models for under-represented populations, we explored transfer learning approaches, assuming that information from existing large datasets can compensate for the small sample size that can be feasibly obtained in developing countries, like Vietnam. Here, we benchmark 13 common PRS methods in meta-population strategy (combining individual genotype data from multiple populations) and multi-population strategy (combining summary statistics from multiple populations). Our results highlight the complementarity of different populations and the choice of methods should depend on the target population. Based on these results, we discussed a set of guidelines to help users select the best method for their datasets. We developed a robust and comprehensive software to allow for benchmarking comparisons between methods and proposed a computational framework for improving PRS performance in a dataset with a small sample size. This work is expected to inform the development of genomics applications in under-represented populations. PRSUP framework is available at:Abstract: Most polygenic risk score (PRS)models have been based on data from populations of European origins (accounting for the majority of the large genomics datasets, e.g. >78% in the UK Biobank and >85% in the GTEx project). Although several large-scale Asian biobanks were initiated (e.g. Japanese, Korean, Han Chinese biobanks), most other Asian countries have little or near-zero genomics data. To implement PRS models for under-represented populations, we explored transfer learning approaches, assuming that information from existing large datasets can compensate for the small sample size that can be feasibly obtained in developing countries, like Vietnam. Here, we benchmark 13 common PRS methods in meta-population strategy (combining individual genotype data from multiple populations) and multi-population strategy (combining summary statistics from multiple populations). Our results highlight the complementarity of different populations and the choice of methods should depend on the target population. Based on these results, we discussed a set of guidelines to help users select the best method for their datasets. We developed a robust and comprehensive software to allow for benchmarking comparisons between methods and proposed a computational framework for improving PRS performance in a dataset with a small sample size. This work is expected to inform the development of genomics applications in under-represented populations. PRSUP framework is available at: https://github.com/BiomedicalMachineLearning/VGP … (more)
- Is Part Of:
- Briefings in bioinformatics. Volume 23:Issue 6(2022)
- Journal:
- Briefings in bioinformatics
- Issue:
- Volume 23:Issue 6(2022)
- Issue Display:
- Volume 23, Issue 6 (2022)
- Year:
- 2022
- Volume:
- 23
- Issue:
- 6
- Issue Sort Value:
- 2022-0023-0006-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-11-02
- Subjects:
- PRS -- GWAS -- trans-ethnic -- across-population
Genetics -- Data processing -- Periodicals
Molecular biology -- Data processing -- Periodicals
Genomes -- Data processing -- Periodicals
572.80285 - Journal URLs:
- http://bib.oxfordjournals.org ↗
http://www.oxfordjournals.org/content?genre=journal&issn=1477-4054 ↗
http://ukcatalogue.oup.com/ ↗
http://firstsearch.oclc.org ↗ - DOI:
- 10.1093/bib/bbac459 ↗
- Languages:
- English
- ISSNs:
- 1467-5463
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 2283.958363
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 24766.xml