The fractal dimension as a measure for characterizing genetic variation of the human genome. (August 2020)
- Record Type:
- Journal Article
- Title:
- The fractal dimension as a measure for characterizing genetic variation of the human genome. (August 2020)
- Main Title:
- The fractal dimension as a measure for characterizing genetic variation of the human genome
- Authors:
- Lee, Chang-Yong
- Abstract:
- Graphical abstract: Highlights: We analyzed genome-wide SNPs in individuals from different populations. We proposed a set of chromosome-wise fractal dimensions. The fractal dimension quantified the degree of clustered distribution of SNPs. A set of fractal dimension described the genetic variation of global populations. The selected feature model predicted the global population with an accuracy of 77%. Abstract: Motivated by the characteristics of highly clustered single nucleotide polymorphism (SNP) across the human genome, we propose a set of chromosome-wise fractal dimensions as a measure for identifying an individual for human polymorphism. The fractal dimension quantifies the degree of clustered distribution of SNPs and represents parsimoniously the genetic variation in a chromosome. In this sense, the proposed scheme projects the SNP genotype data into a new space which is simpler and lower in dimension. As an illustrative example, we estimate the chromosome-wise fractal dimensions of SNPs that are extracted from the HapMap of Phase III data set. To determine the validity of the proposed measure, we apply principal component analysis (PCA) to the set of estimated fractal dimensions and demonstrate that the set more or less described the population structure of 11 global populations. We also use multidimensional scaling to relate the genetic distances based on PCA to the geographical distances between global populations. This shows that, similar to the SNP genotypeGraphical abstract: Highlights: We analyzed genome-wide SNPs in individuals from different populations. We proposed a set of chromosome-wise fractal dimensions. The fractal dimension quantified the degree of clustered distribution of SNPs. A set of fractal dimension described the genetic variation of global populations. The selected feature model predicted the global population with an accuracy of 77%. Abstract: Motivated by the characteristics of highly clustered single nucleotide polymorphism (SNP) across the human genome, we propose a set of chromosome-wise fractal dimensions as a measure for identifying an individual for human polymorphism. The fractal dimension quantifies the degree of clustered distribution of SNPs and represents parsimoniously the genetic variation in a chromosome. In this sense, the proposed scheme projects the SNP genotype data into a new space which is simpler and lower in dimension. As an illustrative example, we estimate the chromosome-wise fractal dimensions of SNPs that are extracted from the HapMap of Phase III data set. To determine the validity of the proposed measure, we apply principal component analysis (PCA) to the set of estimated fractal dimensions and demonstrate that the set more or less described the population structure of 11 global populations. We also use multidimensional scaling to relate the genetic distances based on PCA to the geographical distances between global populations. This shows that, similar to the SNP genotype data, the fractal dimensions also has a role in genetic distance in the population structure. In addition, we apply the proposed measure to a signature for the classification of global populations by developing a support vector machine model. The selected feature model predicts the global population with a balanced accuracy of about 77%. These results support that the fractal dimension is an efficient way to describe the genetic variation of global populations. … (more)
- Is Part Of:
- Computational biology and chemistry. Volume 87(2020)
- Journal:
- Computational biology and chemistry
- Issue:
- Volume 87(2020)
- Issue Display:
- Volume 87, Issue 2020 (2020)
- Year:
- 2020
- Volume:
- 87
- Issue:
- 2020
- Issue Sort Value:
- 2020-0087-2020-0000
- Page Start:
- Page End:
- Publication Date:
- 2020-08
- Subjects:
- Fractal dimension -- Single nucleotide polymorphism -- human genome -- population structure -- genetic variation
Chemistry -- Data processing -- Periodicals
Biology -- Data processing -- Periodicals
Biochemistry -- Data processing
Biology -- Data processing
Molecular biology -- Data processing
Periodicals
Electronic journals
542.85 - Journal URLs:
- http://www.sciencedirect.com/science/journal/14769271 ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.compbiolchem.2020.107278 ↗
- Languages:
- English
- ISSNs:
- 1476-9271
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3390.576700
British Library DSC - BLDSS-3PM
British Library STI - ELD Digital store - Ingest File:
- 13579.xml