Feature-extraction and analysis based on spatial distribution of amino acids for SARS-CoV-2 Protein sequences. (February 2022)
- Record Type:
- Journal Article
- Title:
- Feature-extraction and analysis based on spatial distribution of amino acids for SARS-CoV-2 Protein sequences. (February 2022)
- Main Title:
- Feature-extraction and analysis based on spatial distribution of amino acids for SARS-CoV-2 Protein sequences
- Authors:
- Rout, Ranjeet Kumar
Hassan, Sk Sarif
Sheikh, Sabha
Umer, Saiyed
Sahoo, Kshira Sagar
Gandomi, Amir H. - Abstract:
- Abstract: Background and objective: The world is currently facing a global emergency due to COVID-19, which requires immediate strategies to strengthen healthcare facilities and prevent further deaths. To achieve effective remedies and solutions, research on different aspects, including the genomic and proteomic level characterizations of SARS-CoV-2, are critical. In this work, the spatial representation/composition and distribution frequency of 20 amino acids across the primary protein sequences of SARS-CoV-2 were examined according to different parameters. Method: To identify the spatial distribution of amino acids over the primary protein sequences of SARS-CoV-2, the Hurst exponent and Shannon entropy were applied as parameters to fetch the autocorrelation and amount of information over the spatial representations. The frequency distribution of each amino acid over the protein sequences was also evaluated. In the case of a one-dimensional sequence, the Hurst exponent (HE) was utilized due to its linear relationship with the fractal dimension (D), i.e. D + H E = 2, to characterize fractality. Moreover, binary Shannon entropy was considered to measure the uncertainty in a binary sequence then further applied to calculate amino acid conservation in the primary protein sequences. Results and conclusion: Fourteen (14) SARS-CoV protein sequences were evaluated and compared with 105 SARS-CoV-2 proteins. The simulation results demonstrate the differences in the collectedAbstract: Background and objective: The world is currently facing a global emergency due to COVID-19, which requires immediate strategies to strengthen healthcare facilities and prevent further deaths. To achieve effective remedies and solutions, research on different aspects, including the genomic and proteomic level characterizations of SARS-CoV-2, are critical. In this work, the spatial representation/composition and distribution frequency of 20 amino acids across the primary protein sequences of SARS-CoV-2 were examined according to different parameters. Method: To identify the spatial distribution of amino acids over the primary protein sequences of SARS-CoV-2, the Hurst exponent and Shannon entropy were applied as parameters to fetch the autocorrelation and amount of information over the spatial representations. The frequency distribution of each amino acid over the protein sequences was also evaluated. In the case of a one-dimensional sequence, the Hurst exponent (HE) was utilized due to its linear relationship with the fractal dimension (D), i.e. D + H E = 2, to characterize fractality. Moreover, binary Shannon entropy was considered to measure the uncertainty in a binary sequence then further applied to calculate amino acid conservation in the primary protein sequences. Results and conclusion: Fourteen (14) SARS-CoV protein sequences were evaluated and compared with 105 SARS-CoV-2 proteins. The simulation results demonstrate the differences in the collected information about the amino acid spatial distribution in the SARS-CoV-2 and SARS-CoV proteins, enabling researchers to distinguish between the two types of CoV. The spatial arrangement of amino acids also reveals similarities and dissimilarities among the important structural proteins, E, M, N and S, which is pivotal to establish an evolutionary tree with other CoV strains. Highlights: The genomic and proteomic level characterizations of SARS-CoV2, are critical. The spatial representation and distribution frequency of amino acids were examined. The Hurst exponent and entropy were applied to fetch the autocorrelation. The simulation results enables to distinguish between the two types of CoV. The spatial arrangement reveals the important information of structural proteins. … (more)
- Is Part Of:
- Computers in biology and medicine. Volume 141(2022)
- Journal:
- Computers in biology and medicine
- Issue:
- Volume 141(2022)
- Issue Display:
- Volume 141, Issue 2022 (2022)
- Year:
- 2022
- Volume:
- 141
- Issue:
- 2022
- Issue Sort Value:
- 2022-0141-2022-0000
- Page Start:
- Page End:
- Publication Date:
- 2022-02
- Subjects:
- Shannon entropy -- Hurst exponent -- Amino acid -- Frequency distribution -- SARS-CoV-2
Medicine -- Data processing -- Periodicals
Biology -- Data processing -- Periodicals
610.285 - Journal URLs:
- http://www.sciencedirect.com/science/journal/00104825/ ↗
http://www.elsevier.com/journals ↗ - DOI:
- 10.1016/j.compbiomed.2021.105024 ↗
- Languages:
- English
- ISSNs:
- 0010-4825
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 3394.880000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 20684.xml