Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Issue Volume 44:Issue D1(2016) (8th November 2015)
- Record Type:
- Journal Article
- Title:
- Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation. Issue Volume 44:Issue D1(2016) (8th November 2015)
- Main Title:
- Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation
- Authors:
- O'Leary, Nuala A.
Wright, Mathew W.
Brister, J. Rodney
Ciufo, Stacy
Haddad, Diana
McVeigh, Rich
Rajput, Bhanu
Robbertse, Barbara
Smith-White, Brian
Ako-Adjei, Danso
Astashyn, Alexander
Badretdin, Azat
Bao, Yiming
Blinkova, Olga
Brover, Vyacheslav
Chetvernin, Vyacheslav
Choi, Jinna
Cox, Eric
Ermolaeva, Olga
Farrell, Catherine M.
Goldfarb, Tamara
Gupta, Tripti
Haft, Daniel
Hatcher, Eneida
Hlavina, Wratko
Joardar, Vinita S.
Kodali, Vamsi K.
Li, Wenjun
Maglott, Donna
Masterson, Patrick
McGarvey, Kelly M.
Murphy, Michael R.
O'Neill, Kathleen
Pujar, Shashikant
Rangwala, Sanjida H.
Rausch, Daniel
Riddick, Lillian D.
Schoch, Conrad
Shkeda, Andrei
Storz, Susan S.
Sun, Hanzhen
Thibaud-Nissen, Francoise
Tolstoy, Igor
Tully, Raymond E.
Vatsan, Anjana R.
Wallin, Craig
Webb, David
Wu, Wendy
Landrum, Melissa J.
Kimchi, Avi
Tatusova, Tatiana
DiCuccio, Michael
Kitts, Paul
Murphy, Terence D.
Pruitt, Kim D.
… (more) - Abstract:
- Abstract: The RefSeq project at the National Center for Biotechnology Information (NCBI) maintains and curates a publicly available database of annotated genomic, transcript, and protein sequence records (http://www.ncbi.nlm.nih.gov/refseq/ ). The RefSeq project leverages the data submitted to the International Nucleotide Sequence Database Collaboration (INSDC) against a combination of computation, manual curation, and collaboration to produce a standard set of stable, non-redundant reference sequences. The RefSeq project augments these reference sequences with current knowledge including publications, functional features and informative nomenclature. The database currently represents sequences from more than 55 000 organisms (>4800 viruses, >40 000 prokaryotes and >10 000 eukaryotes; RefSeq release 71), ranging from a single record to complete genomes. This paper summarizes the current status of the viral, prokaryotic, and eukaryotic branches of the RefSeq project, reports on improvements to data access and details efforts to further expand the taxonomic representation of the collection. We also highlight diverse functional curation initiatives that support multiple uses of RefSeq data including taxonomic validation, genome annotation, comparative genomics, and clinical testing. We summarize our approach to utilizing available RNA-Seq and other data types in our manual curation process for vertebrate, plant, and other species, and describe a new direction for prokaryoticAbstract: The RefSeq project at the National Center for Biotechnology Information (NCBI) maintains and curates a publicly available database of annotated genomic, transcript, and protein sequence records (http://www.ncbi.nlm.nih.gov/refseq/ ). The RefSeq project leverages the data submitted to the International Nucleotide Sequence Database Collaboration (INSDC) against a combination of computation, manual curation, and collaboration to produce a standard set of stable, non-redundant reference sequences. The RefSeq project augments these reference sequences with current knowledge including publications, functional features and informative nomenclature. The database currently represents sequences from more than 55 000 organisms (>4800 viruses, >40 000 prokaryotes and >10 000 eukaryotes; RefSeq release 71), ranging from a single record to complete genomes. This paper summarizes the current status of the viral, prokaryotic, and eukaryotic branches of the RefSeq project, reports on improvements to data access and details efforts to further expand the taxonomic representation of the collection. We also highlight diverse functional curation initiatives that support multiple uses of RefSeq data including taxonomic validation, genome annotation, comparative genomics, and clinical testing. We summarize our approach to utilizing available RNA-Seq and other data types in our manual curation process for vertebrate, plant, and other species, and describe a new direction for prokaryotic genomes and protein name management. … (more)
- Is Part Of:
- Nucleic acids research. Volume 44:Issue D1(2016)
- Journal:
- Nucleic acids research
- Issue:
- Volume 44:Issue D1(2016)
- Issue Display:
- Volume 44, Issue 1 (2016)
- Year:
- 2016
- Volume:
- 44
- Issue:
- 1
- Issue Sort Value:
- 2016-0044-0001-0000
- Page Start:
- D733
- Page End:
- D745
- Publication Date:
- 2015-11-08
- Subjects:
- Nucleic acids -- Periodicals
Molecular biology -- Periodicals
572.805 - Journal URLs:
- http://nar.oxfordjournals.org/ ↗
http://www.ncbi.nlm.nih.gov/pmc/journals/4 ↗
http://ukcatalogue.oup.com/ ↗
http://firstsearch.oclc.org ↗ - DOI:
- 10.1093/nar/gkv1189 ↗
- Languages:
- English
- ISSNs:
- 0305-1048
- Deposit Type:
- Legaldeposit
- View Content:
- Available online (eLD content is only available in our Reading Rooms) ↗
- Physical Locations:
- British Library DSC - 6183.850000
British Library DSC - BLDSS-3PM
British Library HMNTS - ELD Digital store - Ingest File:
- 17261.xml