The Reference Sequence (RefSeq) collection aims to provide a comprehensive, integrated, non-redundant, well-annotated set of sequences, including genomic DNA, transcripts, and proteins. RefSeq is a foundation for medical, functional, and diversity studies; they provide a stable reference for genome annotation, gene identification and characterization, mutation and polymorphism analysis (especially RefSeqGene records), expression studies, and comparative analyses. [more...]
Scope |
|
NCBI provides RefSeqs for taxonomically diverse organisms including eukaryotes, bacteria, and viruses. Additional records are added to the collection as data become publicly available.
Announcements |
|
April 22, 2011: The Conserved CDS (CCDS) database released an update for human following genome reannotation by NCBI and Ensembl. This update adds 2,126 new CCDS IDs, bringing the total to 25,564 consistently consistently annotated coding regions that pass CCDS QA tests.[more]
January 24, 2011: The Conserved CDS (CCDS) database released an update for the mouse following genome reannotation by NCBI and Ensembl. This update adds 4,561 new CCDS IDs, bringing the total to 22,187 consistently annotated coding regions that pass CCDS QA tests.[more]
|
May 12, 2011: RefSeq Release 47 available for FTP
This release includes:
To receive announcements of future RefSeq releases and incremental large updates please subscribe to NCBI's refseq-announce mail list:
refseq-announce
|
|
|
BLAST databases: Formatted genomic, mRNA, and protein RefSeq BLAST databases are available for FTP.
Data Access and Availability |
|
RefSeq is accessible via BLAST, Entrez, and the NCBI FTP site. Information is also available in Entrez Genomes and Entrez Gene, and for some genomes additional information is available in the Map Viewer. Special properties have been defined to facilitate Entrez-based retrieval. Also see: Entrez Query Hints
Distinguishing Features |
|
The main features of the RefSeq collection include:
|
non-redundancy
|
|
explicitly linked nucleotide and protein sequences
|
|
updates to reflect current knowledge of sequence data and biology
|
|
data validation and format consistency
|
|
distinct accession series (all accessions include an underscore '_' character)
|
|
ongoing curation by NCBI staff and collaborators, with reviewed records indicated
|
References |
|
Please refer to the Publications page for a full list of articles describing or using the RefSeq dataset. When using the RefSeq database, please cite one of the following:
The NCBI handbook [Internet]. Bethesda (MD): National Library of Medicine (US), National Center for Biotechnology Information; 2002 Oct. Chapter 18, The Reference Sequence (RefSeq) Project. Available from http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=Books
NCBI Reference Sequences: current status, policy, and new initiatives.
Pruitt KD, Tatusova T, Klimke W, Maglott DR
Nucleic Acids Res 2009 Jan;37(Database issue):D32-6
[Full Text in PubMed Central]
|
|
Site contents
|
|
|
|
|
|