Genbank : GenBank® is a sequence database that
contains publicly available DNA sequences for more than
170,000 different organisms. GenBank data is accessible through
NCBI?s retrieval system, Entrez, which integrates data from
the major DNA and protein sequence databases along with
taxonomy, genome, mapping, protein structure and domain
information, and the biomedical literature via PubMed.
Sequence similarity searching is provided by the BLAST
family of programs. Complete bimonthly releases and daily
updates of the GenBank database are available by FTP.
UniGene UniGene provides an organized
view of transcribed sequences by collapsing them into
groups that correspond to genes and then connecting these
entries to other classes of information that may shed light
on gene expression and function. Tools provided on the UniGene
web site allow researchers to browse cDNA libraries by
developmental stage or tissue of origin, view surrogate
expression profiles based on EST counts, and find groups of
genes with similar expression patterns.
COG :The database of Clusters of
Orthologous Groups of proteins (COGs) is an attempt on
phylogenetic classification of the proteins encoded in
complete genomes. Each COGs includes proteins that are
inferred to be orthologs (direct evolutionary counterparts).
The current release consists of 138,458 which form 4873 COGs
and comprise 75% of the 185,505 proteins from 50 bacterial
genomes, 13 archaeal genomes, and three genomes of
unicellular eukaryotes. The COG database is updated periodically as new genomes
EMBL Nucleotide Sequence Database (URL :
http://www.ebi.ac.uk/embl/) is maintained at the European
Bioinformatics Institute (EBI) in an international collaboration
with the DNA Data Bank of Japan (DDBJ) and GenBank at the NCBI
(USA). Data is exchanged amongst the collaborating databases on a
daily basis. Network services allow free access to the most up-to-date data collection. For sequence similarity searching a variety of tools (e.g. Fasta, BLAST) are available.
Ensembl is a joint
project between EMBL-EBI and the Sanger
Institute to develop a software system
which produces and maintains automatic annotation on metazoan
genomes. This site provides free access to
all the data and software from the Ensembl project.
TIGRGene Indices :
Analysis of the public EST sequences, available
through the TIGR Gene Indices (TGI ;
http://www.tigr.org/tdb/tdb.html), is an attempt to identify the
genes represented by that data and to provide functional, structural, and evolutionary information regarding those genes. Gene Indices are constructed by first clustering, then assembling EST
and annotated gene sequences from GenBank. This process produces a
set of unique, high-fidelity virtual transcripts, or Tentative
Consensus (TC) sequences. The TC sequences can be used to provide
putative genes with functional annotation, to link the transcripts
to mapping and genomic sequence data, and to provide links between
orthologous and paralogous genes.
ASDB - Alternative Splicing Database :
ASDB consists of two divisions, ASDB (proteins), which contains
amino acid sequences, and ASDB (nucleotides) with genomic
sequences. The protein entries from SwissProt are joined into
clusters corresponding to alternatively spliced variants of one
gene. The DNA division consists of complete genes with alternative
splicing mentioned or annotated in GenBank. The search engine
allows one to search over SwissProt and GenBank fields and then
follow the links to all variants.