Types of
Biological Databases
Genomic Databases
Amandeep Singh
Assistant Professor
Department of Biotechnology
GSSDGS Khalsa College Patiala
Primary Database
It act as repository of raw data (came directly through experimentation)
For DNA
(Nucleotide/Genome Sequence Database)
For Protein
(Proteome Sequence Database)
1. GenBank
2. EMBLE
1. SWISS-PROT
2. PIR
Nucleotide and Genome Sequence
Database
GenBank
• NIH genetic sequence database.
• Annotated collection of all publically available DNA sequences.
• A new release is made every two months.
GenBank
International Nucleotide Sequence Database Collaboration
DNA
Data Bank of Japan
European Molecular
Biology Laboratory
(EMBL)
National Center for
Biotechnology
Information (NCBI)
GenBank Entry
1. Concise description of sequence
2. Scientific nomenclature and taxonomy of source organism
3. Table of features:- Coding region and sites
• Transcription site
• Mutation site
• Modification site
4. Protein translation of coding region
5. Bibliographic references
EMBLE (European Molecular
Biology Laboratory)
Nucleotide Sequence Database maintained by European Bioinformatics
Institute (EBI) in Hinxton, Cambridge, UK.
Accessed Through
SRS System
(Sequence Retrieval System)
At EBI
Download entire database
as single flat file
UniGene
GenBank Sequence
Non-redundant Gene
Cluster
Represent a Unique
Gene
ATGCATGCATGC
ATGCATGCATGC
ATGCATGCATGCAT
ATGCATGCATGCATGC
ATGCATGCATGCATGCAT
SGD (Saccharomyces Genome
Database)
About molecular biology and genetics of yeast Saccharomyces Cerevisiae
EBI Genome
Provide access and statistic for complete genomes and
information about ongoing projects.
Ensemble
• Joint Project between EMBL-EBI + Sanger Center
• To develop software system that produce and maintain automatic
annotation on eukaryotic genomes.
Entrez
Retrieval system for searching several linked databases.
Gene Census
• Provide entry point for genome analysis with an interactive whole
genome comparison from an evolutionary perspective.
• Build Phylogenetic trees
COGs
• This database classify proteins encoded in 43 complete genomes on the basis of
sequence similarity.
• Cluster of Orthologous Group (COGs): A collection of homologous genes that are
useful for study of evolutionary relationships.
• Basic application:- Predict the function of uncharacterized protein to characterized
protein and to identify phylogenetic patterns of protein occurrence.

Genomic databases

  • 1.
    Types of Biological Databases GenomicDatabases Amandeep Singh Assistant Professor Department of Biotechnology GSSDGS Khalsa College Patiala
  • 2.
    Primary Database It actas repository of raw data (came directly through experimentation) For DNA (Nucleotide/Genome Sequence Database) For Protein (Proteome Sequence Database) 1. GenBank 2. EMBLE 1. SWISS-PROT 2. PIR
  • 3.
    Nucleotide and GenomeSequence Database GenBank • NIH genetic sequence database. • Annotated collection of all publically available DNA sequences. • A new release is made every two months. GenBank International Nucleotide Sequence Database Collaboration DNA Data Bank of Japan European Molecular Biology Laboratory (EMBL) National Center for Biotechnology Information (NCBI)
  • 4.
    GenBank Entry 1. Concisedescription of sequence 2. Scientific nomenclature and taxonomy of source organism 3. Table of features:- Coding region and sites • Transcription site • Mutation site • Modification site 4. Protein translation of coding region 5. Bibliographic references
  • 5.
    EMBLE (European Molecular BiologyLaboratory) Nucleotide Sequence Database maintained by European Bioinformatics Institute (EBI) in Hinxton, Cambridge, UK. Accessed Through SRS System (Sequence Retrieval System) At EBI Download entire database as single flat file
  • 6.
    UniGene GenBank Sequence Non-redundant Gene Cluster Representa Unique Gene ATGCATGCATGC ATGCATGCATGC ATGCATGCATGCAT ATGCATGCATGCATGC ATGCATGCATGCATGCAT
  • 7.
    SGD (Saccharomyces Genome Database) Aboutmolecular biology and genetics of yeast Saccharomyces Cerevisiae
  • 8.
    EBI Genome Provide accessand statistic for complete genomes and information about ongoing projects.
  • 9.
    Ensemble • Joint Projectbetween EMBL-EBI + Sanger Center • To develop software system that produce and maintain automatic annotation on eukaryotic genomes.
  • 10.
    Entrez Retrieval system forsearching several linked databases.
  • 11.
    Gene Census • Provideentry point for genome analysis with an interactive whole genome comparison from an evolutionary perspective. • Build Phylogenetic trees
  • 12.
    COGs • This databaseclassify proteins encoded in 43 complete genomes on the basis of sequence similarity. • Cluster of Orthologous Group (COGs): A collection of homologous genes that are useful for study of evolutionary relationships. • Basic application:- Predict the function of uncharacterized protein to characterized protein and to identify phylogenetic patterns of protein occurrence.