Bioinformatics Databases


Bioinformatics Database
1.1 International Nucleotide Sequence Database Collaboration
Database name
Full name and/or description
URL
DDBJ-DNA Data Bank of Japan
All known nucleotide and proteinsequences
EMBL-Nucleotide Sequence Database
All known nucleotide and proteinsequences
GenBank
All known nucleotide and proteinsequences
1.2. DNA sequences: genes, motifs and regulatory sites 1.2.1. Coding and coding DNA
Database name
Full name and/or description
URL
ACLAME
A classification of genetic mobile elements
CUTG
Codon usage tabulated from GenBank
Genetic Codes
Genetic codes in various organisms and organelles
Entrez Gene
Gene-centeredinformation at NCBI
HERVd
Human endogenous retrovirus database
Hoppsigen
Human and mouse homologous processed pseudogenes
Imprinted Gene Catalogue
Imprinted genes and parent-of-origin effects in animals
Islander
Pathogenicity islands and prophages inbacterial genomes
MICdb
Prokaryotic microsatellites
NPRD
Nucleosome positioning region database
STRBase
Short tandem DNA repeats database
TIGR Gene Indices
Organism-specific databases of EST and genesequences
Transterm
Codon usage, start and stop signals
UniGene
Non-redundant set of eukaryotic gene-oriented clusters
UniVec
Vector sequences,adapters, linkers and primers used in DNA cloning, can be used to check for vectorcontamination
VectorDB
Characterization and classification of nucleic acid vectors
Xpro
Eukaryotic protein-encoding DNAsequences, both intron-containing and intron- less genes
1.2.2. Gene structure, introns and exons, splice sites
Database name
Full name and/or description
URL
ASAP
Alternative spliced isoforms
ASD
Alternative splicing database at EBI, includes three databases AltSplice, AltExtron and AEdb
ASDB
Alternative splicing database: protein products and expression patterns of alternatively spliced genes
ASHESdb
Alternatively spliced human genes by exonskipping database
EASED
Extended alternatively spliced EST database
ECgene
Genome annotation for alternative splicing
EDAS
EST-derived alternative splicing database
ExInt
Exonintron structure of eukaryotic genes
HS3D
Homo sapiens splice sites dataset
Intronerator
Alternative splicing in C.elegans and C.briggsae
SpliceDB
Canonical and non-canonical mammalian splice sites
SpliceInfo
Modes of alternative splicing in human genome
SpliceNest
A tool for visualizing splicing of genes from EST data
1.2.3. Transcriptional regulator sites and transcription factors
Database name
Full name and/or description
URL
ACTIVITY
Functional DNA/RNA site activity
DBTBS
Bacillus subtilis promoters and transcription factors
DoOP
Database of orthologous promoters: chordates and plants
DPInteract
Binding sites for E.coli DNA-binding proteins
EPD
Eukaryotic promoterdatabase
HemoPDB
Hematopoieticpromoter database: transcriptional regulation in hematopoiesis
JASPAR
PSSMs for transcription factor DNA-binding sites
MAPPER
Putative transcription factor binding sites in various genomes
PLACE
Plant cis-acting regulatory DNA elements
PlantCARE
Plant promoters and cis -acting regulatory elements
PlantProm
Plant promotersequences for RNApolymerase II
PRODORIC
Prokaryotic database of gene regulation networks
PromEC
E . coli promoters with experimentally identified transcriptional start sites
SELEX_DB
DNA and RNA binding sites for various proteins, found by systematic evolution of ligands by exponential enrichment
TESS
Transcription element search system
TRACTOR db
Transcription factors in gamma-proteobacteria database
TRANSCompel
Composite regulatory elements affecting gene transcription in eukaryotes
TRANSFAC
Transcription factors and binding sites
TRED
Transcriptional regulatory element database
TRRD
Transcription regulatory regions of eukaryotic genes
2. RNA sequence databases
Database name
Full name and/or description
URL
16S and 23S rRNA Mutation Database
16S and 23S ribosomal RNA mutations
5S rRNA Database
5S rRNA sequences
Aptamer database
Small RNA/DNAmolecules binding nucleic acids, proteins
ARED
AU-rich element-containing mRNA database
Mobile group II introns
A database of group II introns, self-splicing catalytic RNAs
European rRNA database
All complete or nearly complete rRNAsequences
GtRDB
Genomic tRNA database
Guide RNA Database
RNA editing in various kinetoplastid species
HIV Sequence Database
HIV RNA sequences
HuSiDa
Human siRNA database
HyPaLib
Hybrid pattern library : structural elements in classes of RNA
IRESdb
Internal ribosome entry site database
microRNA Registry
Database of microRNAs (small non-coding RNAs)
NCIR
Non-canonical interactions in RNA structures
ncRNAs Database
Non-coding RNAs with regulatory functions
NONCODE
A database of non-coding RNAs
PLANTncRNAs
Plant non-coding RNAs
Plant snoRNA DB
snoRNA genes in plant species
PolyA_DB
A database of mammalian mRNA polyadenylation
PseudoBase
Database of RNA pseudoknots
Rfam
Non-coding RNA families
RISSC
Ribosomal internal spacer sequence collection
RNAdb
Mammalian non-coding RNA database
RNA Modification Database
Naturally modified nucleosides in RNA
RRNDB
rRNA operon numbers in various prokaryotes
siRNAdb
siRNA database and search engine
Small RNA Database
Small RNAs from prokaryotes and eukaryotes
SRPDB
Signal recognition particle database
SSU rRNA Modification Database
Modified nucleosides in small subunit rRNA
Subviral RNA Database
Viroids and viroid-like RNAs
tmRNA Website
tmRNA sequences and alignments
tmRDB
tmRNA database
tRNA sequences
tRNA viewer and sequence editor
UTRdb/UTRsite
5' and 3' -UTRs of eukaryotic mRNAs
3. Protein sequence databases
3.1. General sequence databases
Database name
Full name and/or description
URL
EXProt
Sequences of proteins with experimentally verified function
NCBI Protein database
All protein sequences: translated from GenBank and imported from other protein databases
PA-GOSUB
Protein sequences from model organisms, GO assignment and subcellular localization
PIR-PSD
Protein information resource protein sequence database, has been merged into the UniProt knowledgebase
PIR-NREF
PIR's non-redundant reference protein database
PRF
Protein research foundation database of peptides: sequences, literature and unnaturalamino acids
Swiss-Prot
Now UniProt/Swiss-Prot: expertly curated protein sequence database, section of the UniProt knowledgebase
TrEMBL
Now UniProt/TrEMBL: computer-annotated translations of EMBL nucleotide sequence entries: section of the UniProt knowledgebase
UniParc
UniProt archive: a repository of all proteinsequences, consisting only of unique identifiers and sequence
UniProt
Universal protein knowledgebase: merged data from Swiss-Prot, TrEMBL and PIR protein sequence databases
UniRef
UniProt non-redundant reference database: clustered sets of relatedsequences (including splice variants and isoforms)
3.2. Protein properties
Database name
Full name and/or description
URL
AAindex
Physicochemical properties of amino acids
ProNIT
Thermodynamic data on proteinnucleic acid interactions
ProTherm
Thermodynamic data for wild-type and mutant proteins
TECRdb
Thermodynamics of enzyme-catalyzed reactions
3.3. Protein localization and targeting
Database name
Full name and/or description
URL
DBSubLoc
Database of protein subcellular localization
NESbase
Nuclear export signals database
NLSdb
Nuclear localization signals
NMPdb
Nuclear matrix associated proteins database
NOPdb
Nucleolar proteome database
PSORTdb
Protein subcellular localization in bacteria
SPD
Secreted protein database
THGS
Transmembrane helices in genome sequences
TMPDB
Experimentally characterized transmembrane topologies
3.4. Protein sequence motifs and active sites
Database name
Full name and/or description
URL
ASC
Active sequence collection: biologically active peptides
Blocks
Alignments of conserved regions in protein families
CSA
Catalytic site atlas : active sites and catalytic residues in enzymes of known 3D structure
COMe
Co-ordination of metals etc.: classification of bioinorganic proteins ( metalloproteins and some other complex proteins)
CopS
Comprehensive peptide signature database
eBLOCKS
Highly conserved protein sequence blocks
eMOTIF
Protein sequence motif determination and searches
Metalloprotein Site Database
Metal-binding sites in metalloproteins
O-GlycBase
O- and C-linked glycosylation sites in proteins
PDBSite
3D structure of protein functional sites
Phospho.ELM
S/T/Y protein phosphorylation sites (formerly PhosphoBase)
PROMISE
Prosthetic centers and metal ions in protein active sites
PROSITE
Biologically significant protein patterns and profiles
ProTeus
Signature sequences at the protein N- and C-termini
3.5. Protein domain databases; protein classification
Database name
Full name and/or description
URL
ADDA
A database of protein domain classification
CDD
Conserved domain database, includes protein domains fromPfam, SMART, COG and KOG databases
CluSTr
Clusters of Swiss-Prot + TrEMBL proteins
FunShift
Functional divergence between the subfamilies of a protein domain family
Hits
A database of protein domains and motifs
InterPro
Integrated resource of protein families, domains and functional sites
iProClass
Integrated protein classification database
PIRSF
Family/superfamily classification of whole proteins
PRINTS
Hierarchical gene family fingerprints
Pfam
Protein families: multiple sequence alignments and profile hidden Markov models of protein domains
PRECISE
Predicted and consensus interaction sites in enzymes
ProDom
Protein domain families
ProtoMap
Hierarchical classification of Swiss-Prot proteins
ProtoNet
Hierarchical clustering of Swiss-Prot proteins
S4
Structure-based sequence alignments of SCOP superfamilies
SBASE
Protein domain sequences and tools
SMART
Simple modular architecture research tool: signalling, extracellular and chromatin-associated protein domains
SUPFAM
Grouping of sequence families into superfamilies
SYSTERS
Systematic re-searching and clustering of proteins
TIGRFAMs
TIGR protein families adapted for functional annotation
3.6. Databases of individual protein families
Database name
Full name and/or description
URL
AARSDB
Aminoacyl-tRNA synthetase database
ASPD
Artificial selected proteins/peptides database
BacTregulators
Transcriptional regulators of AraC and TetR families
CSDBase
Cold shock domain-containing proteins
CuticleDB
Structural proteins of Arthropod cuticle
DCCP
Database of copper-chelating proteins
DExH/D Family Database
DEAD-box, DEAH-box and DExH-box proteins
Endogenous GPCR List
G protein-coupled receptors; expression in cell lines
ESTHER
Esterases and other alpha/beta hydrolase enzymes
EyeSite
Families of proteins functioning in the eye
GPCRDB
G protein-coupled receptors database
gpDB
G-proteins and their interaction with GPCRs
Histone Database
Histone fold sequences and structures
Homeobox Page
Homeobox proteins, classification and evolution
Hox-Pro
Homeobox genes database
Homeodomain Resource
Homeodomain sequences, structures and related genetic and genomic information
HORDE
Human olfactory receptor data exploratorium
InBase
Inteins (protein splicing elements) database: properties, sequences, bibliography
KinGKinases in Genomes
S/T/Y-specific protein kinases encoded in complete genomes
Knottins
Database of knottinssmall proteins with an unusual disulfide through disulfide ' knot
LGICdb
Ligand-gated ion channel subunit sequences database
Lipase Engineering Database Sequence
structure and function of lipases and esterases
LOX-DB
Mammalian, invertebrate, plant and fungal lipoxygenases
MEROPS
Database of proteolytic enzymes (peptidases)
NPD
Nuclear protein database
NucleaRDB
Nuclear receptor superfamily
Nuclear Receptor Resource
Nuclear receptor superfamily
NUREBASE
Nuclear hormone receptors database
Olfactory Receptor Database
Sequences for olfactory receptor-like molecules
ooTFD
Object-oriented transcription factors database
PKR
Protein kinase resource: sequences, enzymology, genetics and molecular and structural properties
PLPMDB
Pyridoxal-5 0 -phosphate dependent enzymes mutations
ProLysED
A database of bacterial protease systems
Prolysis
Proteases and natural and synthetic protease inhibitors
REBASE
Restriction enzymes and associated methylases
Ribonuclease P Database
RNase P sequences, alignments and structures
RPG
Ribosomal protein gene database
RTKdb
Receptor tyrosine kinase sequences
S/MARt dB
Nuclear scaffold/matrix attached regions
Scorpion
Database of scorpion toxins
SDAP
Structural database of allergenic proteins and food allergens
SENTRA
Sensory signal transduction proteins
SEVENS
7-transmembrane helix receptors (G-protein-coupled)
SRPDB
Proteins of the signal recognition particles
TrSDB
Transcription factor database
VKCDB
Voltage-gated potassium channel database
Wnt Database
Wnt proteins and phenotypes
4. Structure Databases
4.1. Small molecules
Database name
Full name and/or description
URL
ChEBI
Chemical entities of biological interest
CSD
Cambridge structural database: crystal structure information for organic and metal-organic compounds
HIC-Up
Hetero-compound Information Centre Uppsala
AANT
Amino acidnucleotide interaction database
Klotho
Collection and categorization of biological compounds
LIGAND
Chemical compounds and reactions in biological pathways
PDB-Ligand
3D structures of small molecules bound to proteins and nucleic acids
Ligand Depot
Ligand Depot is a data warehouse which integrates databases, services, tools and methods related to small molecules bound to macromolecules.
PubChem
Structures and biological activities of small organic molecules
4.2. Carbohydrates
Database name
Full name and/or description
URL
CCSD
Complex carbohydrate structure database (CarbBank)
CSS
Carbohydrate structure suite: carbohydrate 3D structures derived from the PDB
Glycan
Carbohydrate database, part of the KEGG system
GlycoSuiteDB
N- and O-linked glycan structures and biological sources
Monosaccharide Browser
Space-filling Fischer projections of monosaccharides
SWEET-DB
Annotated carbohydrate structure and substance information
4.3. Nucleic acid structure
Database name
Full name and/or description
URL
NDB
Nucleic acid-containing structures
NTDB
Thermodynamic data for nucleic acids
RNABase
RNA-containing structures from PDB and NDB
SCOR
Structural classification of RNA: RNA motifs by structure, function and tertiary interactions
4.4. Protein structure
Database name
Full name and/or description
URL
ArchDB
Automated classification of protein loop structures
wwPDB
Worldwide Protein Data Bank
PDBj
Protein Data Bank Japan-the archive for macromolecular structures.
ASTRAL
Sequences of domains of known structure, selected subsets and sequence structure correspondences
BAliBASE
A database for comparison of multiple sequence alignments
BioMagResBank
NMR spectroscopic data for proteins and nucleic acids
CADB
Conformational angles in proteins database
CATH
Protein domain structures database
CE 3D
protein structure alignments
CKAAPs DB
Structurally similar proteins with dissimilar sequences
Dali
Protein fold classification using the Dali search engine
Decoys R' Us
Computer-generated protein conformations
DisProt
Database of Protein Disorder: proteins that lack fixed 3D structure in their native states
DomIns
Domain insertions in known protein structures
DSDBASE
Native and modeled disulfide bonds in proteins
DSMM
Database of simulated molecular motions
eF-site
Electrostatic surface of Functional site: electrostatic potentials and hydrophobic properties of the active sites
GenDiS
Genomic distribution of protein structural superfamilies
Gene3D
Precalculated structural assignments for whole genomes
GTD
Genomic threading database: structural annotations of complete proteomes
GTOP
Protein fold predictions from genome sequences
Het-PDB
Navi Hetero-atoms in protein structures
HOMSTRAD
Homologous structure alignment database: curated structure-based alignments for protein families
IMB Jena Image Library
Visualization and analysis of 3D biopolymer structures
IMGT/3Dstructure-DB
Sequences and 3D structures of vertebrate immunoglobulins, T cell receptors and MHC proteins
ISSD
Integrated sequencestructure database
LPFC
Library of protein family core structures
MMDB
NCBI's database of 3D structures, part of NCBI Entrez
E-MSD
EBI's macromolecular structure database
ModBase
Annotated comparative protein structure models
MolMovDB
Database of macromolecular movements: descriptions of protein and macromolecular motions, including movies
PALI
Phylogeny and alignment of homologous protein structures
PASS2
Structural motifs of protein superfamilies
PepConfDB
A database of peptide conformations
PDB
Protein structure databank: all publicly available 3D structures of proteins and nucleic acids
PDB-REPRDB
Representative protein chains, based on PDB entries
PDBsum
Summaries and analyses of PDB structures
PDB_TM
Transmembrane proteins with known 3D structure
Protein Folding Database
Experimental data on protein folding
SCOP
Structural classification of proteins
Sloop
Classification of protein loops
Structure Superposition Database
Pairwise superposition of TIM-barrel structures
SWISS-MODEL Repository
Database of annotated 3D protein structure models
SUPERFAMILY
Assignments of proteins to structural superfamilies
SURFACE
Surface residues and functions annotated, compared and evaluated: a database of protein surface patches
TargetDB
Target data from worldwide structural genomics projects
3D-GENOMICS
Structural annotations for complete proteomes
TOPS
Topology of protein structures database
5. Genomics Databases (non-human)
5.1. Genome annotation terms, ontologies and nomenclature
Database name
Full name and/or description
URL
Genew
Human gene nomenclature: approved gene symbols
GO
Gene ontology consortium database
GOA
EBI's gene ontology annotation project
IUBMB Nomenclature database
Nomenclature of enzymes, membrane transporters, electron transport proteins and other proteins
IUPAC Nomenclature database
Nomenclature of biochemical and organic compounds approved by the IUBMB-IUPAC Joint Commission
IUPHAR-RD
The International Union of Pharmacology recommendations on receptor nomenclature and drug classification
PANTHER
Gene products organized by biological function
UMLS
Unified medical language system
5.1.1. Taxonomy and Identification
Database name
Full name and/or description
URL
ICB
gyrB database for identification and classification of bacteria
NCBI Taxonomy
Names of all organisms represented in GenBank
PANDIT
Protein and associated nucleotide domains with inferred trees
RIDOM
rRNA-based differentiation of medical microorganisms
RDP-II
Ribosomal database project
Tree of Life
Information on phylogeny and biodiversity
5.2. General genomics databases
Database name
Full name and/or description
URL
COG
Clusters of orthologous groups of proteins
COGENT
Complete genome tracking: predicted peptides from fully sequenced genomes
CORG
Comparative regulatory genomics: conserved non-coding sequence blocks
DEG
Database of essential genes from bacteria and yeast
EBI Genomes
EBI's collection of databases for the analysis of complete and unfinished viral , pro- and eukaryotic genomes
EGO
Eukaryotic gene orthologs: orthologous DNA sequences in the TIGR gene indices
EMGlib
Enhanced microbial genomes library: completely sequenced genomes of unicellular organisms
Entrez Genomes
NCBI's collection of databases for the analysis of complete and unfinished viral , pro- and eukaryotic genomes
ERGO
Light Integrated biochemical data on nine bacterial genomes: publicly available portion of the ERGO database
FusionDB
Database of bacterial and archaeal gene fusion events
Genome Atlas
DNA structural properties of sequenced genomes
Genome Information Broker
DDBJ's collection of databases for the analysis of complete and unfinished viral , pro- and eukaryotic genomes
Genome Reviews
Integrated view of complete genomes
GOLD
Genomes online database: a listing of completed and ongoing genome projects
HGT-DB
Putative horizontally transferred genes in prokaryotic genomes
Integr8
Functional classification of proteins in whole genomes
KEGG
Kyoto encyclopedia of genes and genomes: integrated suite of databases on genes , proteins and metabolic pathways
MBGD
Microbial genome database for comparative analysis
ORFanage
Database of orphan ORFs (ORFs with no homologs) in complete microbial genomes
PACRAT
Archaeal and bacterial intergenic sequence features
PartiGeneDB
Assembled partial genomes for _ 250 eukaryotic organisms
PEDANT
Results of an automated analysis of genomic sequences
TIGR Microbial Database
Lists of completed and ongoing genome projects with links to complete genome sequences
TIGR Comprehensive Microbial Resource
Various data on complete microbial genomes: uniform annotation, properties of DNA and predicted proteins
TransportDB
Predicted membrane transporters in complete genomes, classified according to the TC classification system
WIT3
What is there ? Metabolic reconstruction for completely sequenced microbial genomes