Bioinformatics Database
1.1 International Nucleotide Sequence Database Collaboration
Database name | Full name and/or description | URL |
DDBJ-DNA Data Bank of Japan | All known nucleotide and proteinsequences | |
EMBL-Nucleotide Sequence Database | All known nucleotide and proteinsequences | |
GenBank | All known nucleotide and proteinsequences |
1.2. DNA sequences: genes, motifs and regulatory sites 1.2.1. Coding and coding DNA
Database name | Full name and/or description | URL |
ACLAME | A classification of genetic mobile elements | |
CUTG | Codon usage tabulated from GenBank | |
Genetic Codes | Genetic codes in various organisms and organelles | |
Entrez Gene | Gene-centeredinformation at NCBI | |
HERVd | Human endogenous retrovirus database | |
Hoppsigen | Human and mouse homologous processed pseudogenes | |
Imprinted Gene Catalogue | Imprinted genes and parent-of-origin effects in animals | |
Islander | Pathogenicity islands and prophages inbacterial genomes | |
MICdb | Prokaryotic microsatellites | |
NPRD | Nucleosome positioning region database | |
STRBase | Short tandem DNA repeats database | |
TIGR Gene Indices | Organism-specific databases of EST and genesequences | |
Transterm | Codon usage, start and stop signals | |
UniGene | Non-redundant set of eukaryotic gene-oriented clusters | |
UniVec | Vector sequences,adapters, linkers and primers used in DNA cloning, can be used to check for vectorcontamination | |
VectorDB | Characterization and classification of nucleic acid vectors | |
Xpro | Eukaryotic protein-encoding DNAsequences, both intron-containing and intron- less genes |
1.2.2. Gene structure, introns and exons, splice sites
Database name | Full name and/or description | URL |
ASAP | Alternative spliced isoforms | |
ASD | Alternative splicing database at EBI, includes three databases AltSplice, AltExtron and AEdb | |
ASDB | Alternative splicing database: protein products and expression patterns of alternatively spliced genes | |
ASHESdb | Alternatively spliced human genes by exonskipping database | |
EASED | Extended alternatively spliced EST database | |
ECgene | Genome annotation for alternative splicing | |
EDAS | EST-derived alternative splicing database | |
ExInt | Exon�intron structure of eukaryotic genes | |
HS3D | Homo sapiens splice sites dataset | |
Intronerator | Alternative splicing in C.elegans and C.briggsae | |
SpliceDB | Canonical and non-canonical mammalian splice sites | |
SpliceInfo | Modes of alternative splicing in human genome | |
SpliceNest | A tool for visualizing splicing of genes from EST data |
1.2.3. Transcriptional regulator sites and transcription factors
Database name | Full name and/or description | URL |
ACTIVITY | Functional DNA/RNA site activity | |
DBTBS | Bacillus subtilis promoters and transcription factors | |
DoOP | Database of orthologous promoters: chordates and plants | |
DPInteract | Binding sites for E.coli DNA-binding proteins | |
EPD | Eukaryotic promoterdatabase | |
HemoPDB | Hematopoieticpromoter database: transcriptional regulation in hematopoiesis | |
JASPAR | PSSMs for transcription factor DNA-binding sites | |
MAPPER | Putative transcription factor binding sites in various genomes | |
PLACE | Plant cis-acting regulatory DNA elements | |
PlantCARE | Plant promoters and cis -acting regulatory elements | |
PlantProm | Plant promotersequences for RNApolymerase II | |
PRODORIC | Prokaryotic database of gene regulation networks | |
PromEC | E . coli promoters with experimentally identified transcriptional start sites | |
SELEX_DB | DNA and RNA binding sites for various proteins, found by systematic evolution of ligands by exponential enrichment | |
TESS | Transcription element search system | |
TRACTOR db | Transcription factors in gamma-proteobacteria database | |
TRANSCompel | Composite regulatory elements affecting gene transcription in eukaryotes | |
TRANSFAC | Transcription factors and binding sites | |
TRED | Transcriptional regulatory element database | |
TRRD | Transcription regulatory regions of eukaryotic genes |
2. RNA sequence databases
Database name | Full name and/or description | URL |
16S and 23S rRNA Mutation Database | 16S and 23S ribosomal RNA mutations | |
5S rRNA Database | 5S rRNA sequences | |
Aptamer database | Small RNA/DNAmolecules binding nucleic acids, proteins | |
ARED | AU-rich element-containing mRNA database | |
Mobile group II introns | A database of group II introns, self-splicing catalytic RNAs | |
European rRNA database | All complete or nearly complete rRNAsequences | |
GtRDB | Genomic tRNA database | |
Guide RNA Database | RNA editing in various kinetoplastid species | |
HIV Sequence Database | HIV RNA sequences | |
HuSiDa | Human siRNA database | |
HyPaLib | Hybrid pattern library : structural elements in classes of RNA | |
IRESdb | Internal ribosome entry site database | |
microRNA Registry | Database of microRNAs (small non-coding RNAs) | |
NCIR | Non-canonical interactions in RNA structures | |
ncRNAs Database | Non-coding RNAs with regulatory functions | |
NONCODE | A database of non-coding RNAs | |
PLANTncRNAs | Plant non-coding RNAs | |
Plant snoRNA DB | snoRNA genes in plant species | |
PolyA_DB | A database of mammalian mRNA polyadenylation | |
PseudoBase | Database of RNA pseudoknots | |
Rfam | Non-coding RNA families | |
RISSC | Ribosomal internal spacer sequence collection | |
RNAdb | Mammalian non-coding RNA database | |
RNA Modification Database | Naturally modified nucleosides in RNA | |
RRNDB | rRNA operon numbers in various prokaryotes | |
siRNAdb | siRNA database and search engine | |
Small RNA Database | Small RNAs from prokaryotes and eukaryotes | |
SRPDB | Signal recognition particle database | |
SSU rRNA Modification Database | Modified nucleosides in small subunit rRNA | |
Subviral RNA Database | Viroids and viroid-like RNAs | |
tmRNA Website | tmRNA sequences and alignments | |
tmRDB | tmRNA database | |
tRNA sequences | tRNA viewer and sequence editor | |
UTRdb/UTRsite | 5' and 3' -UTRs of eukaryotic mRNAs |
3. Protein sequence databases
3.1. General sequence databases
Database name | Full name and/or description | URL |
EXProt | Sequences of proteins with experimentally verified function | |
NCBI Protein database | All protein sequences: translated from GenBank and imported from other protein databases | |
PA-GOSUB | Protein sequences from model organisms, GO assignment and subcellular localization | |
PIR-PSD | Protein information resource protein sequence database, has been merged into the UniProt knowledgebase | |
PIR-NREF | PIR's non-redundant reference protein database | |
PRF | Protein research foundation database of peptides: sequences, literature and unnaturalamino acids | |
Swiss-Prot | Now UniProt/Swiss-Prot: expertly curated protein sequence database, section of the UniProt knowledgebase | |
TrEMBL | Now UniProt/TrEMBL: computer-annotated translations of EMBL nucleotide sequence entries: section of the UniProt knowledgebase | |
UniParc | UniProt archive: a repository of all proteinsequences, consisting only of unique identifiers and sequence | |
UniProt | Universal protein knowledgebase: merged data from Swiss-Prot, TrEMBL and PIR protein sequence databases | |
UniRef | UniProt non-redundant reference database: clustered sets of relatedsequences (including splice variants and isoforms) |
3.2. Protein properties
Database name | Full name and/or description | URL |
AAindex | Physicochemical properties of amino acids | |
ProNIT | Thermodynamic data on protein�nucleic acid interactions | |
ProTherm | Thermodynamic data for wild-type and mutant proteins | |
TECRdb | Thermodynamics of enzyme-catalyzed reactions |
3.3. Protein localization and targeting
Database name | Full name and/or description | URL |
DBSubLoc | Database of protein subcellular localization | |
NESbase | Nuclear export signals database | |
NLSdb | Nuclear localization signals | |
NMPdb | Nuclear matrix associated proteins database | |
NOPdb | Nucleolar proteome database | |
PSORTdb | Protein subcellular localization in bacteria | |
SPD | Secreted protein database | |
THGS | Transmembrane helices in genome sequences | |
TMPDB | Experimentally characterized transmembrane topologies |
3.4. Protein sequence motifs and active sites
Database name | Full name and/or description | URL |
ASC | Active sequence collection: biologically active peptides | |
Blocks | Alignments of conserved regions in protein families | |
CSA | Catalytic site atlas : active sites and catalytic residues in enzymes of known 3D structure | |
COMe | Co-ordination of metals etc.: classification of bioinorganic proteins ( metalloproteins and some other complex proteins) | |
CopS | Comprehensive peptide signature database | |
eBLOCKS | Highly conserved protein sequence blocks | |
eMOTIF | Protein sequence motif determination and searches | |
Metalloprotein Site Database | Metal-binding sites in metalloproteins | |
O-GlycBase | O- and C-linked glycosylation sites in proteins | |
PDBSite | 3D structure of protein functional sites | |
Phospho.ELM | S/T/Y protein phosphorylation sites (formerly PhosphoBase) | |
PROMISE | Prosthetic centers and metal ions in protein active sites | |
PROSITE | Biologically significant protein patterns and profiles | |
ProTeus | Signature sequences at the protein N- and C-termini |
3.5. Protein domain databases; protein classification
Database name | Full name and/or description | URL |
ADDA | A database of protein domain classification | |
CDD | Conserved domain database, includes protein domains fromPfam, SMART, COG and KOG databases | |
CluSTr | Clusters of Swiss-Prot + TrEMBL proteins | |
FunShift | Functional divergence between the subfamilies of a protein domain family | |
Hits | A database of protein domains and motifs | |
InterPro | Integrated resource of protein families, domains and functional sites | |
iProClass | Integrated protein classification database | |
PIRSF | Family/superfamily classification of whole proteins | |
PRINTS | Hierarchical gene family fingerprints | |
Pfam | Protein families: multiple sequence alignments and profile hidden Markov models of protein domains | |
PRECISE | Predicted and consensus interaction sites in enzymes | |
ProDom | Protein domain families | |
ProtoMap | Hierarchical classification of Swiss-Prot proteins | |
ProtoNet | Hierarchical clustering of Swiss-Prot proteins | |
S4 | Structure-based sequence alignments of SCOP superfamilies | |
SBASE | Protein domain sequences and tools | |
SMART | Simple modular architecture research tool: signalling, extracellular and chromatin-associated protein domains | |
SUPFAM | Grouping of sequence families into superfamilies | |
SYSTERS | Systematic re-searching and clustering of proteins | |
TIGRFAMs | TIGR protein families adapted for functional annotation |
3.6. Databases of individual protein families
Database name | Full name and/or description | URL |
AARSDB | Aminoacyl-tRNA synthetase database | |
ASPD | Artificial selected proteins/peptides database | |
BacTregulators | Transcriptional regulators of AraC and TetR families | |
CSDBase | Cold shock domain-containing proteins | |
CuticleDB | Structural proteins of Arthropod cuticle | |
DCCP | Database of copper-chelating proteins | |
DExH/D Family Database | DEAD-box, DEAH-box and DExH-box proteins | |
Endogenous GPCR List | G protein-coupled receptors; expression in cell lines | |
ESTHER | Esterases and other alpha/beta hydrolase enzymes | |
EyeSite | Families of proteins functioning in the eye | |
GPCRDB | G protein-coupled receptors database | |
gpDB | G-proteins and their interaction with GPCRs | |
Histone Database | Histone fold sequences and structures | |
Homeobox Page | Homeobox proteins, classification and evolution | |
Hox-Pro | Homeobox genes database | |
Homeodomain Resource | Homeodomain sequences, structures and related genetic and genomic information | |
HORDE | Human olfactory receptor data exploratorium | |
InBase | Inteins (protein splicing elements) database: properties, sequences, bibliography | |
KinG�Kinases in Genomes | S/T/Y-specific protein kinases encoded in complete genomes | |
Knottins | Database of knottins�small proteins with an unusual �disulfide through disulfide ' knot | |
LGICdb | Ligand-gated ion channel subunit sequences database | |
Lipase Engineering Database Sequence | structure and function of lipases and esterases | |
LOX-DB | Mammalian, invertebrate, plant and fungal lipoxygenases | |
MEROPS | Database of proteolytic enzymes (peptidases) | |
NPD | Nuclear protein database | |
NucleaRDB | Nuclear receptor superfamily | |
Nuclear Receptor Resource | Nuclear receptor superfamily | |
NUREBASE | Nuclear hormone receptors database | |
Olfactory Receptor Database | Sequences for olfactory receptor-like molecules | |
ooTFD | Object-oriented transcription factors database | |
PKR | Protein kinase resource: sequences, enzymology, genetics and molecular and structural properties | |
PLPMDB | Pyridoxal-5 0 -phosphate dependent enzymes mutations | |
ProLysED | A database of bacterial protease systems | |
Prolysis | Proteases and natural and synthetic protease inhibitors | |
REBASE | Restriction enzymes and associated methylases | |
Ribonuclease P Database | RNase P sequences, alignments and structures | |
RPG | Ribosomal protein gene database | |
RTKdb | Receptor tyrosine kinase sequences | |
S/MARt dB | Nuclear scaffold/matrix attached regions | |
Scorpion | Database of scorpion toxins | |
SDAP | Structural database of allergenic proteins and food allergens | |
SENTRA | Sensory signal transduction proteins | |
SEVENS | 7-transmembrane helix receptors (G-protein-coupled) | |
SRPDB | Proteins of the signal recognition particles | |
TrSDB | Transcription factor database | |
VKCDB | Voltage-gated potassium channel database | |
Wnt Database | Wnt proteins and phenotypes |
4. Structure Databases
4.1. Small molecules
Database name | Full name and/or description | URL |
ChEBI | Chemical entities of biological interest | |
CSD | Cambridge structural database: crystal structure information for organic and metal-organic compounds | |
HIC-Up | Hetero-compound Information Centre� Uppsala | |
AANT | Amino acid�nucleotide interaction database | |
Klotho | Collection and categorization of biological compounds | |
LIGAND | Chemical compounds and reactions in biological pathways | |
PDB-Ligand | 3D structures of small molecules bound to proteins and nucleic acids | |
Ligand Depot | Ligand Depot is a data warehouse which integrates databases, services, tools and methods related to small molecules bound to macromolecules. | |
PubChem | Structures and biological activities of small organic molecules |
4.2. Carbohydrates
Database name | Full name and/or description | URL |
CCSD | Complex carbohydrate structure database (CarbBank) | |
CSS | Carbohydrate structure suite: carbohydrate 3D structures derived from the PDB | |
Glycan | Carbohydrate database, part of the KEGG system | |
GlycoSuiteDB | N- and O-linked glycan structures and biological sources | |
Monosaccharide Browser | Space-filling Fischer projections of monosaccharides | |
SWEET-DB | Annotated carbohydrate structure and substance information |
4.3. Nucleic acid structure
Database name | Full name and/or description | URL |
NDB | Nucleic acid-containing structures | |
NTDB | Thermodynamic data for nucleic acids | |
RNABase | RNA-containing structures from PDB and NDB | |
SCOR | Structural classification of RNA: RNA motifs by structure, function and tertiary interactions |
4.4. Protein structure
Database name | Full name and/or description | URL |
ArchDB | Automated classification of protein loop structures | |
wwPDB | Worldwide Protein Data Bank | |
PDBj | Protein Data Bank Japan-the archive for macromolecular structures. | |
ASTRAL | Sequences of domains of known structure, selected subsets and sequence� structure correspondences | |
BAliBASE | A database for comparison of multiple sequence alignments | |
BioMagResBank | NMR spectroscopic data for proteins and nucleic acids | |
CADB | Conformational angles in proteins database | |
CATH | Protein domain structures database | |
CE 3D | protein structure alignments | |
CKAAPs DB | Structurally similar proteins with dissimilar sequences | |
Dali | Protein fold classification using the Dali search engine | |
Decoys �R' Us | Computer-generated protein conformations | |
DisProt | Database of Protein Disorder: proteins that lack fixed 3D structure in their native states | |
DomIns | Domain insertions in known protein structures | |
DSDBASE | Native and modeled disulfide bonds in proteins | |
DSMM | Database of simulated molecular motions | |
eF-site | Electrostatic surface of Functional site: electrostatic potentials and hydrophobic properties of the active sites | |
GenDiS | Genomic distribution of protein structural superfamilies | |
Gene3D | Precalculated structural assignments for whole genomes | |
GTD | Genomic threading database: structural annotations of complete proteomes | |
GTOP | Protein fold predictions from genome sequences | |
Het-PDB | Navi Hetero-atoms in protein structures | |
HOMSTRAD | Homologous structure alignment database: curated structure-based alignments for protein families | |
IMB Jena Image Library | Visualization and analysis of 3D biopolymer structures | |
IMGT/3Dstructure-DB | Sequences and 3D structures of vertebrate immunoglobulins, T cell receptors and MHC proteins | |
ISSD | Integrated sequence�structure database | |
LPFC | Library of protein family core structures | |
MMDB | NCBI's database of 3D structures, part of NCBI Entrez | |
E-MSD | EBI's macromolecular structure database | |
ModBase | Annotated comparative protein structure models | |
MolMovDB | Database of macromolecular movements: descriptions of protein and macromolecular motions, including movies | |
PALI | Phylogeny and alignment of homologous protein structures | |
PASS2 | Structural motifs of protein superfamilies | |
PepConfDB | A database of peptide conformations | |
PDB | Protein structure databank: all publicly available 3D structures of proteins and nucleic acids | |
PDB-REPRDB | Representative protein chains, based on PDB entries | |
PDBsum | Summaries and analyses of PDB structures | |
PDB_TM | Transmembrane proteins with known 3D structure | |
Protein Folding Database | Experimental data on protein folding | |
SCOP | Structural classification of proteins | |
Sloop | Classification of protein loops | |
Structure Superposition Database | Pairwise superposition of TIM-barrel structures | |
SWISS-MODEL Repository | Database of annotated 3D protein structure models | |
SUPERFAMILY | Assignments of proteins to structural superfamilies | |
SURFACE | Surface residues and functions annotated, compared and evaluated: a database of protein surface patches | |
TargetDB | Target data from worldwide structural genomics projects | |
3D-GENOMICS | Structural annotations for complete proteomes | |
TOPS | Topology of protein structures database |
5. Genomics Databases (non-human)
5.1. Genome annotation terms, ontologies and nomenclature
Database name | Full name and/or description | URL |
Genew | Human gene nomenclature: approved gene symbols | |
GO | Gene ontology consortium database | |
GOA | EBI's gene ontology annotation project | |
IUBMB Nomenclature database | Nomenclature of enzymes, membrane transporters, electron transport proteins and other proteins | |
IUPAC Nomenclature database | Nomenclature of biochemical and organic compounds approved by the IUBMB-IUPAC Joint Commission | |
IUPHAR-RD | The International Union of Pharmacology recommendations on receptor nomenclature and drug classification | |
PANTHER | Gene products organized by biological function | |
UMLS | Unified medical language system |
5.1.1. Taxonomy and Identification
Database name | Full name and/or description | URL |
ICB | gyrB database for identification and classification of bacteria | |
NCBI Taxonomy | Names of all organisms represented in GenBank | |
PANDIT | Protein and associated nucleotide domains with inferred trees | |
RIDOM | rRNA-based differentiation of medical microorganisms | |
RDP-II | Ribosomal database project | |
Tree of Life | Information on phylogeny and biodiversity |
5.2. General genomics databases
Database name | Full name and/or description | URL |
COG | Clusters of orthologous groups of proteins | |
COGENT | Complete genome tracking: predicted peptides from fully sequenced genomes | |
CORG | Comparative regulatory genomics: conserved non-coding sequence blocks | |
DEG | Database of essential genes from bacteria and yeast | |
EBI Genomes | EBI's collection of databases for the analysis of complete and unfinished viral , pro- and eukaryotic genomes | |
EGO | Eukaryotic gene orthologs: orthologous DNA sequences in the TIGR gene indices | |
EMGlib | Enhanced microbial genomes library: completely sequenced genomes of unicellular organisms | |
Entrez Genomes | NCBI's collection of databases for the analysis of complete and unfinished viral , pro- and eukaryotic genomes | |
ERGO | Light Integrated biochemical data on nine bacterial genomes: publicly available portion of the ERGO database | |
FusionDB | Database of bacterial and archaeal gene fusion events | |
Genome Atlas | DNA structural properties of sequenced genomes | |
Genome Information Broker | DDBJ's collection of databases for the analysis of complete and unfinished viral , pro- and eukaryotic genomes | |
Genome Reviews | Integrated view of complete genomes | |
GOLD | Genomes online database: a listing of completed and ongoing genome projects | |
HGT-DB | Putative horizontally transferred genes in prokaryotic genomes | |
Integr8 | Functional classification of proteins in whole genomes | |
KEGG | Kyoto encyclopedia of genes and genomes: integrated suite of databases on genes , proteins and metabolic pathways | |
MBGD | Microbial genome database for comparative analysis | |
ORFanage | Database of orphan ORFs (ORFs with no homologs) in complete microbial genomes | |
PACRAT | Archaeal and bacterial intergenic sequence features | |
PartiGeneDB | Assembled partial genomes for _ 250 eukaryotic organisms | |
PEDANT | Results of an automated analysis of genomic sequences | |
TIGR Microbial Database | Lists of completed and ongoing genome projects with links to complete genome sequences | |
TIGR Comprehensive Microbial Resource | Various data on complete microbial genomes: uniform annotation, properties of DNA and predicted proteins | |
TransportDB | Predicted membrane transporters in complete genomes, classified according to the TC classification system | |
WIT3 | What is there ? Metabolic reconstruction for completely sequenced microbial genomes |