Bioinformatics Database
1.1 International Nucleotide Sequence Database Collaboration
| Database name | Full name and/or description | URL |
| DDBJ-DNA Data Bank of Japan | All known nucleotide and proteinsequences | |
| EMBL-Nucleotide Sequence Database | All known nucleotide and proteinsequences | |
| GenBank | All known nucleotide and proteinsequences |
1.2. DNA sequences: genes, motifs and regulatory sites 1.2.1. Coding and coding DNA
| Database name | Full name and/or description | URL |
| ACLAME | A classification of genetic mobile elements | |
| CUTG | Codon usage tabulated from GenBank | |
| Genetic Codes | Genetic codes in various organisms and organelles | |
| Entrez Gene | Gene-centeredinformation at NCBI | |
| HERVd | Human endogenous retrovirus database | |
| Hoppsigen | Human and mouse homologous processed pseudogenes | |
| Imprinted Gene Catalogue | Imprinted genes and parent-of-origin effects in animals | |
| Islander | Pathogenicity islands and prophages inbacterial genomes | |
| MICdb | Prokaryotic microsatellites | |
| NPRD | Nucleosome positioning region database | |
| STRBase | Short tandem DNA repeats database | |
| TIGR Gene Indices | Organism-specific databases of EST and genesequences | |
| Transterm | Codon usage, start and stop signals | |
| UniGene | Non-redundant set of eukaryotic gene-oriented clusters | |
| UniVec | Vector sequences,adapters, linkers and primers used in DNA cloning, can be used to check for vectorcontamination | |
| VectorDB | Characterization and classification of nucleic acid vectors | |
| Xpro | Eukaryotic protein-encoding DNAsequences, both intron-containing and intron- less genes |
1.2.2. Gene structure, introns and exons, splice sites
| Database name | Full name and/or description | URL |
| ASAP | Alternative spliced isoforms | |
| ASD | Alternative splicing database at EBI, includes three databases AltSplice, AltExtron and AEdb | |
| ASDB | Alternative splicing database: protein products and expression patterns of alternatively spliced genes | |
| ASHESdb | Alternatively spliced human genes by exonskipping database | |
| EASED | Extended alternatively spliced EST database | |
| ECgene | Genome annotation for alternative splicing | |
| EDAS | EST-derived alternative splicing database | |
| ExInt | Exon�intron structure of eukaryotic genes | |
| HS3D | Homo sapiens splice sites dataset | |
| Intronerator | Alternative splicing in C.elegans and C.briggsae | |
| SpliceDB | Canonical and non-canonical mammalian splice sites | |
| SpliceInfo | Modes of alternative splicing in human genome | |
| SpliceNest | A tool for visualizing splicing of genes from EST data |
1.2.3. Transcriptional regulator sites and transcription factors
| Database name | Full name and/or description | URL |
| ACTIVITY | Functional DNA/RNA site activity | |
| DBTBS | Bacillus subtilis promoters and transcription factors | |
| DoOP | Database of orthologous promoters: chordates and plants | |
| DPInteract | Binding sites for E.coli DNA-binding proteins | |
| EPD | Eukaryotic promoterdatabase | |
| HemoPDB | Hematopoieticpromoter database: transcriptional regulation in hematopoiesis | |
| JASPAR | PSSMs for transcription factor DNA-binding sites | |
| MAPPER | Putative transcription factor binding sites in various genomes | |
| PLACE | Plant cis-acting regulatory DNA elements | |
| PlantCARE | Plant promoters and cis -acting regulatory elements | |
| PlantProm | Plant promotersequences for RNApolymerase II | |
| PRODORIC | Prokaryotic database of gene regulation networks | |
| PromEC | E . coli promoters with experimentally identified transcriptional start sites | |
| SELEX_DB | DNA and RNA binding sites for various proteins, found by systematic evolution of ligands by exponential enrichment | |
| TESS | Transcription element search system | |
| TRACTOR db | Transcription factors in gamma-proteobacteria database | |
| TRANSCompel | Composite regulatory elements affecting gene transcription in eukaryotes | |
| TRANSFAC | Transcription factors and binding sites | |
| TRED | Transcriptional regulatory element database | |
| TRRD | Transcription regulatory regions of eukaryotic genes |
2. RNA sequence databases
| Database name | Full name and/or description | URL |
| 16S and 23S rRNA Mutation Database | 16S and 23S ribosomal RNA mutations | |
| 5S rRNA Database | 5S rRNA sequences | |
| Aptamer database | Small RNA/DNAmolecules binding nucleic acids, proteins | |
| ARED | AU-rich element-containing mRNA database | |
| Mobile group II introns | A database of group II introns, self-splicing catalytic RNAs | |
| European rRNA database | All complete or nearly complete rRNAsequences | |
| GtRDB | Genomic tRNA database | |
| Guide RNA Database | RNA editing in various kinetoplastid species | |
| HIV Sequence Database | HIV RNA sequences | |
| HuSiDa | Human siRNA database | |
| HyPaLib | Hybrid pattern library : structural elements in classes of RNA | |
| IRESdb | Internal ribosome entry site database | |
| microRNA Registry | Database of microRNAs (small non-coding RNAs) | |
| NCIR | Non-canonical interactions in RNA structures | |
| ncRNAs Database | Non-coding RNAs with regulatory functions | |
| NONCODE | A database of non-coding RNAs | |
| PLANTncRNAs | Plant non-coding RNAs | |
| Plant snoRNA DB | snoRNA genes in plant species | |
| PolyA_DB | A database of mammalian mRNA polyadenylation | |
| PseudoBase | Database of RNA pseudoknots | |
| Rfam | Non-coding RNA families | |
| RISSC | Ribosomal internal spacer sequence collection | |
| RNAdb | Mammalian non-coding RNA database | |
| RNA Modification Database | Naturally modified nucleosides in RNA | |
| RRNDB | rRNA operon numbers in various prokaryotes | |
| siRNAdb | siRNA database and search engine | |
| Small RNA Database | Small RNAs from prokaryotes and eukaryotes | |
| SRPDB | Signal recognition particle database | |
| SSU rRNA Modification Database | Modified nucleosides in small subunit rRNA | |
| Subviral RNA Database | Viroids and viroid-like RNAs | |
| tmRNA Website | tmRNA sequences and alignments | |
| tmRDB | tmRNA database | |
| tRNA sequences | tRNA viewer and sequence editor | |
| UTRdb/UTRsite | 5' and 3' -UTRs of eukaryotic mRNAs |
3. Protein sequence databases
3.1. General sequence databases
| Database name | Full name and/or description | URL |
| EXProt | Sequences of proteins with experimentally verified function | |
| NCBI Protein database | All protein sequences: translated from GenBank and imported from other protein databases | |
| PA-GOSUB | Protein sequences from model organisms, GO assignment and subcellular localization | |
| PIR-PSD | Protein information resource protein sequence database, has been merged into the UniProt knowledgebase | |
| PIR-NREF | PIR's non-redundant reference protein database | |
| PRF | Protein research foundation database of peptides: sequences, literature and unnaturalamino acids | |
| Swiss-Prot | Now UniProt/Swiss-Prot: expertly curated protein sequence database, section of the UniProt knowledgebase | |
| TrEMBL | Now UniProt/TrEMBL: computer-annotated translations of EMBL nucleotide sequence entries: section of the UniProt knowledgebase | |
| UniParc | UniProt archive: a repository of all proteinsequences, consisting only of unique identifiers and sequence | |
| UniProt | Universal protein knowledgebase: merged data from Swiss-Prot, TrEMBL and PIR protein sequence databases | |
| UniRef | UniProt non-redundant reference database: clustered sets of relatedsequences (including splice variants and isoforms) |
3.2. Protein properties
| Database name | Full name and/or description | URL |
| AAindex | Physicochemical properties of amino acids | |
| ProNIT | Thermodynamic data on protein�nucleic acid interactions | |
| ProTherm | Thermodynamic data for wild-type and mutant proteins | |
| TECRdb | Thermodynamics of enzyme-catalyzed reactions |
3.3. Protein localization and targeting
| Database name | Full name and/or description | URL |
| DBSubLoc | Database of protein subcellular localization | |
| NESbase | Nuclear export signals database | |
| NLSdb | Nuclear localization signals | |
| NMPdb | Nuclear matrix associated proteins database | |
| NOPdb | Nucleolar proteome database | |
| PSORTdb | Protein subcellular localization in bacteria | |
| SPD | Secreted protein database | |
| THGS | Transmembrane helices in genome sequences | |
| TMPDB | Experimentally characterized transmembrane topologies |
3.4. Protein sequence motifs and active sites
| Database name | Full name and/or description | URL |
| ASC | Active sequence collection: biologically active peptides | |
| Blocks | Alignments of conserved regions in protein families | |
| CSA | Catalytic site atlas : active sites and catalytic residues in enzymes of known 3D structure | |
| COMe | Co-ordination of metals etc.: classification of bioinorganic proteins ( metalloproteins and some other complex proteins) | |
| CopS | Comprehensive peptide signature database | |
| eBLOCKS | Highly conserved protein sequence blocks | |
| eMOTIF | Protein sequence motif determination and searches | |
| Metalloprotein Site Database | Metal-binding sites in metalloproteins | |
| O-GlycBase | O- and C-linked glycosylation sites in proteins | |
| PDBSite | 3D structure of protein functional sites | |
| Phospho.ELM | S/T/Y protein phosphorylation sites (formerly PhosphoBase) | |
| PROMISE | Prosthetic centers and metal ions in protein active sites | |
| PROSITE | Biologically significant protein patterns and profiles | |
| ProTeus | Signature sequences at the protein N- and C-termini |
3.5. Protein domain databases; protein classification
| Database name | Full name and/or description | URL |
| ADDA | A database of protein domain classification | |
| CDD | Conserved domain database, includes protein domains fromPfam, SMART, COG and KOG databases | |
| CluSTr | Clusters of Swiss-Prot + TrEMBL proteins | |
| FunShift | Functional divergence between the subfamilies of a protein domain family | |
| Hits | A database of protein domains and motifs | |
| InterPro | Integrated resource of protein families, domains and functional sites | |
| iProClass | Integrated protein classification database | |
| PIRSF | Family/superfamily classification of whole proteins | |
| PRINTS | Hierarchical gene family fingerprints | |
| Pfam | Protein families: multiple sequence alignments and profile hidden Markov models of protein domains | |
| PRECISE | Predicted and consensus interaction sites in enzymes | |
| ProDom | Protein domain families | |
| ProtoMap | Hierarchical classification of Swiss-Prot proteins | |
| ProtoNet | Hierarchical clustering of Swiss-Prot proteins | |
| S4 | Structure-based sequence alignments of SCOP superfamilies | |
| SBASE | Protein domain sequences and tools | |
| SMART | Simple modular architecture research tool: signalling, extracellular and chromatin-associated protein domains | |
| SUPFAM | Grouping of sequence families into superfamilies | |
| SYSTERS | Systematic re-searching and clustering of proteins | |
| TIGRFAMs | TIGR protein families adapted for functional annotation |
3.6. Databases of individual protein families
| Database name | Full name and/or description | URL |
| AARSDB | Aminoacyl-tRNA synthetase database | |
| ASPD | Artificial selected proteins/peptides database | |
| BacTregulators | Transcriptional regulators of AraC and TetR families | |
| CSDBase | Cold shock domain-containing proteins | |
| CuticleDB | Structural proteins of Arthropod cuticle | |
| DCCP | Database of copper-chelating proteins | |
| DExH/D Family Database | DEAD-box, DEAH-box and DExH-box proteins | |
| Endogenous GPCR List | G protein-coupled receptors; expression in cell lines | |
| ESTHER | Esterases and other alpha/beta hydrolase enzymes | |
| EyeSite | Families of proteins functioning in the eye | |
| GPCRDB | G protein-coupled receptors database | |
| gpDB | G-proteins and their interaction with GPCRs | |
| Histone Database | Histone fold sequences and structures | |
| Homeobox Page | Homeobox proteins, classification and evolution | |
| Hox-Pro | Homeobox genes database | |
| Homeodomain Resource | Homeodomain sequences, structures and related genetic and genomic information | |
| HORDE | Human olfactory receptor data exploratorium | |
| InBase | Inteins (protein splicing elements) database: properties, sequences, bibliography | |
| KinG�Kinases in Genomes | S/T/Y-specific protein kinases encoded in complete genomes | |
| Knottins | Database of knottins�small proteins with an unusual �disulfide through disulfide ' knot | |
| LGICdb | Ligand-gated ion channel subunit sequences database | |
| Lipase Engineering Database Sequence | structure and function of lipases and esterases | |
| LOX-DB | Mammalian, invertebrate, plant and fungal lipoxygenases | |
| MEROPS | Database of proteolytic enzymes (peptidases) | |
| NPD | Nuclear protein database | |
| NucleaRDB | Nuclear receptor superfamily | |
| Nuclear Receptor Resource | Nuclear receptor superfamily | |
| NUREBASE | Nuclear hormone receptors database | |
| Olfactory Receptor Database | Sequences for olfactory receptor-like molecules | |
| ooTFD | Object-oriented transcription factors database | |
| PKR | Protein kinase resource: sequences, enzymology, genetics and molecular and structural properties | |
| PLPMDB | Pyridoxal-5 0 -phosphate dependent enzymes mutations | |
| ProLysED | A database of bacterial protease systems | |
| Prolysis | Proteases and natural and synthetic protease inhibitors | |
| REBASE | Restriction enzymes and associated methylases | |
| Ribonuclease P Database | RNase P sequences, alignments and structures | |
| RPG | Ribosomal protein gene database | |
| RTKdb | Receptor tyrosine kinase sequences | |
| S/MARt dB | Nuclear scaffold/matrix attached regions | |
| Scorpion | Database of scorpion toxins | |
| SDAP | Structural database of allergenic proteins and food allergens | |
| SENTRA | Sensory signal transduction proteins | |
| SEVENS | 7-transmembrane helix receptors (G-protein-coupled) | |
| SRPDB | Proteins of the signal recognition particles | |
| TrSDB | Transcription factor database | |
| VKCDB | Voltage-gated potassium channel database | |
| Wnt Database | Wnt proteins and phenotypes |
4. Structure Databases
4.1. Small molecules
| Database name | Full name and/or description | URL |
| ChEBI | Chemical entities of biological interest | |
| CSD | Cambridge structural database: crystal structure information for organic and metal-organic compounds | |
| HIC-Up | Hetero-compound Information Centre� Uppsala | |
| AANT | Amino acid�nucleotide interaction database | |
| Klotho | Collection and categorization of biological compounds | |
| LIGAND | Chemical compounds and reactions in biological pathways | |
| PDB-Ligand | 3D structures of small molecules bound to proteins and nucleic acids | |
| Ligand Depot | Ligand Depot is a data warehouse which integrates databases, services, tools and methods related to small molecules bound to macromolecules. | |
| PubChem | Structures and biological activities of small organic molecules |
4.2. Carbohydrates
| Database name | Full name and/or description | URL |
| CCSD | Complex carbohydrate structure database (CarbBank) | |
| CSS | Carbohydrate structure suite: carbohydrate 3D structures derived from the PDB | |
| Glycan | Carbohydrate database, part of the KEGG system | |
| GlycoSuiteDB | N- and O-linked glycan structures and biological sources | |
| Monosaccharide Browser | Space-filling Fischer projections of monosaccharides | |
| SWEET-DB | Annotated carbohydrate structure and substance information |
4.3. Nucleic acid structure
| Database name | Full name and/or description | URL |
| NDB | Nucleic acid-containing structures | |
| NTDB | Thermodynamic data for nucleic acids | |
| RNABase | RNA-containing structures from PDB and NDB | |
| SCOR | Structural classification of RNA: RNA motifs by structure, function and tertiary interactions |
4.4. Protein structure
| Database name | Full name and/or description | URL |
| ArchDB | Automated classification of protein loop structures | |
| wwPDB | Worldwide Protein Data Bank | |
| PDBj | Protein Data Bank Japan-the archive for macromolecular structures. | |
| ASTRAL | Sequences of domains of known structure, selected subsets and sequence� structure correspondences | |
| BAliBASE | A database for comparison of multiple sequence alignments | |
| BioMagResBank | NMR spectroscopic data for proteins and nucleic acids | |
| CADB | Conformational angles in proteins database | |
| CATH | Protein domain structures database | |
| CE 3D | protein structure alignments | |
| CKAAPs DB | Structurally similar proteins with dissimilar sequences | |
| Dali | Protein fold classification using the Dali search engine | |
| Decoys �R' Us | Computer-generated protein conformations | |
| DisProt | Database of Protein Disorder: proteins that lack fixed 3D structure in their native states | |
| DomIns | Domain insertions in known protein structures | |
| DSDBASE | Native and modeled disulfide bonds in proteins | |
| DSMM | Database of simulated molecular motions | |
| eF-site | Electrostatic surface of Functional site: electrostatic potentials and hydrophobic properties of the active sites | |
| GenDiS | Genomic distribution of protein structural superfamilies | |
| Gene3D | Precalculated structural assignments for whole genomes | |
| GTD | Genomic threading database: structural annotations of complete proteomes | |
| GTOP | Protein fold predictions from genome sequences | |
| Het-PDB | Navi Hetero-atoms in protein structures | |
| HOMSTRAD | Homologous structure alignment database: curated structure-based alignments for protein families | |
| IMB Jena Image Library | Visualization and analysis of 3D biopolymer structures | |
| IMGT/3Dstructure-DB | Sequences and 3D structures of vertebrate immunoglobulins, T cell receptors and MHC proteins | |
| ISSD | Integrated sequence�structure database | |
| LPFC | Library of protein family core structures | |
| MMDB | NCBI's database of 3D structures, part of NCBI Entrez | |
| E-MSD | EBI's macromolecular structure database | |
| ModBase | Annotated comparative protein structure models | |
| MolMovDB | Database of macromolecular movements: descriptions of protein and macromolecular motions, including movies | |
| PALI | Phylogeny and alignment of homologous protein structures | |
| PASS2 | Structural motifs of protein superfamilies | |
| PepConfDB | A database of peptide conformations | |
| PDB | Protein structure databank: all publicly available 3D structures of proteins and nucleic acids | |
| PDB-REPRDB | Representative protein chains, based on PDB entries | |
| PDBsum | Summaries and analyses of PDB structures | |
| PDB_TM | Transmembrane proteins with known 3D structure | |
| Protein Folding Database | Experimental data on protein folding | |
| SCOP | Structural classification of proteins | |
| Sloop | Classification of protein loops | |
| Structure Superposition Database | Pairwise superposition of TIM-barrel structures | |
| SWISS-MODEL Repository | Database of annotated 3D protein structure models | |
| SUPERFAMILY | Assignments of proteins to structural superfamilies | |
| SURFACE | Surface residues and functions annotated, compared and evaluated: a database of protein surface patches | |
| TargetDB | Target data from worldwide structural genomics projects | |
| 3D-GENOMICS | Structural annotations for complete proteomes | |
| TOPS | Topology of protein structures database |
5. Genomics Databases (non-human)
5.1. Genome annotation terms, ontologies and nomenclature
| Database name | Full name and/or description | URL |
| Genew | Human gene nomenclature: approved gene symbols | |
| GO | Gene ontology consortium database | |
| GOA | EBI's gene ontology annotation project | |
| IUBMB Nomenclature database | Nomenclature of enzymes, membrane transporters, electron transport proteins and other proteins | |
| IUPAC Nomenclature database | Nomenclature of biochemical and organic compounds approved by the IUBMB-IUPAC Joint Commission | |
| IUPHAR-RD | The International Union of Pharmacology recommendations on receptor nomenclature and drug classification | |
| PANTHER | Gene products organized by biological function | |
| UMLS | Unified medical language system |
5.1.1. Taxonomy and Identification
| Database name | Full name and/or description | URL |
| ICB | gyrB database for identification and classification of bacteria | |
| NCBI Taxonomy | Names of all organisms represented in GenBank | |
| PANDIT | Protein and associated nucleotide domains with inferred trees | |
| RIDOM | rRNA-based differentiation of medical microorganisms | |
| RDP-II | Ribosomal database project | |
| Tree of Life | Information on phylogeny and biodiversity |
5.2. General genomics databases
| Database name | Full name and/or description | URL |
| COG | Clusters of orthologous groups of proteins | |
| COGENT | Complete genome tracking: predicted peptides from fully sequenced genomes | |
| CORG | Comparative regulatory genomics: conserved non-coding sequence blocks | |
| DEG | Database of essential genes from bacteria and yeast | |
| EBI Genomes | EBI's collection of databases for the analysis of complete and unfinished viral , pro- and eukaryotic genomes | |
| EGO | Eukaryotic gene orthologs: orthologous DNA sequences in the TIGR gene indices | |
| EMGlib | Enhanced microbial genomes library: completely sequenced genomes of unicellular organisms | |
| Entrez Genomes | NCBI's collection of databases for the analysis of complete and unfinished viral , pro- and eukaryotic genomes | |
| ERGO | Light Integrated biochemical data on nine bacterial genomes: publicly available portion of the ERGO database | |
| FusionDB | Database of bacterial and archaeal gene fusion events | |
| Genome Atlas | DNA structural properties of sequenced genomes | |
| Genome Information Broker | DDBJ's collection of databases for the analysis of complete and unfinished viral , pro- and eukaryotic genomes | |
| Genome Reviews | Integrated view of complete genomes | |
| GOLD | Genomes online database: a listing of completed and ongoing genome projects | |
| HGT-DB | Putative horizontally transferred genes in prokaryotic genomes | |
| Integr8 | Functional classification of proteins in whole genomes | |
| KEGG | Kyoto encyclopedia of genes and genomes: integrated suite of databases on genes , proteins and metabolic pathways | |
| MBGD | Microbial genome database for comparative analysis | |
| ORFanage | Database of orphan ORFs (ORFs with no homologs) in complete microbial genomes | |
| PACRAT | Archaeal and bacterial intergenic sequence features | |
| PartiGeneDB | Assembled partial genomes for _ 250 eukaryotic organisms | |
| PEDANT | Results of an automated analysis of genomic sequences | |
| TIGR Microbial Database | Lists of completed and ongoing genome projects with links to complete genome sequences | |
| TIGR Comprehensive Microbial Resource | Various data on complete microbial genomes: uniform annotation, properties of DNA and predicted proteins | |
| TransportDB | Predicted membrane transporters in complete genomes, classified according to the TC classification system | |
| WIT3 | What is there ? Metabolic reconstruction for completely sequenced microbial genomes |