COMBREX is a project to increase the speed of the functional annotation of new bacterial and archaeal genomes. It consists of a database of functional predictions produced by computational biologists and a mechanism for experimental biochemists to bid for the validation of those predictions. Small grants are available to support successful bids.
Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) Finder detects this family of direct repeats found in the DNA of many bacteria and archaea.
Composition Vector Tree (CVTree) infers phylogenetic relationships between microbial organisms by comparing their proteomes using a composition vector approach.
The Genomic Disulfide Analysis Program (GDAP) predicts disulfide bonds for a user-supplied protein sequence. GDAP also provides access to pre-computed predictions of disulfide bonds for over 100 microbial genomes.
Gene3D provides accurate structural domain family assignments for over 1100 genomes and nearly 10,000,000 proteins. A hidden Markov model library, constructed from the manually curated CATH structural domain hierarchy, is used to search UniProt, RefSeq and Ensembl protein sequences. The resulting matches are refined into simple multi-domain architectures. The domain assignments are integrated with multiple external protein function descriptions (e.g. Gene Ontology and KEGG), structural annotations (e.g. coiled coils, disordered regions and sequence polymorphisms) and family resources (e.g. Pfam and eggNog). Gene3D also provides a set of services, including an interactive genome coverage graph visualizer, DAS annotation resources, sequence search facilities and SOAP services.
The Genomes On Line Database (GOLD) is a comprehensive resource for centralized monitoring of genome and metagenome projects worldwide. Both complete and ongoing projects, along with their associated metadata, can be accessed in GOLD through precomputed tables and a search page. GOLD is in accordance with the Minimum Information about a (Meta)Genome Sequence (MIGS/MIMS) specification.
Extensive server possessing a wide range of tools for pattern discovery in DNA and protein sequences as well as in text. Tools for multiple sequence alignment, gene discovery, protein annotation, and other applications also exist on this server. A detailed help page is provided for all tools.
The integrated microbial genomes (IMG) system serves as a community resource for comparative analysis of publicly available genomes in a comprehensive integrated context. IMG contains both draft and complete microbial genomes integrated with other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and reviewing the annotations of genes and genomes in a comparative context.
IslandPath aids genomic island detection in prokaryotic genome seqeunces, using features such as dinucleotide bias, G+C, location of tRNA genes, annotations of mobility genes, etc. Genomic islands are defined here as genomic regions of potential horizontal origin.
The MetaCyc database is a comprehensive resource for metabolic pathways and enzymes from all domains of life. The pathways in MetaCyc are experimentally determined, small-molecule metabolic pathways and are curated from the primary scientific literature. Pathways reactions are linked to one or more well-characterized enzymes, and both pathways and enzymes are annotated with reviews, evidence codes, and literature citations.
The Microbe Browser is a web server providing comparative microbial genomics data integrated from GenBank, RefSeq, UniProt, InterPro, Gene Ontology and the Orthologs Matrix Project (OMA) databases. Gene predictions based on 5 software packages is also displayed.
The MiST2 database identifies and catalogs the repertoire of signal transduction proteins in microbial genomes. These are identified by searching protein sequences for specific domain profiles that implicate a protein in signal transduction. MiST2 contains a host of new features and improvements including the following: draft genomes; extracytoplasmic function (ECF) sigma factor protein identification; enhanced classification of signaling proteins; novel, high-quality domain models for identifying histidine kinases and response regulators; neighboring two-component genes; gene cart; better search capabilities; enhanced taxonomy browser; advanced genome browser; and a modern, biologist-friendly web interface.