AGenDA is a web tool that compares the genomic sequences from evolutionarily related organisms in order to make gene predictions. It takes pairs of genomic sequences as input, aligns the sequences, and makes predictions based on splice signals, start and stop codons, and areas of conserved sequence.
Babelomics is a suite of web tools for the functional annotation and analysis of groups of genes in high throughput experiments. Tools include: FatiGO, FatiGOplus, Fatiscan, Gene Set Enrichment Analysis (GSEA), Marmite, and the Tissues Mining Tool (TMT). Other tools include Biocarta pathways, Transfac and a tool de novo functional annotation of sequences.
The Berkeley Phylogenomics Group provides a series of
web servers for phylogenomic analysis: classification of sequences to pre-computed families and subfamilies using the PhyloFacts Phylogenomic Encyclopedia, FlowerPower clustering of proteins sharing the same domain architecture, MUSCLE multiple sequence alignment, SATCHMO simultaneous alignment and tree construction, and SCI-PHY subfamily identification.
BLASTO (BLAST on Orthologous groups) is a modified BLAST tool for searching orthologous group data. It treats each orthologous group as a unit and outputs a ranked list of orthologous groups instead of single sequences.
CATH is a manually curated classification of protein domain structures. Each protein has been chopped into structural domains and assigned into homologous superfamilies (groups of domains that are related by evolution). This classification procedure uses a combination of automated and manual techniques which include computational algorithms, empirical and statistical evidence, literature review and expert analysis.
The Cyber infrastructure for Fusarium (CiF) consists of Fusarium-ID, Fusarium Comparative Genomics Platform (FCGP) and Fusarium Community Platform (FCP). The Fusarium-ID archives phylogenetic marker sequences from most known species along with information associated with characterized isolates and supports strain identification and phylogenetic analyses. The FCGP currently archives five genomes from four species. The FCGP presents computed characteristics of multiple gene families and functional groups. The Cart/Favorite function allows users to collect sequences from Fusarium-ID and the FCGP and analyze them using multiple tools without requiring repeated copying-and-pasting of sequences.
Clustal X is a version of the Clustal W multiple sequence alignment program with a graphical interface. The display colours allow conserved features to be highlighted for easy viewing in the alignment. It is available for several platforms, including Windows, Macintosh PowerMac, Linux and Solaris.
CONREAL (Conserved Regulatory Elements Anchored Alignment) allows identification of transcription factor binding sites (TFBS) that are conserved between two orthologous promoter sequences.
The ConSurf server allows one to map levels of amino acid conservation to known protein structures in order to study areas of potential functional importance on the surface of the protein. A PDB file is required as input, and a multiple sequence alignment is optional. If an alignment is not provided, ConSurf will build one by performing a search for homologous sequences and then aligning them.
ConSurf 2010 combines ConSurf and ConSeq for an easier, more intuitive interface.
CORUM is a database that provides a manually curated repository of experimentally characterized protein complexes from mammalian organisms, mainly human (64%), mouse (16%) and rat (12%). The CORUM dataset is built from 3198 different genes, representing approximately 16% of the protein coding genes in humans. Each protein complex is described by a protein complex name, subunit composition, function as well as the literature reference that characterizes the respective protein complex. A 'Phylogenetic Conservation' analysis tool allows one to predict the occurrence of protein complexes in different phylogenetic groups.
Composition Vector Tree (CVTree) infers phylogenetic relationships between microbial organisms by comparing their proteomes using a composition vector approach.
A Mixer of Protein Domain Analysis Tools (d-Omix) provides tools to analyze, compare and visualize protein data sets with respect to their combinations of protein domains.