Goal is to determine the gene expression profiles of normal, precancer, and cancer cells; resources for human and mouse include ESTs, gene expression patterns, SNPs, cluster assemblies, cytogenetic information, and tools to query and analyze the data.
CanPredict uses a combination of computational methods to predict whether specific sequence changes in a protein are likely to be cancer-associated mutations.
CaSNP database for storing and interrogating quantitative copy number alterations (CNA) data from SNP arrays on 34 different cancer types in 104 studies. With a user input of region or gene of interest, CaSNP will return the CNA information summarizing the frequencies of gain/loss and averaged copy number for each study, and provide links to download the data or visualize it in UCSC Genome Browser. CaSNP also displays the heatmap showing copy numbers estimated at each SNP marker around the query region across all studies for a more comprehensive visualization.
CCancer is an automatically collected database of gene lists reported in various studies. The current coverage is 3369 gene lists from 2644 papers. Enrichment analysis reports intersecting gene lists with an inputted gene list.
COSMIC curates comprehensive information on somatic mutations in human cancer. Release v48 (July 2010) describes over 136,000 coding mutations in almost 542,000 tumour samples; of the 18,490 genes documented, 4803 (26%) have one or more mutations. Full scientific literature curations are available on 83 major cancer genes and 49 fusion gene pairs. Biomart allows more automated data mining and integration with other biological databases. Annotation of genomic features has become a significant focus. COSMIC integrates many diverse types of mutation information and is making much closer links with Ensembl and other data resources.
A database of Chromosomal Rearrangements In Diseases (dbCRID, http://dbCRID.biolead.org) is a database of human CR events and their associated diseases. For each reported CR event, dbCRID documents the type of the event, the disease or symptoms associated, and--when possible--detailed information about the CR event including precise breakpoint positions, junction sequences, genes and gene regions disrupted and experimental techniques applied to discover/analyze the CR event.
A database of differentially expressed proteins in human cancers (dbDEPC) collects curated cancer proteomics data, provides a resource for information on protein-level expression changes, and explores protein profile differences among different cancers. dbDEPC currently contains 1803 proteins differentially expressed in 15 cancers, curated from 65 mass spectrometry (MS) experiments in peer-reviewed publications.
GeneHub-GEPIS is a tool for inferring human and mouse gene expression patterns based on normalized EST abundance in various normal and cancerous tissues.
GeneSigDB is a manually curated database of gene expression signatures. GeneSigDB focuses on cancer, development, and stem cell gene signatures and was constructed from thousands of publications from which we manually transcribe gene signatures. Gene signatures are mapped to the genome to extract standardized lists of EnsEMBL gene identifiers. GeneSigDB provides the original gene signature, the standardized gene list and a fully traceable gene mapping history for each gene from the original transcribed data table through to the standardized list of genes. GeneSigDB release 3.0 (Decemeber 2010) contained over 2,000 gene signatures.
The Network of Cancer Genes (NCG) collects and integrates data on 736 human genes that are mutated in various types of cancer. For each gene, NCG provides information on duplicability, orthology, evolutionary appearance and topological properties of the encoded protein in a comprehensive version of the human protein-protein interaction network. NCG also stores information on all primary interactors of cancer proteins, thus providing a complete overview of 5357 proteins that constitute direct and indirect determinants of human cancer.
Onto-Tools is a suite of tools for data mining based on information from Gene Ontology (GO). Onto-Tools includes an annotation database and the data mining tools: Onto-Express, Onto-Compare, Onto-Design, Onto-Translate, Onto-Miner, Pathway-Express, Promoter-Express, nsSNPCounter, TAQ, and OE2GO; free registration is required.
The Selective Targets database is a curated database of a growing number of public mononucleotide repeat tracts (MNR) mutation data in microsatellite unstable human tumors. Regression calculations for various microsatellite instability (MSI) -H tumor entities indicating statistically deviant mutation frequencies predict genes that are shown or highly suspected to be involved in MSI tumorigenesis. Many useful tools for further analyzing genomic DNA, derived wild-type and mutated cDNAs and peptides are integrated. A comprehensive database of all human coding, untranslated, non-coding RNA- and intronic MNRs (MNR_ensembl) is also included.