The allele frequency net database is an online repository that contains information on the frequencies of immune genes and their corresponding alleles in different populations. At present, the system contains data on the frequency of genes from different polymorphic regions such as human leukocyte antigens, killer-cell immunoglobulin-like receptors, major histocompatibility complex Class I chain-related genes and a number of cytokine gene polymorphisms.
AntigenDB contains 500 antigens to pathogenic species curated from the literature and other immunological resources. In AntigenDB, a database entry contains information regarding the sequence, structure, origin, etc. of an antigen with additional information such as B and T-cell epitopes, MHC binding, function, gene-expression and post translational modifications, where available. AntigenDB also provides links to major internal and external databases.
ASPicDB provides a unique annotation resource of human protein variants generated by alternative splicing. A total of 256,939 protein variants from 17,191 multi-exon genes have been extensively annotated through machine learning tools providing information of the protein type (globular and transmembrane), localization, presence of PFAM domains, signal peptides, GPI-anchor propeptides, transmembrane and coiled-coil segments. Furthermore, full-length variants can be now specifically selected based on the annotation of CAGE-tags and polyA signal and/or polyA sites, marking transcription initiation and termination sites, respectively. The retrieval can be carried out at gene, transcript, exon, protein or splice site level allowing the selection of data sets fulfilling one or more features settled by the user.
ConsensusPathDB is a meta-database that integrates physical protein interactions, metabolic and signaling reactions and gene regulatory interactions in a seamless functional association network that simultaneously describes multiple functional aspects of genes, proteins, complexes, metabolites, etc. ConsensusPathDB offers different ways of utilizing these integrated interaction data, with tools for visualization, analysis and interpretation of high-throughput expression data in the light of functional interactions and biological pathways.
COSMIC curates comprehensive information on somatic mutations in human cancer. Release v48 (July 2010) describes over 136,000 coding mutations in almost 542,000 tumour samples; of the 18,490 genes documented, 4803 (26%) have one or more mutations. Full scientific literature curations are available on 83 major cancer genes and 49 fusion gene pairs. Biomart allows more automated data mining and integration with other biological databases. Annotation of genomic features has become a significant focus. COSMIC integrates many diverse types of mutation information and is making much closer links with Ensembl and other data resources.
A database of differentially expressed proteins in human cancers (dbDEPC) collects curated cancer proteomics data, provides a resource for information on protein-level expression changes, and explores protein profile differences among different cancers. dbDEPC currently contains 1803 proteins differentially expressed in 15 cancers, curated from 65 mass spectrometry (MS) experiments in peer-reviewed publications.
The Functional Annotation Of the Mammalian Genome (FANTOM) is a database for the transcriptional network that regulates macrophage differentiation. Data comes from cap analysis of gene expression (CAGE), sequencing mRNA 5'-ends with a second-generation sequencer to quantify promoter activities even in the absence of gene annotation. Additional genome-wide experiments complement the setup including short RNA sequencing, microarray gene expression profiling on large-scale perturbation experiments and ChIP-chip for epigenetic marks and transcription factors.
Frequency of INherited Disorders database (FIND base) records frequencies of causative genetic variations worldwide. Database records include the population and ethnic group or geographical region, the disorder name and the related gene, accompanied by links to any related external resources and the genetic variation together with its frequency in that population. Other features include: (i) the systematic collection and thorough documentation of population/ethnic group-specific pharmacogenomic markers allele frequencies for markers in genes of pharmacogenomic interest from different classes of drug-metabolizing enzymes and transporters, representing 150 populations and ethnic groups worldwide; (ii) the development of new data querying and visualization tools in the expanded FINDbase data collection that facilitates querying of large data sets and visualizing the results; and (iii) the establishment of the first database journal, by affiliating FINDbase with Human Genomics and Proteomics journal.
The Gene Wiki goal is to build a gene-specific review article for every gene in the human genome, where each article is collaboratively written, continuously updated and community reviewed. Gene Wiki articles are freely accessible within the Wikipedia web site.
The H-Invitational Database (H-InvDB) is a comprehensive annotation resource of human genes and transcripts. The latest release of H-InvDB (release 6.2) provides the annotation for 219,765 human transcripts in 43,159 human gene clusters based on human full-length cDNAs and mRNAs. H-InvDB now provides several new annotation features, such as mapping of microarray probes, new gene models, relation to known ncRNAs and information from the Glycogene database. H-InvDB also provides useful data mining resources-'Navigation search', 'H-InvDB Enrichment Analysis Tool (HEAT)' and web service APIs.
The human lung cancer database (HLungDB) is a database with the integration of the lung cancer-related genes, proteins and miRNAs together with the corresponding clinical information. Currently, we have collected 2585 genes and 212 miRNA with the experimental evidences involved in the different stages of lung carcinogenesis through text mining. The results from analysis of transcription factor-binding motifs, the promoters and the SNP sites for each gene are also included. Genes with epigenetic regulation were also included.
The Hormone Receptor Target Binding Loci, HRTBLDb database contains hormone receptor binding regions (binding loci) from in vivo ChIP-based high-throughput experiments as well as in silico, computationally predicted, binding motifs and cis-regulatory modules for the co-occurring transcription factor binding motifs, which are within a binding locus. It also contains individual binding sites whose regulatory action has been verified by in vitro experiments.
Indian Genetic Disease Database (IGDD) is an integrated and curated repository of mutation data on common genetic diseases afflicting the Indian populations. Information on locus heterogeneity, type of mutation, clinical and biochemical data, geographical location and common mutations are furnished based on published literature. The database can be searched based on disease of interest, causal gene, type of mutation and geographical location of the patients or carriers.
The IMGT/HLA database provides a searchable repository of highly curated HLA sequences. The naming of these HLA genes and alleles and their quality control is the responsibility of the WHO Nomenclature Committee for Factors of the HLA System. Through the work of the HLA Informatics Group and in collaboration with the European Bioinformatics Institute, we are able to provide public access to this data.
The Immune Epitope Database (IEDB) provides a catalog of experimentally characterized B and T cell epitopes, as well as data on Major Histocompatibility Complex (MHC) binding and MHC ligand elution experiments. The database represents the molecular structures recognized by adaptive immune receptors and the experimental contexts in which these molecules were determined to be immune epitopes. Epitopes recognized in humans, nonhuman primates, rodents, pigs, cats and all other tested species are included. Both positive and negative experimental results are captured. The database can be queried by epitope structure, source organism, MHC restriction, assay type or host organism, among other criteria.