3D-footprint provides estimates of binding specificity for all protein-DNA complexes available at the Protein Data Bank. The web interface allows the user to: (i) browse DNA-binding proteins by keyword; (ii) find proteins that recognize a similar DNA motif and (iii) BLAST similar DNA-binding proteins, highlighting interface residues in the resulting alignments. Comparisons with expert-curated databases RegulonDB and TRANSFAC support the quality of structure-based estimates of specificity.
BioLit is a web sever resource that integrates scientific publications with existing biological databases. To perform this link, BioLit searches the full text of the article for metadata such as database identifiers and ontology terms.
Chilibot searches the PubMed literature database based on specific relationships between proteins, genes, or keywords. The results are returned as a graph.
ChimerDB is a knowledgebase of fusion genes identified from bioinformatics analysis of transcript sequences in the GenBank and various other public resources such as the Sanger cancer genome project (CGP), OMIM, PubMed and the Mitelman's database. A new algorithm that is more sensitive, has detected 2699 fusion transcripts with high confidence. Furthermore, it can identify interchromosomal translocations as well as the intrachromosomal deletions or inversions of large DNA segments. Results from the analysis of next-generation sequencing data in the short read archives are incorporated along with a new alignment viewer.
eTBLAST is a textual similarity search engine. This server can parse and summarize the results of an abstract similarity search to find appropriate journals for publication, authors with expertise in a given field, and documents similar to a submitted query.
A disease gene mining browser for association study. GenoWatch is a real-time batch SNP and short tandem repeat polymorphism pipeline that extracts current information from public domain websites such as NCBI, UniProt, KEGG and GO so that users can select the appropriate disease candidate genes.
HubMed uses information from PubMed's database, provided by the NCBI through the EUtils web service, to produce a search interface focused on browsing, organising and gathering information from the biomedical literature.
iHOP (Information Hyperlinked over Proteins) allows researchers to explore a network of gene and protein interactions based on published scientific literature. For each gene search, iHOP reports sentences from abstracts associating it with other genes, links out to full abstracts, and reports experimental evidence for the interactions, if available. You can also select sentences to create and visualize your own gene model.
The Laminin(LM)-database is a database focusing on the non-collagenous extracellular matrix protein family, the LMs. The homepage is subdivided into LMs, receptors, extracellular binding and other related proteins. Each tab opens into a given LM or LM-related molecule, where the reader finds a series of further tabs for 'protein', 'gene structure', 'gene expression' and 'tissue distribution' and 'therapy'. Data are separated as a function of species, comprising Homo sapiens, Mus musculus and Rattus novergicus.
A literature search tool providing gene homonym mining within the PubMed database. Search terms are highlighted in the results. LitInspector also performs signal transduction pathway mining using a manually curated database of pathway names, pathway components and pathway keywords.
LitMiner is a literature data mining tool that is based on the annotation of key terms in article abstracts followed by statistical co-citation analysis of annotated key terms in order to predict relationships between genes, compounds, diseases and phenotypes, and tissues and organs.
The National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data made available through the NCBI web site.
NLProt is a tool for finding protein names in natural language text. This data-mining method is a useful approach for extracting protein UniprotIDs from research articles for the construction of custom datasets and/or databases.