Patent Informatics: Sequence & Chemical Databases for Prior Art Searching

Workshop Details

Date: April 19, 2012 or April 20, 2012
Location: Downtown Toronto, ON
Lead Faculty (2012): Jennifer McDowall and Louisa Bellis (European Bioinformatics Institute)
Registration Fee for Applications received before April 5, 2012: $100 + HST
Registration Fee for Applications received after April 5, 2012: $200 + HST

Apply now!

Target Audience
This workshop is targeted to those involved in biotechnology patenting, including patent searchers, patent attorneys, patent agents, patent examiners, technology transfer officers, scientific researchers, and company executives.

Prerequisite: Basic familiarity with computer databases.
Your own laptop computer. If you do not access to a laptop, you may loan one from the CBW. Please contact course_info@bioinformatics.ca for more information.


Course Objectives
Patenting biotechnological inventions, including medicines, recombinant DNA, diagnostic and therapeutic sequences, and agriculturally useful sequences, is a growing industry. Whether you are working on patentable biotechnology, or wish to determine your freedom to operate in a specific field, at some point you will need to search patent databases to determine the scope of protection available. There are a number of free-to-use patent sequence and chemical databases that can provide the information you require without paying large subscription fees.

The CBW will host a 1-day workshop covering key patent sequence and chemical databases freely available in the public domain. Through international agreements, the patent sequence databases at the European Bioinformatics Institute (EBI) cover patented protein and nucleotide sequences from the US Patent and Trademark Office (USPTO), the European Patent Office (EPO), the Japan Patent Office (JPO), and the Korean Intellectual Property Office (KIPO), with new sequences being introduced by the Canadian Intellectual Property Office (CIPO) and the State Intellectual Property Office in China (SIPO). These databases contain over 23.1 million nucleotide and 6.3 million protein sequences. The workshop will focus on how to search these databases using both text and sequence search methods.

The workshop will also cover the major chemical databases ChEMBL and ChEBI, which hold bioactivity data and literature references for over 1.2 million compounds and patent information for over 5,200 compounds that have been synthesized over the past 30 years. ChEMBL is a database of drug-like small molecules with associated bioactivity data that have been curated from primary literature sources. ChEBI (Chemical Entities of Biological Interest) is a dictionary of small chemical compounds. This workshop will provide an overview of ChEBI and ChEMBL, including searching using text and chemical structures.



Course Outline

Day 1
Module 1: Patent Sequence Search Strategies (Jennifer McDowall)

  • Overview of non-redundant patent protein/nucleotide sequence databases
  • Advantages of grouping sequences by patent families
  • Obtaining non-redundant sequence lists for patent documents
  • Sequence search strategies: which algorithm is best to use?

Exercises: Participants will conduct sequence searches and explore how different algorithms and parameters affect the search results. Text-based searches will also be explored.


Module 2: Chemical Databases (Louisa Bellis)

  • Overview of ChEMBL and ChEBI databases
  • Compound patent searching using ChEBI and ChEMBL
  • Combining structure drawing, text and compound identifiers as search tools

Exercises: Participants will conduct text- and structure-based searches and explore how different search methods impact results. Data in ChEMBL and ChEBI databases will also be further explored.