Sequence Hybridization Language in Patent Claims

22 March 2018 by Cambia Staff in Frequent Questions

Sequence Hybridization Language in Patent Claims
There are many ways to define the limits of a claim towards related genes, nucleotides, or protein sequences. One way is through a direct comparison of two aligned sequences in order to determine if they are "similar" enough to each other so that the claim language is satisfied. Typically this requires a quantifiable comparison, often in the form of "% identity": US 2005/50583 (assigned to Genesis/Agrigenesis, according to press releases now subject to the rights of ArborGen) Claim 3 An isolated polynucleotide comprising a sequence selected from the group consisting of: (a) sequences having at least 75% identity to a sequence of SEQ ID NO: 1-67, 131-481, 833-888, 946-952 and 960-974; In this case if a compared sequence (even a sequence not in the possession of the patentee, such as one discovered in another species) falls within the 75% sequence identity range, it could potentially be subject to the patent applicant's claims.  Alternatively, subject to a number of detailed caveats (for the US see 35 USC 102), if a compared sequence more than 75% identical was made public or submitted for patent by someone else prior to the priority date of the patent application ("prior art"), the patent claim may be invalid. Such a comparison is technically straightforward, although for a long sequence listing with more than a handful of sequences (almost 500 are claimed here) it would require a large amount of time for the examiner to do a thorough analysis to determine whether or not the claim is valid, and for a researcher or a member of the public to determine whether any particular sequence might be infringing. Even more difficult for the patent examiner and the public to compare, to clearly define the bounds of what is subject to patent claims, is the use of such claim language as: US 2005/91708 (assigned to Dow Chemical) Claim 1 An isolated nucleic acid selected from the group consisting of SEQ ID NOs: 1-7554 and nucleic acid sequences that hybridize to any thereof under conditions of low stringency, wherein expression of said isolated nucleic acid in a plant results in an altered metabolic characteristic. Hybridisation experiments would seem to be a definitive way to define similarity in patent documents. Unfortunately, the tools to perform fast and accurate comparisons of sequences based on such language, without the use of a research laboratory, are lacking in the patent office, and what is more, a skilled practitioner will realise that many (possibly only distantly related!)  nucleic acid sequences will hybridise together under low stringency conditions.   At some frequency, any DNA at all will hybridise to the claimed sequence under any stringency conditions. Hence hybridisation language, when used in patent documents, has the potential to create uncertainty in understanding the scope of the claim.  The probability that an applicant will achieve such claims to all 7554 sequences is small.  However, it is possible not only that some of these sequences will be successfully claimed in issued patents, but that most remain pending in future applications. The uncertainty with this claim is two-fold:
  • Which sequences are going to be successfully claimed in future?
  • Will the applicant also acheive claims to related sequences?  On the face of it, this claim would apply to sequences from any other species, not necessarily a plant species, and not necessarily sequences in the possession of the patentee, that hybridise to any of the 7554 claimed sequences and alter anything about a plant when expressed in the plant.
Below we explain, for those unfamiliar with the concept, how hybridisation works and how it is often presented in a patent application or patent.

The Basis for Hybridisation Language

DNA is a biological polymer consisting of "simple" repeated subunits, or bases.  It is within the sequence of these bases that genetic information is stored. Although DNA can exist as a single strand in solution, it is generally found within living systems together with its complementary strand (see the figure below).  If we consider a simple representational diagram of a DNA molecule, consisting of a sequence of bases: A, G, C, and T. We can see that some of the bases interact with one another in the figure below.   The bases A,T,G, and C form hydrogen bonds with their respective bases T, A, C and G, in the complementary strand.  These bonds provide a stability to the double strand DNA (dsDNA), under normal circumstances, preventing it from dissociating into single strand DNA (ssDNA).  Significantly, dsDNA can be "melted" to produce ssDNA under certain physical circumstances and can be influenced by many factors such as temperature, salt concentration, inorganic solvents, sequence composition, and (importantly) the degree of sequence similarity or missmatch between ssDNA molecules.

The Concept of Melting Temperature (Tm)

Mathematical relationships between the various physical parameters listed above and the melting point of DNA have been established.  Historically, the point at which 50% of the DNA molecule is ssDNA is referred to as the melting temperature or Tm. The Tm for a DNA sequence can be estimated via the commonly-used calculation: Where     Tm         = melting temperature in oC [Na+]     = Molar concentration of sodium ions in %[G+C]    = percent of G+C bases in DNA sequence n         = length of DNA sequence in bases P         = temperature correction for % mismatched base pairs (~1oC per 1% mismatch) F         = correction for formamide concentration (= 0.63oC per 1% [formamide]) Note:  A particular DNA sequence will have different Tm values under different conditions.  This calculation is sometimes referred to as the effective Tm. The fact that ssDNA can re-anneal to form dsDNA is used in the process called hybridisation.

NOTE that the Tm refers to 50% hybridisation, not complete hybridisation!

A typical hybridisation experiment consists of a number of steps:
  1. Binding of target ssDNA to a solid support (often a thin nitrocellulose, nylon, or polymer membrane), referred to here as a "blot".
  2. Blocking any remaining free binding sites for DNA on the blot.
  3. Labeling of the "query" ssDNA or probe.
  4. Creation of a hybridisation solution containing the probe at a known concentration of salt, formamide,...
  5. Equilibrating the "blot" with the hybridisation solution at a defined temperature (for a specified length of time).
  6. Washing the unhybridised probe off the blot using a specific wash protocol;  the wash conditions are actually more important than the hybridisation conditions for determining what will remain hybridised, but are less often specified in patent applications!
  7. Detecting the bound probe by some means.
If the probe (or query DNA) is similar enough to the target ssDNA (on the blot) then the probe will remain bound to it after the wash step. "Stringency" refers to the concentrations and temperatures in steps 4-6 above, and it is most often "strigency" that is referred to in order to determine the scope of claims.  Strictly speaking, this should refer to the "stringency" of the wash step required to leave the probe bound to target sequences covered by the claims:  the "query" is often a sequence mentioned in the claims section (as in the example given above). Although the table below is not legally binding and indeed patent applications may make their own definitions of stringency levels, the typical usage is roughly:
Stringency [NaCl] (Molar) Range from Tm (oC) Description
High 0.0165 - 0.0330 5 to 10oC below Tm Only highly similar sequences will bind under these conditions (typically >95%)
Medium/Moderate 0.165 - 0.330 20 to 29oC below Tm Those sequences above and less homologous ones (typically >80%)
Low 0.330 - 0.825 40 to 48oC below Tm Sequences with lower homology will bind under these conditions (>50%?)
1. Note that the values here are for the purpose of discussion.  Actual values and conditions must be acquired from the patent document specifications. 2. The "Range from Tm" is the temperature of the wash solution required to achieve the level of stringency and is dependent on the factors described above. 3. Values in Column 2 are typical only! 4. Values in column 3 were obtained from US 2006/0057724 A1 5. Values in Column 4 are typical only and sourced from the web.  Refer to patent specifications for your details! 6. Definition of stringency levels may vary between patents (BEWARE). Stringency conditions are usually defined in the patent document!
Sometimes the patent examiner will require the applicant to stipulate the stringency conditions: high/medium/or low stringency, or specifying the [NaCl] and temperature of the wash conditions.  Few bother to give a reference with respect to Tm (a value that can be calculated).  It is possible that the conditions defined as stringent in one patent document may not be as stringent as those defined in another!