Abstract: A sequence of polymer units in a polymer (3) eg. DNA is estimated from at least one series of measurements related to the polymer eg. ion current as a function of translocation through a nanopore (1) wherein the value of each measurement is dependent on a k mer being a group of k polymer units (4). A probabilistic model especially a hidden Markov model (HMM) is provided comprising for a set of possible k mers: transition weightings representing the chances of transitions from origin k mers to destination k mers; and emission weightings in respect of each k mer that represent the chances of observing given values of measurements for that k mer. The series of measurements is analysed using an analytical technique eg. Viterbi decoding that refers to the model and estimates at least one estimated sequence of polymer units in the polymer based on the likelihood predicted by the model of the series of measurements being produced by sequences of polymer units. In a further embodiment different voltages are applied across the nanopore during translocation in order to improve the resolution of polymer units.
Analysis of a Polymer comprising Polymer Units
The present invention relates generally to the field of analysing a polymer comprising polymer units, for example but without limitation a polynucleotide, by making measurements related to the polymer. The first aspect of the present invention relates specifically to the estimation of a sequence of polymer units in the polymer. The second and third aspects of the present invention relate to the measurement of ion current flowing through a nanopore during translocation of a polymer for analysis of the polymer.
There are many types of measurement system that provide measurements of a polymer for the purpose of analysing the polymer and/or determining the sequence of polymer units.
For example but without limitation, one type of measurement system utilises a nanopore through which the polymer is translocated. Some property of the system depends on the polymer units in the nanopore, and measurements of that property are taken. For example, a measurement system may be created by placing a nanopore in an insulating membrane and measuring voltage-driven ionic transport through the nanopore in the presence of analyte molecules. Depending on the nature of the nanopore, the identity of an analyte may be revealed through its distinctive ion current signature, notably the duration and extent of current block and the variance of current levels. Such types of measurement system using a nanopore has considerable promise, particularly in the field of sequencing a polynucleotide such as DNA or RNA, and has been the subject of much recent development.
There is currently a need for rapid and cheap nucleic acid (e.g. DNA or RNA) sequencing technologies across a wide range of applications. Existing technologies are slow and expensive mainly because they rely on amplification techniques to produce large volumes of nucleic acid and require a high quantity of specialist fluorescent chemicals for signal detection. Nanopore sensing has the potential to provide rapid and cheap nucleic acid sequencing by reducing the quantity of nucleotide and reagents required.
The present invention relates to a situation where the value of each measurement is dependent on a group of k polymer units where k is a positive integer (i.e a 'k-mer').
Furthermore, it is typical of many types of measurement system, including the majority of currently known biological nanopores, for the value of each measurement to be dependent on a k-mer where k is a plural integer. This is because more than one polymer unit contributes to the observed signal and might be thought of conceptually as the measurement system having a "blunt reader head" that is bigger than the polymer unit being measured. In such a situation, the number of different k-mers to be resolved increases to the power of k. For example, if there are n possible polymer units, the number of different k-mers to be resolved is nk. While it is desirable to have clear separation between measurements for different k-mers, it is common for some of these measurements to overlap. Especially with high numbers of polymer units in the k-mer, i.e. high values of k, it can become difficult to resolve the measurements produced by different k-mers, to the detriment of deriving information about the polymer, for example an estimate of the underlying sequence of polymer units.
Accordingly, much of the development work has been directed towards the design of a measurement system that improves the resolution of measurements. This is difficult in practical measurement systems, due to variation in measurements that can arise to varying extents from inherent variation in the underlying physical or biological system and/or measurement noise that is inevitable due the small magnitude of the properties being measured.
Much research has aimed at design of a measurement system that provides resolvable measurements that are dependent on a single polymer unit. However, this has proved difficult in practice.
Other work has accepted measurements that are dependent on k-mers where k is a plural integer, but has aimed at design of a measurement system in which the measurements from different k-mers are resolvable from each other. However practical limitations mean again that this is very difficult. Distributions of signals produced by some different k-mers can often overlap.
In principle, it might be possible to combine information from k measurements, where k is a plural integer, that each depend in part on the same polymer unit to obtain a single value that is resolved at the level of a polymer unit. However, this is difficult in practice. Firstly, this relies on the possibility of identifying a suitable transform to transform a set of k measurements. However, for many measurements systems, due to the complexity of the interactions in the underlying physical or biological system, such a transform either does not exist or is impractical to identify. Secondly, even if such a transform might exist in principle for a given measurement system, the variation in measurements makes the transform difficult to identify and/or the transform might still provide values that cannot be resolved from each other. Thirdly, with such techniques it is difficult or impossible to take account of missed measurements, that is where a measurement that is dependent on a given k-mer is missing in the sequence of polymer units, as can sometimes be the case in a practical measurement system, for example due to the measurement system failing to take the measurement or due to an error in the subsequent data processing.
The first aspect of the present invention is concerned with the provision of techniques that improve the accuracy of estimating a sequence of polymer units in a polymer from such
measurements that are dependent on a k-mer.
According to the first aspect of the present invention, there is provided a method of estimating a sequence of polymer units in a polymer from at least one series of measurements related to the polymer, wherein the value of each measurement is dependent on a k-mer, a k-mer being a group of k polymer units where k is a positive integer, the method comprising:
providing a model comprising, for a set of possible k-mers:
transition weightings representing the chances of transitions from origin k-mers to destination k-mers; and
emission weightings in respect of each k-mer that represent the chances of observing
given values of measurements for that k-mer; and
analysing the series of measurements using an analytical technique that refers to the model and estimating at least one estimated sequence of polymer units in the polymer based on the likelihood predicted by the model of the series of measurements being produced by sequences of polymer units.
Further according to first aspect of the present invention, there is provided an analysis apparatus that implements a similar method.
Therefore, the first aspect of the present invention makes use of a model of the measurement system that produces the measurements. Given any series of measurements, the model represents the chances of different sequences of k-mers having produced those measurements. The first aspect of the present invention is particularly suitable for situations in which the value of each measurement is dependent on a k-mer, where k is a plural integer.
The model considers the possible k-mers. For example, in a polymer where each polymer unit may be one of 4 polymer units (or more generally n polymer units) there are 4k possible k-mers (or more generally nk possible k-mers), unless any specific k-mer does not exist physically. For all k-mers that may exist, the emissions weightings take account of the chance of observing given values of measurements. The emission weightings in respect of each k-mer represent the chances of observing given values of measurements for that k-mer.
The transition weightings represent the chances of transitions from origin k-mers to destination k-mers, and therefore take account of the chance of the k-mer on which the measurements depend transitioning between different k-mers. The transition weightings may therefore take account of transitions that are more and less likely. By way of example, where k is a plural integer, for a given origin k-mer this may represent that a greater chance of a preferred transitions, being transitions to destination k-mers that have a sequence in which the first (k-1) polymer units are the final (k-1) polymer unit of the origin k-mer, than non-preferred transitions, being transitions to destination k-mers that have a sequence different from the origin k-mer and in which the first (k-1) polymer units are not the final (k-1) polymer units of the origin k-mer. For example, for 3 -mers where the polymer units are naturally occurring DNA bases, state CGT has preferred transitions to GTC, GTG, GTT and GTA. By way of example without limitation, the model may be a Hidden Markov Model in which the transition weightings and emission weightings are probabilities.
This allows the series of measurements to be analysed using an analytical technique that refers to the model. At least one estimated sequence of polymer units in the polymer is estimated based on the likelihood predicted by the model of the series of measurements being produced by sequences of polymer units. For example but without limitation, the analytical technique may be a probabilistic technique.
In particular, the measurements from individual k-mers are not required to be resolvable from each other, and it is not required that there is a transform from groups of k measurements that are dependent on the same polymer unit to a value in respect of that transform, i.e. the set of observed states is not required to be a function of a smaller number of parameters (although this is not excluded). Instead, the use of the model provides accurate estimation by taking plural measurements into account in the consideration of the likelihood predicted by the model of the series of
measurements being produced by sequences of polymer units. Conceptually, the transition weightings may be viewed as allowing the model to take account, in the estimation of any given polymer unit, of at least the k measurements that are dependent in part on that polymer unit, and indeed also on measurements from greater distances in the sequence. The model may effectively take into account large numbers of measurements in the estimation of any given polymer unit, giving a result that may be more accurate.
Similarly, the use of such a model may allow the analytical technique to take account of missing measurements from a given k-mer and/or to take account of outliers in the measurement produced by a given k-mer. This may be accounted for in the transition weightings and/or emission weightings. For example, the transition weightings may represent non-zero chances of at least some of the non-preferred transitions and/or the emission weightings may represent non-zero chances of observing all possible measurements.
The second and third aspects of the present invention are concerned with the provision of techniques that assist the analysis of polymers using measurements of ion current flowing through a nanopore while the polymer is translocated through the nanopore.
According to the second aspect of the present invention, there is provided a method of analysing a polymer comprising polymer units, the method comprising:
during translocation of a polymer through a nanopore while a voltage is applied across the nanopore, making measurements that are dependent on the identity of k-mers in the nanopore, a k-mer being k polymer units of the polymer, where k is a positive integer, wherein the measurements comprise, in respect of individual k-mers, separate measurements made at different levels of said voltage applied across the nanopore; and
analysing the measurements at said different levels of said voltage to determine the identity of at least part of the polymer.
The method involves making measurements that are dependent on the identity of k-mers in the nanopore, a k-mer being k polymer units of the polymer, where k is a positive integer. In particular, the measurements comprise, in respect of individual k-mers, separate measurements made at different levels of said voltage applied across the nanopore. The present inventors have appreciated and demonstrated that such measurements at different levels of said voltage applied across the nanopore provide additional information, rather than being merely duplicative. For example, the measurements at different voltages allow resolution of different states. For example, some k-mers that cannot be resolved at a given voltage can be resolved at another voltage.
The third aspect of the present invention provides a method of making measurements made
under the application of different levels of voltage across the nanopore, that may optionally be applied in the second aspect of the invention. In particular, according to the third aspect of the present invention, there is provided a method of making measurements of a polymer comprising polymer units, the method comprising:
performing a translocation of said polymer through a nanopore while a voltage is applied across the nanopore;
during said translocation of the polymer through the nanopore, applying different levels of said voltage in a cycle, and
making measurements that are dependent on the identity of k-mers in the nanopore, a k-mer being k polymer units of the polymer, where k is a positive integer, the measurements comprising separate measurements in respect of individual k-mers at said different levels of said voltage in said cycle, the cycle having a cycle period shorter than states in which said measurements are dependent on said individual k-mers.
Thus the third aspect of the present invention provides the same advantages as the second aspect of the present invention, in particular that the measurements provide additional information, rather than being merely duplicative. The measurements at different voltages allow resolution of different states in a subsequent analysis of the measurements. For example, some states that cannot be resolved at a given voltage can be resolved at another voltage.
This is based on an innovation in which measurements at different voltages are acquired during a single translocation of a polymer through a nanopore. This is achieved by changing the level of said voltage in a cycle, selected so that the cycle period is shorter than the duration of states that are measured.
However, it is not essential to use this method within the second aspect of the invention. As an alternative, the ion current measurements at different magnitudes of the voltage may be made during different translocations of the polymer through the nanopore which may be translocations in the same direction, or may include translocations in opposite directions.
Thus, the methods of the second aspect and third aspect of the present invention can provide additional information that improves subsequent analysis of the measurements to derive information about the polymer. Some examples of the types of information that may be derived are as follows.
The analysis may be to derive the timings of transitions between states. In this case, the additional information provided by the measurements of each state at different potentials improves the accuracy. For example, in the case that a transition between two states cannot be resolved at one voltage, the transition may be identified by the change in the level of the ion current measurement at another voltage . This potentially allows identification of a transition that would not be apparent working only at one voltage or a determination with a higher degree of confidence that a transition did not in fact occur. This identification may be used in subsequent analysis of the measurements.
In general, carrying out measurements at the different voltage levels provides more
information than may be obtained at one voltage level. For example in the measurement of ion flow through the nanopore, information that may be obtained from the measurements includes the current level and the signal variance (noise) for a particular state. For example for translocation of DNA through a nanopore, k-mers comprising the nucleotide base G tend to give rise to states having increased signal variance. It may be difficult to determine whether a transition in states has occurred, for example due to respective states having similar current levels or where one or both of the respective states have high signal variance. The current level and signal variance for a particular state may differ for different voltage levels and thus measurement at the different voltage levels may enable the determination of high variance states or increase the level of confidence in determining a state. Consequently, it may be easier to determine a transition between states at one voltage level compared to another voltage level.
The analysis may be to estimate the identity of the polymer or to estimate a sequence of polymer units in the polymer. In this case, the additional information provided by the measurements of each state at different potentials improves the accuracy of the estimation.
In the case of estimating a sequence of polymer units, the analysis may use a method in accordance with the first aspect of the present invention. Accordingly, the features of the first aspect of the present invention may be combined with the features of the second aspect and/or third aspect of the present invention, in any combination.
Further according to second and third aspects of the present invention, there is provided an analysis apparatus that implements a similar method..
To allow better understanding, embodiments of the present invention will now be described by way of non-limitative example with reference to the accompanying drawings, in which:
Fig. 1 is a schematic diagram of a measurement system comprising a nanopore;
Fig. 2 is a plot of a signal of an event measured over time by a measurement system;
Fig. 3 is a graph of the frequency distributions of measurements of two different
polynucleotides in a measurement system comprising a nanopore;
Figs. 4 and 5 are plots of 64 3-mer coefficients and 1024 5-mer coefficients, respectively, against predicted values from a first order linear model applied to sets of experimentally derived current measurements;
Fig. 6 is a flowchart of a method of analyzing an input signal comprising measurements of a polymer;
Fig. 7 is a flowchart of a state detection step of Fig. 6;
Fig. 8 is a flowchart of an analysis step of Fig. 6;
Figs. 9 and 10 are plots, respectively, of an input signal subject to the state detection step and of the resultant series of measurements;
Fig. 11 is a pictorial representation of a transition matrix;
Fig. 12 is a graph of the expected measurements in respect of k-mer states in a simulated
example;
Fig. 13 shows an input signal simulated from the expected measurements illustrated in Fig.
12;
Fig. 14 shows a series of measurements derived from the input signal of Fig. 13;
Figs. 15 and 16 show respective transition matrices of transition weightings ;
Figs. 17 to 19 are graphs of emission weightings having possible distributions that are, respectively, Gaussian, triangular and square;
Fig. 20 is a graph of the current space alignment between a set of simulated measurements and the expected measurements shown in Fig. 12;
Fig. 21 is a graph of the k-mer space alignment between the actual k-mers and the k-mers, estimated from the simulated measurements of Fig. 20;
Fig. 22 is a graph of the current space alignment between a further set of simulated measurements and the expected measurements shown in Fig. 12;
Figs. 23 and 24 are graphs of the k-mer space alignment between the actual k-mers and the k-mers estimated from the simulated measurements of Fig. 22 with the transition matrices of Figs. 15 and 16, respectively;
Fig. 25 is a graph of emission weightings having a square distribution with a small non-zero background with distributions centred on the expected measurements of Fig. 12;
Fig. 26 is a graph of the k-mer space alignment between the actual k-mers and the k-mers estimated from the simulated measurements of Fig. 20 with the transition matrix of Fig. 15 and the emission weightings of Fig. 25;
Fig. 27 is a graph of emission weightings having a square distribution with a zero background with distributions centred on the expected measurements of Fig. 12;
Fig. 28 is a graph of the k-mer space alignment between the actual k-mers and the k-mers estimated from the simulated measurements of Fig. 20 with the transition matrix of Fig. 15 and the emission weightings of Fig. 27;
Fig. 29 is a scatter plot of current measurements obtained from DNA strands held in a MS-(B2)8 nanopore using streptavidin;
Fig. 30 is a transition matrix for an example training process;
Fig. 31 is an enlarged portion of the transition matrix of Fig. 30;
Figs. 32 and 33 are graphs of emission weightings for, respectively, a model of 64 k-mers derived from a static training process and a translation of that model into a model of approximately 400 states;
Fig. 34 is a flow chart of a training process;
Fig. 35 is a graph of emission weightings determined by the training process of Fig. 34;
Fig. 36 is a graph of current measurements aggregated over several experiments with the expected measurements from a model;
Fig. 37 is a graph of the k-mer space alignment between the actual k-mers and the estimated k-mers;
Fig. 38 shows an estimated sequence of estimated k-mers aligned with the actual sequence;
Fig. 39 shows separate estimated sequences of sense and antisense regions of a polymer together with an estimated sequence derived by treating measurements from the sense and antisense regions as arranged in two respective dimensions;
Fig. 40 is a set of histograms of ion current measurements for a set of DNA strands in a nanopore at three different voltages in a first example;
Fig. 41 is a pair of graphs of applied potential and resultant ion current over a common time period for a single strand in a nanopore in a second example;
Figs. 42 to 45 are scatter plots of the measured current for each of the DNA strands indexed horizontally at four levels of voltage, respectively, in the second example;
Fig. 46 is a plot of the measured current each DNA strand against the applied voltage in the second example;
Fig. 47 is a plot of the standard deviation of the current measurements for each DNA strand in the second example against the applied voltage;
Fig. 48 is a flow chart of a method of making ion current measurements;
Figs. 49 and 50 are each a pair of graphs of applied potential and resultant ion current over a common time period in a third example;
Fig. 51 is a is a flow chart of an alternative method of making ion current measurements; and
Figs. 52a and 52b are plots over the same time scale of shaped voltage steps applied across a nanopore and the resultant current. All the aspects of the present invention may be applied to a range of polymers as follows.
The polymer may be a polynucleotide (or nucleic acid), a polypeptide such as a protein, a polysaccharide, or any other polymer. The polymer may be natural or synthetic.
In the case of a polynucleotide or nucleic acid, the polymer units may be nucleotides. The nucleic acid is typically deoxyribonucleic acid (DNA), ribonucleic acid (RNA), cDNA or a synthetic nucleic acid known in the art, such as peptide nucleic acid (PNA), glycerol nucleic acid (GNA), threose nucleic acid (TNA), locked nucleic acid (LNA) or other synthetic polymers with nucleotide side chains. The nucleic acid may be single-stranded, be double-stranded or comprise both single-stranded and double-stranded regions. Typically cDNA, RNA, GNA, TNA or LNA are single stranded. The methods of the invention may be used to identify any nucleotide. The nucleotide can be naturally occurring or artificial. A nucleotide typically contains a nucleobase, a sugar and at least one phosphate group. The nucleobase is typically heterocyclic. Suitable nucleobases include purines and pyrimidines and more specifically adenine, guanine, thymine, uracil and cytosine. The sugar is typically a pentose sugar. Suitable sugars include, but are not limited to, ribose and deoxyribose. The nucleotide is typically a ribonucleotide or deoxyribonucleotide. The nucleotide typically contains a monophosphate, diphosphate or triphosphate.
The nucleotide can be a damaged or epigenetic base. The nucleotide can be labelled or modified to act as a marker with a distinct signal. This technique can be used to identify the absence of a base, for example, an abasic unit or spacer in the polynucleotide. The method could also be applied to any type of polymer.
Of particular use when considering measurements of modified or damaged DNA (or similar systems) are the methods where complementary data are considered. The additional information provided allows distinction between a larger number of underlying states.
In the case of a polypeptide, the polymer units may be amino acids that are naturally occurring or synthetic.
In the case of a polysaccharide, the polymer units may be monosaccharides.
The present invention may be applied to measurements taken by a range of measurement systems, as discussed further below.
In accordance with all aspects of the present invention, the measurement system may be a nanopore system that comprises a nanopore. In this case, the measurements may be taken during translocation of the polymer through the nanopore. The translocation of the polymer through the nanopore generates a characteristic signal in the measured property that may be observed, and may be referred to overall as an "event".
The nanopore is a pore, typically having a size of the order of nanometres, that allows the passage of polymers therethrough. A property that depends on the polymer units translocating through the pore may be measured. The property may be associated with an interaction between the polymer and the pore. Interaction of the polymer may occur at a constricted region of the pore. The measurement system measures the property, producing a measurement that is dependent on the polymer units of the polymer.
The nanopore may be a biological pore or a solid state pore.
Where the nanopore is a biological pore, it may have the following properties.
The biological pore may be a transmembrane protein pore. Transmembrane protein pores for use in accordance with the invention can be derived from β-barrel pores or a-helix bundle pores, β-barrel pores comprise a barrel or channel that is formed from β-strands. Suitable β-barrel pores include, but are not limited to, β-toxins, such as a-hemolysin, anthrax toxin and leukocidins, and outer membrane proteins/porins of bacteria, such as Mycobacterium smegmatis porin (Msp), for example MspA, outer membrane porin F (OmpF), outer membrane porin G (OmpG), outer membrane phospholipase A and Neisseria autotransporter lipoprotein (NalP). a-helix bundle pores comprise a barrel or channel that is formed from a-helices. Suitable a-helix bundle pores include, but are not limited to, inner membrane proteins and a outer membrane proteins, such as WZA and ClyA toxin. The transmembrane pore may be derived from Msp or from a-hemolysin (a-HL).
The transmembrane protein pore is typically derived from Msp, preferably from MspA. Such
a pore will be oligomeric and typically comprises 7, 8, 9 or 10 monomers derived from Msp. The pore may be a homo-oligomeric pore derived from Msp comprising identical monomers.
Alternatively, the pore may be a hetero-oligomeric pore derived from Msp comprising at least one monomer that differs from the others. The pore may also comprise one or more constructs that comprise two or more covalently attached monomers derived from Msp. Suitable pores are disclosed in US Provisional Application No. 61/441,718 (filed 11 February 2011). Preferably the pore is derived from MspA or a homolog or paralog thereof.
The biological pore may be a naturally occurring pore or may be a mutant pore. Typical pores are described in WO-2010/109197, Stoddart D et al., Proc Natl Acad Sci, 12;106(19):7702-7, Stoddart D et al., Angew Chem Int Ed Engl. 2010;49(3):556-9, Stoddart D et al., Nano Lett. 2010 Sep 8;10(9):3633-7, Butler TZ et al., Proc Natl Acad Sci 2008;105(52):20647-52, and US Provisional Application 61/441718.
The biological pore may be MS-(B1)8. The nucleotide sequence encoding Bl and the amino acid sequence of Bl are shown below (Seq ID: 1 and Seq ID: 2 ).
Seq ID 1 : MS-(B1)8 = MS-(D90N/D91N/D93N/D118R/D134R/E139K)8
ATGGGTCTGGATAATGAACTGAGCCTGGTGGACGGTCAAGATCGTACCCTGACGGTGCA ACAATGGGATACCTTTCTGAATGGCGTTTTTCCGCTGGATCGTAATCGCCTGACCCGTGA ATGGTTTCATTCCGGTCGCGCAAAATATATCGTCGCAGGCCCGGGTGCTGACGAATTCGA AGGCACGCTGGAACTGGGTTATCAGATTGGCTTTCCGTGGTCACTGGGCGTTGGTATCAA CTTCTCGTACACCACGCCGAATATTCTGATCAACAATGGTAACATTACCGCACCGCCGTT TGGCCTGAACAGCGTGATTACGCCGAACCTGTTTCCGGGTGTTAGCATCTCTGCCCGTCT GGGCAATGGTCCGGGCATTCAAGAAGTGGCAACCTTTAGTGTGCGCGTTTCCGGCGCTA AAGGCGGTGTCGCGGTGTCTAACGCCCACGGTACCGTTACGGGCGCGGCCGGCGGTGTC CTGCTGCGTCCGTTCGCGCGCCTGATTGCCTCTACCGGCGACAGCGTTACGACCTATGGC GAACCGTGGAATATGAACTAA
Seq ID 2: MS-(B1)8 = MS-(D90N/D91N/D93N/D118R/D134R/E139K)8
GLDNELSLVDGQDRTLTVQQWDTFLNGVFPLDRNRLTREWFHSGRAKYIVAGPGADEFEGT LELGYQIGFPWSLGVGINFSYTTPNILINNGNITAPPFGLNSVITPNLFPGVSISARLGNGPGIQE VATFSVRVSGAKGGVAVSNAHGTVTGAAGGVLLRPFARLIASTGDSVTTYGEPWNMN
The biological pore is more preferably MS-(B2)8. The amino acid sequence of B2 is identical to that of Bl except for the mutation L88N. The nucleotide sequence encoding B2 and the amino acid sequence of B2 are shown below (Seq ID: 3 and Seq ID: 4).
Seq ID 3: MS-(B2)8 = MS-(L88N/D90N/D91N/D93N/D118R/D134R/E139K)8
ATGGGTCTGGATAATGAACTGAGCCTGGTGGACGGTCAAGATCGTACCCTGACGGTGCA
ATGGTTTCATTCCGGTCGCGCAAAATATATCGTCGCAGGCCCGGGTGCTGACGAATTCGA AGGCACGCTGGAACTGGGTTATCAGATTGGCTTTCCGTGGTCACTGGGCGTTGGTATCAA
CTTCTCGTACACCACGCCGAATATTAACATCAACAATGGTAACATTACCGCACCGCCGTT TGGCCTGAACAGCGTGATTACGCCGAACCTGTTTCCGGGTGTTAGCATCTCTGCCCGTCT GGGCAATGGTCCGGGCATTCAAGAAGTGGCAACCTTTAGTGTGCGCGTTTCCGGCGCTA AAGGCGGTGTCGCGGTGTCTAACGCCCACGGTACCGTTACGGGCGCGGCCGGCGGTGTC CTGCTGCGTCCGTTCGCGCGCCTGATTGCCTCTACCGGCGACAGCGTTACGACCTATGGC GAACCGTGGAATATGAACTAA
Seq ID 4: MS-(B2)8 = MS-(L88N/D90N/D91N/D93N/D118R/D134R/E139K)8
GLDNELSLVDGQDRTLTVQQWDTFLNGVFPLDRNRLTREWFHSGRAKYIVAGPGADEFEGT
LELGYQIGFPWSLGVGI FSYTTPNI I GNITAPPFGLNSVITPNLFPGVSISARLGNGPGIQE VATF S VRVS GAKGGVAVSNAHGTVTGAAGGVLLRPF ARLI ASTGD S VTTYGEP WNMN
The biological pore may be inserted into an amphiphilic layer such as a biological membrane, for example a lipid bilayer. An amphiphilic layer is a layer formed from amphiphilic molecules, such as phospholipids, which have both hydrophilic and lipophilic properties. The amphiphilic layer may be a monolayer or a bilayer. The amphiphilic layer may be a co-block polymer such as disclosed by (Gonzalez-Perez et al., Langmuir, 2009, 25, 10447-10450). Alternatively, a biological pore may be inserted into a solid state layer.
Alternatively, a nanopore may be a solid state pore comprising an aperture formed in a solid state layer.
A solid-state layer is not of biological origin. In other words, a solid state layer is not derived from or isolated from a biological environment such as an organism or cell, or a synthetically manufactured version of a biologically available structure. Solid state layers can be formed from both organic and inorganic materials including, but not limited to, microelectronic materials, insulating materials such as Si3N4, A 1203, and SiO, organic and inorganic polymers such as polyamide, plastics such as Teflon® or elastomers such as two-component addition-cure silicone rubber, and glasses. The solid state layer may be formed from graphene. Suitable graphene layers are disclosed in WO 2009/035647 and WO-2011/046706.
A solid state pore is typically an aperture in a solid state layer. The aperture may be modified, chemically, or otherwise, to enhance its properties as a nanopore. A solid state pore may be used in combination with additional components which provide an alternative or additional measurement of the polymer such as tunnelling electrodes (Ivanov AP et al., Nano Lett. 2011 Jan 12;1 l(l):279-85), or a field effect transistor (FET) device (International Application WO 2005/124888). Solid state pores may be formed by known processes including for example those described in WO 00/79257.
In one type of measurement system, there may be used measurements of the ion current flowing through a nanopore. These and other electrical measurements may be made using standard single channel recording equipment as describe in Stoddart D et al., Proc Natl Acad Sci,
12;106(19):7702-7, Lieberman KR et al, J Am Chem Soc. 2010; 132(50): 17961-72, and International Application WO-2000/28312. Alternatively, electrical measurements may be made using a multi-
channel system, for example as described in International Application WO-2009/077734 and International Application WO-2011/067559.
In order to allow measurements to be taken as the polymer translocates through a nanopore, the rate of translocation can be controlled by a polymer binding moiety. Typically the moiety can move the polymer through the nanopore with or against an applied field. The moiety can be a molecular motor using for example, in the case where the moiety is an enzyme, enzymatic activity, or as a molecular brake. Where the polymer is a polynucleotide there are a number of methods proposed for controlling the rate of translocation including use of polynucleotide binding enzymes. Suitable enzymes for controlling the rate of translocation of polynucleotides include, but are not limited to, polymerases, helicases, exonuc leases, single stranded and double stranded binding proteins, and topoisomerases, such as gyrases. For other polymer types, moieties that interact with that polymer type can be used. The polymer interacting moiety may be any disclosed in International Application No. PCT/GB 10/000133 or US 61/441718, (Lieberman KR et al, J Am Chem Soc.
2010;132(50):17961-72), and for voltage gated schemes (Luan B et al., Phys Rev Lett.
2010;104(23):238103).
The polymer binding moiety can be used in a number of ways to control the polymer motion. The moiety can move the polymer through the nanopore with or against the applied field. The moiety can be used as a molecular motor using for example, in the case where the moiety is an enzyme, enzymatic activity, or as a molecular brake. The translocation of the polymer may be controlled by a molecular ratchet that controls the movement of the polymer through the pore. The molecular ratchet may be a polymer binding protein. For polynucleotides, the polynucleotide binding protein is preferably a polynucleotide handling enzyme. A polynucleotide handling enzyme is a polypeptide that is capable of interacting with and modifying at least one property of a polynucleotide. The enzyme may modify the polynucleotide by cleaving it to form individual nucleotides or shorter chains of nucleotides, such as di- or trinucleotides. The enzyme may modify the polynucleotide by orienting it or moving it to a specific position. The polynucleotide handling enzyme does not need to display enzymatic activity as long as it is capable of binding the target polynucleotide and controlling its movement through the pore. For instance, the enzyme may be modified to remove its enzymatic activity or may be used under conditions which prevent it from acting as an enzyme. Such conditions are discussed in more detail below.
The polynucleotide handling enzyme may be derived from a nucleolytic enzyme. The polynucleotide handling enzyme used in the construct of the enzyme is more preferably derived from a member of any of the Enzyme Classification (EC) groups 3.1.11, 3.1.13, 3.1.14, 3.1.15, 3.1.16, 3.1.21, 3.1.22, 3.1.25, 3.1.26, 3.1.27, 3.1.30 and 3.1.31. The enzyme may be any of those disclosed in International Application No. PCT/GB 10/000133 (published as WO 2010/086603).
Preferred enzymes are polymerases, exonucleases, helicases and topoisomerases, such as gyrases. Suitable enzymes include, but are not limited to, exonuc lease I from E. coli (SEQ ID NO: 8), exonuclease III enzyme from E. coli (SEQ ID NO: 10), RecJ from T. thermophilus (SEQ ID NO: 12) and bacteriophage lambda exonuclease (SEQ ID NO: 14) and variants thereof. Three subunits comprising the sequence shown in SEQ ID NO: 14 or a variant thereof interact to form a trimer exonuclease. The enzyme is preferably derived from a Phi29 DNA polymerase. An enzyme derived from Phi29 polymerase comprises the sequence shown in SEQ ID NO: 6 or a variant thereof.
A variant of SEQ ID NOs: 6, 8, 10, 12 or 14 is an enzyme that has an amino acid sequence which varies from that of SEQ ID NO: 6, 8, 10, 12 or 14 and which retains polynucleotide binding ability. The variant may include modifications that facilitate binding of the polynucleotide and/or facilitate its activity at high salt concentrations and/or room temperature.
Over the entire length of the amino acid sequence of SEQ ID NO: 6, 8, 10, 12 or 14, a variant will preferably be at least 50% homologous to that sequence based on amino acid identity. More preferably, the variant polypeptide may be at least 55%, at least 60%>, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90% and more preferably at least 95%, 97% or 99% homologous based on amino acid identity to the amino acid sequence of SEQ ID NO: 6, 8, 10, 12 or 14 over the entire sequence. There may be at least 80%, for example at least 85%, 90% or 95%, amino acid identity over a stretch of 200 or more, for example 230, 250, 270 or 280 or more, contiguous amino acids ("hard homology"). Homology is determined as described above. The variant may differ from the wild-type sequence in any of the ways discussed above with reference to SEQ ID NO: 2. The enzyme may be covalently attached to the pore as discussed above.
The two strategies for single strand DNA sequencing are the translocation of the DNA through the nanopore, both cis to trans and trans to cis, either with or against an applied potential. The most advantageous mechanism for strand sequencing is the controlled translocation of single strand DNA through the nanopore under an applied potential. Exonucleases that act progressively or processively on double stranded DNA can be used on the cis side of the pore to feed the remaining single strand through under an applied potential or the trans side under a reverse potential. Likewise, a helicase that unwinds the double stranded DNA can also be used in a similar manner. There are also possibilities for sequencing applications that require strand translocation against an applied potential, but the DNA must be first "caught" by the enzyme under a reverse or no potential. With the potential then switched back following binding the strand will pass cis to trans through the pore and be held in an extended conformation by the current flow. The single strand DNA exonucleases or single strand DNA dependent polymerases can act as molecular motors to pull the recently translocated single strand back through the pore in a controlled stepwise manner, trans to cis, against the applied potential. Alternatively, the single strand DNA dependent polymerases can act as molecular brake slowing down the movement of a polynucleotide through the pore. Any moieties, techniques or enzymes described in Provisional Application US 61/441718 or US Provisional Application No. 61/402903 could be used to control polymer motion.
However, alternative types of measurement system and measurements are also possible.
Some non- limitative examples of alternative types of measurement system are as follows. The measurement system may be a scanning probe microscope. The scanning probe microscope may be an atomic force microscope (AFM), a scanning tunnelling microscope (STM) or another form of scanning microscope.
In the case where the reader is an AFM, the resolution of the AFM tip may be less fine than the dimensions of an individual polymer unit. As such the measurement may be a function of multiple polymer units. The AFM tip may be functionalised to interact with the polymer units in an alternative manner to if it were not functionalised. The AFM may be operated in contact mode, non-contact mode, tapping mode or any other mode.
In the case where the reader is a STM the resolution of the measurement may be less fine than the dimensions of an individual polymer unit such that the measurement is a function of multiple polymer units. The STM may be operated conventionally or to make a spectroscopic measurement (STS) or in any other mode.
Some examples of alternative types of measurement include without limitation: electrical measurements and optical measurements. A suitable optical method involving the measurement of fluorescence is disclosed by J. Am. Chem. Soc. 2009, 131 1652-1653. Possible electrical measurements include: current measurements, impedance measurements, tunnelling measurements (for example as disclosed in Ivanov AP et al., Nano Lett. 2011 Jan 12;1 l(l):279-85), and FET measurements (for example as disclosed in International Application WO2005/124888). Optical measurements may be combined with electrical measurements (Soni GV et al., Rev Sci Instrum. 2010 Jan;81(l):014301). The measurement may be a transmembrane current measurement such as measurement of ion current flow through a nanopore. The ion current may typically be the DC ion current, although in principle an alternative is to use the AC current flow (i.e. the magnitude of the AC current flowing under application of an AC voltage).
Herein, the term 'k-mer' refers to a group of k- polymer units, where k is a positive integer, including the case that k is one, in which the k-mer is a single polymer unit. In some contexts, reference is made to k-mers where k is a plural integer, being a subset of k-mers in general excluding the case that k is one.
Although ideally the measurements would be dependent on a single polymer unit, with many typical measurement systems, the measurement is dependent on a k-mer of the polymer where k is a plural integer. That is, each measurement is dependent on the sequence of each of the polymer units in a k-mer where k is a plural integer. Typically the measurements are of a property that is associated with an interaction between the polymer and the measurement system.
In some embodiments of the present invention it is preferred to use measurements that are dependent on small groups of polymer units, for example doublets or triplets of polymer units (i.e. in which k=2 or k=3). In other embodiments, it is preferred to use measurements that are dependent on larger groups of polymer units, i.e. with a "broad" resolution. Such broad resolution may be
particularly useful for examining homopolymer regions.
Especially where measurements are dependent on a k-mer where k is a plural integer, it is desirable that the measurements are resolvable (i.e. separated) for as many as possible of the possible k-mers. Typically this can be achieved if the measurements produced by different k-mers are well spread over the measurement range and/or have a narrow distribution. This may be achieved to varying extents by different measurement systems. However, it is a particular advantage of the present invention, that it is not essential for the measurements produced by different k-mers to be resolvable.
Fig. 1 schematically illustrates an example of a measurement system 8 comprising a nanopore that is a biological pore 1 inserted in a biological membrane 2 such as an amphiphilic layer. A polymer 3 comprising a series of polymer units 4 is translocated through the biological pore 1 as shown by the arrows. The polymer 3 may be a polynucleotide in which the polymer units 4 are nucleotides. The polymer 3 interacts with an active part 5 of the biological pore 1 causing an electrical property such as the trans-membrane current to vary in dependence on a k-mer inside the biological pore 1. In this example, the active part 5 is illustrated as interacting with a k-mer of three polymer units 4, but this is not limitative.
Electrodes 6 arranged on each side of the biological membrane 2 are connected to a an electrical circuit 7, including a control circuit 71 and a measurement circuit 72.
The control circuit 71 is arranged to supply a voltage to the electrodes 6 for application across the biological pore 1.
The measurement circuit 72 is arranged to measures the electrical property. Thus the measurements are dependent on the k-mer inside the biological pore 1.
A typical type of signal output by a measurement system and which is an input signal to be analysed in accordance with the present invention is a "noisy step wave", although without limitation to this signal type. An example of an input signal having this form is shown in Fig. 2 for the case of an ion current measurement obtained using a measurement system comprising a nanopore.
This type of input signal comprises an input series of measurements in which successive groups of plural measurements are dependent on the same k-mer. The plural measurements in each group are of a constant value, subject to some variance discussed below, and therefore form a "level" in the signal, corresponding to a state of the measurement system. The signal moves between a set of levels, which may be a large set. Given the sampling rate of the instrumentation and the noise on the signal, the transitions between levels can be considered instantaneous, thus the signal can be approximated by an idealised step trace.
The measurements corresponding to each state are constant over the time scale of the event, but for most measurement systems will be subject to variance over a short time scale. Variance can result from measurement noise, for example arising from the electrical circuits and signal processing, notably from the amplifier in the particular case of electrophysiology. Such measurement noise is
inevitable due the small magnitude of the properties being measured. Variance can also result from inherent variation or spread in the underlying physical or biological system of the measurement system. Most measurement systems will experience such inherent variation to greater or lesser extents. For any given measurement system, both sources of variation may contribute or one of these noise sources may be dominant.
In addition, typically there is no a priori knowledge of number of measurements in the group, which varies unpredictably.
These two factors of variance and lack of knowledge of the number of measurements can make it hard to distinguish some of the groups, for example where the group is short and/or the levels of the measurements of two successive groups are close to one another.
The signal takes this form as a result of the physical or biological processes occurring in the measurement system. Thus, each group of measurements may be referred to as a "state".
For example, in some measurement systems comprising a nanopore, the event consisting of translocation of the polymer through the nanopore may occur in a ratcheted manner. During each step of the ratcheted movement, the ion current flowing through the nanopore at a given voltage across the nanopore is constant, subject to the variance discussed above. Thus, each group of measurements is associated with a step of the ratcheted movement. Each step corresponds to a state in which the polymer is in a respective position relative to the nanopore. Although there may be some variation in the precise position during the period of a state, there are large scale movements of the polymer between states. Depending on the nature of the measurement system, the states may occur as a result of a binding event in the nanopore.
The duration of individual states may be dependent upon a number of factors, such as the potential applied across the pore, the type of enzyme used to ratchet the polymer, whether the polymer is being pushed or pulled through the pore by the enzyme, pH, salt concentration and the type of nucleoside triphosphate present. The duration of a state may vary typically between 0.5ms and 3 s, depending on the measurement system, and for any given nanopore system, having some random variation between states. The expected distribution of durations may be determined experimentally for any given measurement system.
The method may use plural input series of measurements each taking the form described above in which successive groups of plural measurements in each series are dependent on the same k-mer. Such plural series might be registered so that it is known a priori which measurements from the respective series correspond and are dependent on the same k-mer, for example if the measurements of each series are taken at the same time. This might be the case, for example, if the measurements are of different properties measured by different measurement systems in synchronisation.
Alternatively, such plural series might not be registered so that it is not known a priori which measurements from the respective series correspond and are dependent on the same k-mer. This might be the case, for example, if the series of measurements are taken at different times.
The method according to the third aspect discussed below in which measurements are made under the application of different levels of voltage across a nanopore, provides a series of measurements in respect of each level of voltage. In this case, the cycle period of the measurements is chosen having regard to the cycle period of the states for the measurement system in question.
Ideally, the cycle period is shorter than the duration of all states, which is achieved by selecting a cycle period that is shorter than the minimum expected cycle period for the measurement system. However useful information may be obtained from measurements made during cycle periods that are shorter than the duration of only some states, for example shorter than the average, 60%, 70%, 80%, 90%), 95%), or 99% of the duration of states. Typically the cycle period may be at most 3s, more typically at most 2s or at most Is. Typically the cycle period may be at least 0.5ms, more typically at least 1ms or at least 2ms.
More than one voltage cycle may be applied for the duration of a state, for example a number between 2 and 10.
Multiple measurements may be made at one voltage level (or multiple measurements in at each of plural voltage levels) in respect of each k-mer. In one possible approach, the different levels of voltage may each be applied continuously for a period of time, for example when the voltage waveform is a step wave, and during respective ones of the periods of time, a group of multiple measurements are made at the one of the voltages applied during that period.
The multiple measurements may themselves be used in the subsequent analysis.
Alternatively, one or more summary measurements at the (or each) voltage level may be derived from each group of multiple measurements. The one or more summary measurements may be derived from the multiple measurements at any given voltage level in respect of any given k-mer in any manner, for example as an average or median, or as a measure of statistical variation, for example the standard deviation. The one or more summary measurements may then be used in the subsequent analysis.
The voltage cycle may be chosen from a number of different waveforms. The waveform may be asymmetric, symmetric, regular or irregular.
In one example of a cycle, the different levels of voltage may each be applied continuously for a period of time, i.e. for a partial period of the cycle, with a transition between those different levels, for example a square wave or stepped wave. The transitions between the voltage levels may be sharp or may be ramped over a period of time.
In another example of a cycle, the voltage level may vary continuously, for example being ramped between different levels, for example a triangular or sawtooth wave. In this case
measurements at different levels may be made by making measurements at times within the cycle corresponding to the desired voltage level.
Information may be derived from measurement at a voltage plateau or from measurement of the slope. Further information may be derived in addition to measurements made at different voltage levels, for example by measurement of the shape of the transient between one voltage level and another.
In a stepped voltage scheme the transitions between voltage levels may be shaped such that any capacitive transients are minimised. Considering the nanopore system as a simple RC circuit the current flowing, I, is given by the equation, I = V/R + C dV/dt, where V is the applied potential, R the resistance (typically of the pore), t time and C the capacitance (typically of the bilayer). In this model system the transition between two voltage levels would follow an exponential of time constant, τ = RC where V = V2 - (V2-Vl)*exp(-t/x).
Figs. 52a and 52b illustrates the cases where the time constant x of the transition between the voltage levels is chosen such that the transition speed is optimised, too fast and too slow. Where the voltage transition is too fast a spike (overshoot) is seen in the measured current signal, too slow and the measured signal does not flatten out quickly enough (undershoot). In the case where the transition speed is optimised the time where the measured current is distorted from the ideal sharp transition is minimised. The time constant x of the transition may be determined from measurement of the electrical properties of the measurement system, or from testing of different transitions.
Measurements may be made at any number of two or more levels of voltage. The levels of voltage are selected so that the measurements at each level of voltage provide information about the identities of the k-mers upon which the measurements depend. The choice of levels therefore depends on the nature of the measurement system. The extent of potential difference applied across the nanopore will depend upon factors such as the stability of the amphiphilic layer, the type of enzyme used and the desired speed of translocation. Typically each of the levels of voltage will be of the same polarity, although in general one or more of the levels of voltage could be of an opposite polarity to the others. In general, for most nanopore systems each level of voltage might typically be between 1 OmVand 2V relative to ground. Thus the voltage difference between the voltage levels may typically be at least 1 OmV, more preferably at least 20mV. The voltage difference between the voltage levels may typically be at most 1.5V, more typically at most 400mV. Greater voltage differences tend to give rise to greater differences in current between the voltage levels and therefore potentially a greater differentiation between respective states. However high voltage levels may give rise for example to more noise in the system or result in disruption of translocation by the enzyme. Conversely smaller voltage differences tend to give rise to smaller differences in current. An optimum potential difference may be chosen depending upon the experimental conditions or the type of enzyme ratchet.
A k-mer measured at one voltage level might not necessarily be the same k-mer as measured at a different voltage level. The value of k may differ between k-mers measured at different potentials. Should this be the case, it is likely however that there will be polymer units that are common to each k-mer measured at the different voltage levels. Without being bound by theory, it is thought that any differences in the k-mers being measured may be due to a change of conformation of the polymer within the nanopore at higher potential differences applied across the nanopore resulting in a change in the number of polymer units being measured by the reader head. The extent of this conformational change is likely to be dependent upon the difference in potential between one value and another.
There may be other information available either as part of the measurement or from additional sources that provides registration information. This other information may enable states to be identified.
Alternatively, the signal may take an arbitrary form. In these cases, the measurements corresponding to k-mers may also be described in terms of a set of emissions and transitions. For example, a measurement that is dependent on a particular k-mer may comprise of a series of measurements occurring in a fashion amenable to description by these methods.
The extent to which a given measurement system provides measurements that are dependent on k-mers and the size of the k-mers may be examined experimentally. For example, known polymers may be synthesized and held at predetermined locations relative to the measurement system to investigate from the resultant measurements how the measurements depend on the identity of k-mers that interact with the measurement system.
One possible approach is to use a set of polymers having identical sequences except for a k-mer at a predetermined position that varies for each polymer of the set. The size and identity of the k-mers can be varied to investigate the effect on the measurements.
Another possible approach is to use a set of polymers in which the polymer units outside a k-mer under investigation at a predetermined position vary for each polymer of the set. As an example of such an approach, Fig. 3 is a frequency distribution of current measurements of two
polynucleotides in a measurement system comprising a nanopore. In one of the polynucleotides (labelled polyT), every base in the region of the nanopore is a T (labelled polyT), and in the other of the polynucleotides (labelled Nl 1 -TATGAT-N8), 11 bases to the left and 8 to the right of a specific fixed 6-mer (having the sequence TATGAT) are allowed to vary. The example of Fig. 3 shows excellent separation of the two strands in terms of the current measurement. The range of values seen by the Nl 1 -TATGAT -N8 strand is also only slightly broader than that seen by the polyT. In this way and measuring polymers with other sequences also, it can be ascertained that, for the particular measurement system in question, measurements are dependent on 6-mers to a good approximation.
This approach, or similar, can be generalised for any measurement system enabling the location and a minimal k-mer description to be determined.
A probabilistic framework, in particular techniques applying multiple measurements under different conditions or via different detection methods, may enable a lower-k description of the polymer to be used. For example in the case of Sense and Antisense DNA measurements discussed below, a 3mer description may be sufficient to determine the underlying polymer k-mers where a more accurate description of each k-mer measurement would be a 6-mer. Similarily, in the case of measurement at multiple potentials, a k-mer description, wherein k has a lower value may be
sufficient to determine the underlying polymer k-mers where a more accurate description of each k-mer measurement would be a kmer or k-mers wherein k has a higher value.
Similar methodology may be used to identify location and width of well-approximating k-mers in a general measurement system. In the example of Fig. 3, this is achieved by changing the position of the 6-mer relative to the pore (e.g. by varying the number of Ns before and after) to detect location of the best approximating k-mer and increasing and decreasing the number of fixed bases from 6. The value of k can be minimal subject to the spread of values being sufficiently narrow. The location of the k-mer can be chosen to minimise peak width.
For typical measurement systems, it is usually the case that measurements that are dependent on different k-mers are not all uniquely resolvable. For example, in the measurement system to which Fig. 3 relates, it is observed that the range of the measurements produced by DNA strands with a fixed 6-mer is of the order of 2 pA and the approximate measurement range of this system is between 30 pA and 70 pA. For a 6-mer, there are 4096 possible k-mers. Given that each of these has a similar variation of 2 pA, it is clear that in a 40 pA measurement range these signals will not be uniquely resolvable. Even where measurements of some k-mers are resolvable, it is typically observed that measurements of many other k-mers are not.
For many actual measurement systems, it is not possible to identify a function that transforms k measurements, that each depend in part on the same polymer unit, to obtain a single value that is resolved at the level of a polymer unit, or more generally the k-mer measurement is not describable by a set of parameters smaller than the number of k-mers.
By way of example, it will now be demonstrated for a particular measurement system comprising a nanopore experimentally derived ion current measurements of polynucleotides are not accurately describable by a simple first order linear model. This is demonstrated for the two training sets described in more detail below. The simple first order linear model used for this demonstration is:
Current = Sum [ fn(Bn) ] + E
where fn are coefficients for each base Bn occurring at each position n in the measurement system and E represents the random error due to experimental variability. The data are fit to this model by a least squares method, although any one of many methods known in the art could alternatively be used. Figs. 4 and 5 are plots of the best model fit against the current measurements. If the data was well described by this model, then the points should closely follow the diagonal line within a typical experimental error (for example 2 pA).This is not the case showing that the data is not well described by this linear model for either set of coefficients.
There will now be described a specific method of analysing an input signal that is a noisy step wave, that embodies the first aspect of the present invention. The following method relates to the case that measurements are dependent on a k-mer where k is two or more, but the same method may be applied in simplified form to measurements that are dependent on a k-mer where k is one.
The method is illustrated in Fig. 6 and may be implemented in an analysis unit 10 illustrated schematically in Fig. 6. The analysis unit 10 receives and analyses an input signal that comprises measurements from the measurement circuit 72. The analysis unit 10 and the measurement system 8 are therefore connected and together constitute an apparatus for analysing a polymer. The analysis unit 10 may also provide control signals to the control circuit 7 to select the voltage applied across the biological pore 1 in the measurement system 8, and may analyse the measurements from the measurement circuit 72 in accordance with applied voltage.
The apparatus including the analysis unit 10 and the measurement system 8 may be arranged as disclosed in any of WO-2008/102210, WO-2009/07734, WO-2010/122293 and/or WO-2011/067559.
The analysis unit 10 may be implemented by a computer program executed in a computer apparatus or may be implemented by a dedicated hardware device, or any combination thereof. In either case, the data used by the method is stored in a memory in the analysis unit 10. The computer apparatus, where used, may be any type of computer system but is typically of conventional construction. The computer program may be written in any suitable programming language. The computer program may be stored on a computer-readable storage medium, which may be of any type, for example: a recording medium which is insertable into a drive of the computing system and which may store information magnetically, optically or opto-magnetically; a fixed recording medium of the computer system such as a hard drive; or a computer memory.
The method is performed on an input signal 11 that comprises a series of measurements (or more generally any number of series, as described further below) of the type described above comprising successive groups of plural measurements that are dependent on the same k-mer without a priori knowledge of number of measurements in any group. An example of such an input signal 11 is shown in Fig. 2 as previously described.
In a state detection step SI, the input signal 11 is processed to identify successive groups of measurements and to derive a series of measurements 12 consisting of a predetermined number, being one or more, of measurements in respect of each identified group. An analysis step S2 is performed on the thus derived series of measurements 12. The purpose of the state detection step SI is to reduce the input signal to a predetermined number of measurements associated with each k-mer state to simplify the analysis step S2. For example a noisy step wave signal, as shown in Fig. 2 may be reduced to states where a single measurement associated with each state may be the mean current. This state may be termed a level.
The state detection step SI may be performed using the method shown in Fig. 7 that looks for short-term increases in the derivative of the input signal 11 as follows.
In step S 1-1, the input signal 11 is differentiated to derive its derivative.
In step SI -2, the derivative from step Sl-1 is subjected to low-pass filtering to suppress high-frequency noise (which the differentiation tends to amplify).
In step SI -3, the filtered derivative from step SI -2 is thresholded to detect transition points between the groups of measurements, and thereby identify the groups of data.
In step S 1 -4, a predetermined number of measurements is derived from the input signal 11 in each group identified in step SI -3. In the simplest approach, a single measurement is derived, for example as the mean, median, or other measure of location, of the measurements in each identified group. The measurements output from step SI -4 form the series of measurements 12. In other approaches, plural measurements in respect of each group are derived.
A common simplification of this technique is to use a sliding window analysis whereby one compares the means of two adjacent windows of data. A threshold can then be either put directly on the difference in mean, or can be set based on the variance of the data points in the two windows (for example, by calculating Student's t-statistic). A particular advantage of these methods is that they can be applied without imposing many assumptions on the data.
Other information associated with the measured levels can be stored for use later in the analysis. Such information may include without limitation any of: the variance of the signal;
asymmetry information; the confidence of the observation; the length of the group.
By way of example, Fig. 9 illustrates an experimentally determined input signal 11 reduced by a moving window t-test. In particular, Fig. 9 shows the input signal 11 as the light line. Levels following state detection are shown overlayed as the dark line. Fig. 10 shows the series of measurements 12 derived for the entire trace, calculating the level of each state from the mean value between transitions.
However, as described in more detail below, the state detection step S 1 is optional and in an alternative described further below, may be omitted. In this case, as shown schematically by the dotted line in Fig. 6, the analysis step S2 is performed on the input signal 11 itself, instead of the series of measurements 12.
The analysis step S2 will now be described.
The analysis step S2 uses an analytical technique that refers to a model 13 stored in the analysis unit 10. The analysis step S2 estimates an estimated sequence 16 of polymer units in the polymer based on the likelihood predicted by the model 13 of the series of measurements 12 being produced by sequences of polymer units. In the simplest case, the estimated sequence 16 may be a representation that provides a single estimated identity for each polymer unit. More generally, the estimated sequence 16 may be any representation of the sequence of polymer units according to some optimality criterion. For example, the estimated sequence 16 may comprise plural sequences, for example including plural estimated identities of one or more polymer units in part or all of the polymer.
The mathematical basis of the model 13 will now be considered. The analysis step S2 also provides quality scores 17 that are described further below.
The relationship between a sequence of random variables {Xi,X2, .. .,Xn} from which currents are sampled may be represented by a simple graphical model A, which represents the conditional
independence relationships between variables:
Each current measurement is dependent on a k-mer being read, so there is an underlying set of random variables { S i, S2, . . . , Sn} representing the underlying sequence of k-mers and with a corresponding graphical model B:
I I I I
These models as applied to the current area of application take advantage of the Markov property. In model A, if f(X;) is taken to represent the probability density function of the random variable X;, then the Markov property can be represented as:
|Χΐ,Χ2, · · · , Xm- l)
In model B, the Markov property can be represented as:
P(Sm|S m-l)=P(S m|S l, S2, . . . , S m_l)
Depending on exactly how the problem is encoded, natural methods for solution may include
Bayesian networks, Markov random fields, Hidden Markov Models, and also including variants of these models, for example conditional or maximum entropy formulations of such models. Methods of solution within these slightly different frameworks are often similar. Generally, the model 13 comprises transition weightings 14 representing the chances of transitions from origin k-mers to destination k-mers; and emission weightings 15 in respect of each k-mer that represent the chances of observing given values of measurements for that k-mer. An explanation will now be given in the case that the model 13 is a Hidden Markov Model.
The Hidden Markov Model (HMM) is a natural representation in the setting given here in graphical model B. In a HMM, the relationship between the discrete random variables Sm and Sm+i is defined in terms of a transition matrix of transition weightings 14 that in this case are probabilities representing the probabilities of transitions between the possible states that each random variable can take, that is from origin k-mers to destination k-mers. For example, conventionally the (i,j)th entry of the transition matrix is a transition weighting 14 representing the probability that
given that Sm=sm;;. i.e. the probability of transitioning to the j'th possible value of Sm+i given that Sm takes on its i'th possible value.
Fig. 11 is a pictorial representation of the transition matrix from Sm to Sm+i . Here Sm and Sm+i only show 4 values for sake of illustration, but in reality there would be as many states as there are different k-mers. Each edge represents a transition, and may be labelled with the entry from the transition matrix representing the transition probability. In Fig. 11 , the transition probabilities of the four edges connecting each node in the Sm layer to the Sm+i layer would classically sum to one, although non-probabilistic weightings may be used.
In general, it is desirable that the transition weightings 14 comprise values of non-binary
variables (non-binary values). This allows the model 13 to represent the actual probabilities of transitions between the k-mers.
Considering that the model 13 represents the k-mers, any given k-mer has k preferred transitions, being transitions from origin k-mers to destination k-mers that have a sequence in which the first (k-1) polymer units are the final (k-1) polymer unit of the origin k-mer. For example in the case of polynucleotides consisting of the 4 nucleotides G, T, A and C, the origin 3-mer TAC has preferred transitions to the 3-mers ACA, ACC, ACT and ACG. To a first approximation,
conceptually one might consider that the transition probabilities of the four preferred transitions are equal being (0.25) and that the transition probabilities of the other non-preferred transitions are zero, the non-preferred transitions being transitions from origin k-mers to destination k-mers that have a sequence different from the origin k-mer and in which the first (k-1) polymer units are not the final (k-1) polymer units of the origin k-mer. However, whilst this approximation is useful for
understanding, the actual chances of transitions may in general vary from this approximation in any given measurement system. This can be reflected by the transition weightings 14 taking values of non-binary variables (non-binary values). Some examples of such variation that may be represented are as follows.
One example is that the transition probabilities of the preferred transitions might not be equal. This allows the model 13 to represent polymers in which there is an interrelationship between polymers in a sequence.
One example is that the transition probabilities of at least some of the non-preferred transitions might be non-zero. This allows the model 13 to take account of missed measurements, that is in which there is no measurement that is dependent on one (or more) of the k-mers in the actual polymer. Such missed measurements might occur either due to a problem in the measurement system such that the measurement is not physically taken, or due to a problem in the subsequent data analysis, such as the state detection step SI failing to identify one of the groups of measurements, for example because a given group is too short or two groups do not have sufficiently separated levels.
Notwithstanding the generality of allowing the transition weightings 14 to have any value, typically it will be the case that the transition weightings 14 represent non-zero chances of the preferred transitions from origin k-mers to destination k-mers that have a sequence in which the first (k-1) polymer units are the final (k-1) polymer unit of the origin k-mer, and represent lower chances of non-preferred transitions. Typically also, the transition weightings 14 represent non-zero chances of at least some of said non-preferred transitions, even though the chances may be close to zero, or may be zero for some of the transitions that are absolutely excluded.
To allow for single missed k-mers in the sequence, the transition weightings 14 may represent non-zero chances of non-preferred transitions from origin k-mers to destination k-mers that have a sequence wherein the first (k-2) polymer units are the final (k-2) polymer unit of the origin k-mer. For example in the case of polynucleotides consisting of 4 nucleotides, for the origin 3-mer TAC these are the transitions to all possible 3-mers starting with C. We may define the transitions corresponding to these single missed k-mers as "skips."
In the case of analysing the series of measurements 12 comprising a single measurement in respect of each k-mer, then the transition weightings 14 will represent a high chance of transition for each measurement 12. Depending on the nature of the measurements, the chance of transition from an origin k-mer to a destination k-mer that is the same as the origin k-mer may be zero or close to zero, or may be similar to the chance of the non-preferred transitions.
Similarly in the case of analysing a series of measurements 12 comprising a predetermined number of measurements in respect of each k-mer, then the transition weightings 14 may represent a low or zero chance of transition between the measurements 12 in respect of the same k-mer. It is possible to change the transition weightings 14 to allow the origin k-mer and destination k-mer to be the same k-mer. This allows, for example, for falsely detected state transitions. We may define the transitions corresponding to these repeated same k-mers as "stays." We note that in the case where all of the polymer units in the k-mer are identical, a homopolymer, a preferred transition would be a stay transition. In these cases the polymer has moved one position but the k-mer remains the same.
Claims
1. A method of estimating a sequence of polymer units in a polymer from at least one series of measurements related to the polymer, wherein the value of each measurement is dependent on a k-mer, being a group of k polymer units where k is a positive integer, the method comprising:
providing a model comprising, for a set of possible k-mers:
transition weightings representing the chances of transitions from origin k-mers to destination k-mers; and
emission weightings in respect of each k-mer that represent the chances of observing given values of measurements for that k-mer; and
analysing the series of measurements using an analytical technique that refers to the model and estimating at least one estimated sequence of polymer units in the polymer based on the likelihood predicted by the model of the series of measurements being produced by sequences of polymer units.
2. A method according to claim 1, wherein at least one of the transition weightings and the emission weightings comprise values of non-binary variables.
3. A method according to claim 2, wherein both of the transition weightings and the emission weightings comprise values of non-binary variables.
4. A method according to any one of claims 1 to 3, wherein the emission weightings represent non-zero chances of observing all possible measurements.
5. A method according to any one of claims 1 to 4, wherein the emission weightings in respect of each k-mer have a unimodal or multimodal distribution over the values of measurements.
6. A method according to claim 5, wherein the emission weightings in respect of each k-mer have a Gaussian,Laplace, square or triangular distribution over the values of measurements.
7. A method according to any one of claims 1 to 6, wherein k is a plural integer.
8. A method according to claim 7, wherein the transition weightings represent non-zero chances of preferred transitions, being transitions from origin k-mers to destination k-mers that have a sequence in which the first (k-1) polymer units are the final (k-1) polymer units of the origin k-mer, and represent lower chances of non-preferred transitions, being transitions from origin k-mers to destination k-mers that have a sequence different from the origin k-mer and in which the first (k-1) polymer units are not the final (k-1) polymer units of the origin k-mer.
9. A method according to claim 8, wherein the transition weightings represent non-zero chances of at least some of said non-preferred transitions.
10. A method according to claim 9, wherein the transition weightings represent non-zero chances of non-preferred transitions from origin k-mers to destination k-mers that have a sequence wherein the first (k-2) polymer units are the final (k-2) polymer units of the origin k-mer.
11. A method according to any one of claims 1 to 10, wherein the analytical technique is a probabilistic technique.
12. A method according to any one of claims 1 to 11, wherein the transition weightings are probabilities, and/or the emission weightings are probabilities.
13. A method according to any one of claims 1 to 12, wherein the model is a Hidden Markov Model.
14. A method according to any one of claims 1 to 13, wherein the step of analysing further comprises deriving a quality score in respect of the or each estimated sequence that represents the likelihood predicted by the model of the series of measurements being produced by the estimated sequence of polymer units.
15. A method according to any one of claims 1 to 14, wherein the step of analysing further comprises deriving quality scores in respect of individual k-mers corresponding to the estimated sequence of polymer units, that represent the likelihoods predicted by the model of the series of measurements being produced by a sequence including the individual k-mers.
16. A method according to any one of claims 1 to 15, wherein the step of analysing further comprises deriving quality scores in respect of sequences of k-mers corresponding to the estimated sequence of polymer units, that represent the likelihoods predicted by the model of the series of measurements being produced by the given sequences of k-mers.
17. A method according to any one of claims 1 to 16, wherein the step of analysing derives plural estimated sequences of polymer units in the polymer.
18. A method according to any one of claims 1 to 17, wherein the step of estimating at least one estimated sequence of polymer units in the polymer comprises:
estimating a sequence of k-mers based on the likelihood predicted by the model of the series of measurements being produced by the individual k-mers; and
estimating a sequence of polymer units from the estimated sequence of k-mers.
19. A method according to any one of claims 1 to 18, wherein the step of estimating at least one estimated sequence of polymer units in the polymer comprises:
estimating at least one sequence of k-mers based on the likelihood predicted by the model of the series of measurements being produced by overall sequences of k-mers; and
estimating a sequence of polymer units from the estimated sequence of k-mers.
20. A method according to any one of claims 1 to 19, wherein, in the at least one series of measurements, a predetermined number of measurements are dependent on each k-mer, the predetermined number being one or more.
21. A method according to claim 20, wherein
the method comprises receiving at least one input signal comprising an input series of measurements in which groups of plural measurements are dependent on the same k-mer, without a priori knowledge of the number of measurements in the group, and
before the step of analysing, processing the at least one input signal to identify successive groups of measurements and to derive said predetermined number of measurements in respect of each identified group, the step of analysing being performed on the or each series of measurements thus derived.
22. A method according to any one of claims 1 to 19, wherein, in the at least one series of measurements, groups of plural measurements are dependent on the same k-mer, without a priori knowledge of number of measurements in the group.
23. A method according to any one of claims 1 to 22, further comprising making said measurements of a polymer.
24. A method according to claim 23, wherein said measurements of the polymer are made during translocation of the polymer through a nanopore
25. A method according to claim 24, wherein translocation of the polymer is performed such that groups of plural measurements are dependent on the same k-mer.
26. A method according to claim 24 or 25 wherein translocation of the polymer through the nanopore is performed in a ratcheted manner.
27. A method according to any one of claims 24 to 26, wherein the polymer is a polynucleotide, and the polymer units are nucleotides.
28. A method according to any one of claims 24 to 27, wherein the series of measurements are measurements taken during translocation of the polymer through a nanopore.
29. A method according to any one of claims 24 to 28, wherein the nanopore is a biological pore.
30. A method according to any one of claims 24 to 29, wherein the measurements comprise one or more of current measurements, impedance measurements, tunnelling measurements, FET measurements and optical measurements.
31. A method according to any one of claims 24 to 30, wherein
the method is performed on plural series of measurements each related to said polymer, wherein the value of each measurement is dependent on a k-mer,
the analytical technique treats the plural series of measurements as arranged in plural, respective dimensions.
32. A method according to claim 31, wherein each series of measurements are measurements of the same region of the same polymer.
33. A method according to claim 31, wherein the plural series of measurements comprise two series of measurements, wherein the first series of measurements are measurements of a first region of a polymer and the second series of measurements are measurements of a second region of a polymer that is related to said first region.
34. A method according to claim 33, wherein the first and second regions are related regions of the same polymer.
35. A method according to claims 33 or 34 wherein the related regions are complementary.
36. A method according to any one of claims 1 to 35, wherein the model is stored in a memory.
37. A method according to any one of claim 1 to 36, wherein the steps of providing a model and analysing measurements are implemented in a hardware apparatus or in a computer apparatus.
38. A device configured to perform a method according to any one of claims 1 to 37.
39. An analysis device for estimating a sequence of polymer units in a polymer from at least one series of measurements related to the polymer, wherein the value of each measurement is dependent on a k-mer being a group of k polymer units, where k is a plural integer, the method comprising: a memory storing a model comprising, for a set of possible k-mers:
transition weightings representing the chances of transitions from origin k-mers to destination k-mers; and
emission weightings in respect of each k-mer that represent the chances of observing given values of measurements for that k-mer; and
an analysis unit configured to analyse the series of measurements using an analytical technique that refers to the model and to estimate at least one estimated sequence of polymer units in the polymer based on the likelihood predicted by the model of the series of measurements being produced by sequences of polymer units.
40. A sequencing apparatus comprising:
a measurement device configured to make said measurements of a polymer; and an analysis device according to claims 38 or 39.
41. A method of analysing a polymer comprising polymer units, the method comprising:
during translocation of a polymer through a nanopore while a voltage is applied across the nanopore, making measurements that are dependent on the identity of k-mers in the nanopore, a k-mer being k polymer units of the polymer, where k is a positive integer, wherein the measurements comprise, in respect of individual k-mers, separate measurements made at different levels of said voltage applied across the nanopore; and
analysing the measurements at said different levels of said voltage to determine the identity of at least part of the polymer.
42. A method according to claim 41, wherein said step of making measurements comprises: performing plural translocations of said polymer through a nanopore while a voltage is applied across the nanopore at different levels in different translocations;
during said different translocations, making measurements of said k-mers at said different levels of said voltage across the nanopore.
43. A method according to claim 42, wherein said plural translocations include translocation in a first direction through the nanopore and translocation in the opposite direction through the nanopore to the first direction.
44. A method according to claim 41, wherein said step of making measurements comprises: performing a translocation of said polymer through a nanopore while a voltage is applied across the nanopore;
during said translocation of the polymer through the nanopore, applying said different levels of said voltage in a cycle having a cycle period shorter than the duration of states in which said measurements are dependent on said individual k-mers, and making said separate measurements in respect of said individual k-mers at said different levels of said voltage in said cycle.
45. A method of making measurements of a polymer comprising polymer units, the method comprising:
performing a translocation of said polymer through a nanopore while a voltage is applied across the nanopore;
during said translocation of the polymer through the nanopore, applying different levels of said voltage in a cycle, and
making measurements that are dependent on the identity of k-mers in the nanopore, a k-mer being k polymer units of the polymer, where k is a positive integer, the measurements comprising separate measurements in respect of individual k-mers at said different levels of said voltage in said cycle, the cycle having a cycle period shorter than states in which said measurements are dependent on said individual k-mers.
46. A method according to claim 44 or 45, wherein the cycle period is at most 3s.
47. A method according to any one of claims 44 to 46, wherein the cycle period is at least 0.5ms.
48. A method according to any one of claims 44 to 47, wherein the different levels of said voltage are each applied continuously for partial periods of said cycle.
49. A method according to claim 48, wherein the transitions between said different levels of said voltage in said cycle are shaped to reduce capacitive transients in the measurement caused by the voltage changes.
50. A method according to claim 45 or any one of claims 46 to 49 when appendant to claim 5, further comprising analysing the measurements to determine the identity of the polymer.
51. A method according to any one of claims 41 to 44 or 50, wherein the step of analysing the measurements to estimate the identity of the polymer comprises analysing the measurements to estimate a sequence of polymer units in the polymer.
52. A method according to claim 51, wherein the step of analysing the measurements to estimate a sequence of polymer units in the polymer comprises:
providing a model comprising, for a set of possible k-mers:
transition weightings representing the chances of transitions from origin k-mers to destination k-mers; and
emission weightings in respect of each k-mer that represent the chances of observing given values of measurements for that k-mer; and
analysing the measurements using an analytical technique that refers to the model and treats the measurements made under the application of different levels of voltage across the nanopore as measurements in plural dimensions, and estimating at least one estimated sequence of polymer units in the polymer based on the likelihood predicted by the model of the series of measurements being produced by sequences of polymer units.
53. A method according to any one of claims 41 to 44, 51 or 52, wherein the step of analysing the measurements to determine the identity of the polymer further comprises comparing the separate measurements made at said different voltage levels to determine a transition between states in which said measurements are dependent on said individual k-mers.
54. A method according to any one of the preceding claims, wherein the difference between said different levels of voltage is in the range from lOmV to 1.5V.
55. A method according to any one of the preceding claims, wherein said different levels consist of two different levels.
56. A method according to any one of the preceding claims, wherein the different levels of voltage are of the same polarity.
57. A method according to any one of the preceding claims, wherein said measurements are measurements of ion current flow through the nanopore.
58. A method according to claim 57, wherein said measurements of ion current flow through the nanopore are measurements of DC ion current flow through the nanopore.
59. A method according to any one of the preceding claims, comprising
making groups of multiple measurements at each one of said different levels of said voltage; and
deriving one or more summary measurements from each group of multiple measurements at each one of said different levels to constitute said separate measurements in respect of an individual k-mer.
60. A method according to claim 59, wherein the different levels of said voltage are each applied continuously for a period of time and
during each respective period of time, making one of the groups of multiple measurements at one of the said different levels of said voltage applied during the respective period.
61. A method according to any one of the preceding claims, wherein the polymer is a polynucleotide, and the polymer units are nucleotides.
62. A method according to any one of the preceding claims, wherein the nanopore is a biological pore.
63. A method according to any one of the preceding claims, wherein said translocation of the polymer through the nanopore is performed in a ratcheted manner in which successive k-mers are registered with the nanopore.
64. A method according to any one of the preceding claims, wherein the translocation of the polymer is controlled by a molecular ratchet.
65. A method according to claim 64, wherein the molecular ratchet is an enzyme.
66. An apparatus for analysing a polymer comprising polymer units, the apparatus comprising: a nanopore through which a polymer may be translocated;
a control circuit arranged to apply a voltage across the nanopore during translocation of the polymer through the nanopore; and
a measurement circuit arranged to make measurements that are dependent on the identity of k-mers in the nanopore, a k-mer being k polymer units of the polymer, where k is a positive integer, wherein the control circuit is arranged to apply different levels of voltage across the nanopore and the measurement circuit is arranged to make separate measurements, in respect of individual k-mers, at different levels of said voltage applied across the nanopore; and
an analysis unit arranged to analyse the measurements at said different levels of said voltage to determine the identity of at least part of the polymer.
67. An apparatus according to claim 66, wherein the control circuit is arranged to apply different levels of voltage across the nanopore during different translocations of said polymer through a nanopore, and the measurement circuit is arranged to make separate measurements, in respect of individual k-mers, during said different translocations at different levels of said voltage.
68. An apparatus according to claim 66, wherein the control circuit is arranged, during said translocation of the polymer through the nanopore, to apply said different levels of said voltage in a cycle having a cycle period shorter than the duration of states in which said measurements are dependent on said individual k-mers, and the measurement circuit is arranged to make separate measurements, in respect of individual k-mers, at said different levels of said voltage in said cycle.
69. An apparatus for measuring a polymer comprising polymer units, the apparatus comprising: a nanopore though which a polymer may be translocated;
a control circuit arranged, during translocation of the polymer through the nanopore, to apply different levels of said voltage in a cycle having a cycle period shorter than the duration of states in which said measurements are dependent on said individual k-mers; and
a measurement circuit arranged to make separate measurements, in respect of individual k-mers, at different levels of said voltage applied across the nanopore.
70. An apparatus according to claim 69, further comprising an analysis unit arranged to analyse the measurements at said different levels of said voltage to determine the identity of at least part of the polymer.
| # | Name | Date |
|---|---|---|
| 1 | 2037-DELNP-2014-IntimationOfGrant03-03-2023.pdf | 2023-03-03 |
| 1 | Sequence Listing.txt | 2014-03-20 |
| 2 | Form 5.pdf | 2014-03-20 |
| 2 | 2037-DELNP-2014-PatentCertificate03-03-2023.pdf | 2023-03-03 |
| 3 | Form 3.pdf | 2014-03-20 |
| 3 | 2037-DELNP-2014-Response to office action [01-03-2023(online)].pdf | 2023-03-01 |
| 4 | Drawings.pdf | 2014-03-20 |
| 4 | 2037-DELNP-2014-Response to office action [07-12-2021(online)].pdf | 2021-12-07 |
| 5 | Complete Specification.pdf | 2014-03-20 |
| 5 | 2037-DELNP-2014-AMENDED DOCUMENTS [02-12-2021(online)]-1.pdf | 2021-12-02 |
| 6 | Abstract.pdf | 2014-03-20 |
| 6 | 2037-DELNP-2014-AMENDED DOCUMENTS [02-12-2021(online)].pdf | 2021-12-02 |
| 7 | 2037-DELNP-2014-FORM 13 [02-12-2021(online)]-1.pdf | 2021-12-02 |
| 7 | 2037-delnp-2014-Correspondence-Others-(20-03-2014).pdf | 2014-03-20 |
| 8 | 2037-DELNP-2014.pdf | 2014-03-21 |
| 8 | 2037-DELNP-2014-FORM 13 [02-12-2021(online)].pdf | 2021-12-02 |
| 9 | 2037-DELNP-2014-MARKED COPIES OF AMENDEMENTS [02-12-2021(online)]-1.pdf | 2021-12-02 |
| 9 | 2037-DELNP-2014-GPA-(24-04-2014).pdf | 2014-04-24 |
| 10 | 2037-DELNP-2014-Correspondence-Others-(24-04-2014).pdf | 2014-04-24 |
| 10 | 2037-DELNP-2014-MARKED COPIES OF AMENDEMENTS [02-12-2021(online)].pdf | 2021-12-02 |
| 11 | 2037-DELNP-2014-Form-3-(29-08-2014).pdf | 2014-08-29 |
| 11 | 2037-DELNP-2014-POA [02-12-2021(online)]-1.pdf | 2021-12-02 |
| 12 | 2037-DELNP-2014-Correspondence-Others-(29-08-2014).pdf | 2014-08-29 |
| 12 | 2037-DELNP-2014-POA [02-12-2021(online)].pdf | 2021-12-02 |
| 13 | 2037-DELNP-2014-RELEVANT DOCUMENTS [02-12-2021(online)]-1.pdf | 2021-12-02 |
| 13 | Other Document [23-09-2015(online)].pdf | 2015-09-23 |
| 14 | 2037-DELNP-2014-RELEVANT DOCUMENTS [02-12-2021(online)].pdf | 2021-12-02 |
| 14 | Marked Copy [23-09-2015(online)].pdf | 2015-09-23 |
| 15 | 2037-DELNP-2014-FER.pdf | 2021-10-17 |
| 15 | Form 13 [23-09-2015(online)].pdf | 2015-09-23 |
| 16 | 2037-DELNP-2014-ABSTRACT [17-07-2020(online)].pdf | 2020-07-17 |
| 16 | Description(Complete) [23-09-2015(online)].pdf | 2015-09-23 |
| 17 | 2037-DELNP-2014-CLAIMS [17-07-2020(online)].pdf | 2020-07-17 |
| 17 | 2037-DELNP-2014-SEQUENCE LISTING [17-07-2020(online)].txt | 2020-07-17 |
| 18 | 2037-DELNP-2014-COMPLETE SPECIFICATION [17-07-2020(online)].pdf | 2020-07-17 |
| 18 | 2037-DELNP-2014-RELEVANT DOCUMENTS [17-07-2020(online)].pdf | 2020-07-17 |
| 19 | 2037-DELNP-2014-PETITION UNDER RULE 137 [17-07-2020(online)].pdf | 2020-07-17 |
| 19 | 2037-DELNP-2014-CORRESPONDENCE [17-07-2020(online)].pdf | 2020-07-17 |
| 20 | 2037-DELNP-2014-DRAWING [17-07-2020(online)].pdf | 2020-07-17 |
| 20 | 2037-DELNP-2014-OTHERS [17-07-2020(online)].pdf | 2020-07-17 |
| 21 | 2037-DELNP-2014-FER_SER_REPLY [17-07-2020(online)].pdf | 2020-07-17 |
| 21 | 2037-DELNP-2014-Information under section 8(2) [17-07-2020(online)].pdf | 2020-07-17 |
| 22 | 2037-DELNP-2014-FORM 3 [17-07-2020(online)].pdf | 2020-07-17 |
| 22 | 2037-DELNP-2014-Information under section 8(2) [17-07-2020(online)]-1.pdf | 2020-07-17 |
| 23 | 2037-DELNP-2014-FORM 3 [17-07-2020(online)].pdf | 2020-07-17 |
| 23 | 2037-DELNP-2014-Information under section 8(2) [17-07-2020(online)]-1.pdf | 2020-07-17 |
| 24 | 2037-DELNP-2014-FER_SER_REPLY [17-07-2020(online)].pdf | 2020-07-17 |
| 24 | 2037-DELNP-2014-Information under section 8(2) [17-07-2020(online)].pdf | 2020-07-17 |
| 25 | 2037-DELNP-2014-OTHERS [17-07-2020(online)].pdf | 2020-07-17 |
| 25 | 2037-DELNP-2014-DRAWING [17-07-2020(online)].pdf | 2020-07-17 |
| 26 | 2037-DELNP-2014-CORRESPONDENCE [17-07-2020(online)].pdf | 2020-07-17 |
| 26 | 2037-DELNP-2014-PETITION UNDER RULE 137 [17-07-2020(online)].pdf | 2020-07-17 |
| 27 | 2037-DELNP-2014-COMPLETE SPECIFICATION [17-07-2020(online)].pdf | 2020-07-17 |
| 27 | 2037-DELNP-2014-RELEVANT DOCUMENTS [17-07-2020(online)].pdf | 2020-07-17 |
| 28 | 2037-DELNP-2014-CLAIMS [17-07-2020(online)].pdf | 2020-07-17 |
| 28 | 2037-DELNP-2014-SEQUENCE LISTING [17-07-2020(online)].txt | 2020-07-17 |
| 29 | 2037-DELNP-2014-ABSTRACT [17-07-2020(online)].pdf | 2020-07-17 |
| 29 | Description(Complete) [23-09-2015(online)].pdf | 2015-09-23 |
| 30 | 2037-DELNP-2014-FER.pdf | 2021-10-17 |
| 30 | Form 13 [23-09-2015(online)].pdf | 2015-09-23 |
| 31 | 2037-DELNP-2014-RELEVANT DOCUMENTS [02-12-2021(online)].pdf | 2021-12-02 |
| 31 | Marked Copy [23-09-2015(online)].pdf | 2015-09-23 |
| 32 | 2037-DELNP-2014-RELEVANT DOCUMENTS [02-12-2021(online)]-1.pdf | 2021-12-02 |
| 32 | Other Document [23-09-2015(online)].pdf | 2015-09-23 |
| 33 | 2037-DELNP-2014-Correspondence-Others-(29-08-2014).pdf | 2014-08-29 |
| 33 | 2037-DELNP-2014-POA [02-12-2021(online)].pdf | 2021-12-02 |
| 34 | 2037-DELNP-2014-Form-3-(29-08-2014).pdf | 2014-08-29 |
| 34 | 2037-DELNP-2014-POA [02-12-2021(online)]-1.pdf | 2021-12-02 |
| 35 | 2037-DELNP-2014-Correspondence-Others-(24-04-2014).pdf | 2014-04-24 |
| 35 | 2037-DELNP-2014-MARKED COPIES OF AMENDEMENTS [02-12-2021(online)].pdf | 2021-12-02 |
| 36 | 2037-DELNP-2014-GPA-(24-04-2014).pdf | 2014-04-24 |
| 36 | 2037-DELNP-2014-MARKED COPIES OF AMENDEMENTS [02-12-2021(online)]-1.pdf | 2021-12-02 |
| 37 | 2037-DELNP-2014.pdf | 2014-03-21 |
| 37 | 2037-DELNP-2014-FORM 13 [02-12-2021(online)].pdf | 2021-12-02 |
| 38 | 2037-DELNP-2014-FORM 13 [02-12-2021(online)]-1.pdf | 2021-12-02 |
| 38 | 2037-delnp-2014-Correspondence-Others-(20-03-2014).pdf | 2014-03-20 |
| 39 | Abstract.pdf | 2014-03-20 |
| 39 | 2037-DELNP-2014-AMENDED DOCUMENTS [02-12-2021(online)].pdf | 2021-12-02 |
| 40 | Complete Specification.pdf | 2014-03-20 |
| 40 | 2037-DELNP-2014-AMENDED DOCUMENTS [02-12-2021(online)]-1.pdf | 2021-12-02 |
| 41 | Drawings.pdf | 2014-03-20 |
| 41 | 2037-DELNP-2014-Response to office action [07-12-2021(online)].pdf | 2021-12-07 |
| 42 | Form 3.pdf | 2014-03-20 |
| 42 | 2037-DELNP-2014-Response to office action [01-03-2023(online)].pdf | 2023-03-01 |
| 43 | 2037-DELNP-2014-PatentCertificate03-03-2023.pdf | 2023-03-03 |
| 43 | Form 5.pdf | 2014-03-20 |
| 44 | 2037-DELNP-2014-IntimationOfGrant03-03-2023.pdf | 2023-03-03 |
| 44 | Sequence Listing.txt | 2014-03-20 |
| 1 | 1searchstrgy_11-02-2020.pdf |