Sign In to Follow Application
View All Documents & Correspondence

Cyclopropene Amino Acids And Methods

Abstract: The invention relates to a polypeptide comprising an amino acid having a cyclopropene group wherein said cyclopropene group is joined to the amino acid via a carbamate group. Suitably the cyclopropene group is a 1 3 disubstituted cyclopropene such as a 1 3 di methylcyclopropene. Suitably the cyclopropene group is present as a residue of a lysine amino acid. The invention also relates to methods of making the polypeptides. The invention also relates to an amino acid comprising cyclopropene wherein said cyclopropene group is joined to the amino acid moiety via a carbamate group.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
09 August 2016
Publication Number
36/2016
Publication Type
INA
Invention Field
BIOTECHNOLOGY
Status
Email
rahulb@adastraip.com
Parent Application
Patent Number
Legal Status
Grant Date
2020-11-28
Renewal Date

Applicants

MEDICAL RESEARCH COUNCIL
2nd Floor David Philips Building Polaris House North Star Avenue Swindon SN2 1FL

Inventors

1. ELLIOTT Thomas
MRC Laboratory of Molecular Biology Francis Crick Avenue Cambridge Biomedical Campus Cambridge Cambridgeshire CB2 0QH

Specification

The invention relates to site-specific incorporation of bio-orthogonal groups via the
(expanded) genetic code. In particular the invention relates to incorporation of
carbamate-bonded cyclopropenes into polypeptides via genetically incorporated amino
acids such as lysines. Such cyclopropene groups are useful for addition of further
chemical groups such as tetrazines.
BACKGROUND TO TH E INVENTI ON
The site-specific incorporation of bio-orthogonal groups via genetic code expansion
provides a powerful general strategy for site specifically labelling proteins with any
probe. However, the slow reactivity of the bio-orthogonal functional groups that can be
genetically encoded, and/or their need for photoactivation, has limited this strategy's
utility.
The rapid, site-specific labeling of proteins with diverse probes remains an outstanding
challenge for chemical biologists; enzyme mediated labeling approaches may be rapid,
but use protein or peptide fusions that introduce perturbations into the protein under
study and may limit the sites that can be labeled, while many 'bio-orthogonal' reactions
for which a component can be genetically encoded are too slow to effect the
quantitative and site specific labeling of proteins on a time-scale that is useful to study
many biological processes.
There is a pressing need for general methods to site-specifically label proteins, in
diverse contexts, with user-defined probes.
Inverse electron demand Diels-Alder reactions involving tetrazines have emerged as an
important class of rapid bio-orthogonal reactions. The rates reported fo some of these
reactions are very fast.
Yu et a 2012 (Angew. Chem. Int. Ed. Volume 51, pages 10600-10604) disclose
Genetically Encoded Cyclopropene Directs Rapid, Photoclick Chemistry Mediated
Protein Labelling in Mammalian Cells. The authors report the synthesis of a stable
cyclopropene amino acid, the characterisation of its reactivity in a photo induced
cycloaddition reaction with two tetrazoles, its site-specific incorporation into proteins
both in E.coli and in mammalian cells, and its use in directing bioothogonal labelling of
proteins both in vitro and in vivo. In order to incorporate their cyclopropene
containing amino acid into proteins, the authors had to evolve an orthogonal
tRNA/tRNA synthetase pair that selectively charges their cyclopropene lysine amino
acid in response to a TAGamber codon. This required a synthetase library to be
constructed, five positions within that synthetase to be randomised, together with at
least five rounds of positive and negative selection screening. It is a drawback of this
work that it relies on the specific mutant synthetase produced. In joining their
tetrazole compounds to the cyclopropene moiety in their modified amino acids, Yu e a
use photo activation. Photo activation is carried out at either 302 nano metres or 365
nano metres. The requirement for photo activation in joining tetrazoles to the amino
acid of Yu et a/ is a drawback in the art. This is a laborious extra step in the conjugation
chemistry UV is also damaging to cells and so is disadvantageous i the in vivo/
cellular setting.
Kamber e a/ disclose Isomeric Cyclopropenes Exhibiting Unique Bioorthogonal
Reactivities (2013 JACS Volume 135, pages 13680-13683). The authors discuss two
reactions that can be used to tag biomolecules in complex environments: the inverse
electron demand Diels-Alder reaction of tetrazines with 1,3-disubstituted
cyclopropenes, and the 1,3-dipolar cycloaddition of nitrile imines with 3,3-disubstituted
cyclopropenes. The authors discuss various chemical reaction schemes used to
generate stable cyclo adducts. None of the molecules discussed by Kamber et ai are
amino acids. There is no reason to imagine that the compounds as described could be
incorporated into amino acids. Even if any such incorporation was attempted, there is
absolutely no suggestion o guidance which might allow such compounds to be
incorporated into polypeptides. No schemes for synthesis of amino acids comprising
any of the chemical groups described are presented by Kamber e ai. There are no
biochemical tools for incorporation into proteins mentioned anywhere in this
document. Kamber et a/ are solely concerned with examining the substitution pattern
on the cyclopropene, one such pattern allowing reactions with tetrazines and one such
pattern not being permissive of reactions with tetrazines.
The present invention seeks to overcome problem(s) associated with the prior art.
SUMMARY OF TH E INVENTI ON
In one aspect the invention provides a polypeptide comprising an amino acid having a
cyclopropene group wherein said cyclopropene group is joined to the amino acid via a
carbamate group.
Suitably said cyclopropene group is a 1,3-disubstituted cyclopropene. Suitably said
cyclopropene is a 1,3-dimethylcyclopropene. Suitably said cyclopropene group is
present as a residue of a lysine amino acid. Suitably said polypeptide further comprises
a tetrazine compound linked to said cyclopropene group.
In another aspect, the invention relates to an amino acid comprising cyclopropene
wherein said cyclopropene group is joined to the amino acid moiety via a carbamate
group.
Suitably said cyclopropene is a 1,3-disubstituted cyclopropene. Suitably said
cyclopropene is a 1,3-dimethylcyclopropene. Suitably said amino acid is a lysine amino
acid. Suitably said amino acid comprises ~[((2-methylcyeloprop~2-en-iyl)
methoxy)carbonyl]-l-lysine.
Suitably said amino acid comprises, or more suitably consists of:
In another aspect, the invention relates to a method of producing a polypeptide
comprising a cyclopropene group wherein said cyclopropene group is joined to the
amino acid moiety via a carbamate group, said method comprising genetically
incorporating an amino acid comprising a cyclopropene group joined to the amino acid
moiety via a carbamate group, into a polypeptide.
Suitably producing the polypeptide comprises
(i) providing a nucleic acid encoding the polypeptide which nucleic acid comprises
an orthogonal codon encoding the amino acid having a cyclopropene group;
(ii) translating said nucleic acid in the presence of an orthogonal tRNA
synthetase/tRNA pair capable of recognising said orthogonal codon and incorporating
said amino acid having a cyclopropene group into the polypeptide chain
Suitably said orthogonal codon comprises an amber codon (TAG), said tRNA comprises
RNAc Aand said tRNA synthetase comprises PyiRS.
Suitably said orthogonal codon comprises an amber codon (TAG), said tRNA comprises
MmtRNAcuA and said tRNA synthetase comprises MmPylRS.
In another aspect, the invention relates to a method as described above wherein said
amino acid comprising a cyclopropene group is an amino acid as described above.
In another aspect, the invention relates to a method of producing a polypeptide
comprising a tetrazine group, said method comprising providing a polypeptide
comprising a cyclopropene group as described above, contacting said polypeptide with
a tetrazine compound, and incubating to allow joining of the tetrazine to the
cyclopropene group by an inverse electron demand Diels-Alder cycloaddition reaction.
Suitably said reaction is allowed to proceed for 10 minutes or less, preferably for 1
minute or less, preferably for 30 seconds or less. Reactions in vivo, or in eukaryotic
culture conditions such as tissue culture medium or other suitable media for eukaryotic
cells, may need to be conducted for longer than 30 seconds to achieve maximal
labelling. The skilled operator can determine optimum reaction times by trial and error
based on the guidance provided herein.
In another aspect, the invention relates to a polypeptide as described above wherein
said polypeptide comprises two or more amino acids each having a cyclopropene group,
wherein each said cyclopropene group is joined to each said amino acid via a carbamate
group. Provision of two or more cyclopropene groups on the polypeptide
advantageously allows joining of two or more conjugated groups (functional groups) to
the polypeptide. This is especially helpful when the conjugated groups (functional
groups) comprise drug molecules such as cytotoxic molecules such as in an antibodydrug-
conjugate.
Suitably said polypeptide comprises four amino acids each having a cyclopropene
group.
Suitably the antibody drug conj ugate (ADC) comprising a polypeptide as described
above comprises four amino acids each having a cyclopropene group. This is especially
advantageous for the joining of four cytotoxic molecules to the ADC of interest.
In another aspect, the invention relates to an antibody drug conjugate (ADC)
comprising a polypeptide as described above. Suitably the polypeptide is an antibody
polypeptide such as whole antibody (e.g. a monoclonal antibody (mAb)) or is an
antibody fragment (e.g. a single-chain variable fragment [scFv]), suitably an antibody
fragment comprising CDR amino acid sequence.
Suitably the antibody polypeptide (or fragment) may advantageously be humanised by
manufacture of chimaeric antibody polypeptide(s); suitably the antibody polypeptide
(or fragment) may advantageously be CDR-grafted; suitably the antibody polypeptide
(or fragment) may advantageously be fully humanised to the extent that the technology
permits.
Suitably the antibody polypeptide (or fragment) may be fused to another polypeptide of
interest such as such as a ligand for the transferrin receptor, for example transferrin or
a part thereof, to assist in transport and/or targeting of the ADC.
In another aspect, the invention relates to a polypeptide as described above wherein
said tetrazine group is further joined to a fluorophore.
Suitably said fluorophore comprises fluorescein, tetramethyl rhodamine (TAMRA) or
boron-dipyrromethene (BODIPY).
Suitably said fluorophore may comprise one or more Alexa fluorophore(s). Suitably
said fluorophore may comprise one or more Cyanine based fluorophore(s).
DETAI LED DESCRI PTI ON
Genetic code expansion methods allow the quantitative, site-specific, and genetically
directed incorporation of unnatural amino acids with diverse chemical structures and
bearing diverse functional groups. This is most commonly achieved b inserting the
unnatural amino acid in response to an amber stop codon introduced into a gene of
interest. 12 ' 3 Genetic code expansion is achieved via the introduction of an orthogonal
aminoacyl-tRNA synthetase/tRNAcuA pair into cells. The pyrrolysyl-tRNA
synthetase/ tRNAcuA pair is amongst the most useful pairs for genetic code expansion,^
because it i ) can specifically recognize a range of useful unnatural amino acids, 2) can
be evolved to recognize an extended range of chemical structures, and 3) can be used as
an orthogonal pair for genetic code expansion in E. c / , 4 yeast, 15 mammalian cells,16-18
C. degans1 and D melanogaster 0
We demonstrate production of newly synthesized proteins with cyclopropene groups
that can be labelled with tetrazine probes introduced via a chemoselective inverse
electron demand Diels-Alder reaction.
In another aspect, the invention relates to a homogenous recombinant polypeptide as
described above. Suitably said polypeptide is made by a method as described above.
Also disclosed is a polypeptide produced according to the method(s) described herein.
As well as being the product of those new methods, such a polypeptide has the technical
feature of comprising cyclopropene suitably carbamate-linked cyclopropene.
Mutating has it normal meaning in the art and may refer to the substitution or
truncation or deletion of the residue, motif or domain referred to. Mutation may be
effected at the polypeptide level e.g. by synthesis of a polypeptide having the mutated
sequence, or may be effected at the nucleotide level e.g. by making a nucleic acid
encoding the mutated sequence, which nucleic acid may be subsequently translated to
produce the mutated polypeptide. Where no amino acid is specified as the replacement
amino acid for a given mutation site, suitably a randomisation of said site is used. As a
default mutation, alanine (A) may be used. Suitably the mutations used at particular
site(s) are as set out herein.
A fragment is suitably at least 10 amino acids in length, suitably at least 25 amino acids,
suitably at least 50 amino acids, suitably at least 100 amino acids, suitably at least 200
amino acids, suitably at least 250 amino acids, suitably at least 300 amino acids,
suitably at least 313 amino acids, or suitably the majority of the polypeptide of interest.
The methods of the invention may be practiced in vivo o in vitro
In one embodiment, suitably the methods of the invention are not applied to the
human or animal body. Suitably the methods of the invention are in vitro methods.
Suitably the methods do not require the presence of the human o animal body.
Suitably the methods are not methods of diagnosis or of surgery or of therapy of the
human or animal body.
The term 'comprises' (comprise, comprising) should be understood to have its normal
meaning in the art, i.e. that the stated feature or group of features is included, but that
the term does not exclude any other stated feature or group of features from also being
present.
ADVANTAGES
Cyclopropene is a less carbon rich group than known protein labelling groups.
Cyclopropene amino acid of the current invention leads to more rapid protein labelling
than prior art techniques.
Using the cyclopropene amino acid of the present invention leads to a more efficient
incorporation than prior art labelled amino acids.
It has been known to incorporate amino acids bearing norbornene groups into proteins.
The present invention offers specific advantages over prior art methods involving
norbornene groups. For example, although the conjugation chemistry for cyclopropene
amino acids of the invention is similar to that of norbornene containing amino acids,
conjugation to cyclopropene amino acids can be faster.
Incorporation of cyclopropene amino acids according to the invention can be more
efficient than incorporation of prior art unnatural amino acids. The incorporation of
cyclopropene amino acids according to the invention can lea to a higher level of
incorporation than prior art unnatural amino acids.
It is an advantage of the invention that the cyclopropene amino acids taught can be
incorporated using wild type tRNA synthetases. Prior art unnatural amino acids have
tended to require mutant tRNA synthetases for their incorporation, such as, for
example, amino acids incorporating BCN groups.
Rapid conjugation reactions for unnatural amino acids incorporated into polypeptides
have been mentioned in the prior art. For example, TCO/BCN amino acids offer rapid
reaction times, which can be faster than norbornene reaction times. However, it is an
advantage of the cyclopropene amino acids that very rapid reaction times are provided.
Certain known unnatural amino acids are able to use the wild type tRNA synthetases.
For example, amino acids comprising norbornene groups can be incorporated using
wild type tRNA synthetase. However, by using cycfopropene containing amino acids of
the invention a higher level of incorporation is achieved. In other words, the amount of
material produced which comprises the unnatural amino acid is greater when using
cyclopropene containing amino acids of the invention than when using prior art
unnatural amino acids such as those comprising norbornene.
It is an advantage of the invention that the cyclopropene amino acids form excellent
substrates for the tRNA synthetases noted herein, most suitably the wild type tRNA
synthetases noted herein.
It is an advantage of the invention that the cyclopropene containing amino acids
support excellent linker chemistry, for example rapid and specific reaction with
tetrazine containing compounds.
It is an advantage of the invention that he cyclopropene containing amino acids are
smaller in size than known unnatural amino acids previously used to label proteins.
For example, a known unnatural amino acid comprising norbornene can be
incorporated into polypeptides, but cyclopropene containing amino acids of the
invention are advantageously of smaller size than the norbornene containing amino
acids of the prior art.
It is an advantage of the invention that the cyclopropene amino acids are less likely to
perturb protein structure when incorporated into polypeptides. At least part of this
advantageous effect maybe attributed to the small size of the cyclopropene molecular
group.
A key advantage of incorporation of a cyclopropene group is that it permits a range of
extremely useful further compounds such as labels to be easily and specifically attached
to the cyclopropene group.
In another aspect, the invention relates to a polypeptide as described above wherein
said cyclopropene group is joined to a tetrazine group.
An unnatural amino acid comprising an amide bonded cyclopropene has been
described in the prior art (Yu e a/ 2012). This amino acid is 3,3 disubstituted. This
amino acid is as follows:
In order to incorporate this amino acid into polypeptides, it is essential to use a mutant
tRNA synthetase.
In contrast, the amino acid comprising cyclopropene of the present invention contains
a carbamate group (rather than an amide group). The cyclopropene containing amino
acid of the present invention is therefore chemically distinct from the amide bonded
cyclopropene amino acid in the art.
An exemplary amino acid of the invention is 1,3 disubstituted. An exemplar)' amino
acid of the invention is as follows:
It is an advantage of the carbamate - cyclopropene amino acid of the invention that it is
incorporated well by the wild type tRNA synthetase. This has the advantage of
requiring less biological manipulation in order to obtain good incorporation. This also
provides the advantage of enhanced or increased incorporation. In other words, the
cyclopropene - carbamate amino acid of the present invention is incorporated to
higher levels and/ or more efficiently than known unnatural amino acids.
Use of the cyclopropene amino acid of the invention may provide a superior rate of
reaction with tetrazine compounds.
The carbamate chemistry of the invention provides the advantage of more degrees of
freedom in the chemical structure of the incorporated amino acid. In particular, the
carbamate cyclopropene of the invention has more degrees of freedom compared to the
amide cyclopropene known in the art. Similarly, the carbamate cyclopropene of the
invention is more accessible when present in the polypeptide chain.
By comparison with the amide bonded cyclopropene known in the art, the carbamate
cyclopropene of the present invention is a slightly "longer" amino acid. This provid es
the advantage of a greater "reach" for the groups of the amino acid protruding away
from the amino acid backbone. Again, this can render those groups more accessible for
further labelling or conjugation reactions.
The chemical structure of the carbamate cyclopropene of the invention advantageously
provides more conformational degrees of freedom. In other words, the carbamate
cyclopropene group of the invention can adopt more conformations within a protein
structure than prior art amide bonded cyclopropene amino acids.
In more detail, this may arise from the nature of the bonding between cyclopropene
group and amino acid group. In the prior art amide arrangement, the important bond
is SP2 hybridised. In the invention, the important bond is SP3 hybridised, which is a
more flexible bonding arrangement.
Moreover, the cyclopropene carbamate arrangement of the invention comprises a
methylene group between the carbamate and the cyclopropene group. Firstly, this
provides a longer molecule. The prior art amide bonded version is a less advantageous
shorter molecule. More specifically, the methylene carbon in the amino acid of the
present invention corresponds to a double bonded oxygen group (=o) instead of the
advantageous methylene carbon of the present invention. The double bonded version
in the prior art amide amino acid cannot rotate as freely as the methylene carbon
bonded group in the amino acid of the invention.
The fact that the amino acid of the present invention is smaller than prior art
norbornene containing amino acids and yet still preserves the advantageous carbamate
chemistry is a benefit of the invention. This benefit provides, among other things,
better incorporation of the amino acid into the polypeptide chain.
In addition, the joining to tetrazine compounds (tetrazine conjugation) is
advantageously facilitated by the carbamate cyclopropene arrangement in the amino
acid of the present invention.
Suitably said tetrazine group is further joined to a fluorophore.
Suitably said tetrazine group is further joined to a polyethylene glycol (PEG) group.
Suitably said fluorophore comprises fluorescein, tetramethyl rhodamine (TAMRA) or
boron-di rromethene (BODIPY).
Suitably the cyclopropene amino acid of the invention is incorporated into a
polypeptide using the wild type tRNA synthetase.
Suitably the amino acid having a cyclopropene group is incorporated at a position
corresponding to a lysine residue in the wild type polypeptide. This has the advantage
of maintaining the closest possible structural relationship of the cyclopropene
containing polypeptide to the wild type polypeptide from which it is derived.
Suitably the polypeptide comprises a single cyclopropene group. This has the
advantage of maintaining specificity for any further chemical modifications which
might be directed at the cyclopropene group. For example when there is only a single
cyclopropene group in the polypeptide of interest then possible issues of partial
modification (e.g. where only a subset of cyclopropene groups in the polypeptide are
subsequently modifi ed) or issues of reaction microenvironments varying between
alternate cyclopropene groups in the same polypeptides (which could lead to unequal
reactivity between different cyclopropene group(s) at different locations in the
polypeptide) are advantageously avoided.
Suitably the polypeptide comprises two cyclopropene groups; suitably the polypeptide
comprises three cyclopropene groups; suitably the polypeptide comprises four
cyclopropene groups; suitably the polypeptide comprises five cyclopropene groups;
suitably the polypeptide comprises ten cyclopropene groups or even more.
In principle multiple cyclopropene containing amino acids could be incorporated by the
same or by different orthogonal codons/orthogonal tRNA pairs. Suitably multiple
cyclopropene containing amino acids are incorporated by insertion of multiple amber
codons (together with a suitable orthogonal tRNA synthetase as described herein).
Suitably the amino acid comprising cyclopropene is a lysine amino acid. In one
embodiment, the tRNA may be from one species such as Methanosarcina barker i, and
the tRNA synthetase may be from another species such as Methanosarcina mazes. In
another embodiment, tRNA may be from a first species such as Methanosarcina mazei
and the tRNA synthetase may from a second species such as Methanosarcina barkers.
When an orthogonal pair comprises tRNA and tRNA synthetase from different species,
it is always with the proviso that the orthogonal pair work effectivelytogether ie. that
the tRNA synthetase will effectively amino acylate the tRNA of the amino acid of
interest. Equally, mutant tRNAs or mutant tRNA synthetases maybe used provided
they have the correct amino acylation activity. Although it is an advantage of the
invention that the cyclopropene containing amino acids of the invention are effectively
charged onto tRNAs using the wild type PylRS synthetase, if is equally possible to use
mutant PylRS synthetases provided they are effective in charging the tRNAwith the
cyclopropene containing amino acid of he invention. Most suitably, orthogonal pairs
comprise the tRNA and a tRNA synthetase from the same species.
Of course it is possible to evolve the wild type synthetase (or another variant of a
suitable synthetase) to make a synthetase for incorporation of the cyclopropene amino
acid of the invention which may have increased efficiency. In principle, a Pyl derived
tRNA synthetase might be of use. Chimeric tRNA synthetases may be produced
provided that the charging/ acetylation part of the tRNA synthetase molecule is based
on or derived from Pyl tRNA synthetase. In other words, the anti-codon part of the
tRNA molecule may be varied according to operator choice, for example to direct tRNA
in recognising an alternate codon such as a sense codon, a quadruplet codon, an amber
codon or another "stop" codon. However, the functional acylation/ charging part of the
tRNA molecule should be conserved in order to preserve the cyclopropene charging
activity.
Either of the Methanosardna barker! and Methanosarcina mazei species pyrrolysine
tRNA synthetases are suitable.
Both the Methanosarcina barkers and Methanosarcina mazei tRNAs are suitable. In
any case these tRNAs differ by only one nucleotide. This one nucleotide difference has
no impact on their activity in connection with cyciopropene containing amino acids.
Therefore, either tRNA is equally applicable in the present invention.
The tRNA used may be varied such as mutated. In all cases, any such variants or
mutants of the Pyl tRNA should always retain the capacity to interact productively with
the tRNA synthetase used to charge the tRNA with the cyciopropene containing amino
acid.
Genetic I ncorporation and Polypeptide Production
In the method according to the invention, said genetic incorporation preferably uses an
orthogonal o expanded genetic code, in which one or more specific orthogonal codons
have been allocated to encode the specific amino acid residue with the cyciopropene
group so that it can be genetically incorporated by using an orthogonal tRNA
synthetase/tRNA pair. The orthogonal tRNA synthetase/tRNA pair can in principle be
any such pair capable of charging the tRNA with the amino acid comprising the
cyciopropene group and capable of incorporating that amino acid comprising the
cyciopropene group into the polypeptide chain in response to the orthogonal codon.
The orthogonal codon may be the orthogonal codon amber, ochre, opal or a quadruplet
codon. The codon simply has to correspond to the orthogonal tRNA which will be used
to carry the amino acid comprising the cyciopropene group. Preferably the orthogonal
codon is amber.
It should be noted that many of the specific examples shown herein have used the
amber codon and the corresponding tRNA/tRNA synthetase. As noted above, these
may be varied. Alternatively, in order to use other codons without going to the trouble
of using or selecting alternative tRNA/tRNA synthetase pairs capable of working with
the amino acid comprising the cyciopropene group, the anticodon region of the tRNA
may simply be swapped for the desired anticodon region for the codon of choice. The
anticodon region is not involved in the charging or incorporation functions of the tRNA
nor recognition by the tRNA synthetase so such swaps are entirely within the ambit of
the skilled operator. Thus in some embodiments the anticodon region of the tRNA
used in the invention such as RN C A or tRNA A may be exchanged i.e. a
chimeric tRNAcuA may be used such that the anticodon region is swapped to recognise
an alternate codon so that the cyclopropene containing amino acid may be incorporated
in response to a different orthogonal codon as discussed herein including ochre, opal or
a quadruplet codon, and the nucleic acid encoding the polypeptide into which the
cyclopropene amino acid is to be incorporated is correspondingly mutated to introduce
the cognate codon at the point of incorporation of the cyclopropene amino acid. Most
suitably the orthogonal codon is amber.
Thus alternative orthogonal tRNA synthetase/ tRNA pairs maybe used if desired.
Preferably the orthogonal synthetase/tRNA pair are Methanosardna barkeri MS
pyrrolysine tRNA synthetase (MfePylRS) and its cognate amber suppressor tRNA
(MttRNAcuA).
The Methanosardna barkeri PylT gene encodes the tRNAcuAtRNA.
The Methanosardna barkeri PylS gene encodes the PylRS tRNA synthetase protein.
When particular amino acid residues are referred to using numeric addresses, the
numbering is taken using Py RS (Methanosardna barkeri pyrrolysyl-tRNA
synthetase) amino acid sequence as the reference sequence (i.e. as encoded by the
publicly available wild type Methanosardna barkeri PylS gene Accession number
Q46E77):
MDKKPLDVLX SATGLWMSRT GTLHKIKHYE VSRSK1YIEM ACGDHLWNN
SRSCRTARAF RHHKYRKTCK RCRVSDEDTN NFLTRSTEGK TSVKVKWSA
PKVKKAMPKS VSRAPKPLEN PVSAKASTDT SRSVPSPAKS TPNSPVPTSA
PAPSLTRSQL DRVEALLSPE DKISLNIAKP FRELESELVT RRKNDFQRLY
TNDREDYLGK LERDITKFFV DRDFLEIKSP ILIPAEYVER MGINNDTELS
KQIFRVDKNL CLRPMLAPTL YNYLRKLDRI LPDPIKIFEV GPCYRKESDG
KEHLEEFTMV NFCQMGSGCT RENLESLIKE FLDYLEIDFE IVGDSCMVYG
DTLDIMHGDL ELSSAWGPV PLDREWGIDK PWIGAGFGLE RLLKVMHGFK
NIKRASRSES YYNGISTNL.
If required, the person skilled in the art may adapt PylRS tRNA synthetase protein
by mutating it so as to optimise for the cyclopropene amino acid to be used. The need
for mutation (if any) depends on the cyclopropene amino acid used. An example where
the dPylRS tRNA synthetase may need to be mutated is when the cyclopropene
amino acid is not processed by the PylRS tRNA synthetase protein.
Such mutation (if desired) may be carried out by introducing mutations into the
Py RS tRNA synthetase, for example at one or more of the following positions in the
PylRS tRNA synthetase: M241, A267, ¥271, L274 and C313.
R Synthetases
The tRNA synthetase of the invention may be varied Although specific tRNA
synthetase sequences may have been used in the examples, the invention is not
intended to be confined only to those examples
In principle any tRNA synthetase which provides the same tRNA charging
(aminoacylation) function can be employed in the invention.
For example the tRNA synthetase may be from any suitable species such as from
archea, for example from Methanosarcina barkers MS; Methanosarcina barkers str.
Fusaro; Methanosarcina mazei Goi; Methanosarcina acetivorans C2A;
Methanosarcina thermophila; or Methanococcoides burtonii. Alternatively the the
tRNA synthetase may be from bacteria, for example from Desulfitobacterium hafniense
DCB-2; Desulfitobacterium hafniense Y51; Desulfitobacterium hafniense PCPi;
Desuifotomacuium acetoxidans DSM 771.
Exemplary sequences from these organisms are he publicallv available sequences. The
following examples are provided as exemplary sequences for pyrrolysine tRNA
synthetases:
>M.barker MS/ 1-419/
Methanosarcina barker! MS
VERSION Q6WRH6.1 G 745 4
MDKKPLDVLISATGLWMSRTGTLHKIKHHEVSRSKIYIEMACGDHLVVNNSRSCRTA
RAFRHiIKYRKTCKRCRVSDEDMNFLTRSTESKNSVK\¾WSAPK\ KKAMPKSVSRAP
KPLENSVSAKASTNTSRSVPSPAKSTPNSSVPASAPAPSLTRSQLDRVEALLSPEDKISL
NMAKPFRELEPEL\TRRKNDFQRLYTNDREDYLGKLERDITKFF\¾RGFLEIKSPILIP
AEWERMGINND^LSKQIFRV DKNLCLRPMLAPTLYNYLRKLDRILPGPIKIFEVGPC
YRKESDGKEHLEEFTMWFCQMGSGCTOENLEALIKEFLDYLEIDFEIVGDSCMVYGD
TLDIMHGDLFi.SSAWGPVSLDREWGIDKPWIGAGFGLERLLKVMHGFKNIKRASRS
ESYYNGISTNL
S
>M.barker IF/ 1-419/
Methanosarcina barkeri s , Fusaro
VERSION UR__304395· GI:7366838o
MDKKPLDV1JSATGIA¥MSRTG
RAFRHHKYRKTCKRCRVSDEDMNFLTRSTEGKTSVKVKA^APKVK^
KPLENP\¾AKASTDTSRSWSPAKSTPNSPWTSAPAPSLTRSQLDR\¾ALLSPEDKISL
NIAKPFRELESELVTRRl¾ DFQRLY'rNDREDYLGKLERDITKFWDRDFLEIKSPILrPA
EY\T^RMGINNDTELSKQIFRVDKNirJ RPMIAPTO^LRKLDRILPDPIKIFEVGPCY
RKESDGKEHLEEFTMVNFCQMGSGCTRENLESLIKEFLDYLEIDFEIVGDSCMWGDT
LDIMHGDI^LSSAWGPWLDREWGIDKPWIGAGFGLERLLK\^HGFKNIKRASRSE
SYY GIST L
>M.mazeij 1-454
Methanosarcina mazes G01
VERSION NP_633469.i GL21227547
MDKKPLNTLISATGIAYMSRTGTIHKIKHHEVSRSKIYIEMACGDFILVVNNSRSSRTAR
ALRHHKYRKTCKRCRVSDEDLNKFLTKANEDQTSVKVKVVSAPTRTKKMIP
PKPLEN^AAQAQPSGSKFSPAIPVSTQESVSVPASVSTSISSISTGATASAI^GNTNPI
TSMSAPVQASAPALTTCSQTORI-EVLLNPKDEISLNSGKPFRELESELLSRRKKDLQQIY
AEERENYLGKLERFJTRFFVDRGFLEIKSPILIPLEYIERMGIDNDTELSKQIFRVDKNF
CLRPMI-APNLYNYLRKLDRALPDPIKIFEIGPCTRKESDGKEHLE^
CTRENIESIITDFLNFILGIDFKWGDSCIyR^ODTLDVMFIGDLELSSAWGPIPLDREW
GIDKPWIGAGFGLERLLKVKHDFKNIKRAARSESYYNGISTNL
>M.acetivor ans/ 1-443
Methanosarcina acetivor ans C2A
VERSION NP... 615128.2 01:161484944
MDKKPLDTLISATGLWMSRTCMffl
ALRHHK^ RKTCRFICRVSDEDINNFLTKTSEEKTT\iaTa^SAPRWKAMPiM.th a mophi!a/ 1-478
Methanosardna thermophila, VERSION DQ017250.1 01:67773308
MDKKPLNTLISATGLW
RALRHHKYRKICKHCRVSDEDLNKFLTRimDK^
PKPLENTAPVQTLPSESQPAPITPISASITAPASTSTfAPAPASITAPAPASITAPASAST
TISTSAMPASTSAQGTTKFNYISGGFPRPIPVQASAPALTKSQIDRLQGLLSPKDEISLDS
GTPFRKLESELLSRRRKDLKQIYi\EEREHYLGKLEREITKi^\¾RGFLEIKSPILIPMEYI
ERMGIDNDKFa.SKQIFRVDNNFCLRPMLAPNL^m.RKLNRALPDPIKIFEIGPCYRK
ESDGKEHLEEFIMLNICQMGSGCTRENLEAIIKDFLDYLGIDFEIVGDSCMVYGDTLD
\TV1HGDLELSSAWGPWMDRDWGINKPWIGAGFGLERLLK\TV1HNFKNIKRASRSES
YYNGISTNL
>M.bur ion/ / 1-416
Methanococcoides burtonii DSM 6242, VERSION YP...566710.1 01:91774018
MEKQLLD L Έ LNGVWLSRSGLLHGIRNFE K H ETDCGA NS SS SA
SLRHNKYRKPCKRCRPADEQIDRFVKKTFKEKRQWSWSSPKKHWKKPK\ AVIKSFS
ISTPSPKEASVSNSIPTPSISVVKDEVKVPEVKYITSQIERLK^MSPDDKIPIQDELPEF
K EKELIQRRRDDIJCKN^EDREDRLGKLERDITEFWDRGFLEIKSPIMIPFEYIER
MGIDM )DHLNKQIFRVDESMCLRPMm CLYNYLRKLDKVXPDPIRIFEIGPCTRKES
DGSSHLFZ,FTNI\¾TCQMGSGCTRENMEALIDEFLEHLGIEYEIEADNCNI\TGDTIDI
MHGDLELSSA\^OPrPLDREWGVNKP\mGAGFGLERLLKVRHNY raiRRASRSELYY
NGINTNL
>D hafn ense__DC -2f 1-279
Desulfitobacterium hafnienseDCB - 2
VERSION YP_00246i289.i GI: 219670854
MSSFWT¾VQYQRLKELNASGEQI^MGFSDALSRDRAFQGIEHQLMSQGKRHLEQLR
WKHRPALLELEEGLAKALHQQ
KKCLRPMLAPNLYTLWRELERLWDKPIRIFEIGTCYRKESQGAQHLNEFmLNLTEL
GTPLEERHQRLEDMARWVLEAAGI
PHFLDEKWErVDPWVGLGFGLERLLMIllEGTQHVQSMARSLSYLDGVRLNIN
>D.hafniense Y / 1-312
Desulfitobacterium hafnienseYsi
VERSION YP 52 92. 01:89897705
MDRIDHTDSKFVQAGE VLPA FL
MGFSDALSRDRAFQGIEHQLMSQGKRHLEQLRTYKHRPALLELEEGLAKALHQQGF
VQV^TPTIITKSALAKMTIGEDHPLF
DKPlRIFEIGTCTRKESQGAQHLNEFTMLNLTELGTPLEERHQRLEDAlAR\WLEAi\Gl
E E ESS GD D KGDLE SGAMG F EK Έ D 'GLGFGLERLL
MIREGTQHVQSMARSLSYLDGVRLNIN
>D. a n/ ns PCPi/ 288
Desulfitobacterium hafniense
VERSION AY692340.1 GL53771772
MFLTRRDPPLSSFVVTiCVQYQRLKELNASGEQLEMGFSDALSRDRAFQGIEHQLMSQG
KIlHLEQLRTVXHRPALLELEEKL\KALHQQGWQVVTPTnTKSALAiaViTIGEDHPLF
SQWWLDGKKCLRPMIAPNLYTLWRELERLWDKPIRIFEIGTCYRKESQGAQHLNEF
TMLNLTELGTPLEERHQRLEDMAIlWVLEAAGIIlEFELVTESSVVYGD^aWMKGDLE
LASGAMGPHFLDFJiWEIFDPWGLGFGLERLLMIREGTQHVQSMARSLSYLDGVRL
NIN
>D.ac&oxidans/ 1-2.77
Desulfotomaculum acetoxidansDSM 77
VERSION YP 003189614.1 01:258513392
MSFLWTVSQQKRLSELNASEEEKNMSFSSTSD REAAYKRVEMRLINESKQRLNKLRH
ETRPAICALENRLAAALRGAGFVQVATPVIL^
LRPMIAPNLYYILKDLLRLV\¾KPWIFEIGSCFRKESQGSNHLNEFTMLNLVEWGLPE
EQ QK SELAKL DETGIDF LEί ES GET ^ D ELGSGALGP FLD
GRWGWGPVVVGIGFGLERLLMVEQGGQNVRSMGKSLTYLDGVRLNI
When the particular tRNA charging (aminoacylation) function has been provided by
mutating the tRNA synthetase, then it may not be appropriate to simply use another
wild-type tRNA sequence, for example one selected from the above. In this scenario, it
will be important to preserve the same tRNA charging (aminoacylation) function. This
is accomplished by transferring the mutation(s) in the exemplar}' tRNA synthetase into
an alternate tRNA synthetase backbone, such as one selected from the above.
In this way it should be possible to transfer selected mutations to corresponding tRNA
synthetase sequences such as corresponding pylS sequences from other organisms
beyond exemplary M.barkeri and/or M.mazei sequences.
Target tRNA synthetase proteins/backbones, may be selected by alignment to known
tRNA synthetases such as exemplary M.barkeri and/or M.mazei sequences.
This subject is now illustrated by reference to the pylS (pyrrolysine tRNA synthetase)
sequences but the principles apply equally to the particular tRNA synthetase of interest.
For example, an alignment of all PylS sequences may be prepared. These can have a
low overall %sequence identity. Thus it is important to study the sequence such as by
aligning the sequence to known tRNA synthetases (rather than simply to use a low
sequence identity score) to ensure that the sequence being used is indeed a tRNA
synthetase.
Thus suitably when sequence identity is being considered, suitably it is considered
across the sequences of the examples of tRNA synthetases as above. Suitably the %
identity may be as defined from an alignment of the above sequences.
It may be useful to focus on the catalytic region. The aim of this is to provide a tRNA
catalytic region from which a high % identity can be defined to capture/identify
backbone scaffolds suitable for accepting mutations transplanted in order to produce
the same tRNA charging (aminoacylation) function, for example new or unnatural
amino acid recognition.
Thus suitably when sequence identity is being considered, suitably it is considered
across the catalytic region. Suitably the % identity may be as defined from the catalytic
region.
'Transferring' or 'transplanting' mutations onto an alternate tRNA synthetase backbone
can be accomplished by site directed mutagenesis of a nucleotide sequence encoding
the tRNA synthetase backbone. This technique is well known in the art. Essentially the
backbone pylS sequence is selected (for example using the active site alignment
discussed above) and the selected mutations are transferred to (i.e. made in) the
correspon ng/homol ogous positions.
When particular amino acid residues are referred to using numeric addresses, unless
otherwise apparent, the numbering is taken using MbPylRS (Methanosardna barkeri
pyrrolysyl-tRNA synthetase) amino acid sequence as the reference sequence (i.e. as
encoded by the publicly available wild type Methanosardna barkeri Py S gene
Accession number Q46E77):
MDKKPLDVLX SATGLWMSRT GTLHKIKHYE VSRSKIYIEM ACGDHLWNN
SRSCRTARAF RHHKYRKTCK RCRVSDEDIN NFLTRSTEGK TSVKVKWSA
PKVKKAMPKS VSRAPKPLEN PVSAKASTDT SRSVPSPAKS TPNSPVPTSA
PAPSLTRSQL DRVEALLSPE DKISLNIAKP FRELESELVT RRKNDFQRLY
TNDREDYLGK LERDITKFFV DRDFLEIKSP ILIPAEYVER MGINNDTELS
KQIFRVDKNL CLRPMLAPTL YNYLRKLDRI LPDPIKIFEV GPCYRKESDG
KEHLEEFTMV NFCQMGSGCT RENLESLIKE FLDYLEIDFE IVGDSCMVYG
DTLDIMHGDL ELSSAWGPV PLDREWGIDK PWIGAGFGLE RLLKVMHGFK
NIKRASRSES Y XG S .
This is to be used as is well understood in the art to locate the residue of interest. This
is not always a strict counting exercise - attention must be paid to the context or
alignment. For example, if the protein of interest is of a slightly different length, then
location of the correct residue in that sequence corresponding to (for example) L266
may require the sequences to be aligned and the equivalent or corresponding residue
picked, rather than simply taking the 266th residue of the sequence of interest. This is
well within the ambit of the skilled reader.
Notation for mutations used herein is the standard in the art. For example L266M
means that the amino acid corresponding to L at position 266 of the wild type sequence
is replaced with M.
The transplantation of mutations between alternate tRNA backbones is now illustrated
with reference to exemplary M.barkeri and M.mazd sequences, but the same principles
apply equally to transplantation onto or from other backbones.
For example Mb AcKRS is an engineered synthetase for the incorporation of Ac
Parental protein/backbone: M. barkeri PylS
Mutations: L266V, L270I, Y271F, L274A, C317F
Mb PCK S: engineered synthetase for the incorporation of PCK
Parental protein/backbone: M. barkers PylS
Mutations: M241F, A267S, Y271C, L274M
Synthetases with the same substrate specificities can be obtained by transplanting these
mutations into M. azei PyfS. Thus the following synthetases may be generated by
transplantation of the mutations from the Mb backbone onto the Mm tRNA backbone:
Mm AcKRS introducing mutations L301V, L305I, Y306F, L309A, C348F into M. mazei
PylS,
and
Mm PCKRS introducing mutations M276F, A302S, Y306C, L309M into M. mazei PylS.
Full length sequences of these exemplaiy transplanted mutation synthetases are given
below.
>Mb PylS/ ~4i9
MDTGCPLDVLISATGLWMSRTGTLHKIKHHEVSRSK1YIEMACGDHLVVNNSRSCRTA
KPLFASVSAKASTNTSRSWSPA1Mb_AcKRS/i-4i9
MDKKPLD\XISATGL\\^SRTGTLHKIKHHEVSRSKrnEMACGDHLVVNNSRSCRTA
RAFRHHKYRICrcKRCRVSGEDmNFLTRSTESKNSVKVRWSA^
KPI^NSVSAKASTNTSRSWSPAKSTPNSSWASAPAPSLTRSQLDR_VEALLSPEDKISL
NMAKPFRELEPELVTRRKNDFQRLYTO
AEA^¾RMGINNDTELSKQIFRVDKNLGLRPM\'APTIFNYARKLDRILPGPIKIFEVGPC
YRKESDGKEHLEEFrMVNFFQMGSGCTJlENLEALIKEFLDYLEIDFEIVGDSCMVYGD
TLJ)IMHGDLFJ SSAWGPVSLDREWGIDKPWIGAGFGLERLLK\ ]V1HGFKNIKRASRS
ESYYNGISTNL
>Mb PCKRS/ i -4 9
MDKKPLD SA WMSRTGTI - KIK EVSRSK EMAC DHI SRSCRTA
Ί v ' Ί R\ V PK . V R P
LENSVSAKAS^^^SRSWSPAKS^TNSSVPASAPAPSLTRSQLDR\ ALLSPEDKISL
NMAKPFRELEPEL\TRRKNDFQRLYTNDS¾EDYLGKLERD1TKFFVDRGFLEIKSPILIP
AF^T^RFGINNDTELSKQIFRVDKNirJ.RPMLSPTir^MRKLDRILPGPIKIFEVGPC
YRSMm_AcKRS/ i -454
MDKKPLNTLISATGLWMSRTGTIHKIKHHEVSRSmiElvL\CGDHLVVNNSRSSRTAR
ALRHHKYRKTCKRCRVSDEDLNKFLTKANEDQTSVKATMm_PCKRS/ i -454
MDKKPL TLISA
ALRHHK^ RKTCKRCRYSDEDLNKFLTKANFJ)QTS\^K\^SAPTRTKKAMPKSVARA
PKPLENTEAAQAQPSGSKFSPMPVSTQESVSVPASVSTSISSISTGATASALVKGNTNPI
TSMSAPVQASAPALTKSQTDRLEVLLNPKDEISLNSGKPFRELESELLSRRKKDLQQrir
AEERENYLGKLEREITRFFVDRGFLEIKSPILIPLEYIERFGIDNDTELSKQIFRVDKNFC
LRl¾lLSPNLCNYMRKLDRALPDPIKIFEIGPi¾¾KESDGKEHLEEFTMLNFCQMGSGC
TRENLESIITDFLNHLGIDFK1VGDSCMVYGDTLDVMHGDLELSSAVVGPIPLDREWGI
DKPWTGAGFGLE LL VK DFKNIKR AARSESYYNG ISTNL
The same principle applies equally to other mutations and/ or to other backbones.
Transplanted polypeptides produced in this manner should advantageously be tested t o
ensure that the desired function/substrate specificities have been preserved.
Polynucleotides encoding the polypeptide of interest for the method described above
can be incorporated into a recombinant replicable vector. The vector may be used to
replicate the nucleic acid in a compatible host cell. Thus in a further embodiment, the
invention provides a method of making polynucleotides of the invention by introducing
a polynucleotide of the invention into a replicable vector, introducing the vector into a
compatible host ce l, and growing the host ceil under conditions which bring about
replication of the vector. The vector may be recovered from the host cell. Suitable host
cells include bacteria such as E. coli.
Preferably, a polynucleotide of the invention in a vector is operably linked t o a control
sequence that is capable of providing for the expression of the coding sequence by the
host ce l, i.e. the vector is an expression vector. The term "operably linked" means that
the components described are in a relationship permitting them to function in their
intended manner. A regulatory sequence "operably linked" to a coding sequence is
ligated in such a way that expression of the coding sequence is achieved under
condition compatible with the control sequences.
Vectors of the invention may be transformed or transfected into a suitable host cell as
described to provide for expression of a protein of the invention. This process may
comprise culturing a host cell transformed with an expression vector as described
above under conditions t o provide for expression by the vector of a coding sequence
encoding the protein, and optionally recovering the expressed protein.
The vectors may be for example, piasmid or virus vectors provided with an origin of
replication, optionally a promoter for the expression of the said polynucleotide and
optionally a regulator of the promoter. The vectors may contain one or more selectable
marker genes, for example an ampicillin resistance gene in the case of a bacterial
piasmid. Vectors maybe used, for example, to transfect or transform a host cell.
Control sequences operably linked to sequences encoding the protein of the invention
include promoters/enhancers and other expression regulation signals. These control
sequences may be selected to be compatible with the host ce l for which the expression
vector is designed to be used in. The term promoter is well-known in the art and
encompasses nucleic acid regions ranging in size and complexity from minimal
promoters to promoters including upstream elements and enhancers.
Another aspect of the invention is a method, such as an in vitro method, of
incorporating the cyclopropene containing amino acid(s) genetically and sitespecifically
into the protein of choice, suitably in a eukaryotic cell. One advantage of
incorporating genetically by said method is that it obviates the need to deliver the
proteins comprising the cyclopropene amino acid into a cell once formed, since in this
embodiment they may be synthesised directly in the target cell. The method comprises
the following steps:
i) introducing, or replacing a specific codon with, an orthogonal codon such as
an amber codon at the desired site in the nucleotide sequence encoding the
protein
ii) introducing an expression system of orthogonal tRNA synthetase/tRNA pair in
the cell, such as a pyrollysyl-tRNA synthetase/tRNA pair
iii) growing the cells in a medium with the cyclopropene containing amino acid
according to the invention.
Step (i) entails or replacing a specific codon with an orthogonal codon such as an amber
codon at the desired site in the genetic sequence of the protein. This can be achieved by
simply introducing a construct, such as a plasmid, with the nucleotide sequence
encoding the protein, wherein the site where the cyclopropene containing amino acid is
desired to be introduced/ replaced is altered to comprise an orthogonal codon such as
an amber codon. This is well within the person skilled in the art's ability and examples
of such are given here below.
Step (ii) requires an orthogonal expression system to specifically incorporate the
cyclopropene containing amino acid at the desired location (e.g. the amber codon).
Thus a specific orthogonal tRNA synthetase such as an orthogonal pyrollysyl-tRNA
synthetase and a specific corresponding orthogonal tRNA pair which are together
capable of charging said tRNA with the cyclopropene containing amino acid are
required. Examples of these are provided herein.
Protein Expression and Purification
Host cells comprising polynucleotides of the invention may be used to express proteins
of the invention. Host cells may be cultured under suitable conditions which allow
expression of the proteins of the invention. Expression of the proteins of the invention
may be constitutive such that they are continually produced, or inducible, requiring a
stimulus to initiate expression. In the case of inducible expression, protein production
ca be initiated when required by, for example, addition of an inducer substance to the
culture medium, for example dexamethasone or IPTG.
Proteins of the invention can be extracted from host cells by a variety of techniques
known in the art, including enzymatic, chemical and/or osmotic lysis and physical
disruption.
Proteins of the invention can be purified by standard techniques known in the art such
as preparative chromatography, affinity purification or any other suitable technique.
FU RTH ER ADVANTAGES
Yu et aljoin tetrazoles to cyclopropene amino acids in polypeptides. Yu et al require
the use of ultraviolet irradiation in order to photoactivate their conjugation groups.
Their best reaction rates were achieved with 302 nano metres UV irradiation.
However, this type of UV irradiation has high ionisation potential. This means that the
molecules and/or cells upon which the radiation is directed are likely to be damaged by
this UV energy. By contrast, the conjugations of the present invention do not require
any UV step for photoactivation. Even when Yu et al use a less damaging source of UV
irradiation (eg. 365 nano metre UV irradiation), the observed reaction rates are
considerably slower than those provided by the present invention. Thus, even if the UV
irradiation is adjusted in Yu et al in an attempt to try to avoid or reduce some of the
drawbacks associated with UV treatment, the same laborious irradiation step must still
be carried out and slower reaction rates are achieved. It is an advantage of the present
invention that UV irradiation can be omitted, and that excellent reaction rates are
obtained even without photoactivation.
It is an advantage of the cyclopropene amino acids of the present invention that they
are easy to manufacture. For example, the number steps in the synthetic pathway is
advantageously few.
It should be noted that the prior art cyclopropene amino acid ofYu et al contains an
amide group. This amide bond is a potential substrate for peptidases. Peptidase action
on the amide bond of the prior art cyclopropene amino aci would cleave the
cyclopropene par of the molecule off the polypeptide. This is clearly a disadvantage.
By contrast, it is an advantage of the carbamate linked cyclopropene groups of the
present invention that carbamate bonded cyclopropene is not a target for peptidases.
Prior art based techniques rely on tetrazole chemistry for conjugation n contrast, the
present invention teaches the use of advantageous tetrazine chemistry.
It is an advantage of the carbamate bonded cyclopropene amino acids of the present
invention that they enable the use of the wild type PylRS synthetase. Making use of the
wild type synthetase is advantageous as it involves less labour by alleviating the need to
prepare mutant synthetases. In addition, the mutant synthetases do not always amino
acylate in tRNA to the same level as wild type t NA synthetases. In other words, the
mutations required to be made to a synthetase in order to handle prior art
cyclopropene amide bonded amino acids can cause a loss of efficiency of amino
acylation. In contrast, it is demonstrated herein that amino acylation using the wild
type synthetase with the amino acid of the present invention is a very efficient process,
which is a further advantage over prior art techniques.
Further particular and preferred aspects are set out in the accompanying independent
and dependent claims. Features of the dependent claims may be combined with
features of the independent claims as appropriate, and in combinations other than
those explicitly set out in the claims.
Where an apparatus feature is described as being operable to provide a function, it will
be appreciated that this includes an apparatus feature which provides that function or
which is adapted or configured to provide that function.
BR EF DESCRI PTI ON OF TH E DRAW INGS
Embodiments of the present invention will now be described furt her, with reference to
the accompanying drawings, in which:
Figure 1 SORT-M enables proteome tagging and labelling at diverse codons, with
diverse chemistries, and in genetically targeted cells and t issues (a) Proteome tagging
via SORT (stochastic orthogonal receding of translation) uses an orthogonal aminoacyltRNA
synthetase/tRNA pair. The pyrrolysyl-tRNA synthetase/tRNA pair is used in this
study. This synthetase (and its previously evolved active-site variants) recognizes a
range of unnatural amino acids (yellow star, and yellow hexagon), does not
aminoacylate endogenous tRNAs, but efficiently aminoacylates its cognate tRNA -
without regard to anticodon identity; PyltRNA is not a substrate for endogenous
aminoacyl-tRNA synthetases. Orthogonal pyrrolysyl-tRNA synthetase/tRNAxxx pairs
(XXX indicates choice of anticodon, yellow) in which the anticodon has been altered
compete for the decoding of sense codons (dark blue and pink) via a pathwa that is
orthogonal to that used by natural synthetases and tR As (dark blue and pink) to
direct natural amino acids. SORT allows the incorporation of diverse chemical groups
into the proteome, in response to diverse codons. Since there is no competition at the
active site of the orthogonal synthetase, starvation and minimal media are not required.
In addition the expression pattern of the orthogonal proteome tagging system can be
genetically directed allowing tissue specific proteome labelling. Selective pressure
incorporation approaches are shown in Supplementary Fig. 1 for comparison to
SORT b) The combination of encoding amino acids (1-3) across the proteome via
SORT and chemoselective modification of 3 with tetrazine probes (4a-g, 5, 6 and 7)
allows detection of labelled proteins via SORT-M (stochastic orthogonal receding of
translation and chemoselective modification). Amino acid structures: -((tertbutoxy)
carbonyl )-L -lysine 1, -(i-propynlyoxy)carbonyl )-L -lysine 2 and -(((2-
methylcycloprop-2-en -i -yl)methoxy)carbonyi )-L -lysine.
Figure 2 (Supplementary Figure 2) shows Quantitative site-specific
incorporation of 3 into proteins expressed n E. coli and its rapid and
quantitative iabe!!ing with tetrazine probes
A. The PylRS/tRNAcuA pair directs efficient, site-specific incorporation of 3 into sfGFP
bearing an amber stop codon at position 150. Incorporation of 3 is more efficient than 1
a well-established excellent substrate for the PylRS/tRNAcuA pair.
B. Specific and quantitative labelling of 2 nmol sfGFP bearing 3 with 10 equivalents of
tetrazine fluorophore 4a. ESI-MS analysis of sfGFP-3 purified from E. co i grown with
1mM 3 bearing the PylRS/tRNAcuA pair and SfGFPisoTAG confirms the incorporation
of 3 . sfGFPi50 -3: Expected mass: 27951.5 Da, Found mass: 27950 ± 1.0 Da, minor
peak 27820 corresponding to loss of N-terminal methionine. Labelling sfGFPi50 -3
with 4a is quantitative, as judged by ESI-MS of the labelling reaction. Expected mass:
28758.4 Da, Found mass: 28758 ± 1.0 Da, minor peak 28627 corresponds to loss of Nterminal
methionine.
C Determining the rate constant for labelling of sfGFP-3 (10 6 mM, sfGFP
incorporating 3 at position 150), with 10 equivalents of 4a. 2 nmol of purified sfGFP-3,
(10.6 mM in 20 mMTris-HCl, 100 mMNaCl, 2 mM EDTA, p 7.4) were incubated with
20 nmol of tetrazine-dye conjugate 4a (10 of a 2 mM solution i DMSO). At different
time points 8 , L aliquots were taken from the solution and quenched with a 700-fold
excess of BCN and plunged into liquid nitrogen. Samples were mixed with NuPAGE
LDS sample buffer supplemented with 5 % b-mercaptoethanol, heated for 10 i to
90°C and analyzed by 4-12% SDS page. The amounts of labelled proteins were
quantified by scanning the fluorescent bands with a Typhoon Trio phosphoimager (GE
Life Sciences). Bands were quantified with the ImageQuant™ TL software (GE Life
Sciences) using rubber band background subtraction. The rate constant was
determined by fitting the data to a single-exponential equation. The calculated
obsen '-ed rate k' was divided by the concentration of 4a to obtain rate constant k for the
reaction. Measurements were done in triplicate. All data processing was performed
using Kaleidagraph software (Synergy Software, Reading, UK). For comparison the rate
of labelling sfGFP bearing Ns~5-norbornene-2-yloxycarbonyl-L-lysine (NorK), a known
substrate for PylRS, was determined in a similar way using 11.25 mM sfGFP bearing
NorK at position 150 (SfGFP-NorK) and 20 equivalents of 4a.
Figure 3 shows Supplementary Table 1 - Primers
Figure 4 (Supplementary Figure 3) shows SO T- enables codon specific
proteome tagg g and labelling in E. coli
A. Proteome labelling with 3 via the indicated PylRS/tRNAxxx pair. Cells contained two
plasmids, one encoding MbPylRS, the other encoding T4 lysozyme and the indicated
tRNAxxx. Cells were grown in the presence of 0.1 raM 3 from OD6oo =0.2 and T4
lysozyme expression, induced by the addition of 0.2 mM arabinose after ih. After a
further 3 h cells were harvested. Tagged proteins in the lysate were detected via an
inverse electron demand Diels-Alder reaction between incorporated 3 and tetrazine
fluorophore 4a (20 mM, ih, RT) The amino acids i parentheses are the natural
amino acids encoded by the endogenous tRNA bearing the corresponding anti-codon.
B. Lane profile analysis for each codon.
Figure 5 (Supplementary Figure 4) shows Specific amino acid replacement
in SORT demonstrated by ES - S
T4 lysozyme isolated after SORT with UUU(Lys) in the presence of lmM 3 Expected
mass WT T4 lysozyme: 19512.2 Da, Found mass: 19510 ± 2.0 Da. Expected mass WT T4
lysozyme Lys-*3 single mutation: 19622.3 Da, Found mass: 19620 ± 2.0 Da
Figure 6 (Supplementary Figure 5) shows ncorporation o! 3 (0.1 ml ) via
SO T- is not xic to ceils
Chemically competent DH10B cells were transformed with two plasmids: pBKwtPylRS
necessary for expression of PylRS, a d pBAD_wtT4L_MbPyfTxxx plasmids that is
required for expression of PyltRNAxxx and expresses lysozyme under arabinose control.
The cells were recovered in i n l SOB medium for one hour at 37°C prior t o aliquoting
t o 10 ml LB-KT (LB media with 50 g m 1 kanamycin, and 25 g ml 1 tetracycline) and
incubated overnight (37°C, 250 rpm, 12 h). The overnight culture (O ( 0~3 was
diluted to a OD&0o~Q-3 in 10 L LB-KT1/2 (LB media with 25 mb kanamycin, and
12.5 m tetracycline) supplemented with 3 at different concentrations, o , 0.1, 0.5
mM. 200 mL aliquots of these cultures were transferred into a 96-well plate and OD oo
measured using a Microplate reader, Infinite 200 Pro (TECAN). OD6oo was measured
fo each sample ever-}' 10 min with linear 1mm shaking between the measurements.
Figure 7 (Supplementary Figure 6 ) shows Measurement of time-dependent
a a in incorporation of 3 in pr eo e SORT-M at different concentrations of 3 in
response to AAA codon
Chemically competent D 0B cells were transformed with two plasmids: pBKwtPylRS
necessary for expression of PylRS, and pBAD_wtT4L_MbPylTuuu plasmid that is required for
expression of PyltRNAuuu- pBAD_wtT4L_MbP T u plasmid also contains the gene for
expression of T4 lysozyme that is downstream of arabinose-inducible promoter. After
transformation, cells were recovered in 1 ml SOB medium for one hour at 37°C prior to
inoculation in 10 ml LB-KT (LB media with 50 g ml 1 kanamycin, and 25 mg ml 1
tetracycline). The culture was incubated overnight (37°C, 250 rpm, 12 h) and subsequently
diluted to an OD oo~ 3 in 30 mL LB-KT 2 (LB media with 25 mg ml 1 kanamycin, and 12.5 mg
ml 1 tetracycline) supplemented w th 3 at different concentrations, 0, 0.1 , 0.5 mM. The cultures
was incubated (37°C, 250 rpm) for 1 h, when D oo reached approximately 0.6. 2 ml culture
aliquot was collected in a separate tube for each of three cultures. This is the pre-induction
culture (lane labelled asl in the gel image). Subsequently arabinose was added at a final
concentration of 0.2% (v/v) to induce expression of T4 lysozyme and culture aliquots of 2 mL
were collected every hour (lanes labelled as 2, 3 and 4 corresponding to 1, 2 and 3h culture
collection after induction). For each of the collected cultures, bacterial cells were pelleted by
centrifi!gation at 4 °C, washed with ice cold PBS (3 x mL) and subsequently the pellets were
frozen and stored at -20 °C. The pellets were then thawed in 200 m . of ice cold PBS and ysed
by sonication (9 x 0 s ON / 20 s OFF, 70% power). The iysates were clarified by
centrifugation at 15,000 RPM, 4 C for 30 minutes. The supernatants were transferred to fresh
1.5 mL. tubes. 50 mί of supernatant was transferred to a new tube for the labeling reactions, and
the rest was frozen in liquid nitrogen and stored at -80C. To the 50 m . of supernatant, 0.5 m . of
2 mM 4a was added and the Iysates were incubated a 25°C for 1 hour. After lh, 17 of 4X
LDS sample buffer supplemented (6mM BCN and 5% BME) was added and mixed by
vortexing gently. Samples were incubated for lornin before boiling at 90 °C for 10 min.
Samples were analysed by 4-12% SDS-PAGE and fluorescent images were acquired
using Typhoon Trio phosphoimager (GE Life Sciences)
Figure 8 shows Site-specific incorporation of 3 into proteins at diverse codons and
specific proteome labelling using SORT-M in human cells (a) Western blot analysis
demonstrates the efficient amino acid dependant expression of an mCherry-EGFP
fusion protein separated by an amber stop codon bearing a C-terminal HA-tag (mCh-
TAG-EGFP-HA) in HEK293T ceils Anti-FLAG detected tagged PylRS (b Specific
labelling of mCh-TAG-EGFP-HA (immunoprecipitated from 106 cells) with 4a (20 mM
in 5qm PBS, ih, RT) confirms the incorporation of 3 into protein in HEK293 cells (c)
SORT-Mlabelling of 3 that is statistically incorporated into newly synthesised proteins
across the whole proteome of mammalian cells directed by six different
PylRS/PyltRNAxxx mutants using 0.5 M 3. Labeling with 4g (20 mM in PBS, ih, RT,
as above). The amino acids in parentheses are the natural amino acids encoded by the
endogenous tRNA bearing the corresponding anti-codon.
Figure 9 (Supplementary Figure 11) shows
A. Full blots from Figure 8.
B. Full blots from F g e 10.
Figure 10 shows Site-specific incorporation of amino acid 3 into protein produced in
Drosophila mdanogaster. (a) Incorporation of 3 demonstrated by a dual luciferase
reporter. Dual luciferase assay on ovary extract from 10 female flies expressing Triple-
Rep-L in the presence or absence of 10 mM 1 or lomM 3. The data show a
representative example from 1 of 3 biological replicates. The error bars represent the
standard deviation of 3 technical replicates from a single biological replicate (b) Sitespecific
incorporation of 3 (or 1) into GFP_TAG_mCherry-HA in flies expressing
PylRS/PyltRNAcuA. The full-length protein resulting from unnatural amino acid
incorporation is detected by anti-HA western blot (c) Specific labelling of encoded 3
with tetrazine probes. Flies were fed with no amino acid, amino acid 1 (500 flies) or
amino acid 3 (100 flies). 5 times more flies were fed with 1 in order to generate
comparable amount of reporter protein. The full-length protein containing the
unnatural amino acid was immunoprecipitated from lysed ovaries with anti-GFP beads.
The beads were labelled (4g, 4mM, 200 m PBS, RT. 2h) washed. Full length protein
was detected by anti-HA blot and he same gel imaged on a fluorescence scanner shows
specific fluorescent labelling of the protein incorporating 3 but not 1, confirming the
identity of the incorporated amino acid.
F e 11 examp e 6) Specific protein labeling at genetically encoded unnatural amino acids
1 and 2 . (a) Genetically encoded 1, but not 2 , in calmodulin is specifically labeled with probe 3 ,
Coomassie and fluorescence images demonstrate the specificity of labeling and ESI MS before
labelling (black, expected mass: 17875, found mass: 17874) and after labelling (red, expected
mass: 18553, found mass: 18552) demonstrate the reaction is quantitative, (b) Genetically
encoded 2 , but not 1, in calmodulin is specifically labeled with probe 4 , Coomassie and
fluorescence images demonstrate the specificity of labeling and ESI MS before labeling (black,
expected mass: 17930, found mass: 17930) and after labelling (green, expected mass: 18484,
found mass: 18485) demonstrate the reaction is quantitative. Raw (before deconvolution) ESIMS
spectra are not shown.
Fig e 12 (exa le 6) Incorporating 1 and 2 at positions 1 and 40 of Calmodulin and the
kinetics of specific labelling, (a) Expression was performed in E. coli bearing ribo-Qi, O-gstcamiTAG-
40AGTA, the PylRS/tRNAuAcu pair an the A PrpRS/ tRNAc Apair. Amino acids 1 an 2
were used at 4 and 1mM, respectively, (b) Labelling time course for reaction of CaM1i2 0 with 3
and . Each reaction was followed for 2h by in ge fluorescence and mobility shift.
Figure 13 (example 6) Concerted, quantitative one-pot, dual labeling of Calmodulin in 30
minutes, (a) Dye dependent labeling of CaM1,2 0; sequential labeling with purification after
first labeling in lane 4, sequential labeling without purification in lane 5, one-pot dual labeling
in lane 6. (b) ESI-MS of one-pot protein labeling, before labeling (black, expected mass: 18000
found mass: 18000), after labeling (gold, expected mass: 19233 found mass: 19234). Raw
(before deconvolution) ESI-MS spectra are not shown.
F g 14 shows Scheme A
Figure 15 {supplementary figure 15) shows Amino acid and DNA sequence of
Drosophila GFP-amber-mCherry-HA.
GFP (amino acid residues 1-238), Amber codon at position 248, mCherry (amino
acid residues 255-489), HA tag (amino acid residues 491-499), Myc tag (amino acid
residues 500-509), His tag (amino acid residues 510-515) and SV40 NLS (amino acid
residues 523-528).
F g 16 shows structure of exemplary' amino acid -[((2-methylcyclopropyi)
methoxy)carbonyl]-l-iysine.
Although illustrative embodiments of the invention have been disclosed in detail
herein, with reference to the accompanying drawings, it is understood that the
invention is not limited to the precise embodiment and that various changes and
modifications can b e effected therein by one skilled in the art without departing from
the scope of the invention as defined by the appended claims and their equi valents.
Chemical syntheses - general methods
All chemicals and solvents were purchased from Sigma-Alrich, Alfa Aesar or Fisher
Scientific and used without further purification unless otherwise stated. Qualitative
analysis by thin layer chromatography (TLC) was performed on aluminium sheets
coated with silica (Merck TLC 60F-254). The spots were visualized under short
wavelength ultra-violet lamp (25411m) or stained with basic, aqueous potassium
permanganate, ethanolic ninhydrin or vanillin. Flash column chromatography was
performed with specified solvent systems on silica gel 60 (mesh 230-400).
LC-MS analysis was performed on Agilent 1200 machine. The solvents used consisted
of 0.2 % formic acid in water (buffer A) and 0.2 % formic acid in acetonitrile (buffer B).
LC was performed using Phenomenex Jupiter C18 column (150 x 2 mm, 5mh ) and
monitored using variable wavelengths. Retention times (Rt) are recorded to a nearest
0.1 min and m/z ratio to nearest 0.01 mass units. The following programme was used
for small molecule LC gradient: 0-1 min (A:B 10:90-10:90, 0.3 mL/min), 1-8 min (A:B
10:90-90:10, 0.3 mL/min), 8-10 min (A:B 90:10-90:10, 0.3 mL/min), 10-12 (A:B
90:10-10:90, 0.3 mL/min).
Mass spectrometry analysis following LC was carried out in ESI mode on a 6130
Quadrupole spectrometer and recorded in both positive and negative ion modes. NMR
analysis was carried out on a Bruker 400MHz instmment. All reported chemical shifts
(8) relative to TMS were referenced to the residual protons in deuterated solvents used:
d ~ chloroform (Ή d = 7.26 pp , d = 77.16 ppm), - dimethylsulfoxide (Ή d =
2.49 d = 39.52 ppm), D 0 (Ή d = 4.70). APT or two-dimensional experiments
(COSY, HSQC) were always performed to provide additional information used for
analysis where needed. Coupling constants are given in Hz and described as: singlet -
s, doublet - d, triplet - t , quartet - q, broad singlet - br, mult let - m, doublet of
doublets - dd, etc. and combinations thereof.
Protein expression, purification a d label ling of site-specifically
incorporated 3 in E. col i
Expression and purification of sfGFP-3 from E. co/iElectrocompetent E. coil DH10B cells were
co-transformed with pBK- PylRS and ps GFPl SQ AG PylT 4, 26. Transformed cells were
recovered in S.O.B. 1 mL, supplemented with 0.2% glucose) for 1 h at 37 Cand used to
inoculate LB containing 50 g/rnL kanamycin and 25 g/ m tetracycline (LB-KT). The ceils
were incubated with shaking overnight at 37 C, 250 r.p.m. 1 L of overnight culture was used
to inoculate 100 mL of LB-KT½, the day culture was then incubated (37 C, 250 r.p.m). At
O. .gQo ~0.3, the culture was divided equally and supplemented with either 3 (1 m or H 0
500 m .) and incubated further (37 C, 250 r.p.m). At O.D. 0 o ~0.6 protein expression was
induced by the addition of arabinose (0.2%), after 4 h, the ceils were harvested by
centrifugation (4000 r.p.m, 20 min) and the pellet frozen until further use.
The frozen bacterial pellet was thawed on ice and resuspended in 2 5 mL lysis buffer
(Bugbuster®, Novagen®, 50 ug/raL DNAse 1, Roche inhibitor cocktail and 20 mM
imidazole). Cells were incubated (4 °C, 30 minutes) then clarified by centrifugation
(16000 g, 4 °C, 30 minutes). The clarified lysates were transferred to fresh tubes and
100 ,LIL Ni-NTAslurry added. The mixtures was incubated with agitation (4 °C, 1h) and
then collected by centrifugation (1000 g, 4 °C, 5 min). The beads were resuspended
three times in 500 wash buffer (10 mM Tris-HCL, 40 mM imidazole, 200 mM NaCl,
pH 8) and collected by centrifugation (1000 g, 4 °C, 5 min). Finally, the beads were
resuspended in 100 m-L elution buffer (10 mM Tris-HCL, 300 mM imidazole, 200 mM
NaCl, pH 8), pelleted by centrifugation (1000 g, 4 °C, 5 min) and the supernatant
collected into fresh tubes. The elution was repeated three times with 100 m , of elution
buffer. The purified proteins were analysed by 4-12% SDS-PAGEand LC-MS.
Protein Mass Spectrometry
Using an Agilent 1200 LC-MS system, ESI-MS was additionally carried out with a 6130
Quadrupole spectrometer. The solvent system consisted of 0.1 %formic acid in H20 as
buffer A, and 0.1 % formic acid in acetonitrile (MeCN) as buffer B. Protein UV
absorbance was monitored at 214 and 280 nm. Protein MS acquisition was carried out
in positive ion mode and total protein masses were calculated by deconvolution within
the MS Chemstation software (Agilent Technologies).
In vitro labeling of purified sfGFP150-3
To Purified sfGFPiso- 1 or sfGFPiso- 3 protein (-30 mM, in elution buffer) was added
4a (10 molar equivalents, from a 2 mM stock solution in DMSO). The reactants were
mixed by aspirating several times and the mixture then incubated at room temperature
for 2 hours, a sample was analysed by ESI-MS. Following incubation the proteins were
separated by 4-12% SDS-PAGE and analysed by using Typhoon Trio phosphoimager
(GE Life Sciences).
Time course of sfGFP150~3 and sfGFP150-NorK labelling and rate constant
determination
2 nmol FP-3 (10.6 mM) was labeled at room temperature by the addition of 20 nmol of
tetrazine-dye conjugate 4a (IOmI of a 2 mM solution in DMSO) the samples were mixed by
aspirating several times. At different time points, 8 aliquots were taken from the solution and
quenched with a 700-fold excess of bicyclo[6. 1.0]non-4-yn-9-ylmethanol (BCN) and plunged
into liquid nitrogen. Samples were mixed with NuPAGE LDS sample buffer supplemented with
5 % b-mercaptoethanol, heated for 0 min to 90°C and analyzed by 4-12% SDS page. The
amounts of labelled proteins were quantified by scanning the fluorescent bands with a Typhoon
Trio phosphoirnager (GE Life Sciences). Bands were quantified with the ImageQuant 1 TL
software (GE Life Sciences) using rubber band background subtraction. The rate constant was
determined by fitting the data to a single-exponential equation. The calculated observed rate k'
was divided by the concentration of 4a to obtain rate constant k for the reaction. Measurements
were done in triplicate. All data processing was performed using Kaleidagraph software
(Synergy Software, Reading, UK). For comparison the rate of labelling sfGFP bearing Ne-5-
norbornene-2-yioxycarbonyl-L-lysine (NorK), a known substrate for PylRS, was determined in
a similar way using 1.25mM sfGFP bearing NorK at position 0 (SfGFP-NorK) and 20
equivalents of 4a.
P!asmid construction for pBAD_ wt T4L_MbPylT XXx
pBAD_T4L83TAG_MbPylTCu a s digested with Ncol and Kpnl restriction enzymes. The same
restriction enzymes were also used to digest the wild-type T4 lysozyme from (D67)
pBAD_wtT4L. The insert and backbone were ligated in 3:1 ratio using T4 DNA ligase (RT, 2
hours), transformed into chemically competent DH10B cells and grown on Tetracycline agar
plates (37°C, 18 hours). Single colonies were picked and the correct sequence was confirmed
by DNA sequencing (GATC Gmbh.), this step created pBAD_wtT4L_MbPylT CuA- All final
constructs were confirmed by DNA sequencing.
Proteomic incorporation of 3 via SORT in E. coli expressing T4 lysozyme
Electrocompetent E. coli DH1 0B cells (50 , L) were either doubly transformed with
pBAD_wtT4L_MbPylTxxx plasmid (2 m ., necessary for expression of PyitRNAxxx and
expresses T4 lysozyme under arabinose control) and pBKwtPylS plasmid (2 necessary for
expression of PylRS) or singly transformed with pBAD_wtT4L_MbPylTxxx alone.
Transformed cells were recovered i 1mL S.O.B. (supplemented with 0.2% glucose) for 1 h at
37 °C. 100 L of the recovery was used to inoculate 5 mL LB-KT (50 ,ug/mL kanamycin and 25
,ug/mL tetracycline) or LB-T (25 mg/mL tetracycline). Cultures were incubated overnight (37
°C, 250 r.p.m.). I mL of each overnight culture was used to inoculate 15 mL ½ strength
antibiotic containing media LB-T or LB-KT. Cultures were incubated at 37 °C until O.D.
-0.3 was reached, at this time each culture was divided into 5 mL aliquots and supplemented
with either 3 (0. 1 mM final cone.) or H20 (50 m ) . Cultures were then incubated (37 °C, 250
r.p.m.). At O.D.600 0.6. T4 lysozyrne expression was initiated by the addition of arabinose (0.2%
final cone.) and cultures incubated for a further 4 hours. Cells were harvested by centrifugation
(4000 rpm, 4 °C, 20 minutes) and then resuspended three times in 1mL of ice cold PBS and
collected by centrifugation (4000 rpm, 4 °C, 20 minutes). The final bacterial pellets were
immediately frozen for storage.
E.coli: Ghemoseiective labelling proteomes tagged with 3 with tetrazine-dye
conjugates
Frozen bacterial pellets were resuspended in 500 PBS an lysed using a bath
sonicator (energy output 7.0, 90 s total sonication time. 10 s blasts and 20 s breaks,
Misonix Sonicator 3000). The lysate was cleared by centrifugation (4 °C, 14000 r.p.m.,
30 min) and the supernatant aspirated to a fresh tube. To 50 m of cleared cell lysate
was added 4 a (2 mM, stock in DMSO, final concentration - 20 mM) . The reactions
were mixed by aspirating several times and the samples then incubated in the dark
(room temperature, 1 h). After this time 17 L of 4X LDS sample buffer supplemented
(6 mM BCN and 5% BME) was added and mixed by vortexing gently. Samples were
incubated for 10 min before boiling at 90 °C for 10 min. Samples were analysed by 4
12% SDS-PAGE and fluorescent images were acquired using Typhoon Trio
phosphoimager (GE Life Sciences).
The same protocol for fluorescent labelling of the E. Coli proteins was applied for all
tetrazine-dye conjugates.
Site-specific incorporation of 3 in HEK293 cells and client oselectiv e
labelling with tetrazine rob s
Ste specific incorporation of 3 in HEK ceils
HEK293 Cells (ATCC CRL-1573) were plated on 24 well plates and grown to near
confluence. The cells were transfected using Lipofectamine 2000 (Invitrogen) with the
pMmPylS-mCherry-TAG-EGFP-HA construct and the p4CMVE-U6~PylT construct. 18
After i6hrs growth with or without lmM 3 or with imM 1 the cells were lysed on ice
using RIPA buffer (Sigma). The lysates were spun down and the supernatant was added
to 4X LDS sample buffer (Life technologies). The samples were run out by SDS-PAGE,
transferred to a nitrocellulose membrane and blotted using primary rat anti-HA(clone
3F10, Roche, No. 1 867 423) and mouse anti-FLAG (clone G191, Abnova, cat.
MAB8183), the secondary antibodies were anti-rat (Invitrogen, A11077) and antimouse
(Cell Signaling Technologies, No. 70768).
Labelling site-specifically incorporated 3 from HEK 293 cells
Adherent HEK293T cells (ATCC CRL-11268; 4x1o6 per immunoprecipitation) were
transfected with 7.5 g p4CMVE-U6-PylT and . pPylRS-mCherry-TAG-EFGP-HA
using TransIT -293 transfection reagent according to the manufacturer's protocol and
cultured for 48 hours in DMEM/io%FBS, supplemented with 0.5 mM 1 or 2 mM 3
where indicated. Cells were washed twice with PBS and lysed on ice for 30 minutes in
imL Lysis Buffer (150 mM NaCl, 1% Triton X-100, 50 mM Tris HC1 (pH 8.0). After
clarifying the lysate by centrifugation (10 min at i6ooog), HA-tagged proteins were
captured using 50 ,uL uMACS HA-tag MicroBeads (Miltenyl Biotec) per transfection,
washed with 0.5 mL RIPA (150 mM NaCl, 1% Igepal CA-630, 0.5% sodium
deoxychoiate, 0.1% SDS, 50 mM Tris HC1 (pH 8.0) and 0.5 mL PBS (pH 7.4). The
suspension of MicroBeads was incubated with 50 f L PBS (pH 7.4), 20 mM 4 a for 1 hour
and subsequently washed with 0.5 mL RIPA to remove excess dye. HA-tagged proteins
were eluted from beads using SDS sample buffer and separated on a 4-12% Bis-Tris
PAGE gel (Invitrogen), imaged using a Typhoon imager (GE Healthcare) and
subsequently stained with DirectBlue or transferred for western blotting with Anti- HAtag
pAb-HRP-DirecT (MBL).
Expression and p urification of SfGFPfrom mammalian cells
HEK293T were transfected in a 10cm tissue culture dish with isug DNA using PE and
incubated for 72 hours with 3 (0.5 mM) . Cells were washed twice with PBS and lysed in
imL RIPA buffer. Cleared lysate was added to qm GFP-Trap® M (ChromoTek) and
incubated for 4 . Beads were washed with imL RIPA, imL PBS, imL PBS+soomM
NaCl, imL ddH20 and eluted in Ace ic Acid/ddH20. Purified protein was labeled
with 2mM 4 a for 4 and loaded on a 4-12% Bis-Tris PAGE gel. Fluorescence of 4alabeled
sfGFP was detected on a Typhoon imager and gel was stained subsequently
with DirectBlue.
Fly plasm ids, transgenic flies and culture
For all fly experiments no randomisation or blinding was used within this study.
Plasmsd construction for transgenic fly line generation
The PyltRNAcuA anticodon was mutated using the QuikChange mutagenesis kit and
pSGioS (pJet i .2-U6 -PylT, gift from S. Greiss) as a template. This contains the PylT
gene without its 3' terminal CCA fused to the Drosophi!a U6-b promoter. Primers
FMT19 and FMT20 were used to generate Py tRNA c to decode alanine codons
(creating pFTi8); primers FMT23 and FMT24 were used to generate PyltRNAGcr to
decode serine codons (creating pFT2o); primers FMT27 and FMT28 were used to
generate PyltRNAcAG to decode leucine codons (creating pFT22) and primers FMT29
and FMT30 were used to generate PyltRNAcAT to decode methionine codons (creating
PFT23). The mutated tRNA expression cassettes were subcloned from pFTiS, pFT20,
pFT22 and PFT23 into pUCiS using EcoRI and HinDIII then multimerised using AsiSI,
BamHI and Bglll to create 2, then 4 copies of the tRNA. The 4 copy versions of the
tRNA cassette were subcloned into pSGiiS using AsiSI and M u to create P FT58 (Ala),
pFT6o (Ser), pFT62 (Leu) and PFT63 (Met). p SG 8 contains the M.maza PylRS
gene. 20
Fly lines and culture conditions
Transgenic lines were created by P element insertion using a Drosophila embryo
injection service (BestGene Inc.). Lines were generated using the following plasmids:
PFT58 (Ala), pFT6o (Ser), pFT02 (Leu) and pFT63 (Met). nos-Gal4 -VPi6
(Bloomington 4937) and MSi096-Gal4 (Bloomington 8860) were used as Gal4 drivers.
All flies were grown at 25°C on standard Iberian medium. Flies were fed unnatural
amino acids by mixing dried yeast with the appropriate concentration of amino acid
(usually lomM) diluted in dH20 to make a paste. Ovaries were prepared from females
that were grown on Iberian fly food supplemented with a yeast paste with or without
the amino acid for a minimum of 48 hours. For proteome labelling experiments
transgenic male flies of constructs FT58, FT60, FT62 and FT63 were crossed with nosvpi6~
GAL4 virgins to generate FT58/nos-vpi6-GAL4, FT6o/nos-vpi6-GAL4,
FT62/nos-vpi6-GAL4 and FT63/'nos-vpi6-GAL4 respectively.
Site specific incorporation of 3 in D melanogaster
Luciferase assays
Ovaries from 10 females of Triple Rep-L flies recombined with nos-Gal4 -VPi6 fed 3 , 1
or no amino acid were dissected in ioom ix Passive lysis buffer and processed for
luciferase assays as previously described .
!mmunoprecipitation and labelling of site specifically incorporated 3
Ovaries from 100 (for control and 3 ) or 500 (for 1) females were dissected in PBS then
lysed in 300 or 1500 m RIPA buffer containing ix complete protease inhibitor cocktail
(Roche). A sample was taken into 4 LDS buffer as a total lysate control then the
remainder was used for immunoprecipitation with GFP-TRAP agarose beads
(Chromotek) following the manufacturer's instructions. The total volume of the IP was
3ml. After overnight incubation, the beads were washed 2 x with RIPA buffer then 2 x
with PBS. For tetrazine labeling, the beads were resuspended in 200 m1PBS + 4mM 4 g
and incubated for 2 hours on a roller at RT. The beads were washed 3 times with 500
L of wash buffer then resuspended in 4 LDS sample buffer.
Exam ple 1 - Synthesis of N [C(2~methylcyc!loprop 2-en 1
yf)methoxy)carbony!] - L-!ysine 3
A class of reaction useful in protein labelling is the very rapid and specific inverse
electron demand Diels-Alder reaction between strained alkenes (or alkynes) and
tetrazines. 21 25
While we, and others, have previously encoded unnatural amino acids bearing strained
alkenes, alkynes and tetrazines via genetic code expansion and demonstrated their use
for site-specific protein labelling via inverse electron demand Diels-Alder reactions, 26 30
all the molecules used to date are rather large. We have previously shown that a variety
of carbamate derivatives of lysine are good substrates for PylRS,3 and it has been
demonstrated that 1,3 disubstituted cyclopropenes, unlike 3,3 d substituted
cyclopropenes,32'24 react efficiently with tetrazines. 22 We therefore designed and
synthesized a carbamate derivative of lysine, bearing a 1,3 disubstituted cyclopropene (
-[((2 -methylcycloprop-2-en-i-yl)methoxy)carbonyl ]-L -lysine 3, Fig. 1b), for
incorporation into proteins and labelling with tetrazines.
ynthe s is of methylcydoprop~2~en 1 yl}methoxy)carbonyl] ~L- y s n e (3)
S4
Scheme 1. Synthesis of A^-[({2-methylcycloprop-2-en-i-yl}methoxy)carbonyl]-Llysine
3 . Reagents and conditions; Rh2(OAc)4, propyne, CH Cl , 4 °C to RT, 75%
yield; is. DIBAL-H, CH2C12, o °C to RT; i . 4-nitrophenyl chloroformate, Hiinig's base,
CH C1 , RT, 73% yield; i . Fmoc-Lys-OH, Hiinig's base, THF/DMF, 4 °C to RT, 82%
yield; v. NaOH, THF/H 0 , RT, 68% yield.
L Ethyl 2-methy!cydoprop-2-ene-1-carboxy!ate S 1
A 100 n L 2-neck round bottom flask was charged with CH2C1 (2 mL) and rhodium
acetate (442 mg, 1 mmol, 0.05 eq), and fitted with a dry ice condenser. Propyne
(approx. 10 mL) was condensed into the rhodium acetate suspension and the flask
lowered into a water bath (20 °C), a steady reflux of propyne was obtained. Ethyl
diazoacetate (2.1 mL, 20 mmol, leq) was added to the stirred propyne solution dropwise
over 1 h using a syringe pump. The reaction was stirred at room temperature for a
further 10 minutes whereby TLC analysis showed the reaction to be complete by after
this time. The cyclopropene product was then purified by silica gel flash column
chromatography eluting with pentane and diethyl ether (90:10). This gave the desired
product S 1 as a colourless volatile liquid (1.9 g, 75% yield). NMR analysis d' (400
MHz, CDC 3) 6.35 (iH, t, J 1.4), 4.18-4.09 (2H, m), 2.16 (3H, d, J 1.3), 2.12 (iH, d, J
1.6), 1.26 (3li, t , 7.1); LRMS m/z (ES+) 127.2 [M+H]+.
These values are in good agreement with literature.{Liao, 2004 #1}
M. and Mi . ( 2 ~ ethyfcycioprop-2-en-1-y!)rnethy! (4-nitropheny!) carbonate
S3
DIBAL-H (22.5 mL of a lMsolution in CH C1 , 22.5 mmol, 1.5 eq) was added drop-wise
to a stirred solution of cyclopropene ester S 1 (1.9 g, 15 mmol, 1 eq) in CH2C1 (15 mL) at
-10 °C. The reaction was stirred at -10 °C for 20 minutes before quenching with the
cautious addition of H20 ( mL), then NaOH ( mL of a i M solution in H 0 ) and 0
(2.3 mL). The mixture was stirred for a further 2h at room temperature before it was
dried (Na2S0 4) and filtered. Hunig's base (3.9 mL, 22.5 mmol. .seq) was added to the
filtrate (containing crude cyclopropene alcohol S2) followed by the addition of 4-
nitrophenyl chloroformate (3.3 g, 16.5 mmol, 1.1 eq). After stirring at room temperature
fo 18 hours a significant colourless precipitate formed, an TLC analysis showed
complete consumption of the crude cyclopropene alcohol S2. The reaction was diluted
with CH2CI2 and then dry loaded onto silica gel, whereby the activated carbonate S3
was purified by silica gel column chromatography eluting with ethyl acetate and hexane
(20:80). This gave the desired cyclopropene carbonate S3 as a colourless oil (2.7 g, 73%
yield over 2 steps). Ή NMR analysis d (400 MHz, CDC¾) 8.28 (2H, d, J 9.2), 7.39
(2H, d, J 9.2), 6.62 (lH, s), 4.21 (lH, dd, J10.9, 5.3), 4.14 (lH, dd, J 10.9, 5.3), 2.18 (3H,
d, J1.3), 1.78 (iH, td, J 5.3, 1.3).
iv, -{Fmoc )- -({(2~methyfcydoprop-2-en-1-y!)methoxy) car bony!) ' ¬
!ys S4
Fmoc-Lys-OH-HCl (6.7 g, 16.5 mmol, 1.5 eq) was dissolved in THF (30 mL) and DMF
(10 mL), to this solution was added Hunig's base (9.0 mL, 55.0 mmol, 5 eq) followed by
cyclopropene carbonate S3 (2.7 g, 11.0 mmol, 1 eq) an immediate yellow coloration was
observed upon addition of the carbonate. The reaction was stirred at room temperature
for 6 hours and was adjudged complete by the consumption of starting material after
this time as shown by TLC analysis. The crude reaction mixture was dry loaded onto
silica gel and the major product purified by silica gel column chromatography eluting
with ethyl acetate, hexane and acetic acid (50:49:1 then 99:0:1). This gave the desired
product S4 as a colourless gum (4.3 g, 82% yield). Ή NMR analysis Ή (400 MHz,
CDCI3) 7-77 (2H, t, J 7-6), 7-65-7-55 (2H, m), 7.39 (2H, t , J 7.6), 7.31 (2H, t, J 7.3), 6.54
(lH, s), 5.68-5.57 (lH, m), 4.84 (lH, br-s), 4.44-4.32 (2H, m), 4.22 (lH, t , J 7.0), 3.98-
3.87 (lH, m), 3.17-3.09 (2H, m), 2.15-2.06 (6H, m), 1.99-1.86 (lH, m), 1.84-1.70 (lH,
m), 1.68-1.59 (lH, m), 1.58-1.34 (2H, m); LRMS m/z (ES+) 479.3 [M+H]+, 501.3
[M+ a]+, m/z(ES-) 477.2
v. N -[({2-methy!cycloprop-2-en-1-yf}methoxy)carbonyl] -L-fysine 3
-(Fmoc)-A/ -(((2-methyl cyclo rop-2-en-i-yl)methoxy)carbonyl)-L-lysine S4 (3.5 g,
7.0 mmol, 1 eq) was dissolved in THF and H20 (3:1 40 mL), to this solution was added
sodium hydroxide (0.9 g, 22.6 mmol, 3.1 eq). The reaction was stirred at room
temperature for 8 hours after which time the reaction was adjudged complete by LCMS
analysis. The reaction mixture was diluted with H20 (100 mL) and the p adjusted
to 5 by the addition of HC1 (lM). The aqueous solution was washed with Et20 (5* 100
mL), then concentrated to dryness yielding a colourless solid. The solid was purified by
preparative HPLC, the product fractions were combined and the solvent removed by
freeze-drying. This gave i'-[({2-methylcycloprop-2-en-i-yl}methoxy)carbonyl]-Llysine
3 as a colourless solid. ¾ (400 MHz, D20 ) 6.45 (iH, s), 3.90-3.61 (2H, m), 3.09
(iH, t, J 6.4), 2.98-2.86 (2H, m), 1.92 (3H, s), 1.52-1 37 (2H, m), 1.37-1.22 (2H, m),
1.21-1.08 (2H, m), 0.83 (iH, d, 5-2). LRMS m/z ES+) 257.2 [M+H] +, m/z (ES ) 255.2
[M-H]-. c (100 MHz, D 0 ) 101.1 (CH), 72.3 (CH ), 55-9 (CH), 40.2 (CH ), 34-3 (CH ),
28.9 (CH ), 20.3 (CH ), 16.6 (CH3), 10.8 (CH) HRMS (ES+) Found: (M+ a)+ 279.1302.
Ci H o0 4 Na required M+, 279.1315.
Example 2 - Encoding the site-specific incorporation of 3 in E . coli
We demonstrated that 3 is efficiently and site-specifically incorporated into
recombinant proteins in response to the amber codon using the PylRS/tRNAcuA pair
an an SfGFP gene bearing an amber codon at position 150 (Supplementary Fig,
2a). The yield of protein is 8 mg per litre of culture, which is greater than that obtained
for a well-established efficient substrate for PylRS -[(tert-butoxy)carbonyl]-L-lysine 1
(4 mg per litre of culture) 33 Electrospray ionisation mass spectrometry of SfGFP
bearing 3 at position 150 (SfGFP-3 ) confirms the incorporation of the unnatural amino
acid (Supplementary Fig. 2b). SfGFP-3 was specifically labelled with the fluorescent
tetrazine probe 4a, while SfGFP- was left unlabelled (Supplementary Fig. 2b). 2
n o of SfGFP-3 was quantitatively labelled with 10 equivalents of 4a in 30 minutes, as
judged by both fluorescence imaging and mass spectrometry (Supplementary Fig,
2b). The second order rate constant for labelling SfGFP-3 with 4a was 27 ± 1.8 M^s
(Supplementary Fig. 2c).26
Since PylRS does not recognize the anticodon of its cognate t A3 it is possible to alter
the anticodon of this tRNA to decode distinct codons. We created a new tRNA in which
the anticodon of PyltRNAcuA was converted from CUA to UUU (Supplementary
Table 1), to decode a set of lysine codons. We added 0.1 mM 3 to cells containing
PylRS, PyltRNAuuu, and the gene for T4 lysozyme. Following expression of T4
lysozyme we detected proteins in the lysate bearing 3 with the tetrazine probe 4a (20
microM ih, Supplementary Fig. 3). Control experiments show that the observed
labelling requires the presence of the synthetase and tRNA, and electrospray ionization
mass spectrometry^ demonstrates the incorporation of 3 in place of lysine in T4
lysozyme (Supplementary Fig. 4). The addition of 3 (0.1 or 0.5 mM) has little or no
effect on cell growth (Supplementary Fig, 5) suggesting that the amino acid is not
toxic at the concentration used, and there is substantial labelling within ih of amino
acid addition (Supplementary Fig, 6).
Exam le 3 - Genetic encoding of 3 in h an cells
Full-length mCherry-3-GFP-HA was expressed in HEK293 cells carrying the
PylRS/tRNAcuA pair and mCherry-TAG-EGFP-HA (a fusion between the mCherry gene
and the EGFP gene with a C-terminal HA tag, separated by the amber stop codon
(TAG)).18 Full-length protein was detected only in the presence of the 3 (Fig, 8a. Full
gels in Supplementary Fig. 11) mCherry-3-EGFP-HA was selectively labelled with
4a, while mCherry - 1-EGFP-HA was not labelled (Fig. 8b) 18 demonstrating the sitespecific
incorporation of 3 with the PylRS/tRNAcuA pair in human cells.
Exa le 4 - Genetic encoding of 3 in D. melanogaster
We demonstrated that 3 can be site specifically incorporated into proteins in D.
melanogaster. To achieve this, we used flies containing the PylRS/tRNAcuA pair (with
the tRNA expressed ubiquitously from a U6 promoter and UAS-PylRS expression
directed to ovaries using a nos-vpi6-GAL4 driver), and a dual luciferase reporter
bearing an amber codon between firefly and renilla luciferase. 20 We observe a strong
luciferase signal that is dependent on the addition of 1 or 3, an the dual luciferase
signal is larger with 3. These experiments demonstrate that 3 is taken up by flies and is
more efficiently incorporated in vivo in response to an amber codon than 1 (Fig. 10a),
a known excellent substrate for PylRS. 3 may be supplied by feeding food
supplemented with amino acid 3 at lomM. In additional experiments, we
demonstrated by western blot the efficient incorporation of 3 into a GFP-TAGmCherry-
HA construct (Supplementary Fig. 15) expressed in ovaries20 (Fig. 10b),
and the specific fluorescent labelling of the incorporated amino acid with 4g (F g,
10 c).
Example 5 - Synthesis of Tetrazine-BODf PY FL 4
Sc e e 3. Synthesis of tetrazine-biotin 4 . Reagaits and conditions: . HCl dioxane,
RT, ioo yield; i . Bodipy-FL-NHS ester, Hiinig's base, DMF, RT.
i, S6
Boc-protected Tetrazine S6 was synthesized using the procedure reported earlier 6. 4M
HC in dioxane (500 m , 2 0 mmol) was added to a stirring solution of Tetrazine S5 (8
mg, 0.02 mmol) in DCM (500 L). The reaction was carried out for 2 h at room
temperature and subsequently the solvent was removed under reduced pressure to
yield primary amine hydrochloride S6 as a pink solid (6mg, 0.02 mmol, 100%). The
compound was directly used in the next step without any further purification.
ii. 4d
BODIPY FL succinimidyl ester (smg, 0.013 mmol, Life technologies) and Hiinig's base
(50 m , 2.8 mmol) were added to the solution of Tetrazine-amine S2 (6mg, 0.02 mmol)
in dry DMF (1 mL). The reaction mixture was stirred at room temperature for 16 h. The
reaction mixture was diluted with 4ml of water and the product was purified by semipreparative
reverse phase HPLC using a gradient from 10% to 90% of buffer B in buffer
A (buffer A: 0 ; bufferB: acetonitrile). The identity and purity of the tetrazine-
BODIPY FL conjugate 4 was confirmed by LC-MS. ESI-MS: [M-H]-, calcd. 581.38,
found 581.2.
SU RY OF EXA PLES 1to 5
We have characterized the synthesis of, and the genetically encoded, site-specific
incorporation of a cyclopropene containing amino acid 3, and demonstrated the
quantitative labelling of 3, with tetrazine probes, in proteins expressed in E. coli,
mammalian cells and D. melanogaster, thereby showing the widespread utility and
industrial application of the present invention.
Supplementary References to Exam ples 1to 5
1. Gautier, A. et al. Genetically Encoded Photocontrol of Protein Localization in
Mammalian Cells Journal of the American Chemical Society 132, 4086-4088
(2010).
2. Karp, N.A., Kreil, D.P. &Lilley, K.S. Determining a significant change in protein
expression with DeCyder during a pair- wise comparison using two-dimensional
difference gel electrophoresis. ProieomicsA, 1421-1432 (2004).
3. Karp, N.A. &Lilley, K.S. Design and analysis issues in quantitative proteomics
studies. Proteomics! S pp! 1, 42-50 (2007).
4. Lilley, K.S. in Current Protocols in Protein Science (John Wiley &Sons, inc.,
2001).
5. Von Stetina, J.R., Lafever, K.S., Rubin, M. &Drummond-Barbosa, D. A Genetic
Screen for Dominant Enhancers of the Cell-Cycle Regulator alpha-Endosulfine
Identifies Matrimony as a Strong Functional Interactor in Drosophila. G3
(Bethesda) 1, 607-613 (2011).
6. Lang, K. et al. Genetically encoded norbornene directs site-specific cellular
protein labelling via a rapid bioorthogonal reaction. Nat Chem 4 , 298-304
(2012).
Example 6 - Dual Labelling of Proteins
The ability to attach two distinct molecules to programmed sites in proteins will facilitate a
variety of applications including FRET1'2 to study protein structure, conformation and
dynamics. Several approaches for doubly labeling proteins have been reported. One
approach relies on the installation of one unnatural amino acid that is specifically labeled in
combination with cysteine thiol labeling, but this approach is generally limited to proteins
that do not contain free thiols. 3-4 Chemical ligation approaches can be combined with the
genetic encoding of a single unnatural amino acid for protein labeling,s but this may limit
the size and/or sites that may be labeled. Perhaps the most generally applicable approach
for protein double labelling is based on the genetic incorporation of two distinct amino
acids in response to two distinct codons introduced at user defined sites in the gene of
interest.
An ideal strategy for dual labeling requires i) the efficient, cellular, incorporation of two
distinct unnatural amino acids into a protein that can be labelled in mutually orthogonal
reactions, and ii) the development of mutually orthogonal reactions that allow the
simultaneous addition of two molecules to the protein for rapid, quantitative labelling of
the protein in aqueous media at physiological H, temperature and pressure.
Scheme A (Figure 14) shows concerted, rapid, one-pot quantitative dual labelling of
proteins in aqueous medium at physiological pH and temperature (a) Unnatural amino
acids and fluorophores used in this example (b) Concerted labeling at an encoded terminal
alkyne and an encoded cyclopropene via mutually orthogonal cycloadditions.
A limited range of chemistries have been investigated for the double labeling of proteins
containing pairs of unnatural amino acids. The incorporation of azide- and alkynecontaining
amino acids, and their non-quantitative labeling with alkyne and azide based
fluorophores has been reported 7, but this is not ideal for double labeling of proteins; if the
encoded azide and alkyne are in proximity they can react to form a triazole in the protein a
strategy which allows genetically directed protein stapling, 6 but precludes labeling with
probes. Moreover, an efficient one-pot reaction is not feasible because of the reaction
between azide- and alkyne- bearing probes with each other. The incorporation of ketone
and azide containing amino acids has been reported, 8'10 which allows one-pot reaction of
the encoded ketone with alpha effect nucleophiles, and the azides with alkyne probes. 10
However this approach is problematic because encoded azides are subject to reduction in
many proteins when expressed in E c / ,8- which will prevents quantitative labeling.
Moreover, ketone labeling with alpha effect nucleophiles is very slow (rate constant
approximately O4 :) and the reaction is optimal at PH4-5.5,12 which limits its utility
for many proteins that are denatured or precipitate when kept for long periods under acidic
conditions. We recently genetically installed a deactivated tetrazine containing amino acid'
and a norbornene containing amino acid14 16 into proteins using our optimized orthogonal
translation system 9 Because the rate of inverse electron demand Diels Alder reaction
between the deactivated tetrazine and norbornene is very slow, but the tetrazine can react
with bicyclononyne based probes and the norbornene can react with activated tetrazine
probes we were able to use this approach to specifically and quantitatively double label
proteins. 9 While this approach has the advantage of proceeding in aqueous media at
physiological pH, temperature and pressure; it does require sequential labeling steps (to
avoid inverse electron Demand reactions between probes), each of which takes several
hours, with purification between steps. All approaches reported to date for doubly labeling
proteins at genetically encoded unnatural amino acids take tens of hours to days to reach
completion.
An ideal approach to double label proteins would allow rapi one-pot labeling of genetically
installed bio-orthogonal functional groups, proceed rapidly in aqueous media at
physiological pH, temperature and pressure and be implemented simply by adding the
labeling reagents to a recombinant protein bearing the site specifically incorporated
bioorthogonal groups. A promising pair of mutually orthogonal reactions for one-pot
labeling under aqueous conditions at physiological p are the Cu(X)-catalysed 3+2
cycloaddition between azides and terminal alkynes, 17 and the inverse electron demand Diels
Alder reaction of a strained alkenes and a tetrazine 23(Figure 11). The reaction of
strained alkynes and azides can also be orthogonal to strained alkene tetrazine reactions,
but since tetrazines react with strained alkynes this approach requires careful tuning of the
rate constants for each reaction. 2 No combination of 3+2 cycloaddition and inverse
electron demand Diels Alder reaction has been demonstrated for protein labelling.
We demonstrated in examples to 5 that a 1,3 disubstituted cyclopropene containing amino
acid, 2 (referred to as 3 in examples 1 to 5 and elsewhere in this document), can be
efficiently and site specifically incorporated into proteins using the PylRS/tRNAcuA pair. 25
This amino acid, unlike the 3,3 disubstituted cyclopropene incorporated for photoclick
reactions, 26 reacts with tetrazines ¾27 with on-protein rate constants of 27 M-'s Here we
demonstrate the efficient genetic encoding of a terminal alkyne containing amino acid 1 and
a cyclopropene containing amino acid 2 into a single protein and their rapid, quantitative,
one-pot labeling with azide an tetrazine probes (F g e 11). This work provides the first
approach to the concerted double labeling of proteins in a one-pot process under aqueous
conditions, at physiological pH, and provides a step change in the speed of double labeling,
from days in previous work to 30 minutes in the approach reported here.
Proteins containing either 1 or 2 were overexpressed to examine the specificity of the
orthogonality of the proposed labeling reactions. A fusion protein of glutathione-Stransferase
and calmodulin (GST-CaM) with amino acid 1 at position 1 in calmodulin was
expressed from cells containing ribo-Qi (an evolved orthogonal ribosome 2 ), O-gstcam
/TAG (a fusion gene between g u a hione S transferase (_gst) and calmodulin (cam) on
an orthogonal message 30 in which the first codon of earn is replaced with a TAG codon),
and M/PrpRS/tRNAcuA (a synthetase / tRNA pair developed for incorporating 1 in response
to the TAG codon) 1 grown in the presence of 1 (4 xM) The GST tag was subsequently
removed by cleavage using thrombin at an engineered thrombin-cleavage site between GST
and CaM. CaM (CaM containing 1 at position 1, ~ioo pmole) was labelled with the azide
containing fluorophore 3 (2 nmole), in a Cu (I)-catalysed click reaction. The reaction was
quantitative as judged by both the quantitative shift of the fluorescemiy labelled protein by
SDS-PAGEand electrospray ionization mass spectrometry (ESI-MS) (F gu r e 11a).
The cyclopropene containing amino acid, 2, was site specifically incorporated at position 40
of calmodulin. The modified protein was expressed in cells bearing the PylRS/tRNAcuA
(that efficiently directs the site specific incorporation of 2),¾ ribo-Qi, and -gst-cam 4oTAG
grown in the presence of 2 (1 mM). CaM2 0 (~ioo p o ) (obtained after thrombin cleavage
of the GST tag) was labelled with the tetrazine containing fluorophore 4 (2 nmole). The
reaction was quantitative as judged by both the quantitative shift of the fluoreseently
labelled protein by SDS-PAGE and electrospray ionization mass spectrometry (ESI-MS)
(Fi gu r e 11b). CaM240 was not labeled with 3 under the conditions that led to quantitative
labeling of CaM1 with 3 ( F igu r e 11a) . Similarly, CaM1, was not labeled with 4 under
conditions where CaM240 was quantitatively labeled with 4 . These experiments
demonstrate that the two labeling reagents react quantitatively with their target amino acid,
but do not react with their non-targeted unnatural amino acid in proteins.
Next we investigated labeling 1 and 2 within the same protein. We site-specifically
incorporated 1 and 2 at positions 1and 40 of calmodulin to produce CaM1 240 (Fi gu r e 12).
We directed the incorporation of amino acid 1 with an M/PrpRS/tRNAcuA pair and the
incorporation of amino acid 2 with the evolved PylRS/tRNAuAcu pair, which efficiently
decodes the quadruplet ACTA codon on orthogonal messages using ribo-Qi. 9 Unnatural
amino acids were incorporated in response to UAG and AGTA codons at positions 1 and 40
in calmodulin, within a GST-calmodulin gene on an orthogonal message -gst-camnAG-
4OAGTA)- Expression of full-length GST-CaM1 2 0 was dependent on the addition of amino
acids 1 and 2 to E coii, and ESI-MS demonstrated the genetically directed incorporation of
amino acids 1 and 2 (Fi gu r e 12c). The yield of full length GST-CaM1 240 was - 2 mg per L
of culture.
To determine the time required to quantitatively label CaM1 2 40 with azide 3 or tetrazine 4
we incubated 100 pmol of CaM1 2 0 with 2 nmo of either 3 or 4 and followed each reaction
by both mobility shift on SDS-PAGE and fluorescent imaging upon labeling (Fi gur e 12b).
These experiments demonstrate that fluorophore labeling is complete in 30 minutes.
Next we investigated the labeling of CaM1 2 40 with both 3 an 4 (Fi gur e 13). We first
tested the addition of 4 (2 nmol) to CaM1 2 0 (100 pmol) followed by purification to
remove free 4, and subsequent labelling with 3 (2 nmol) (Fi gu r e 13a lane 4). This led to
efficient double labelling as judged by SDS-PAGE mobility shift and fluorescence imaging.
Next we performed sequential labeling without purification by incubating CaM1 240 with 4
for 30 minutes and then adding 3 and click reagents and incubating further for 30 rnin
(Fi gu r e 13a lane 5). This also led to efficient double labelling as judged by SDS-PAGE
mobility shift and fluorescence imaging. Finally, we simultaneously added 4 (2 nmol), 3 (2
nmol) and click reagents to CaM1 2 0 ( 00 pmol) and incubated for 30 minutes. (Fi gu r e
13a lane 6). This again led to efficient double labelling as judged by SDS-PAGE mobility
shift an fluorescence imaging. In all doubly labeled proteins we observe a decrease in the
BODIPY-FL fluorescence relative to the singly labeled control upon excitation at 488 nm
(compare lanes 4, 5, and 6 to lane 3 in Figu r e 13a), consistent with in gel Forster
resonance energy transfer (FRET) between BODIPY-FL and BODIPY-TMR-X . ESI-MS
further demonstrates that this concerted, one-pot protocol leads to genetically directed
efficient, rapid and quantitative double labeling of proteins.
n summary, in this example we show an efficient and rapid protocol for expressing
recombinant proteins bearing a site specifically incorporated aikyne and a site specificallyincorporated
cyclopropene. We demonstrate that the inverse electron demand Diels Alder
reaction of an encoded 1,3 disubstituted cyclopropene and tetrazine probe, and the 3+2
cycloaddition reaction of the encoded aikyne and azide probe are mutually orthogonal to
each other and to the functional groups in proteins. By combining the genetic encoding of
an aikyne and a cyclopropene in a single protein and labelling with the mutually orthogonal
reactions we demonstrate the concerted, one-pot rapid double labeling of a protein in
aqueous media at physiological p and temperature. This strategy has utility for doubly
labeling proteins for a variety of studies and applications, and may be extended to the
double labeling of diverse molecules in diverse cells and organisms.
Note on exa le 6 : The chemical designations in example 6 and in the corresponding
figures (drawings) discussed in example 6 are self-contained and apply only to example 6.
Discussion of chemical designations in the rest of this document are consistent with the
exception of example 6. For example, the skilled reader will immediately appreciate that
compound 2 of example 6 corresponds to compound 3 in the rest of this document (i.e. the
exemplary cyclopropene amino acid of the invention). Compounds 3 and 4 of example 6
are tetrazine compounds.
REFERENCES TO EXAMPLE 6
(l)Zhang, J.; Campbell, R. E.; Ting, A. Y.; Tsien, R Y. Nature Reviews Molecular Cell Biology 2QQ2, 3, 906.
(2) Kajihara, D.; Abe, R.; jima, I.; Komiyama, C ; Sisido, M.; Hohsaka, T. Nat Methods2Q 6 , 3, 923.
3)Brostad, E. M.; Lemke, E. A.; Schuitz, P. G.; Deniz. A.A. J Am Chem &c20 8 , 130, 17664.
(4) g en D. P.; Elliott, T.; Holt, M.; Muir, T. W.; Chin, J .W. J Am Chem Soc 2 11, 133, « 4 8.
(5)Wissoer, R. F.; Batjargal, S,; Fadzen, C. ML; Petersson, E. J . J Am Chem S¥ 2Q13, 135, 6529.
C6)Neumann, H.: Wang, K.; Daws, L.: Garcia-Alai, ML; Chin, J .W. Nature2Q 10 , 464, 44s.
(7)Wao, W.; Huang, Y,; Wang, Z.; Russell, W. K.; Pai, P. J.; Russell, D. H.; Liu, W. R. Angew Chem Int Ed Engl
20 10 , 49, 3211.
(SjChatterjee, A.; Sun, S. B.; Furman, J . L.; Xiao, H.; Schuitz, P, G. Biochemistry 2 3 ,
(9)Wang, K.; Sachdeva, A.; Cox, D. J.; Wiif, . W.; Wallace, S.; Mehl, R, A.; Chin, J .W. submitted.
(lO)Wu, B,; Wang, ,; Huang, Y.; Liu, W. R. Chembiochem : a European journal of chemical biology 2 2 , 13,
1405.
(n)Sasmal, P, K.; Carregal-Romero, S.; Han, A. A.; Streu, C. N.; Lin, Z.; Namikawa, K.; Elliott, S. L.; Koster, R.
W.; Parak, W. J.; Meggers, E. ChemBioChem 20 2 , 13, 1116.
(12)Rotenberg, S. A.: Calogeropoulou, T.; JaworsM, J . S.; Weinstein, I . B.; Rideout, D. Proceedingsof the
National Academy of Sciences of the United Sates of America 1991, 88, 2490.
(13)Seitchik, J . L.; Peeler, J . C ; Taylor, M. T.; Blackman, M. L.; Rhoads, T. W.; Coolev, R. B.; Refakis, ; Fox, J .
M.; Mehl, R.A. J Am Chem S c 2Q12 , 34 , 2898.
(14)Lang, .; Davis, L.; Torres-Kolbus, J.; Chou, ; Deiters, A.; Chin, J ,W. Nat Char, 2 2 , 4, 298.
(15)Plass, T.; Milles, S.; Koehler, ; Szymanski, J.; Mueller, R.; WieBler, M.; Schuitz, C.; Lemke, E.A,
Angewandte Chemie International Edition 20 2 , 51, 4166.
)Kaya, E.: Vrabel, M.; Deiml, C ; Prill, S.; Fluxa, V. S.; Carell, T. Angewandte Chemie Internationa! Edition
20 2 , 5 , 4466.
(i7)Wang, Q.; Chan, T. R.; Hilgraf, R.; Fokin, V. V.; Sharpless, . B.; Finn, M. G, J Am Chem 5 C2 3 , 725,
3 .
(i8)Devaraj, N. K.; Weissieder, R. Accounts of Chemical Research 20 11, 44, 816.
( )Yang, J.; Seckute, J.; Cole, M.; Devaraj, N. K. Angewandte Chemie Internationa! Edition 2 2 , 57, 7476.
(20) Blackman, M. L.; Royzen, M.; Fox, J . . J Am Chem b 20 0 8 , 730, 13518.
2 1)Lang, K.; Davis, L.; Wallace, S.; Mahesh, M.; Cox, D. J.; Blackmail, M. L.; Fox, J . M.; Chin, J .W. J Am Che
S 20 12, 134, 10317.
(22)Borrmann, A.; Miiles, S.; Plass, T.; Dommerholt, J.; Verkade, J . M. M.; WieBler, M.; Schultz, C ; van Hest, J .
C. M.; van Delft, F. L.; Lemke, E.A. ChemBioChem 2 2 , 73, 2094.
(23) Schoch, J.; Staudt, M.; Samanta, A.; Wiessler, M.; Jaschke, A. Bioconjug Chem 20 12, 23, 1382.
(24) Karver, . R.; Weissieder, R.; Hilderbrand, S.A. Angew Chem Int Ed Eng! 0 12, 51, 920.
(25) Bianco, A.; Elliott, T. S.; Townsley, F. M.; Pisa, R.; Davis, L.; Elsasser, S. J.; Ernst:, R. J.; Lang, K.; Sachdeva,
A.; Chin, J .W. Under Review,
(26) Yu, Z.; Pan, Y.; Wang, Z.; Wang, J.; Lin, Q. Angewandte Chemie Internationa! Edit) on 2 12, 57, 10600.
(27) a ber , D. N.; Nazarova, L. A.: Liang, Y.; Lopez, S. A.; Patterson, D. M.; Shih, . W.; Houk, . N. Prescher,
J . A. J Am Chem Sbc20 13, 735, 13680.
(28) Wang, .; Schmied, W. H.; Chin, J .W. Angew Chem int Ed Engl 20 12, 57, 2288.
(29)Wang, K.; Neumann, H.; Peak-Chew, S. Y.; Chin, J .W. Nature biotechnology 2007, 25, 770.
3 0 )Rackham, O.; Chin, J .W. Nature chemical biology 2005, 7, 159.
3 )Deiters, A,; Schultz, P. G. Bioorganic & Medicinal Chemistry Letter s 2 , 15, 1521.

claims
1. A polypeptide comprising an amino acid having a cydopropene group wherein
said cydopropene group is joined to theamino acid via a carbamate group.
2. A polypeptide according to claim 1 wherein said cydopropene group is a
1,3-disubstituted cydopropene.
3. A polypeptide according to claim 2 wherein said cydopropene is a
1,3-dimethylcyclopropene.
4. A polypeptide according to any of claims 1to 3 wherein said cydopropene group
is present as a residue of a lysine ami no acid.
5. A polypeptide according to any of claims 1to 4 further comprising a tetrazine
compound linked to said cydopropene group.
6. An amino acid comprising cydopropene wherein said cydopropene group is
joined to theamino acid moiety v a carbarn ate group.
7. An amino acid according to claim 6 wherein said cydopropene is a
1,3-disubstituted cydopropene.
8. An amino acid according o claim 7 wherein said cydopropene is a
1,3-dimethylcyclopropene.
9. An amino acid according to any of claims 6 to 8 wherein said amino acid is a
lysine ami no acid.
10. An amino acid according to claim 9 which comprises A -i((2-methylcydoprop-
2-en-1-yi)methoxy)carbonyl]-l-iysine.
11. An amino acid according to claim 10 which consists of
12. A method of producing a polypeptide comprising a cydopropene group wherein
said cydopropene group isjoined to the amino acid moiety via a carbamate group, said
method comprising genetically incorporating an amino add comprising acydopropene
group joined to theamino acid moiety viaacarbamategroup, into a polypeptide.
13. A method according to claim 12 wherein producing thepolypeptidecomprises
(i) providing a nucleic acid encoding the polypeptide which nucleic ac comprises
an orthogonal codon encoding theamino acid having a cydopropene group;
(ii) translating said nucleic acid in the presence of an orthogonal tRNA
synthetase-'' RNA pair capable of recognising said orthogonal codon and incorporating
said amino acid having a cydopropene group into the polypeptide chain.
14. A method according to claim 12 or claim 13 wherein said orthogonal codon
comprises an amber codon (TAG), said tRNA comprises RNACUA and said tRNA
synthetase comprises - Py RS; or wherein said orthogonal codon comprises an amber
codon (TAG), said tRNA comprises tRNAcuA and said tRNA synthetase comprises
Py!RS.
15. A method according to any of claims 12 to 14 wherein said amino acid
comprising a cydopropene group is an amino acid according to any of claims 6 to 11.
16. A method of producing apolypeptidecomprising atetrazinegroup, said method
comprising providing a polypeptide according to any of claims 1to 4, contacting said
po ypept de with atetrazineeompound, and incubating to allow joining of thetetrazine
to the cydopropene group by an n erse electron demand Die!s-A!der cycloaddition
reaction.
17. A method according to claim 16 wherein said reaction is allowed to proceed for
10 minutes or less, preferably for 1minute or less, preferably for 30 seconds or less.
18. A polypeptide according to any of claims 1to 5 wherein said polypeptide
comprises two or more amino acids each having acyciopropenegroup, wherein each
said cydopropene group is joined to each said amino acid v a a carbarn ate group.
19. A polypeptideaccording o claim 18 wherein said polypeptide comprises four
amino acids each having a cyciopropene group
20. An antibody drug conjugate (ADC) comprising a polypeptide according to any of
claims 1to 5, 18 or 19.
21. A compound, polypeptide or method substantially as described herein.
22. A compound, polypeptideor method substantially as described herein with
reference to the accompanying drawings.

Documents

Application Documents

# Name Date
1 Form 5 [09-08-2016(online)].pdf 2016-08-09
2 Form 20 [09-08-2016(online)].pdf 2016-08-09
3 Drawing [09-08-2016(online)].pdf 2016-08-09
4 Description(Complete) [09-08-2016(online)].pdf 2016-08-09
5 201617027126.pdf 2016-08-18
6 abstract.jpg 2016-09-03
7 Form 26 [29-09-2016(online)].pdf 2016-09-29
8 Form 3 [15-11-2016(online)].pdf 2016-11-15
9 201617027126-FORM 3 [17-08-2017(online)].pdf 2017-08-17
10 201617027126-FORM 18 [16-01-2018(online)].pdf 2018-01-16
11 201617027126-PA [16-07-2019(online)].pdf 2019-07-16
12 201617027126-ASSIGNMENT DOCUMENTS [16-07-2019(online)].pdf 2019-07-16
13 201617027126-8(i)-Substitution-Change Of Applicant - Form 6 [16-07-2019(online)].pdf 2019-07-16
14 201617027126-Further Evidence [29-07-2019(online)].pdf 2019-07-29
15 201617027126-RELEVANT DOCUMENTS [18-08-2020(online)].pdf 2020-08-18
16 201617027126-FORM 13 [18-08-2020(online)].pdf 2020-08-18
17 201617027126-AMENDED DOCUMENTS [18-08-2020(online)].pdf 2020-08-18
18 201617027126-FORM 4(ii) [24-08-2020(online)].pdf 2020-08-24
19 201617027126-OTHERS [26-11-2020(online)].pdf 2020-11-26
20 201617027126-FER_SER_REPLY [26-11-2020(online)].pdf 2020-11-26
21 201617027126-DRAWING [26-11-2020(online)].pdf 2020-11-26
22 201617027126-CORRESPONDENCE [26-11-2020(online)].pdf 2020-11-26
23 201617027126-COMPLETE SPECIFICATION [26-11-2020(online)].pdf 2020-11-26
24 201617027126-CLAIMS [26-11-2020(online)].pdf 2020-11-26
25 201617027126-PatentCertificate28-11-2020.pdf 2020-11-28
26 201617027126-IntimationOfGrant28-11-2020.pdf 2020-11-28
27 201617027126-Power of Authority [17-03-2021(online)].pdf 2021-03-17
28 201617027126-PETITION u-r 6(6) [17-03-2021(online)].pdf 2021-03-17
29 201617027126-Covering Letter [17-03-2021(online)].pdf 2021-03-17
30 201617027126-RELEVANT DOCUMENTS [30-09-2021(online)].pdf 2021-09-30
31 201617027126-FER.pdf 2021-10-17
32 201617027126-RELEVANT DOCUMENTS [30-09-2022(online)].pdf 2022-09-30
33 201617027126-RELEVANT DOCUMENTS [11-11-2023(online)].pdf 2023-11-11

Search Strategy

1 Search_23-01-2020.pdf

ERegister / Renewals

3rd: 17 Mar 2021

From 10/03/2017 - To 10/03/2018

4th: 17 Mar 2021

From 10/03/2018 - To 10/03/2019

5th: 17 Mar 2021

From 10/03/2019 - To 10/03/2020

6th: 17 Mar 2021

From 10/03/2020 - To 10/03/2021

7th: 17 Mar 2021

From 10/03/2021 - To 10/03/2022

8th: 17 Mar 2021

From 10/03/2022 - To 10/03/2023

9th: 03 Mar 2023

From 10/03/2023 - To 10/03/2024

10th: 09 Mar 2024

From 10/03/2024 - To 10/03/2025

11th: 26 Feb 2025

From 10/03/2025 - To 10/03/2026