Sign In to Follow Application
View All Documents & Correspondence

Expression Of Heterologous Proteins In Bacterial System Using A Novel Fusion Tag

Abstract: Fusion tag comprising C terminus amino acids of human Granulocyte Macrophage Colony Simulating Factor (GM CSF), START codon and enterokinase cleavage site and adapted to express other wise non expressible genes. A process for formation of the fusion tag comprising amplification of C terminus 45 amino acids from a full length human GM-CSF synthetic gene using gene specific forward primers which contains specific restriction site for providing START codon, and reverse primer containing sequences corresponding to enterokinase cleavage site, Fusion protein comprising fusion tag towards the N terminus, said fusion tag comprising C terminus amino acids of human Granulocyte Macrophage Colony Simulating Factor (GM CSF), START codon and enterokinase cleavage site and adapted to express other wise non expressible genes, and non GM peptide towards the C-terminus. A method of producing the fusion protein, comprising transforming a host cell with the DNA expression vector as defined in and expressing said fusion protein. A method of purifying or isolating the fusion protein, using affinity chromatography to purify or isolate the fusion protein.An expression vector comprising a DNA sequence coding for a fusion protein said fusion protein comprising a fusion tag being capable of being used in N- terminus tagging, wherein the fusion tag has the 45 amino acid of human GM CSF, START codon and enterokinase cleavage site.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
03 July 2008
Publication Number
2/2010
Publication Type
INA
Invention Field
BIOTECHNOLOGY
Status
Email
Parent Application

Applicants

LUPIN LIMITED
LUPIN LIMITED, 159 CST ROAD KALINA, SANTACRUZ (EAST), MUMBAI-400 098, STATE OF MAHARASHTRA, INDIA AND ALSO HAVING A PLACE OF BUSINESS AT 1/1, SASHI SHEKHAR BOSE ROAD, KOLKATA

Inventors

1. BANERJEE SAMPALI
LUPIN LIMITED (RESEARCH PARK), 46A/47A, VILLAGE NANDE, TALUKA MULSHI, PUNE 411042
2. APTE DESHPANDE, ANJALI
LUPIN LIMITED (RESEARCH PARK), 46A/47A, VILLAGE NANDE, TALUKA MULSHI, PUNE 411042
3. MANDI NAGANATH
LUPIN LIMITED (RESEARCH PARK), 46A/47A, VILLAGE NANDE, TALUKA MULSHI, PUNE 411042
4. PADMANABHAN SRIRAM
LUPIN LIMITED (RESEARCH PARK), 46A/47A, VILLAGE NANDE, TALUKA MULSHI, PUNE 411042

Specification

Field of Invention
The present invention is related to fusion tag comprising amino acids of human origin
and having a very high potential to aid in expression of foreign genes. More particularly
the invention relates to a fusion tag which comprises C-terminus domain of human
Granulocyte-macrophage Colony Stimulating Factor (hGMCSF), START codon and
enterokinase cleavage site and adapted for expression of otherwise non-expressible
genes. The said tag is capable of expression of genes with rare codons in prokaryotic
cells without supply of rare codons. The present invention further relates to the said
fusion tags being used in protein purification. The present invention relates to expression
vector comprising the fusion tag. The present invention further relates to fusion protein
comprising the fusion tag at its N terminus and nonGM peptide at the C terminus The
invention also relates to method of formation of the vector and kit comprising the said
fusion tag.
Background and prior Art
With the advent of recombinant DNA technology, production of heterologous proteins in
E. coli has become the choice for expression systems to most of the researchers. It has
certain advantages over other systems, viz., very well characterized, relatively simple
genetics, high growth and production rate, low cost etc. However, the disadvantages are
not less, like no post-translational modification and most importantly high level of
expression causes the formation of insoluble protein aggregates (known as inclusion
bodies) which are mostly inactive [Harrison: Innovations, 11:4-7 (2000)]. In order to get
an active protein, optimization of the expression conditions or the refolding studies are
required which could be time consuming and cost intensive. Harrison teaches in his
paper new fusion protein systems based on solubility model. Three E.coli fusion proteins
are identified BFR, GrpE and NusA that gave a high level of solubility when expressed
as a fusion with target heterologous protein human interlukin-3 (hlL-3). Apart from
solubility problem, many mammalian proteins cannot be expressed successfully in E. coli
which leaves researcher to either explore expression in a wide range of E. coli host or
with different temperature and fusion tags [Cabrita et. al.,BMC Biotechnology 6 12-19
2006]. Cabrita in his document teaches rationally designed T 7 based E coli expression
vectors that include incorporation of solubility tags to assist in expression/purification, a
TEV cleavable sequence and a LIC sequence. So, there are many commercial and non
commercial E. coli expression vectors available that incorporate fusion tags (both for
purification and enhanced solubility).

Literature evidences show that formation of secondary structures in transcribed mRNA
reduces expression of heterologous genes. These secondary structures interfere with
the binding of ribosome with mRNA thereby prevent efficient translation initiation. These
deleterious secondary structures more likely occur due to short-range RNA-RNA
interactions. An N-terminal tag may provide a reliable context for efficient translation by
reducing the possibility of secondary structure formation in mRNA and thus may improve
the yield of recombinants proteins [Trends in Biotechnology (2005), 23, 316-320].
Also sequence determinants at both N- and C- termini of proteins can influence their
stability towards protease degradation. In some cases, fusion tags might improve the
yield of recombinant proteins by rendering them more resistant to intracellular proteases.
Therefore, one approach to deal with this difficulty has been to express proteins as N- or
C- terminal fusion tags. There are several fusion tags available for the ease of
expression and purification of recombinant proteins. The smallest fusion tag available is
His-tag (6-10 aa) [JBC 263 7211-7215 1988], but have potential problems of leakage of
Ni2+ ions used for purification. The other tags available are thioredoxin (109 aa)
[Biotechnology 11 187-193 1993], GST-tag (236 aa) [Gene 67 31-40 1988], MBP-tag
(363 aa) [Eur J Biochem 171 541-549 1988], NusA (435 aa) [Biotechnol Bioeng 65 382-
388 1999]etc.
As mentioned in US 7261895 there are number of the epitope tags commercially
available. Often they are incorporated into the expression vectors for mammalian, insect,
yeast or bacterial cells. A variety of epitope tags are available, including c-myc, FLAG,
HA, His6, T7-Tag, HSV-Tag, Pk-Tag, VSV-Tag, Glu-Glu, BTag and S-Tag. Most
epitopes that have been popular for the epitope tagging are highly charged (HA, c-myc,
FLAG). Since one generally aims to place the tag in the external portion of the target
protein, it is appropriate that the tag be charged rather that hydrophobic. However, the
tag of extreme or inappropriate charge could cause the problems in some cases, for
example, if a basic domain of a protein is tagged with an acidic sequence. The epitope
tags without highly charged amino acids are T7-Tag and BTag. The E2Tag consists of
ten 10 amino acid residues, SSTSSDFRDR. US '895 designated this sequence as
E2Tag. All these ten amino acids are required and are sufficient for strong interaction
with monoclonal antibody 3F12. The first half of the sequence consists of polar amino
acids, and the second half contains charged amino acid residues resulting in very
hydrophilic peptide. Such highly hydrophilic sequences have the strong antigenicity and
are correspondingly likely to adopt a highly exposed conformation in the three-
dimensional folding of a protein.

Most of these tags are affinity tags and facilitates the purification of the fused proteins.
Some of them (Trx, NusA etc.) are also reported to increase the solubility of the proteins
when overexpressed [Harrison: Innovations, 11:4-7 (2000)] and there are many
commercial or non-commercial E.coli expression vectors available that incorporates
fusion tags (for both purification and solubility). Most of these tags are large in size and
of microbial origin.
Some proteins are known to get expressed in E. coli only after supply of rare codons in
the host which are known to play regulatory function [Annals New York Academy of
Sciences,782: 79-86,1996; Kanakura et. al., Blood 77 1033-1043 1991]. Thus there is a
need to provide fusion tags comprising eukaryotic genes which may be expressed in E.
coli without supply of rare codons which play regulatory role.
Human Granulocyte macrophage colony stimulating factor (hGM-CSF) is a glycoprotein
growth factor which includes the proliferation of hematopoitic progenitor cells and
functionally activates mature granulocytes and monocytes. The processed human GM-
CSF is a 127 amino acid polypeptide of which about 5-10 amino acids at the amino and
the carboxy end are not essential while synthetic residues of 31 to 113 was biologically
active. [Kanakura et. al., Blood 77 1033-1043 1991]
US 6379661 teaches novel polypeptides possessing part or all of the primary structural
conformation and one or more of the biological properties of a mammalian (e.g., human)
pluripotent GRANULOCYTE colony-stimulating factor ("hpG-CSF") which
are characterized in preferred forms by being the product of prokaryotic or eukaryotic
host expression of an exogenous DNA sequence. Sequences coding for
part or all of the sequence of amino acid residues of hpG-CSF or for
analogs thereof may be incorporated into autonomously replicating plasmid or
viral vectors employed to transform or transfect suitable prokaryotic or eukaryotic host
cells such as bacteria, yeast or vertebrate cells in culture. Products of expression of the
DNA sequences display, e.g., the physical and immunological properties and in vitro
biological activities of isolates of hpG-CSF derived from natural sources. Disclosed also
are chemically synthesized polypeptides sharing the biochemical and immunological
properties of hpG-CSF. There is disclosure of formation of the peptide of mammalian
origin and its expression in bacterial host. It shows formation of vectors with the DNA
of GRANULOCYTE colony-stimulating factor ("hpG-CSF"). Thus there is teaching that
DNA from mammalian host may be expressed in the E. coli and that the characters are

retained. However at the end of Example 4 it is mentioned that in order to express the
genes in the host, it requires the special codon like start codon to be present.
Specifically it mentions that the sequence is not readily susceptible for securing direct
expression of hpG-CSF in a microbial host and that to secure such expression, the hpG-
CSF coding region should be provided with an initial ATG codon and the sequence
should be inserted in a transformation vector at a site under control of a suitable
promoter-regulator DNA sequence. Accordingly though this art teaches that
mammalian peptides could be expressed in microbial host, it requires specific start
codons, in order to be expressed in the prokaryotic cells.
US 5359035 teaches bifunctional fusion protein comprising IL 2 and GM CSF which are
linked together. The said protein is capable of being expressed in bacteria. In this art the
inventors have fused both IL2 and GMCSF together via a linker amino acid sequence. In
that fusion construct, both the proteins are complete and retain their respective biological
activities. However there is no teaching that few amino acids from the C terminus could
be used to form a fusion protein, which would provide enhanced expression of otherwise
non expressible genes.
US 2008/096249 in one of the embodiments this documents teaches formation of fusion
proteins with tags where the tag includes a cleavage site.
EP1599589 (WO 2004/076670) teaches fusion proteins where tagging
is carried out by biotinylation of the N terminal of peptides. This
is used in fusion proteins and use of such tags increases the
expression of genes in the host cells. This document though discloses enhancement of
expression by use of fusion tags. However the said tagging is carried out by post
translational manipulation i.e the fusion polypeptide needs to be biotinylated after
synthesis for easy purification.
The GM fusion protein does not need any post-translational modification for purification.
It is well known that the GM-CSF gene contains a heparin-binding domain and that
domain lies at the C-terminus of the gene. The present inventors have found that this
property can be exploited to use the said C terminus genes as fusion tag and purify the
tagged proteins described here in a single step.

Objects of the Invention
An object of the present invention is to provide a fusion tag comprising 45 amino acids of
C terminus of the human GM CSF with a START codon and an enterokinase cleavage
site and adapted to express otherwise non-expressible genes
Another object is to provide a fusion tag such that it is capable of expression in a cell
without requirement of rare codons.
A further object is to provide a fusion protein comprising fusion tag comprising 45 amino
acids of C terminus of the human GM CSF with a START codon and enterokinase
cleavage site and adapted to express otherwise non-expressible genes and non GM
protein
Another object of the invention is to provide a vector comprising the fusion tag.
A further object is to provide a process for preparation of the fusion tag
Another object is to provide a method of producing the fusion protein comprising the
following steps: a) transforming a host cell with the DNA expression vector and b)
expressing said fusion protein.
A further object is to provide a method of purifying protein with the fusion tag
Another object is to provide a kit comprising the expression vector comprising the fusion
tag
Summary of the Invention
An aspect of the present invention is to provide a fusion tag of 45 amino acids of the C
terminus portion of human Granulocyte Macrophage Colony Stimulating Factor, START
codon and enterokinase cleavage site and adapted to express otherwise non expressible
genes in prokaryotes.
According to an aspect there is provided a fusion protein comprising fusion tag towards
the N terminus, said fusion tag comprising C terminus amino acids of human
Granulocyte Macrophage Colony Simulating Factor (GM CSF), START codon and
enterokinase cleavage site and adapted to express other wise non expressible genes
when cloned at C-terminus of the fusion tag.

According to another aspect there is provided an expression vector comprising the fusion
tag. The vector may have one or more cloning sites.
According to further aspect there is provided a process for formation of the fusion tag
comprising amplification of the C terminus 45 amino acids from a full length human GM-
CSF synthetic gene using gene specific forward primers which contains specific
restriction site for providing START codon and reverse primer containing sequences
corresponding to enterokinase cleavage site.
According to another aspect there is provided a method of producing the fusion protein
comprising transforming a host cell with the DNA expression vector as defined above
and expressing said fusion protein.
According to a further aspect there is provided a method of purifying protein with the
fusion tag, said process comprising cloning the protein of interest to the expression
vector, exposing to heparin sepharose, eluting the bound protein and ultimately obtaining
the protein of interest bound to the heparin by the fusion tag which has affinity for binding
to heparin.
According to another aspect there is provided a kit comprising the expression vector
comprising the fusion tag.
Brief description of accompanying drawings
Figure 1 :Amplification of C terminus of GM CSF forming the Fusion tag (GM tag)
Figure 2: colony PCR of pET21 a-GM transformant
Figure 3: Restriction digestion of two PCR positive clones- a new vector comprising
fusion tag (GM tag) of present invention and named pCGM
Figure. 4 Amplification of hlFN 2b
Figure. 5 : Colony PCR screening for IFN transformants
Fig. 6: Restriction analysis of PCR positive clone
Fig.7 SDS-PAGE analysis of rhIFN expression with or without GM tag in E. coli host
BL21(DE3) Codon plus cells.

Fig. 8 Expression of GM-/rhlFN in different E. co//expression hosts.
Fig. 9 SDS - PAGE analysis of hlL11 and hlL2 with and without GM tag, expressed in
BL21A1
Fig. 10 Immunoblot analysis of GM-fusion protein (for example GM - GCSF fusion
protein with antibodies against GMCSF and GCSF.
Fig. 11 Cloning of C-term GMCSF into pET21a to construct pCGM (5.55 kbp)
Fig. 12 Enterokinase Clevage site in commercial vectors
Fig. 13 Enterokinase Clevage site in the vector of present invention
Detailed Description of The Invention
In the present invention, C-terminus part of GM-CSF (the C-terminal part) is used as
fusion tag to aid the expression of an otherwise non-expressible gene. The portion of
GM-CSF may or may not retain the biological activity of full length GMCSF. The idea of
using truncated GM-CSF as N-terminus fusion protein is hitherto unknown. This domain
of hGM-CSF has been found to have the properties to be used as fusion partner to
achieve high level expression or to escape the requirement of rare codons for certain
genes in a host prokaryotic, preferably bacterial host. In the fusion protein, the C-
terminus domain of hGM-CSF i.e the fusion tag, is located towards the N-terminus of the
fusion protein and a non-GM peptide is located towards the C-terminus of the fusion
protein.
The use of the antibody to the tag provides the rapid method for identifying, isolating,
purifying and quantifying the amount of the fusion protein. The antibody to the tag can be
affixed onto sepharose or other beads suitable for the column separation. The sample
containing the chimeric protein is passed over the column containing the sepharose
beads, or the binding is performed in batch by the end-over-end mixing, and the fusion
protein having the tag is bound to the beads coupled by the specific antibody. It is
followed by series of washes, the protein is eluted from the column using the standard
elution techniques or the expression and/or the function of the fusion protein is studied
by the matrix-attached tagged protein.
The tag of the invention is used to detect the protein expression in the transformed
bacteria as well as in the transfected eukaryotic cells. The tag is also used to identify the

cellular localization of the fusion protein in the intact cell and to demonstrate the
applicability of the tag for identifying the tracking of the protein through the cellular milieu
by immunofluorescnence using commercially available anti hGMCSF antibody.
The fusion tag has an affinity to bind to heparin [Sebollela et. al., Journal of Biological
Chemistry 280 31049-31956; 2005] and thus can be purified by affinity chromatography
using immobilized heparin sepharose matrices.
As used herein the term "fusion tag" refers to the peptide having 45 amino acids of the C
terminus of human GM CSF. As used herein, the term "tagging" refers to introducing by
recombinant methods one or more nucleotide sequences encoding a peptide tag into a
polypeptide encoding gene. "Fusion protein" refers to the protein whose N terminus is
formed by the fusion tag comprising the C terminus portion of human GM CSF and a non
GM peptide at the C terminus.
This is the first report of a smallest fusion tag of human origin useful for expression of an
otherwise non-expressible genes. This fusion tag contains an enterokinase cleavage site
and has been designed in such a way that the fused peptide comprising the said tag can
be obtained with no extra amino acid at N-terminus after enterokinase cleavage. The
fusion tag has an affinity to bind to heparin [Sebollela et. al., Journal of Biological
Chemistry 280 31049-31956; 2005] and thus can be purified by affinity chromatography
using immobilized heparin sepharose matrices. The fusion tag increases the expression
of heterologous protein either by reducing the probability of formation of secondary
structures near the ribosome binding site and thereby helps in efficient translation
initiation or by increasing mRNA or protein stability.
The fusion tag has the following nucleotide sequence (SEQ ID 1)
5'atgcactacaagcagcactgccctccaaccccggaaacttcctgtgcaacccagattatc
acctttgaaagtttcaaagagaacctgaaggactttctgcttgtcatcccctttgactgc
tgggagccagtcggatccgatgatgatgataaa3'
The corresponding amino acid sequence of the fusion tag is as under (SEQ ID 2)
MHYKQHCPPTPETSCATQII
TFESFKENLKDFLLVIPFDC
WEPVGSDDDDK (DDDDK is the EK cleavage site)

Alternatively expressed as Met His Tyr Lys Gln His Cys Pro Pro Thr Pro Glu Thr Ser
Cys Ala Thr Gln lle lle Thr Phe Glu Ser Phe Lys Glu Asn Leu Lys Asp Phe Leu Leu Val
He Pro Phe Asp Cys Trp Glu Pro Val Gly Ser Asp Asp Asp Asp Lys
According to the present invention the fusion tag does not require rare codons for certain
genes like hlFN 2b since the same cannot be expressed in ordinary host without supply
of rare codons which are a prerequisite for interferon expression in E. coli [Annals New
York Academy of Sciences; Olivares-Trjo et. Al, Molecular Microbiology, 2003, 49, 1043-
1049].
It has been shown that genes containing rare codons (with respect to E. coli codon
preference) near the initiation codon AUG, inhibit cell growth and protein synthesis due
to ribosome stalling and premature release of specific peptidyl-tRNAs from the ribosome
at the rare codons [Molecular Micribiology (2003), 49,1043-1049].
Accordingly presence of rare codon specific tRNAs in abundance is not sufficient for
expression of a gene with rare codons, RNA stability and secondary structure has an
immense effect on the expression of these genes [Biochemical and Biophysical
Research Communications (2004), 313, 89-96].
The present fusion tag facilitates the expression of the gene containing rare codons
either by altering the RNA secondary structure formation or by increasing the mRNA
stability, without additional supply of tRNAs corresponding to rare codons
Fusion tag of present invention (GM tag) increases the expression of heterologous
protein either by reducing the probability of formation of secondary structures near the
ribosome binding site and there by helps in efficient translation initiation or by increasing
mRNA or protein stability.
It is known that bacterial protein synthesis initiates at START codon (ATG in case of E.
coli) and majority of the genes of mammalian origin, are expressed along with signal
peptide which later gets processed to yield mature peptide. Therefore, in most of the
cases mature peptides do not contain methionine (coded by ATG) as the first amino acid
(like hGCSF). To express these mature peptides in E. coli, one has to provide START
codon for efficient initiation of protein synthesis. In the present invention, the GM
fusion tag is provided with a START codon and any gene can be cloned at the C-
terminus of GM tag that may or may not contain ATG depending on the sequence of the
mature peptide and the tag is capable of expressing.

According to another aspect an expression vector comprising a DNA sequence coding
for a fusion protein said fusion protein comprising a fusion tag being capable of being
used in N- terminus tagging, wherein the fusion tag has the 45 amino acid of human GM
CSF with a STRT codon and enterokinase cleavage site. Further, a gene of interest to
be tagged with said fusion tag, expressing protein of interest directly cleavable from
cleavage site without intervening amino acids. The vector is such that the gene of
interest is expressed in higher amount which was otherwise non- expressible without a
tag. The vector is such that the gene of interest is expressed without supplement of
START codons (if mature peptide does not have methionine as first amino acid) in the
host unlike other vectors known in the art. Such vector designated as pCGM may be
constructed and made available to public. Further, the fusion construct (i.e. vector pCGM)
could be made available commercially.
Formation of the expression fusion tag:
Specific primers are designed to amplify the C-terminus GM-CSF from a synthetic gene
construct corresponding to human GMCSF as described in Example"!. The forward
primer contains sequence corresponding to Ndel restriction site. This is capable to
incorporate the START codon in the fusion tag The reverse primer is designed to have
BamHI restriction site followed by a stretch of sequences that code for Enterokinase
cleavage site and then an EcoRI site. Thus the enterokinase cleavage site is
incorporated in the fusion tag and the fusion tag comprising the 45 amino acid of the C
terminus of hGMCSF, START codon and the enterokinase cleavage site is formed. This
is illustrated in Figure 11.
Formation of the vectorfrom the fusion tag
Schematic presentation of the vector construct with GM -tag (I) containing enterokinase
(II) cleavage site is shown in figure 11. Commercially available expression vector is
digested with specific enzymes and ligated to the fusion tag of the present invention. The
ligation mix is introduced into appropriate E. coli host by a method known as
transformation. Gene of interest (GOI, III) is expressed as N- terminus GM fusion and
could be cleaved by enterokinase to obtain protein of interest without any extra amino
acid at N -terminus since the tag does not have any additional amino acid at this end.
The gene of interest (GOI), to be cloned as the C terminus protein and the GM fusion tag
of present invention forms the N terminus, is PCR amplified using gene specific 5'primer
containing BamHI restriction site followed by sequence corresponding to EK site and 3'

primer containing EcoRI site (see example 2). This amplicon is cloned into pCGM at
BamHI/EcoRI sites.
The vector with fusion tag is such that no extra amino acids at the N terminus of the
synthesized protein after enterokinase cleavage. This vector with the fusion tag is
capable of being used to clone any foreign genes.
The construct has been designed in such a way that after EK cleavage, the next amino
acid is the first amino acid of the foreign protein. For most of the commercially available
vectors after EK cleavage, the resulting protein contains extra amino acid at their N-
terminus originated from vector and not of native protein. In commercial vectors, fusion
protein contains at least two extra amino acids (Glycine and Serine) as encoded by
nucleotide SEQ ID 3 GAT GAT GAT GAT AAA GGA TCC ATG and having
amino acid sequence SEQID4 D D D D K G G M out of which
enetrokinase cleavage site is formed of D D D D K and the BamHI site with G as
described in figure 12.
On the other hand the vector comprising the fusion protein of the present invention as
encoded by nucleotide SEQ ID 5 GGA TCC GAT GAT GAT GAT AAA ATG and
having amino acid sequence SEQ ID6 G S D D D D K M out of
which enetrokinase cleavage site is formed of D D D D K and the Bam site with G and S
as described in figure 13.
A further aspect is to provide a method of producing a fusion protein comprising
transforming a host cell with the DNA expression vector as defined above and
expressing said fusion protein. The host cell is prokaryotic cell, preferably an E. coli.
According to another aspect there is provided a method of purifying or isolating the
protein of interest using fusion tag, wherein affinity chromatography using immobilized
heparin is carried out.
The method of identifying, purifying or isolation of the fusion protein may be carried out
by using an antibody raised against the fusion tag (more precisely, against full length
hGMCSF) using immunoblotting to identify the tagged protein of interest.
In yet another aspect of this invention the kit for the protein tagging is disclosed. The kit
for the tagging comprises reagents like Heparin sepharose (immobilized column material)

for tagged protein purification specific for the peptide tag of the invention. In the further
embodiment the kit comprise additionally the DNA expression vector comprising DNA
coding for the peptide tag of the invention. In another embodiment the kit comprises the
expression vector comprising the sequences coding for the peptide tag of the invention
and having at least one cloning site. The vector may have multiple cloning sites. The
peptide/ protein of interest/ to be purified is cloned with the expression vector. It is then
run over the heparin sepharose. In view of the affinity for the heparin the fusion tags will
bind to it and along with it the protein if interest will also be bound. This would facilitate
separation/ purification of proteins using the fusion tag of the present invention.
Advantages of the fusion tag
—This tag also enhances/improves the expression of already expressing foreign
proteins
by nearly 20%.
----GM fusion tag capable of expression of genes containing rare codons, in prokaryotes
without additional rare codon specific tRNA supplementation.
----Heparin binding domain for affinity purification
-The tag could be easily cleaved off by enterokinase to get
authentic N terminal of the protein of interest
-Could be easily detected by commercially available anti-h GM-CSF antibody.
-Tagged fusion protein could be more active than the native protein and may be of
potential use in diagnostics
-The tag could be less immunogenic as it is of human origin
-Since the tag is small, there is a possibility of the fusion protein being
more protease resistant.
The invention is now described by way of non limiting illustrative examples
Example 1A
Preparation of GM tag
GM tag is the 45 amino acids long C-terminus part of human GM-CSF gene. To prepare
GM tag, the tag was amplified from a full length human GM-CSF synthetic gene using
gene specific primers as mentioned below.

The forward primer was designed in such a way that it contains an Ndel restriction site
(which provides a START codon (ATG) at the N-terminus of the tag).
The reverse primer contains two restriction sites, namely EcoRI and BamHI to enable the
construction of the vector and subsequent cloning of the genes of interest. Sequences
corresponding to enterokinase (EK) cleavage site have been incorporated in the reverse
primer flanking between the two above mentioned restriction sites.
Forward primer: 5' ccg ccg gaa ttc cat atg cac tac aag cag cac tgc cct cca 3' (SEQ ID 7)
(Ndel restriction site comprising the START codon is indicated by underline)
Reverse primer: 5' ccg ccg gaa ttc ttt ate atc atc ate gga tcc gac tgg ctc cca gca gtc 3'.
(SEQ ID 8)
(EcoRI and BamHI restriction sites are indicated by underline and italics, respectively.
Thick bar above the sequence corresponds to EK site).
Using this set of primers, the GM tag was amplified using PCR, the amplicon was
purified using gel extraction kit from Sigma following manufacturer's protocol and
digested with Ndel and EcoRI (Fig 1). The digested amplicon is ready to use for
construction of the vector.
Example 1 B
Construction of the fusion tag vector (pCGM)
Commercially available expression vector pET21a (from Novagen) was digested with
Ndel & EcoRI enzymes and ligated to the digested fragment of C-terminus GM prepared
as given in experimental 1A. The ligation mix was introduced into appropriate E. coli
host by a method known as transformation.
The transformants were screened by PCR using same set of primers (fig.2) and PCR
positive clones were validated by restriction digestion with Ndel/Hindlll which would
result in release of ~200 bp long fragment (fig. 3). This restriction analysis confirms the
construction of a new vector containing GM tag and hence named as pCGM.
Example 1C
Construction of fusion protein
To construct the fusion protein, the gene corresponding to the protein, is amplified using
specific primers that contains BamHI site followed by EK site in the forward primer and

and EcoRI in the reverse primer. The PCR amplified product the digested with same set
of enzymes and ligated to pCGM as BamHI/EcoRI fragment.
Example 2
Cloning of rhIFN 2b as N terminal GM fusion tag - fusion protein with a peptide of
interest +tag
The present example is aimed at cloning hlFN 2b as N-terminal GM fusion and
investigate the requirement of rare codon of GM-hlFN 2b while expressing in E. coli.
For the purpose the expression level of IFN 2b with and without fusion tag was
compared and the change in the solubility pattern of the protein was assessed.
The hlFND2b gene was amplified with specific primers using synthetic gene as template
(fig- 4).

The amplicon was digested and ligated to the pCGM vector at BamHI/EcoRI site that
contains GM tag peptide as N-terminal fusion including EK site. The clones were
screened by colony PCR using same set of primers (fig. 5) The PCR positive clones
were confirmed by restriction digestion with BamHI/ EcoRI and Ndel/ Hindlll that would
release ~ 500bp and ~638 bp, respectively (fig. 6). GM tag without any foreign gene
protein was taken as control.
Example 3
Expression of GM-tagged rhlFNa2b
GM-hlFN clones as formed in example 2 were introduced into E. coli expression host
BL21 (DE3) codon Plus for expression studies. HIFND2b clone without GM fusion was
used as control. The cells were induced with 1mM IPTG for expression of IFN[ 2b and
induction was continued for 4 hours at 37°C. After induction, soluble and insoluble
fractions were separated and analysed on SDS PAGE (fig. 7).

It was found that IFN is completely insoluble when expressed alone (fig. 7 lane: 21a-IFN),
in contrast, it is ~10% soluble when expressed as GM tag fusion (fig. 7 lane: GM-IFN).
Yield of rhIFN production is ~ 20% higher when expressed as GM-fusion compared to
expression without fusion (fig. 7, right panel lanes: GM-IFN and 21D-IFN, respectively).
Accordingly it could be seen that other wise insoluble protein could be made soluble
when tagged with the fusion peptide in the fusion protein of the present invention.
Example 4
IFN 2b expression in different E. coli hosts
IFN2b was expressed as native and tagged fusion in three different E. coli hosts.
As evident from the fig. 8 (Lane: 1, 2, 3 : Native IFN in BL21A1, BL21(DE3) and
BL21(DE3) codon+ respectively. Lane: 4, 5, 6: GM IFN fusion in BL21A1, BL21(DE3)
and BL21(DE3) codon+ respectively), IFN does not express in hosts, which do not
supplement rare codons (lanes land 2).
However, GM-IFN expression was seen in regular E. coli hosts that does not supply rare
codons (fig. 8, lane 4).
Example 5
GM fusion facilitates expression of those proteins that are otherwise
difficult to express
It is known some of the genes specifically of human origin are very difficult to express in
E.coli hosts, one of the alternatives is to express these with either N- or C terminus
fusion tags. Here, the expression of hlL2 and hlL11 is studied in bacterial hosts. For that,
both the genes corresponding to mature peptide were cloned into pET21a vector as
Ndel/Hindlll fragments.
Clones of untagged hlL11 and hlL2 did not express in tested bacterial hosts, like BL21A1,
BL21 DE3, BL21 DE3 Codon Plus, etc.(data not shown). The fusion protein with the GM
fusion tag is used to express these proteins. Both hlL11 and IL2 were cloned as GM
fusion peptide similarly as IFNα2b and introduced into E. coli host BL21A1 for
expression studies and cells were induced with 13mM Arabinose at 37oC.

Fig. 9, lanes 2 and 4 shows the expression of GM-IL11 and GM-IL2 , respectively and
not expressed in the same host when untagged (Fig. 10, lanes 3and 5).
Example 6
Detection of GM-fusion protein on immunoblot with commercially available
antibody against hGMCSF
GCSF is cloned in pCGM vector as GM-GCSF fusion as in example 1 and expression
was carried out in E. coli expression host BL21A1 as before.
The immunoblot analysis was done with both antiGCSF and antiGMCSF antibodies.
Fig. 10 shows that GM-GCSF is recognised by GCSF antibody (land) as well as by
GMCSF antibody (lane 3).
As expected, untagged GCSF is recognised only by GCSF antibody (lane 2) and not by
GMCSF antibody (lane 4).

We Claim:
1. Fusion tag comprising C terminus amino acids of human Granulocyte
Macrophage Colony Simulating Factor (GM CSF), START codon and
enterokinase cleavage site and adapted to express other wise non expressible
genes.
2. The fusion tag as claimed in claim 1 comprising 45 amino acids of the C terminus
of human GM CSF.
3. The fusion tag as claimed in any of claims 1or 2 comprising nucleotide sequence
ID 1 as under:
5'atgcactacaagcagcactgccctccaaccccggaaacttcctgtgcaacccagattatc
acctttgaaagtttcaaagagaacctgaaggactttctgcttgtcatcccctttgactgc
tgggagccagtcggatccgatgatgatgataaa3'
4. The fusion tag as claimed in any of claims 1or 2 comprising amino acid sequence
ID 2 as under:
MHYKQHCPPTPETSCATQII
TFESFKENLKDFLLVIPFDC
WEPVGSDDDDK
out of which D D D D K at the carboxy end is the enterokinase cleavage site.
5. The fusion tag as claimed in any preceding claims capable of expression of
genes containing rare codons, in prokaryotes without additional
supplementation of specific tRNAs corresponding to rare codons.
6. The fusion tag as claimed in any preceding claims wherein no additional amino
acids are present at N-terminus end beyond the enterokinase cleavage site of the
fusion tag.
7. A process for formation of fusion tag comprising amplification of C terminus 45
amino acids from a full length human GM-CSF synthetic gene using gene specific
forward primers which contains specific restriction site for providing START
codon, and reverse primer containing sequences corresponding to enterokinase
cleavage site.

8. Fusion protein comprising fusion tag towards the N terminus, said fusion tag
comprising C terminus amino acids of human Granulocyte Macrophage Colony
Simulating Factor (GM CSF), START codon and enterokinase cleavage site and
adapted to express other wise non expressible genes, and non GM peptide
towards the C-terminus.
9. An expression vector comprising a DNA sequence coding for a fusion protein
said fusion protein comprising a fusion tag being capable of being used in N-
terminus tagging, wherein the fusion tag has the 45 amino acid of human GM
CSF, START codon and enterokinase cleavage site.
10. The vector as claimed in claim 9, further comprising a gene of interest to be
tagged with said fusion tag, resulting protein directly cleavable from cleavage site
without intervening amino acids.
11. The vector as claimed in claim 10 wherein the gene of interest is such that the
same is otherwise non- expressible.
12. The vector as claimed in claim 10 wherein the gene of interest is such that it can
be expressed without supplement of rare codons in the host.
13. A method of producing a fusion protein as defined in claim 8, comprising the
following steps: a) transforming a host cell with the DNA expression vector as
defined in claim 9, and b) expressing said fusion protein.
14. The method as defined in claim 13, wherein the host cell is prokaryotic cell,
preferably an E. coli.
15. A method of purifying or isolating the fusion protein of claim 8, wherein affinity
chromatography is used to purify or isolate the fusion protein.
16. The method as claimed in claim 15 wherein affinity chromatography is carried out
using immobilized heparin.
17. A method of identifying, purifying or isolation the fusion protein of claim 8 by using
an antibody raised against hGMCSF.

18. The method of claim 17, wherein immunoblotting is used to identify the fusion
protein.

Fusion tag comprising C terminus amino acids of human Granulocyte Macrophage
Colony Simulating Factor (GM CSF), START codon and enterokinase cleavage site and
adapted to express other wise non expressible genes. A process for formation of the
fusion tag comprising amplification of C terminus 45 amino acids from a full length
human GM-CSF synthetic gene using gene specific forward primers which contains
specific restriction site for providing START codon, and reverse primer containing
sequences corresponding to enterokinase cleavage site, Fusion protein comprising
fusion tag towards the N terminus, said fusion tag comprising C terminus amino acids of
human Granulocyte Macrophage Colony Simulating Factor (GM CSF), START codon
and enterokinase cleavage site and adapted to express other wise non expressible
genes, and non GM peptide towards the C-terminus. A method of producing the fusion
protein, comprising transforming a host cell with the DNA expression vector as defined in
and expressing said fusion protein. A method of purifying or isolating the fusion protein,
using affinity chromatography to purify or isolate the fusion protein.An expression vector
comprising a DNA sequence coding for a fusion protein said fusion protein comprising a
fusion tag being capable of being used in N- terminus tagging, wherein the fusion tag
has the 45 amino acid of human GM CSF, START codon and enterokinase cleavage
site.

Documents

Application Documents

# Name Date
1 01164-kol-2008-sequence listing.pdf 2011-10-07
1 1164-KOL-2008-AbandonedLetter.pdf 2017-10-07
2 01164-kol-2008-gpa.pdf 2011-10-07
2 1164-KOL-2008-FER.pdf 2017-03-27
3 Form 13 [15-09-2016(online)].pdf 2016-09-15
3 01164-kol-2008-form 3.pdf 2011-10-07
4 Other Document [15-09-2016(online)].pdf 2016-09-15
4 01164-kol-2008-form 2.pdf 2011-10-07
5 1164-KOL-2008-(08-02-2013)-CORRESPONDENCE.pdf 2013-02-08
5 01164-kol-2008-form 1.pdf 2011-10-07
6 1164-KOL-2008-(08-02-2013)-FORM-1.pdf 2013-02-08
6 01164-kol-2008-drawings.pdf 2011-10-07
7 1164-KOL-2008-(08-02-2013)-FORM-13.pdf 2013-02-08
7 01164-kol-2008-description complete.pdf 2011-10-07
8 1164-KOL-2008-(08-02-2013)-OTHERS.pdf 2013-02-08
8 01164-kol-2008-correspondence others.pdf 2011-10-07
9 01164-kol-2008-claims.pdf 2011-10-07
9 1160-KOL-2008-FORM-18.pdf 2012-07-07
10 01164-kol-2008-abstract.pdf 2011-10-07
10 1164-KOL-2008-FORM-1.118.pdf 2012-07-07
11 01164-kol-2008-abstract.pdf 2011-10-07
11 1164-KOL-2008-FORM-1.118.pdf 2012-07-07
12 01164-kol-2008-claims.pdf 2011-10-07
12 1160-KOL-2008-FORM-18.pdf 2012-07-07
13 01164-kol-2008-correspondence others.pdf 2011-10-07
13 1164-KOL-2008-(08-02-2013)-OTHERS.pdf 2013-02-08
14 01164-kol-2008-description complete.pdf 2011-10-07
14 1164-KOL-2008-(08-02-2013)-FORM-13.pdf 2013-02-08
15 01164-kol-2008-drawings.pdf 2011-10-07
15 1164-KOL-2008-(08-02-2013)-FORM-1.pdf 2013-02-08
16 01164-kol-2008-form 1.pdf 2011-10-07
16 1164-KOL-2008-(08-02-2013)-CORRESPONDENCE.pdf 2013-02-08
17 01164-kol-2008-form 2.pdf 2011-10-07
17 Other Document [15-09-2016(online)].pdf 2016-09-15
18 Form 13 [15-09-2016(online)].pdf 2016-09-15
18 01164-kol-2008-form 3.pdf 2011-10-07
19 1164-KOL-2008-FER.pdf 2017-03-27
19 01164-kol-2008-gpa.pdf 2011-10-07
20 1164-KOL-2008-AbandonedLetter.pdf 2017-10-07
20 01164-kol-2008-sequence listing.pdf 2011-10-07

Search Strategy

1 searchstrategy_09-03-2017.pdf