Specification
CHEMICAL COMPOUNDS
This invention relates particularly to gene directed enzyme prodrug therapy (GDEPT) using in situ antibody generation to provide enhanced selectivity, particularly for use in cancer therapy.
Known gene therapy based prodrug therapeutic approaches include virus-directed enzyme prodrug therapy (VDEPT) and gene-directed enzyme prodrug therapy (GDEPT)j, the latter term encompassing both VDEPT and non-viral delivery systems. VDEPT involves targeting tumour cells with a viral vector carrying a gene which codes for an enzyme capable of activating a prodrug. The viral vector enters the tumour cell and enzyme is expressed from the enzyme gene inside the cell. In GDEPT, alternative approaches such as microinjection, liposomal delivery and receptor mediated DNA uptake as well as viruses may be used to deliver the gene encoding the enzyme.
In both VDEPT and GDEPT the enzyme gene can be transcriptionally regulated by DNA sequences capable of being selectively activated in mammalian cells e.g. tumour cells (EP 415 731 (Wellcome); Huber et al, Proc. Natl. Acad. Sci. USA, 88, 8039-8043,1991). While giving some degree of selectivity, gene expression may also occur in non-target cells and this is clearly undesirable when the approach is being used to activate prodrugs into potent cytotoxic agents. In addition these regulatory sequences will generally lead to reduced expression of the enzyme compared with using viral promoters and this will lead to a reduced ability to convert prodrug in the target tissue.
Expression and localisation of the prodrug activating enzyme inside the cell has disadvantages. Prodrug design is severely limited by the fact that the prodrug has to be able to be able to cross the cell membrane and enter the cell but not be toxic until it is converted to the drug inside the cell by the activating enzyme. Most prodrugs utilise hydrophilic groups to prevent cell entry and thus reduce cytotoxicity. Prodrug turnover by activating enzyme produces a less hydrophilic drug which can enter cells to produce anti-cancer effects. This approach can not be used when the activating enzyme is expressed inside the cell. Another disadvantage is that target cells which lack intracellular activating enzyme will be difficult to attack because they are unable to generate active drug. To achieve this desirable "bystander activity" (or "neighbouring cell kill"), the active drug will have to be capable of diffusion out of the cell containing activating enzyme to reach target cells which lack enzyme expression.
Many active drugs when produced inside a cell will be unable to escape from the cell to achieve this bystander effect.
Modifications of GDEPT have been put forward to overcome some of the problems described above. Firstly vectors have been described which are said to express the activating enzyme on the surface of the target cell (WO 96/03515) by attaching a signal peptide and transmembrane domain to the activating enzyme. The approach, if viable, would overcome
the problems of having the activating enzyme located inside the cell but would still have; to
I, rely on transcriptionally regulated sequences capable of being selectively expressed in target
cells to restrict cell expression. As described above there are disadvantages of using such sequences. Secondly vectors have been described which result in secretion of the enzyme from the target cell (WO 96/16179). In this approach the enzyme would be able to diffuse away from its site of generation since it is extracellular and not attached to the cell surface. Enzyme which has diffused away from the target site would be capable of activating prodrug at non-target sites leading to unwanted toxicity. To achieve some selectivity it is suggested that enzyme precursors could be used which are cleaved by pathology associated proteases to form active enzyme. Some selectivity is likely to be achieved by this approach but its unlikely that activation will only occur at target sites. In addition, once activated, the enzyme will still be free to diffuse away from the target site and thus suffer from the same drawback described above.
For GDEPT approaches, three levels of selectivity can be observed. Firstly, there is selectivity at the cell infection stage such that only specific cell types are targeted. For example cell selectivity can be provided by the gene delivery system per se. An example of this type of selectivity is set out in International Patent Application WO 95/26412 (UAB Research Foundation) which describes the use of modified adenovirus fiber proteins incorporating cell specific ligands. Other examples of cell specific targeting include ex vivo gene transfer to specific cell populations such as lymphocytes and direct injection of DNA into muscle tissue.
The second level of selectivity is control of gene expression after cell infection such as for example by the use of cell or tissue specific promoters. If the gene has been delivered to a cell type in a selective manner then it is important that a promoter is chosen that is compatible with activity in the cell type.
The third level of selectivity can be considered as the selectivity of the expressed gene construct. Selectivity at this level has received scant attention to date. In International patent application WO 96/16179 (Wellcome Foundation) it is suggested that enzyme precursors could be used which are cleaved by pathology associated proteases to form active enzyme. Some selectivity is likely to be achieved by this approach but it is unlikely that activation will only occur at target sites. In addition, once activated, the enzyme will still be free to diffuse away from the target site and thus suffer from the same drawback of activating prodrug gt non-target sites leading to unwanted toxicity.
There exists a need for more selective GDEPT systems to reduce undesirable effects in normal tissues arising from erroneous prodrug activation.
The present invention is based on the discovery that antibody-heterologous enzyme gene constructs can be expressed intracellularly and used in GDEPT systems (or other systems such as AMIRACS - see below) for cell targeting arising from antibody specificity to deliver cell surface available enzyme in a selective manner. This approach may be used optionally in combination with any other suitable specificity enhancing technique(s) such as targeted cell infection and/or tissue specific expression.
According to one aspect of the present invention there is provided a gene construct encoding a cell targeting antibody and a heterologous enzyme for use as a medicament in a mammalian host wherein the gene construct is capable of expressing the antibody and enzyme as a conjugate within a target cell in the mammalian host and wherein the conjugate can leave the cell thereafter for selective localisation at a cell surface antigen recognised by the antibody.
According to another aspect of the present invention there is provided a gene construct encoding a cell targeting moiety and a heterologous prodrug activating enzyme for use as a medicament in a mammalian host wherein the gene construct is capable of expressing the cell targeting moiety and enzyme as a conjugate within a target cell in the mammalian host and wherein the conjugate is directed to leave the cell thereafter for selective localisation at a cell surface antigen recognised by the cell targeting moiety.
The "cell targeting moiety" is defined as any polypeptide or fragment thereof which selectively binds to a particular cell type in a host through recognition of a cell surface antigen. Preferably the cell targeting moiety is an antibody. Cell targeting moieties other then antibodies include ligands as described for use in Ligand Directed Enzyme Prodrug
Therapy as described in International patent application WO 97/26918, Cancer Research Campaign Technology Limited, such as for example epidermal growth factor, heregulin, c-erbB2 and vascular endothelial growth factor with the latter being preferred.
A "cell targeting antibody" is defined as an antibody or fragment thereof which selectively binds to a particular cell type in a host through recognition of a cell surface antigen. Preferred cell targeting antibodies are specific for solid tumours, more preferably colorectal tumours, more preferably an anti-CEA antibody, more preferably antibody A5
Similar studies can be used to demonstrate that the antibody-enzyme gene delivered in an appropriate gene delivery vector to established LoVo tumours produced from non-transfected parental LoVo cells when used in combination with the PGP prodrug can result in significant anti-tumour activity. Thus non-transfected LoVo cells are injected into athymic nude mice (1 X 107 tumour cells per mouse) and once the tumours are 5-7 mm in diameter the vector containing the antibody-enzyme fusion protein gene is injected intra-tumourally. After 1 -3 days to allow the antibody-enzyme fusion protein to be expressed by, and bind to, the LoVo tumour cells, the PGP prodrug is administered as described above. This results in significant anti-tumour activity compared with controls.
Example 13
Construction of an (806.077 Fab-CPG2)2 fusion protein
The construction of a (806.077 Fab-CPG2)2 enzyme fusion was planned with the aim of obtaining a bivalent human carcinoembryonic antigen (CEA) binding molecule which also exhibits CPG2 enzyme activity. To this end the initial construct was designed to contain an 806.077 antibody heavy chain Fd fragment linked at its C-terminus via a flexible (048)3 peptide linker to the N-terminus of the CPG2 polypeptide (as shown in Figure 1 but substituting 806.077 in place of A5B7).
The antibody 806.077 (described in International Patent Application WO 97/42329, Zeneca Limited) binds with a very high degree of specificity to human CEA. Thus the 806.077 antibody is particularly suitable for targeting colorectal carcinoma or other CEA antigen bearing cells.
In general, antibody (or antibody fragment)-enzyme conjugate or fusion proteins should be at least divalent, that is to say capable of binding at least 2 tumour associated antigens (which may be the same or different). In the case of the (806.077 Fab-CPG2)2 fusion
protein, dimerisation of the enzyme component takes place (after expression, as with the native enzyme) thus forming an enzymatic molecule which contains two Fab antibody fragments (and is thus bivalent with respect to antibody binding sites) and two molecules of CPG2 (Figure 2a).
a) Cloning of the 806.077 antibody genes
Methods for the cloning and characterisation of recombinant murine 806.077 F(ab')2 antibody have been published (International Patent Application WO 97/42329, Example\T). Reference Example 7.5, describes cloning of the 806.077 antibody variable region genesi into Bluescriptâ„¢ KS+ vectors. These vectors were subsequently used as the source of the 806.077 variable region genes for the construction of 806.077 chimaeric light and heavy chain Fd genes.
b) Chimaeric 806.077 antibody vector constructs
International Patent Application WO 97/42329, Example 8 describes the cloning of the 806.077 chimaeric light and heavy chain Fd genes in the vectors pNG3-Vkss-HuCk-NEO (NCIMB deposit no. 40799) and pNG4-VHss-Hu!gG2CHl' (NCIMB deposit no. 40797) respectively. The resulting vectors were designated pNG4/VHss806.077VH-IgG2CHl' (806.077 chimaeric heavy chain Fd1) and pNG3/VKss806.077VK-HuCK-NEO (806.077 chimaeric light chain). These vectors were the source of the 806.077 antibody genes for the construction of the 806.077 Fab-CPG2 fusion protein.
c) Construction of the 806.077 heavy chain Fd-CPG2 fusion protein gene
The cloning and construction of the CPG2 gene used are described in Example 1, sections c and d. Similarly, the construction of the pNG4/A5B7VH-IgG2CHl/CPG2 R6 vector, which was used for the constuction of the 806.077 heavy chain Fd-CPG2, is described in Example 1, section e. The 806.077 variable heavy chain gene was removed from the pNG4/VHss806.077VH-IgG2CHI1 vector by digestion with restriction enzymes Hindlll and Nhel and a band of the expected size (approximately 300 b.p) which contained the variable region gene was purified. The same restriction enzymes (Hindll/Nhel) were used to digest the vector pNG4/A5B7VH-IgG2CHl/CPG2 R6 in preparation for the substitution of the 806.077 variable region for that of the A5B7 antibody. After digestion, the DNA was dephosphorylated then the larger vector band was separated and purified. The similarly restricted variable region gene fragment was then ligated in to this prepared vector and the ligation mix transformed into E. coli. DNA was prepared from the clones obtained and
analysed by restriction digest analysis and subsequently sequenced to confirm the fusion gene sequence. A number of the clones were found to be correct and one of these clones, pNG4/VHss806VH-IgG2CHl/CPG2 R6, was chosen for further work. The sequence of the 806.077 heavy chain Fd-CPG2 fusion protein gene created is shown SEQ ID NOS 25 and 26.
d) Co-transfection, transient expression and analysis of fusion protein
The plasmids pNG4/VHss806.077VH-IgG2CHl/CPG2 R6 (encoding the antibody chimaeric Fd-CPG2 fusion protein) and pNG3/VHss806.077VK-HuCK-NEO (encoding the antibody chimaeric light chain) were co-transfected into COS-7 cells using a LIPOFECTINâ„¢ based procedure described in Example If above. Analysis of the fusion protein was performed as described in Example Ig. The HPLC based enzyme activity assay clearly showed CPG2 enzyme activity to be present in the cell supernatant and both the anti-CEA ELISA assays exhibited binding of protein at levels commensurate with a bivalent 806.077 antibody molecule. The fact that the anti-CEA ELISA detected with an anti-CPG2 reporter antibody also exhibited clear CEA binding indicated that not only antibody but also antibody-CPG2 fusion protein was binding CEA. Western blot analysis with both reporter antibody assays clearly displayed a (806.077 Fab-CPG2)2 fusion protein subunit of the expected approximately 90 kDa size with only a small amount of degradation or smaller products (such as Fab or enzyme) observable. Since CPG2 is only known to exhibit enzyme activity when it is in a dimeric state it and since only antibody enzyme fusion protein is present, this indicates that the 90 kDa fusion protein (seen under SDS/PAGE conditions) dimerises via the natural CPG2 dimerisation mechanism to form a 180 kDa dimeric antibody-enzyme fusion protein molecule (Figure 2a) in "native" buffer conditions. Furthermore, this molecule exhibits both CPG2 enzymatic activity and CEA antigen binding properties which do not appear to be significantly different in the fusion protein compared with enzyme or antibody alone.
e) Construction of a (806.077 Fab-CPG2)2 fusion protein coexpression vector for use
in transient and stable cell line expression
For a simpler transfection methodology and the direct coupling of both expression cassettes to a single selection marker, a co-expression vector for fusion protein expression was constructed using the existing vectors pNG4/VHss806.077VH-IgG2CHl/CPG2 (encoding the antibody Fd-CPG2 fusion protein) and pNG3/VKss806.077VK-HuCK-NEO (encoding the antibody light chain). The pNG4/VHss806.077VH-IgG2CHl/CPG2 plasmid was first digested with the restriction enzyme Seal, the linear vector band purified, digested with the
restriction enzymes Bglll and BamHI and a desired band (approximately 2700 b.p.) purified. The plasmid pNG3/VKss806.077VK-HuCK-NEO was digested with the restriction enzyme BamHI after which the DNA was dephosphorylated and the vector band purified. The heavy chain expression cassette fragment was ligated in to the prepared vector and the ligation mix transformed into E. coll. The orientation was checked by a variety of restriction digests and clones selected which had the heavy chain cassette in the same direction as that of the light chain. This plasmid was termed pNG3-806.077-CPG2/R6-coexp.-NEO.
Example 14
Construction of a (55.1 scFv-CPG2)2 fusion protein
The 55.1 antibody, described in the United States Patent 5,665,357, recognises the CAS5.1 tumour associated antigen which is expressed on the majority of colorectal tumours and is only weakly expressed or absent in normal colonic tissue. The determination of the 55.1 heavy and light chain cDNA sequences is described in Example 3 of the aforementioned US patent. A plasmid expression vector allowing the secretion of antibody fragments into the periplasm ofE.coli utilizing a single pelB leader sequence (pICI266) has been deposited as accession number NCIMB 40589 on 1 !Oct93 under the Budapest Treaty at the National Collections of Industrial and Marine Bacteria Limited (NCIMB), 23 St. Machar Drive, Aberdeen, AB2 1RY, Scotland, U.K. This vector was modified as described in Example 3.3a of United States Patent 5,665,357 to create pICI1646; this plasmid was used for cloning of various 55.1 antibody fragments as described in further subsections of Example 3, including the production of a 55.1 scFv construct which was designated pICI1657.
The pICI1657 (otherwise known as pICI-55.1 scFv) was used as the starting point for the construction of the (55.1 scFv-CPG2)2 fusion protein. The 55.1 scFv gene was amplified using the oligonucleotides CME 3270 and CME 3272 (SEQ ID NOS: 27 and 28 respectively) and the plasmid pICI1657 as the template DNA. The resulting PCR product band of about 790 b.p. was purified. Similarly the pNG4/A5B7VH-IgG2CHl/CPG2 R6 plasmid described in Example le above was used as the template DNA in a standard PCR reaction to amplify the CPG2 gene using the oligonucleotide primers CME 3274 and CME 3275 (SEQ ID NOS: 29 and 30 respectively). The expected PCR product band of about 1200 b.p. was purified.
A further PCR reaction was performed to join (or splice) the two purified PCR reaction products together. Standard PCR reaction conditions were used using varying
amounts (between 0.5 to 2 ul) of each PCR product but utilising 25 cycles (instead of the usual 15 cycles) with the oligonucleotides CME 3270 and CME 3275 (SEQ ID NOS: 27 & 30). A reaction product of the expected size (approximately 2000 b.p.) was excised, purified and eluted in 20 ul F^O, digested using the restriction enzyme EcoRI and purified. The
vector pNG4/VHss806.077VH-IgG2CHl/CPG2 was prepared to receive the above PCR product by digestion with restriction enzyme EcoRI, dephosphorylated, the larger vector band separated from the smaller fragment and purified. The similarly restricted PCR product ^vas ligated in to the prepared vector and the ligation mix transformed into E. coli. DNA was prepared from the clones obtained and analysed by Hindlll/NotI restriction digestion to check for correct fragment orientation and appropriate clones subsequently sequenced to confirm the fusion gene sequence. A number of the clones with the correct sequence were obtained and one of these clones was given the plasmid designation pNG4/55.1scFv/CPG2 R6. The DNA and amino acid sequences of the fusion protein are shown in SEQ ID NOS: 31 and 32.
Example 15
Modification of the plasmid pNG4/55.1scFv/CPG2 R6 to facilitate scFv gene exchange
During the construction of pNG4/55.1scFv/CPG2 R6 a unique BspEI (isoschizomer of AccIII) was introduced into the flexible (G4S)3 linker coding sequence, situated between the antibody and CPG2 genes. To facilitate cloning of alternative scFv constructs the EcoRI site 3' of the CPG2 gene in the pNG4/55.1scFv/CPG2 R6 was deleted in order to enable insertion of alternative scFv antibody genes in frame, both behind the plasmid signal sequence and 5' of the CPG2 gene, via a EcoRI/BspEI fragment cloning. This modification was achieved by PCR mutagenesis in which first the pNG4/55.1scFv/CPG2 R6 was amplified using oligonucleotides CME 3903 and CME 3906 (SEQ ID NOS: 33 and 34 respectively). Secondly, the pNG4/55.1scFv/CPG2 R6 was again amplified but using oligonucleotides CME 4040 and CME 3905 (SEQ ID NOS: 35 and 36 respectively). The first expected PCR product band of about 420 b.p. was purified. The second PCR reaction was similarly treated and the expected PCR product band of about 450 b.p. purified.
A further PCR reaction was performed to join (or splice) the two purified PCR reaction products together. Standard PCR reaction conditions were used using varying amounts (between 0.5 to 2 ul) of each PCR product but utilising between 15 and 25 cycles
with oligonucleotides CME 3905 and CME 3906 (SEQ ID NOS: 36 & 34). A reaction product of the expected size (approximately 840 b.p.) was purified, digested using the restriction enzymes NotI and Xbal and the expected fragment band of ca.460 b.p. was purified.
The original pNG4/55.1scFv/CPG2 R6 was prepared to receive the above PCR product by digestion with restriction enzymes NotI and Xbal, dephosphorylated and the larger vector band separated from the smaller fragment. The vector band was purified and subsequently the similarly restricted PCR product was ligated in to the prepared vector and the ligation mix transformed into E. coli. DNA was prepared from the clones obtained and analysed by EcoRI restriction digestion to check for insertion of the modified fragment and appropriate clones subsequently sequenced to confirm the sequence change. A number of clones with the correct sequence were obtained and one of these clones was given the plasmid designation pNG4/55.1scFv/CPG2 R6/del EcoRI. This mutation removes the EcoRI site which was 3' of the CPG2 gene and simultaneously introduces an additional stop codon. The DNA sequence of the fusion protein gene up to, and including the two stop codons, are shown in SEQ ID NO: 37.
Example 16
Construction of an 806.077 scFv antibody gene
The 806.077 scFv was created using vectors pNG4/VHss806.077VH-IgG2CHl' and pNG3/VKss806.077VK-HuCK-NEO which are sources for 806.077 VH and VK variable region genes. The 806.077 VH gene was amplified from the pNG4/VHss806.077VH-IgG2CHl' plasmid using standard PCR conditions with the oligonucleotides CME 3260 and CME 3266 (SEQ ID NOS: 39 and 40 respectively). The 806.077 VK was amplified from the pNG3/VKss806.077VK-HuCK-NEO plasmid using oligonucleotides CME 3262 and CME 3267 (SEQ ID NOS: 41 and 42 respectively). The VH and VK PCR reaction products were purified.
A further PCR reaction was performed to join (or splice) the two purified PCR reaction products together. Standard PCR reaction conditions were used using varying amounts (between 0.5 to 2 jal) of each PCR product but utilising between 15 and 25 cycles with the flanking oligonucleotides oligonucleotides CME 3260 and CME 3262 (SEQ ID NOS: 39 & 41). A reaction product of the expected size (approximately 730 b.p.) was
purified, digested using the restriction enzymes Ncol and Xhol and an expected fragment band of about 720 b.p. purified.
The pICI1657 plasmid (otherwise known as pICI-55.1 scFv) had been further modified by the insertion of a double stranded DNA cassette produced from the two oligonucleotides CME 3143 and CME 3145 (SEQ ID NOS: 45 and 46) between the existing Xhol and EcoR restriction sites by standard cloning techniques to create the vector pICI266-55.1 scFv tag/his (the DNA sequence of the resulting 55.1 scFv tag/his gene is shown in SEQ ID NO: 47). This vector was prepared to receive the above PCR product by digestion with restriction enzymes Ncol and Xhol, dephosphorylated and the larger vector band separated from the smaller fragment. The vector band was purified and subsequently the similarly restricted PCR product was ligated in to the prepared vector and the ligation mix transformed into E. coli. DNA was prepared from the clones obtained and analysed by EcoRI restriction digestion to check for insertion of the modified fragment and appropriate clones subsequently sequenced to confirm the sequence change. A number of the clones with the correct sequence were obtained and one of these clones was given the plasmid designation pICI266/806IscFvtag/his (alternatively known as pICI266-806VH/VLscFvtag/his). The DNA and protein sequences of the 8061 scFvtag/his gene are shown in (SEQ ID NOS: 25 and 26).
Example 17
Construction of an (806.077 scFv-CPG2)2 fusion protein
The pICI266/806IscFvtag/his plasmid was used as the source for the 806scFv. The gene was amplified using oligonucleotides CME 3907 and CME 3908 (SEQ ID NOS: 48 and 49) and a band of the expected size purified. This fragment was then digested using the restriction enzymes EcoRI and BspEI after which an expected fragment band of about 760 b.p. was purified.
The pNG4/55.1scFv/CPG2 R6/del EcoRI plasmid was prepared to receive the above fragment by digestion with restriction enzymes EcoRI and BspEI, dephosphorylated and the larger vector band separated from the smaller fragment. The vector band was purified and subsequently the similarly restricted fragment ligated in to the prepared vector and the ligation mix was transformed into E. coli.. DNA was prepared from the clones obtained and analysed by EcoRI restriction digestion to check for insertion of the modified fragment. Appropriate
clones were subsequently sequenced to confirm the gene sequence. A number of the clones with the correct sequence were obtained and one of these clones was given the plasmid designation pNG4/806IscFv/CPG2 R6/del EcoRI. The DNA and protein sequence of the fusion protein gene 806IscFv/CPG2 R6 are shown in (SEQ ID NOS: 50 and 51).
Example 18
Co-transfection, transient expression of antibody-CPG2 fusion proteins
As described in Example If, plasmids encoding other fusion protein variants can be transfected using the given standard conditions in order to obtain transient expression of their encoded fusion protein from COS? cells. In the case of (Fab-CPG2)2 fusion proteins both co-transfection of appropriate plasmids or transfection of co-expression proteins can be performed. Similarly, the single expression plasmids of (scFv-CPG2)2 fusion proteins can be also be transfected by the same protocol. In each case a maximum total of 4 mg DNA are used in an individual transfection.
Example 19
Gene switches for protein expression
As described in Example 1 j, the use of tightly controlled but inducible gene switch systems such as the "TET on" or "TET off (Grossen, M. et al (1995) Science 268: 1766-1769) or the ecdysone/ muristerone A (No, D. et al (1996) PNAS 93 :3346-3351 ) may be used for the expression of fusion proteins. Appropriate methodology and cloning strategies as described in Example 5 may be used for antibody Fab-enzyme fusions requiring an IRES sequence for expression. Insertion of the appropriate gene cassette in to the switchable expression vectors may be used if the fusion protein product is a single polypeptide chain such as in scFv-enzyme constructs.
Example 20
Determination of the properties of COS? cell secreted antibody-enzyme fusion proteins
The COS? cell supernatant material can be analysed for the presence of antibody fusion proteins as described in Example Ig. Similarly the use of expressed fusion protein and CPG2 prodrug in an in vitro cytotoxicity assay can be performed as previously described in Example Ih. The HPLC based enzyme activity assay can show CPG2 enzyme activity to be
present in the cell supernatant and anti-CEA ELISA can be detected with an anti-CPG2 reporter antibody to confirm binding of protein at levels commensurate with a bivalent A5B7 antibody molecule and also to demonstrate that antibody-CPG2 fusion protein (not only just the antibody component) is binding CEA.
Western blot analysis with both reporter antibody assays clearly display a fusion protein subunit of the expected size. Since CPG2 is only known to exhibit enzyme activity
when it is in a dimeric state it and since only antibody enzyme fusion protein is present, this
/. indicates that the fusion protein (seen under SDS/PAGE conditions) dimerises via the natural
CPG2 dimerisation mechanism to form a dimeric antibody-enzyme fusion protein molecule in "native" buffer conditions. Furthermore, this molecule exhibits both CPG2 enzymatic activity and CEA antigen binding properties which do not appear to be significantly different in the fusion protein compared with enzyme or antibody alone. Results obtained from the cytotoxicity assay can demonstrate that antibody-enzyme fusion protein (together with prodrug) causes at least equivalent cell kill and resulted in lower numbers of cells at the end of the assay period than the equivalent levels of A5B7 F(ab')2-CPG2 conjugate (with the same prodrug). Since cell killing (above basal control levels) can only occur if the prodrug is converted to active drug by the CPG2 enzyme (and since the cells are washed to remove unbound protein, only cell bound enzyme will remain at the stage where the prodrug is added). Thus this experiment can demonstrate that at least as much of the (A5B7-CPG2 R6)2 fusion protein remains bound compared with conventional A5B7 F(ab)2-CPG2 conjugate as a greater degree of cell killing (presumably due to higher prodrug to drug conversion) occurs.
Example 21
In vitro and in vivo determination of the properies of antibody-enzyme fusion proteins
expressed from recombinant tumour cells
The construction of fusion protein expressing tumour cell lines can be performed as described in Example 4.
Retention of the fusion protein on the cell surface of recombinant LoVo tumour cells expressing antibody-enzyme fusion protein can be shown using the techniques described in Example 7. Selective killing of cultured LoVo tumour cells transfected with an antibody-CPG2 fusion protein gene by a prodrug that is converted by the enzyme into an active drug
can be demonstrated as described in Example 8. Establishment of antibody-enzyme fusion protein expressing LoVo tumours xenografts in athymic mice can be performed as described in Example 9. Determination of enzyme activity in tumour xenograft samples can also be determined as described in Example 10.
Determination enzyme activity in plasma samples performed as described in Example
11. The anti-tumour activity of PGP prodrug in LoVo tumours expressing the antibody-CPG''
fusion protein can be evaluated using the method described in Example 12.
The results from these experiments can be used to show that the antibody-CPG2 j fusion protein secreted from CEA positive tumour cell lines bind to the surface of the cells (via CEA) whereas the same protein expressed from CEA negative tumours shows no such binding.
These results can demonstrate that the transfected cells which express the antibody-CPG2 fusion protein can convert the PGP prodrug into the more potent active drug while non-transfected LoVo cells are unable to convert the prodrug. Consequently the transfected LoVo cells will be over 100 fold more sensitive to the PGP prodrug in terms of cell killing compared to the non-transfected LoVo cells, thus demonstrating that transfecting tumour cells with a gene for an antibody-enzyme fusion protein can lead to selective tumour cell killing with a prodrug.
Administration of PGP to LoVo tumours established from recombinant LoVo cells or recombinant Lovo/Parental LoVo cell mixes can result in a significant anti-tumour effect as judged by the PGP treated tumours decreasing in size compared to the formulation buffer only treated tumours and it taking a significantly longer time for the PGP treated tumours to reach 4 times their initial tumour volume compared with formulation buffer treated tumours. In contrast, administration of PGP to LoVo tumours established from non-transfected cells would result in no significant anti-tumour activity.
Similar studies can be used to demonstrate that the antibody-enzyme gene delivered in an appropriate gene delivery vector to established LoVo tumours produced from non-transfected parental LoVo cells when used in combination with the PGP prodrug can result in significant anti-tumour activity. Thus non-transfected LoVo cells are injected into athymic nude mice (1 X 107 tumour cells per mouse) and once the tumours are 5-7 mm in diameter the vector containing the antibody-enzyme fusion protein gene is injected intra-tumourally. After 1-7 days to allow the antibody-enzyme fusion protein to be expressed by, and bind to, the
LoVo tumour cells, the PGP prodrug is administered as previously described. This results in significant anti-tumour activity compared with control mice receiving formulation buffer instead of PGP prodrug.
Example 22
Preparation of (murine A5B7 Fab-CPG2)2 fusion protein
(Murine A5B7 Fab-CPG2)2 is expressed from COS-7 and CHO cells essentially as described in part (d) of Example 48 of International Patent Application WO 97/42329 (Zeneca Limited, published 13 November, 1997) by cloning the genes for A5B7 light chain and A5B7 Fd linked at its C-terminus via a flexible (648)3 peptide linker to CPG2 in the pEE14 co-expression vector.
The murine A5B7 light chain is isolated from pAF8 (described in part g of Reference Example 5 in International Patent Application WO 96/20011, Zeneca Limited). Plasmid pAF8 is cut with EcoRI and the resulting 732 bp fragment isolated by electrophoresis on a 1% agarose gel. This fragment is cloned into pEE14 (described by Bebbington in METHODS: A Companion to Methods in Enzymology (1991) 2, 136-145) similarly cut with EcoRI and the resulting plasmid used to transform E. coli strain DH5a. The transformed cells are plated onto L agar plus ampicillin (100 ng/ml). A clone containing a plasmid with the correct sequence and orientation is confirmed by DNA sequence analysis (SEQ ID NO: 57) and the plasmid named pEE14/A5B7muVkmuCK. The amino acid sequence of the encoded signal sequence (amino acid residues 1 to 22) and murine light chain ( amino acid residues 23 to 235) is shown in SEQ ID NO: 58.
The murine Fd-CPG2 gene is prepared from the R6 variant of the CPG2 gene (d of Example 1) and the murine A5B7 Fd sequence in pAFl (described in part d of Reference Example 5 in International Patent Application WO 96/20011, Zeneca Limited ). A PCR reaction with oligonucleotides SEQ ID NOS: 53 and 54 on pAFl gives a 247 bp fragment. This is cut with Hindlll and BamHI and cloned into similarly cut pUC19. The resulting plasmid is used to transform E. coli strain DH5ct. The transformed cells are plated onto L agar plus ampicillin (100 ^ig/ml). A clone containing a plasmid with the correct sequence is named pUC19/muCHl/NcoI-AccIII(Fd). A second PCR with oligonucleotides SEQ ID NOS: 55 and 56 on pNG/VKss/CPG2/R6-neo (Example 1) gives a 265 bp fragment which is cut with Hindlll and EcoRI and cloned into similarly cut pUC19 as above to give plasmid
pUC19/muCHl-linker-CPG2/AccIII-SacII. Plasmid pUC19/muCHl/NcoI-AccIII(Fd) is cut with HindlH and AccIII and the 258 bp fragment isolated by electrophoresis on a 1 % agarose gel. This fragment is cloned into HindlH and AccIII cut pUC19/muCHl-linker-CPG2/AccIII-SacII to give plasmid pUC19/muCHl-linker-CPG2/NcoI-SacII. A 956 bp fragment is ' isolated from pNG/VKss/CPG2/R6-neo by cutting it with SacII and EcoRI. This is cloned into SacII and EcoRI cut pUC19/muCHl-linker-CPG2/NcoI-SacII to give plasmid pUC19/muCHl-linker-RC/CPG2(R6). The complete gene construct is prepared by isolating a 498 bp Hindlll to Ncol fragment from pAFl and cloning it into Hindlll and Ncol cut pUC19/muCHl-linker-RC/CPG2(R6). The resulting plasmid is used to transform E. coli strain DH5a. The transformed cells are plated onto L agar plus ampicillin (100 ug/ml). A clone containing a plasmid with the correct sequence and orientation is confirmed by DNA sequence analysis (SEQ ID NO: 59) and the plasmid named pUC19/muA5B7-RC/CPG2(R6). The amino acid sequence of the encoded signal sequence (amino acid residues 1 to 19) and murine Fd-linker-CPG2 (amino acid residues 20 to 647) is shown in SEQ ID NO: 60. Alternatively, the CPG2 gene sequence described in Example 1 can be obtained by total gene synthesis and converted to the R6 variant as described in d of Example 1. In this case, the base residue C at position 933 in SEQ ID NO: 59 is changed to G. The amino acid sequence of SEQ ID NO: 60 remains unaltered.
For expression in the pEE14 vector, the gene is first cloned into pEE6 (this is a derivative of pEE6.hCMV - Stephens and Cockett, 1989, Nucleic Acids Research JJ, 7110, in which a Hindlll site upstream of the hCMV promoter has been converted to a Bglll site). Plasmid pUC19/muA5B7-RC/CPG2(R6) is cut with Hindlll and EcoRI and the 1974 bp fragment isolated by electrophoresis on a 1 % agarose gel. This is cloned into Hindlll and EcoRI cut pEE6 in E. coli strain DH5a to give plasmid pEE6/muA5B7-RC/CPG2(R6). The pEE14 co-expression vector is made by first cutting pEE6/muA5B7-RC/CPG2(R6) with Bglll and BamHI and isolating the 4320 bp fragment on a 1 % agarose gel. This fragment is cloned into Bglll and BamHI cut pEE14/A5B7muVkmuCK. The resulting plasmid is used to transform E. coli strain DH5a. The transformed cells are plated onto L agar plus ampicillin (100 (J.g/ml). A clone containing a plasmid with the correct sequence and orientation is confirmed by DNA sequence analysis and the plasmid named pEE14/muA5B7-RC/CPG2(R6).
For expression of (murine A5B7 Fab-CPG2)2, plasmid pEE14/muA5B7-RC/CPG2(R6) is used to transfect COS-7 or CHO cells as described in Example 48 of International Patent Application WO 97/42329, Zeneca Limited, published 13 November 1997. COS cell supernatants and CHO clone supernatants are assayed for activity as described in Example 1 and shown to have CEA binding and CPG2 enzyme activity.
Example 23 Pharmaceutical composition
The following illustrate a representative pharmaceutical dosage form containing a gene construct of the invention which may be used for therapy in combination with a suitable prodrug.
A sterile aqueous solution, for injection either parenterally or directly into tumour tissue, containing 107-10H adenovirus particles comprising a gene construct as described in Example 1. After 3-7 days, three 1 g doses of prodrug are administered as sterile solutions at hourly intervals. Prodrug is selected from N-(4-[N,N-bis(2-iodoethyl)amino]-phenoxycarbonyl)-L-glutamic acid, N-(4-[N,N-bis(2-chloroethyl)amino]-phenoxycarbonyl)-L-glutamic-gamma-(3,5-dicarboxy)anilide or N-(4-[N,N-bis(2-chloroethyl)amino]-phenoxycarbonyl)-L-glutamic acid or a pharmaceutically acceptable salt thereof.
SEQUENCE LISTING
(1) GENERAL INFORMATION:
(i) APPLICANT:
(A) NAME: Zeneca Limited
(B) STREET: 15 Stanhope Gate
(C) CITY: London
(D) STATE: England
(E) COUNTRY: United Kingdom
(F) POSTAL CODE (ZIP): W1Y 6LN
(G) TELEPHONE: 0171 304 5000
(H) TELEFAX: 0171 304 5151
(I) TELEX: 0171 304 2042
(ii) TITLE OF INVENTION: CHEMICAL COMPOUNDS (iii) NUMBER OF SEQUENCES: 60
(iv) COMPUTER READABLE FORM:
(A) MEDIUM TYPE: Floppy disk
(B) COMPUTER: IBM PC compatible
(C) OPERATING SYSTEM: PC-DOS/MS-DOS
(D) SOFTWARE: Patentln Release fl.O, Version #1.30 (EPO)
(vi) PRIOR APPLICATION DATA:
(A) APPLICATION NUMBER: GB 9709421.3
(B) FILING DATE: 10-MAY-1997
(2) INFORMATION FOR SEQ ID NO: 1:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 21 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1:
GGGAATTCCT CGAGGAGCTC C
(2) INFORMATION FOR SEQ ID NO: 2: (i) SEQUENCE CHARACTERISTICS: (A) LENGTH: 27 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2:
CCGGGGAGCT CCTCGAGGAA TTCCCGC 27
(2) INFORMATION FOR SEQ ID NO: 3: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3:
CAGAAGCGCG ACAACGTG 18
(2) INFORMATION FOR SEQ ID NO: 4: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 39 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4:
CGAGGCCTTG CCGGTGATCT GGACCTGCAC GTAGGCGAT 39
(2) INFORMATION FOR SEQ ID NO: 5: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 63 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5:
GGGGATGATG TTCGAGACCT GGCCGGCCTT GGCGATGGTC CACTGGAAGC GCAGGTTCTT 60
CGC 63
(2) INFORMATION FOR SEQ ID,NO: 6: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6:
CTTGCCGGCG CCCAGATC 18
(2) INFORMATION FOR SEQ ID NO: 7: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7:
GTCTCGAACA TCATCCCC 18
(2) INFORMATION FOR SEQ ID NO: 8: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8:
ATCACCGGCA AGGCCTCG 18
(2) INFORMATION FOR SEQ ID NO: 9: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1236 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid
(XI) SEQUENCE DESCRIPTION: SEQ ID NO: 9:
JVTGGATTTTC AAGTGCAGAT TTTCAGCTTC CTGCTAATCA GTGCTTCAGT CATAATGTCC 60
CGCGGGCAGA AGCGCGACAA CGTGCTGTTC CAGGCAGCTA CCGACGAGCA GCCGGCCGTG 120
^TCAAGACGC TGGAGAAGCT GGTCAACATC GAGACCGGCA CCGGTGACGC CGAGGGCATC 180
3CCGCTGCGG GCAACTTCCT CGAGGCCGAG CTCAAGAACC TCGGCTTCAC GGTCACGCGA 240
^GCAAGTCGG CCGGCCTGGT GGTGGGCGAC AACATCGTGG GCAAGATCAA GGGCCGCGGC 300
3GCAAGAACC TGCTGCTGAT GTCGCACATG GACACCGTCT ACCTCAAGGG CATTCTCGCG 360
QAGGCCCCGT TCCGCGTCGA AGGCGACAAG GCCTACGGCC CGGGCATCGC CGACGACAAG 420
3GCGGCAACG CGGTCATCCT GCACACGCTC AAGCTGCTGA AGGAATACGG CGTGCGCGAC 480
TACGGCACCA TCACCGTGCT GTTCAACACC GACGAGGAAA AGGGTTCCTT CGGCTCGCGC 540
GACCTGATCC AGGAAGAAGC CAAGCTGGCC GACTACGTGC TCTCCTTCGA GCCCACCAGC 600
GCAGGCGACG AAAAACTCTC GCTGGGCACC TCGGGCATCG CCTACGTGCA GGTCCAGATC 660
ACCGGCAAGG CCTCGCATGC CGGCGCCGCG CCCGAGCTGG GCGTGAACGC GCTGGTCGAG 720
GCTTCCGACC TCGTGCTGCG CACGATGAAC ATCGACGACA AGGCGAAGAA CCTGCGCTTC 780
CAGTGGACCA TCGCCAAGGC CGGCCAGGTC TCGAACATCA TCCCCGCCAG CGCCACGCTG 840
AACGCCGACG TGCGCTACGC GCGCAACGAG GACTTCGACG CCGCCATGAA GACGCTGGAA 900
GAGCGCGCGC AGCAGAAGAA GCTGCCCGAG GCCGACGTGA AGGTGATCGT CACGCGCGGC 960
CGCCCGGCCT TCAATGCCGG CGAAGGCGGC AAGAAGCTGG TCGACAAGGC GGTGGCCTAC 1020
TACAAGGAAG CCGGCGGCAC GCTGGGCGTG GAAGAGCGCA CCGGCGGCGG CACCGACGCG 1080
GCCTACGCCG CGCTCTCAGG CAAGCCAGTG ATCGAGAGCC TGGGCCTGCC GGGCTTCGGC 1140
TACCACAGCG ACAAGGCCGA GTACGTGGAC ATCAGCGCGA TTCCGCGCCG CCTGTACATG 1200
GCTGCGCGCC TGATCATGGA TCTGGGCGCC GGCAAG 1236
(2) INFORMATION FOR SEQ ID NO: 10:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 412 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10:
Met Asp Phe Gin Val Gin lie Phe Ser Phe Leu Leu lie Ser Ala Ser
15 10 15
Val lie Met Ser Arg Gly Gin Lys Arg Asp Asn Val Leu Phe Gin Ala
20 25 30
Ala Thr Asp Glu Gin Pro Ala Val lie Lys Thr Leu Glu Lys Leu Val
35 40 45
Asn lie Glu Thr Gly Thr Gly Asp Ala Glu Gly lie Ala Ala Ala Gly
50 55 60
Asn Phe Leu Glu Ala Glu Leu Lys Asn Leu Gly Phe Thr Val Thr Arg
65 70 75 80
Ser Lys Ser Ala Gly Leu Val Val Gly Asp Asn lie Val Gly Lys lie
85 90 95
Lys Gly Arg Gly Gly Lys Asn Leu Leu Leu Met Ser His Met Asp Thr
100 105 110
Val Tyr Leu Lys Gly lie Leu Ala Lys Ala Pro Phe Arg Val Glu Gly
115 120 125
Asp Lys Ala Tyr Gly Pro Gly lie Ala Asp Asp Lys Gly Gly Asn Ala
130 135 140
Val lie Leu His Thr Leu Lys Leu Leu Lys Glu Tyr Gly Val Arg Asp
145 150 155 160
Tyr Gly Thr lie Thr Val Leu Phe Asn Thr Asp Glu Glu Lys Gly Ser
165 170 175
Phe Gly Ser Arg Asp Leu lie Gin Glu Glu Ala Lys Leu Ala Asp Tyr
180 185 190
Val Leu Ser Phe Glu Pro Thr Ser Ala Gly Asp Glu Lys Leu Ser Leu
195 200 205
Gly Thr Ser Gly He Ala Tyr Val Gin Val Gin He Thr Gly Lys Ala
210 215 220
Ser His Ala Gly Ala Ala Pro Glu Leu Gly Val Asn Ala Leu Val Glu
225 230 235 240
Ala Ser Asp Leu Val Leu Arg Thr Met Asn He Asp Asp Lys Ala Lys
245 250 255
Asn Leu Arg Phe Gin Trp Thr He Ala Lys Ala Gly Gin Val Ser Asn
260 265 270
He He Pro Ala Ser Ala Thr Leu Asn Ala Asp Val Arg Tyr Ala Arg
275 280 285
Asn Glu Asp Phe Asp Ala Ala Met Lys Thr Leu Glu Glu Arg Ala Gin
290 295 300
Gin Lys Lys Leu Pro Glu Ala Asp Val Lys Val He Val Thr Arg Gly
305 310 315 320
Arg Pro Ala Phe Asn Ala Gly Glu Gly Gly Lys Lys Leu Val Asp Lys
325 330 335
Ala Val Ala Tyr Tyr Lys Glu Ala Gly Gly Thr Leu Gly Val Glu Glu
340 345 350
Arg Thr Gly Gly Gly Thr Asp Ala Ala Tyr Ala Ala Leu Ser Gly Lys
355 360 365
Pro Val He Glu Ser Leu Gly Leu Pro Gly Phe Gly Tyr His Ser Asp
370 375 380
Lys Ala Glu Tyr Val Asp He Ser Ala He Pro Arg Arg Leu Tyr Met
385 390 395 400
Ala Ala Arg Leu He Met Asp Leu Gly Ala Gly Lys
405 410
!2) INFORMATION FOR SEQ ID NO: 11: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 21 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11:
:CACTCTCAC AGTGAGCTCG G 21
2) INFORMATION FOR SEQ ID NO: 12: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 55 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12:
ACCGCTACCG CCACCACCAG AGCCACCACC GCCAACTGTC TTGTCCACCT TGGTG 55
(2) INFORMATION FOR SEQ ID NO: 13: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13:
ACCCCCTCTA GAGTCGAC 18
(2) INFORMATION FOR SEQ ID NO: 14: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 54 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14:
TCTGGTGGTG GCGGTAGCGG TGGCGGGGGT TCCCAGAAGC GCGACAACGT GCTG 54
(2) INFORMATION FOR SEQ ID NO: 15: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1929 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15:
ATGGAGTTGT GGCTGAACTG GATTTTCCTT GTAACACTTT TAAATGGTAT CCAGTGTGAG 60
GTGAAGCTGG TGGAGTCTGG AGGAGGCTTG GTACAGCCTG GGGGTTCTCT GAGACTCTCC 120
TGTGCAACTT CTGGGTTCAC CTTCACTGAT TACTACATGA ACTGGGTCCG CCAGCCTCCA 180
GGAAAGGCAC TTGAGTGGTT GGGTTTTATT GGAAACAAAG CTAATGGTTA CACAACAGAG 240
TACAGTGCAT CTGTGAAGGG TCGGTTCACC ATCTCCAGAG ATAAATCCCA AAGCATCCTC 300
TATCTTCAAA TGAACACCCT GAGAGCTGAG GACAGTGCCA CTTATTACTG TACAAGAGAT 360
AGGGGGCTAC GGTTCTACTT TGACTACTGG GGCCAAGGCA CCACTCTCAC AGTGAGCTCG 420
GCTAGCACCA AGGGACCATC GGTCTTCCCC CTGGCCCCCT GCTCCAGGAG CACCTCCGAG 480
AGCACAGCCG CCCTGGGCTG CCTGGTCAAG GACTACTTCC CCGAACCGGT GACGGTGTCG 540
TGGAACTCAG GCGCTCTGAC CAGCGGCGTG CACACCTTCC CGGCTGTCCT ACAGTCCTCA 600
GGACTCTACT CCCTCAGCAG CGTCGTGACG GTGCCCTCCA GCAACTTCGG CACCCAGACC 660
TACACCTGCA ACGTAGATCA CAAGCCCAGC AACACCAAGG TGGACAAGAC AGTTGGCGGT 720
GGTGGCTCTG GTGGTGGCGG TAGCGGTGGC GGGGGTTCCC AGAAGCGCGA CAACGTGCTG 780
TTCCAGGCAG CTACCGACGA GCAGCCGGCC GTGATCAAGA CGCTGGAGAA GCTGGTCAAC 840
ATCGAGACCG GCACCGGTGA CGCCGAGGGC ATCGCCGCTG CGGGCAACTT CCTCGAGGCC 900
GAGCTCAAGA ACCTCGGCTT CACGGTCACG CGAAGCAAGT CGGCCGGCCT GGTGGTGGGC 960
GACAACATCG TGGGCAAGAT CAAGGGCCGC GGCGGCAAGA ACCTGCTGCT GATGTCGCAC 1020
ATGGACACCG TCTACCTCAA GGGCATTCTC GCGAAGGCCC CGTTCCGCGT CGAAGGCGAC 1080
AAGGCCTACG GCCCGGGCAT CGCCGACGAC AAGGGCGGCA ACGCGGTCAT CCTGCACACG 1140
CTCAAGCTGC TGAAGGAATA CGGCGTGCGC GACTACGGCA CCATCACCGT GCTGTTCAAC 1200
ACCGACGAGG AAAAGGGTTC CTTCGGCTCG CGCGACCTGA TCCAGGAAGA AGCCAAGCTG 1260
GCCGACTACG TGCTCTCCTT CGAGCCCACC AGCGCAGGCG ACGAAAAACT CTCGCTGGGC 1320
ACCTCGGGCA TCGCCTACGT GCAGGTCCAG ATCACCGGCA AGGCCTCGCA TGCCGGCGCC 1380
GCGCCCGAGC TGGGCGTGAA CGCGCTGGTC GAGGCTTCCG ACCTCGTGCT GCGCACGATG 1440
AACATCGACG ACAAGGCGAA GAACCTGCGC TTCCAGTGGA CCATCGCCAA GGCCGGCCAG 1500
GTCTCGAACA TCATCCCCGC CAGCGCCACG CTGAACGCCG ACGTGCGCTA CGCGCGCAAC 1560
GAGGACTTCG ACGCCGCCAT GAAGACGCTG GAAGAGCGCG CGCAGCAGAA GAAGCTGCCC 1620
GAGGCCGACG TGAAGGTGAT CGTCACGCGC GGCCGCCCGG CCTTCAATGC CGGCGAAGGC 1680
GGCAAGAAGC TGGTCGACAA GGCGGTGGCC TACTACAAGG AAGCCGGCGG CACGCTGGGC 1740
GTGGAAGAGC GCACCGGCGG CGGCACCGAC GCGGCCTACG CCGCGCTCTC AGGCAAGCCA 1800
GTGATCGAGA GCCTGGGCCT GCCGGGCTTC GGCTACCACA GCGACAAGGC CGAGTACGTG 1860
GACATCAGCG CGATTCCGCG CCGCCTGTAC ATGGCTGCGC GCCTGATCAT GGATCTGGGC 1920
GCCGGCAAG 1929
(2) INFORMATION FOR SEQ ID NO: 16: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 643 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16:
Met Glu Leu Trp Leu Asn Trp lie Phe Leu Val Thr Leu Leu Asn Gly
15 10 15
lie Gin Cys Glu Val Lys Leu Val Glu Ser Gly Gly Gly Leu Val Gin
20 25 30
Pro Gly Gly Ser Leu Arg Leu Ser Cys Ala Thr Ser Gly Phe Thr Phe
35 40 45
Thr Asp Tyr Tyr Met Asn Trp Val Arg Gin Pro Pro Gly Lys Ala Leu
50 55 60
Glu Trp Leu Gly Phe He Gly Asn Lys Ala Asn Gly Tyr Thr Thr Glu
65 70 75 80
Tyr Ser Ala Ser Val Lys Gly Arg Phe Thr He Ser Arg Asp Lys Ser
85 90 95
Gin Ser He Leu Tyr Leu Gin Met Asn Thr Leu Arg Ala Glu Asp Ser
100 105 110
Ala Thr Tyr Tyr Cys Thr Arg Asp Arg Gly Leu Arg Phe Tyr Phe Asp
115 120 125
Tyr Trp Gly Gin Gly Thr Thr Leu Thr Val Ser Ser Ala Ser Thr Lys
130 135 140
Gly Pro Ser Val Phe Pro Leu Ala Pro Cys Ser Arg Ser Thr Ser Glu
145 150 155 160
Ser Thr Ala Ala Leu Gly Cys Leu Val Lys Asp Tyr Phe Pro Glu Pro
165 170 175
Val Thr Val Ser Trp Asn Ser Gly Ala Leu Thr Ser Gly Val His Thr
180 185 190
Phe Pro Ala Val Leu Gin Ser Ser Gly Leu Tyr Ser Leu Ser Ser Val
195 200 205
Val Thr Val Pro Ser Ser Asn Phe Gly Thr Gin Thr Tyr Thr Cys Asn
210 215 220
Val Asp His Lys Pro Ser Asn Thr Lys Val Asp Lys Thr Val Gly Gly
225 230 235 240
Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gin Lys Arg
245 250 255
Asp Asn Val Leu Phe Gin Ala Ala Thr Asp Glu Gin Pro Ala Val lie
260 265 270
Lys Thr Leu Glu Lys Leu Val Asn lie Glu Thr Gly Thr Gly Asp Ala
275 280 285
Glu Gly lie Ala Ala Ala Gly Asn Phe Leu Glu Ala Glu Leu Lys Asn
290 295 300
Leu Gly Phe Thr Val Thr Arg Ser Lys Ser Ala Gly Leu Val Val Gly
305 310 315 320
Asp Asn lie Val Gly Lys lie Lys Gly Arg Gly Gly Lys Asn Leu Leu
325 330 335
Leu Met Ser His Met Asp Thr Val Tyr Leu Lys Gly lie Leu Ala Lys
340 345 350
Ala Pro Phe Arg Val Glu Gly Asp Lys Ala Tyr Gly Pro Gly lie Ala
355 360 365
Asp Asp Lys Gly Gly Asn Ala Val lie Leu His Thr Leu Lys Leu Leu
370 375 380
Lys Glu Tyr Gly Val Arg Asp Tyr Gly Thr lie Thr Val Leu Phe Asn
385 390 395 400
Thr Asp Glu Glu Lys Gly Ser Phe Gly Ser Arg Asp Leu lie Gin Glu
405 410 415
Glu Ala Lys Leu Ala Asp Tyr Val Leu Ser Phe Glu Pro Thr Ser Ala
420 425 430
Gly Asp Glu Lys Leu Ser Leu Gly Thr Ser Gly lie Ala Tyr Val Gin
435 440 445
Val Gin He Thr Gly Lys Ala Ser His Ala Gly Ala Ala Pro Glu Leu
450 455 460
Gly Val Asn Ala Leu Val Glu Ala Ser Asp Leu Val Leu Arg Thr Met
465 470 475 480
Asn lie Asp Asp Lys Ala Lys Asn Leu Arg Phe Gin Trp Thr He Ala
485 490 495
Lys Ala Gly Gin Val Ser Asn He He Pro Ala Ser Ala Thr Leu Asn
500 505 510
Ala Asp Val Arg Tyr Ala Arg Asn Glu Asp Phe Asp Ala Ala Met Lys
515 520 525
Thr Leu Glu Glu Arg Ala Gin Gin Lys Lys Leu Pro Glu Ala Asp Val
530 535 540
Lys Val lie Val Thr Arg Gly Arg Pro Ala Phe Asn Ala Gly Glu Gly
545 550 555 560
Gly Lys Lys Leu Val Asp Lys Ala Val Ala Tyr Tyr Lys Glu Ala Gly
565 570 575
Gly Thr Leu Gly Val Glu Glu Arg Thr Gly Gly Gly Thr Asp Ala Ala
580 585 590
Tyr Ala Ala Leu Ser Gly Lys Pro Val lie Glu Ser Leu Gly Leu Pro
595 600 605
Gly Phe Gly Tyr His Ser Asp Lys Ala Glu Tyr Val Asp lie Ser Ala
610 615 620
lie Pro Arg Arg Leu Tyr Met Ala Ala Arg Leu He Met Asp Leu Gly
625 630 635 640
Ala Gly Lys
(2) INFORMATION FOR SEQ ID NO: 17: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 705 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17:
ATGGATTTTC AAGTGCAGAT TTTCAGCTTC CTGCTAATCA GTGCTTCAGT CATAATGTCC 60
AGAGGACAAA CTGTTCTCTC CCAGTCTCCA GCAATCCTGT CTGCATCTCC AGGGGAGAAG 120
GTCACAATGA CTTGCAGGGC CAGCTCAAGT GTAACTTACA TTCACTGGTA CCAGCAGAAG 180
CCAGGTTCCT CCCCCAAATC CTGGATTTAT GCCACATCCA ACCTGGCTTC TGGAGTCCCT 240
GCTCGCTTCA GTGGCAGTGG GTCTGGGACC TCTTACTCTC TCACAATCAG CAGAGTGGAG 300
GCTGAAGATG CTGCCACTTA TTACTGCCAA CATTGGAGTA GTAAACCACC GACGTTCGGT 360
GGAGGCACCA AGCTCGAGAT CAAACGGACT GTGGCTGCAC CATCTGTCTT CATCTTCCCG 420
CCATCTGATG AGCAGTTGAA ATCTGGAACT GCCTCTGTTG TGTGCCTGCT GAATAACTTC 480
TATCCCAGAG AGGCCAAAGT ACAGTGGAAG GTGGATAACG CCCTCCAATC GGGTAACTCC 540
CAGGAGAGTG TCACAGAGCA GGACAGCAAG GACAGCACCT ACAGCCTCAG CAGCACCCTG 600
ACGCTGAGCA AAGCAGACTA CGAGAAACAC AAAGTCTACG CCTGCGAAGT CACCCATCAG 660
GGCCTGAGTT CGCCCGTCAC AAAGAGCTTC AACAGGGGAG AGTGT 705
(2) INFORMATION FOR SEQ ID NO: 18: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 235 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear (ii) MOLECULE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18:
Met Asp Phe Gin Val Gin lie Phe Ser Phe Leu Leu He Ser Ala Ser
15 10 15
Val He Met Ser Arg Gly Gin Thr Val Leu Ser Gin Ser Pro Ala He
20 25 30
Leu Ser Ala Ser Pro Gly Glu Lys Val Thr Met Thr Cys Arg Ala Ser
35 40 45
Ser Ser Val Thr Tyr He His Trp Tyr Gin Gin Lys Pro Gly Ser Ser
50 55 60
Pro Lys Ser Trp He Tyr Ala Thr Ser Asn Leu Ala Ser Gly Val Pro
65 70 75 80
Ala Arg Phe Ser Gly Ser Gly Ser Gly Thr Ser Tyr Ser Leu Thr He
85 90 95
Ser Arg Val Glu Ala Glu Asp Ala Ala Thr Tyr Tyr Cys Gin His Trp
100 105 110
Ser Ser Lys Pro Pro Thr Phe Gly Gly Gly Thr Lys Leu Glu He Lys
115 120 125
Arg Thr Val Ala Ala Pro Ser Val Phe He Phe Pro Pro Ser Asp Glu
130 135 140
Gin Leu Lys Ser Gly Thr Ala Ser Val Val Cys Leu Leu Asn Asn Phe
145 150 155 160
Tyr Pro Arg Glu Ala Lys Val Gin Trp Lys Val Asp Asn Ala Leu Gin
165 170 175
Ser Gly Asn Ser Gin Glu Ser Val Thr Glu Gin Asp Ser Lys Asp Ser
180 185 190
Thr Tyr Ser Leu Ser Ser Thr Leu Thr Leu Ser Lys Ala Asp Tyr Glu
195 200 205
Lys His Lys Val Tyr Ala Cys Glu Val Thr His Gin Gly Leu Ser Ser
210 215 220
Pro Val Thr Lys Ser Phe Asn Arg Gly Glu Cys
225 230 235
(2) INFORMATION FOR SEQ ID NO: 19: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 39 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19:
\AGCTTGAAT TCGCCGCCAC TATGGATTTT CAAGTGCAG (2) INFORMATION FOR SEQ ID NO: 20:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 44 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20:
TTAATTGGAT CCGAGCTCCT ATTAACACTC TCCCCTGTTG AAGC 44
(2) INFORMATION FOR SEQ ID NO: 21: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 50 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21:
AAGCTTCCGG ATCCCTGCAG CCATGGAGTT GTGGCTGAAC TGGATTTTCC 50
(2) INFORMATION FOR SEQ ID NO: 22: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 38 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22:
AAGCTTAGTC TAGATTATCA CTTGCCGGCG CCCAGATC 38
(2) INFORMATION FOR SEQ ID NO: 23: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 46 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23:
CGGGGGATCC AGATCTGAGC TCCTGTAGAC GTCGACATTA ATTCCG 46
(2) INFORMATION FOR SEQ ID NO: 24: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 30 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24:
GGAAAATCCA GTTCAGCCAC AACTCCATGG 30
(2) INFORMATION FOR SEQ ID NO: 25: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1926 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25:
ATGAAGTTGT GGCTGAACTG GATTTTCCTT GTAACACTTT TAAATGGAAT TCAGTGTGAG 60
GTGCAGCTGC AGCAGTCTGG GGCAGAGCTT GTGAGGTCAG GGGCCTCAGT CAAGTTGTCC 120
TGCACAGCTT CTGGCTTCAA CATTAAAGAC AACTATATGC ACTGGGTGAA GCAGAGGCCT 180
GAACAGGGCC TGGAGTGGAT TGCATGGATT GATCCTGAGA ATGGTGATAC TGAATATGCC 240
CCGAAGTTCC GGGGCAAGGC CACTTTGACT GCAGACTCAT CCTCCAACAC AGCCTACCTG 300
CACCTCAGCA GCCTGACATC TGAGGACACT GCCGTCTATT ACTGTCATGT CCTGATCTAT 360
GCTGGTTATT TGGCTATGGA CTACTGGGGT CAAGGAACCT CAGTCGCCGT GAGCTCGGCT 420
AGCACCAAGG GACCATCGGT CTTCCCCCTG GCCCCCTGCT CCAGGAGCAC CTCCGAGAGC 480
ACAGCCGCCC TGGGCTGCCT GGTCAAGGAC TACTTCCCCG AACCGGTGAC GGTGTCGTGG 540
AACTCAGGCG CTCTGACCAG CGGCGTGCAC ACCTTCCCGG CTGTCCTACA GTCCTCAGGA 600
CTCTACTCCC TCAGCAGCGT CGTGACGGTG CCCTCCAGCA ACTTCGGCAC CCAGACCTAC 660
ACCTGCAACG TAGATCACAA GCCCAGCAAC ACCAAGGTGG ACAAGACAGT TGGCGGTGGT 720
GGCTCTGGTG GTGGCGGTAG CGGTGGCGGG GGTTCCCAGA AGCGCGACAA CGTGCTGTTC 780
CAGGCAGCTA CCGACGAGCA GCCGGCCGTG ATCAAGACGC TGGAGAAGCT GGTCAACATC 840
GAGACCGGCA CCGGTGACGC CGAGGGCATC GCCGCTGCGG GCAACTTCCT CGAGGCCGAG 900
CTCAAGAACC TCGGCTTCAC GGTCACGCGA AGCAAGTCGG CCGGCCTGGT GGTGGGCGAC 960
AACATCGTGG GCAAGATCAA GGGCCGCGGC GGCAAGAACC TGCTGCTGAT GTCGCACATG 1020
GACACCGTCT ACCTCAAGGG CATTCTCGCG AAGGCCCCGT TCCGCGTCGA AGGCGACAAG 1080
GCCTACGGCC CGGGCATCGC CGACGACAAG GGCGGCAACG CGGTCATCCT GCACACGCTC 1140
AAGCTGCTGA AGGAATACGG CGTGCGCGAC TACGGCACCA TCACCGTGCT GTTCAACACC 1200
GACGAGGAAA AGGGTTCCTT CGGCTCGCGC GACCTGATCC AGGAAGAAGC CAAGCTGGCC 1260
GACTACGTGC TCTCCTTCGA GCCCACCAGC GCAGGCGACG AAAAACTCTC GCTGGGCACC 1320
TCGGGCATCG CCTACGTGCA GGTCCAGATC ACCGGCAAGG CCTCGCATGC CGGCGCCGCG 1380
CCCGAGCTGG GCGTGAACGC GCTGGTCGAG GCTTCCGACC TCGTGCTGCG CACGATGAAC 1440
ATCGACGACA AGGCGAAGAA CCTGCGCTTC CAGTGGACCA TCGCCAAGGC CGGCCAGGTC 1500
TCGAACATCA TCCCCGCCAG CGCCACGCTG AACGCCGACG TGCGCTACGC GCGCAACGAG 1560
GACTTCGACG CCGCCATGAA GACGCTGGAA GAGCGCGCGC AGCAGAAGAA GCTGCCCGAG 1620
GCCGACGTGA AGGTGATCGT CACGCGCGGC CGCCCGGCCT TCAATGCCGG CGAAGGCGGC 1680
AAGAAGCTGG TCGACAAGGC GGTGGCCTAC TACAAGGAAG CCGGCGGCAC GCTGGGCGTG 1740
GAAGAGCGCA CCGGCGGCGG CACCGACGCG GCCTAGGCCG CGCTCTCAGG CAAGCCAGTG 1800
ATCGAGAGCC TGGGCCTGCC GGGCTTCGGC TACCACAGCG ACAAGGCCGA GTACGTGGAC 1860
ATCAGCGCGA TTCCGCGCCG CCTGTACATG GCTGCGCGCC TGATCATGGA TCTGGGCGCC 1920
GGCAAG 1926
(2) INFORMATION FOR SEQ ID NO: 26: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 642 araino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26:
Met Lys Leu Trp Leu Asn Trp lie Phe Leu Val Thr Leu Leu Asn Gly
15 10 15
He Gin Cys Glu Val Gin Leu Gin Gin Ser Gly Ala Glu Leu Val Arg
20 25 30
Ser Gly Ala Ser Val Lys Leu Ser Cys Thr Ala Ser Gly Phe Asn He
35 40 45
Lys Asp Asn Tyr Met His Trp Val Lys Gin Arg Pro Glu Gin Gly Leu
50 55 60
Glu Trp He Ala Trp lie Asp Pro Glu Asn Gly Asp Thr Glu Tyr Ala
65 70 75 80
Pro Lys Phe Arg Gly Lys Ala Thr Leu Thr Ala Asp Ser Ser Ser Asn
85 90 95
Thr Ala Tyr Leu His Leu Ser Ser Leu Thr Ser Glu Asp Thr Ala Val
100 105 110
Tyr Tyr Cys His Val Leu He Tyr Ala Gly Tyr Leu Ala Met Asp Tyr
115 120 125
Trp Gly Gin Gly Thr Ser Val Ala Val Ser Ser Ala Ser Thr Lys Gly
130 135 140
Pro Ser Val Phe Pro Leu Ala Pro Cys Ser Arg Ser Thr Ser Glu Ser
145 150 155 160
Thr Ala Ala Leu Gly Cys Leu Val Lys Asp Tyr Phe Pro Glu Pro Val
165 170 175
Thr Val Ser Trp Asn Ser Gly Ala Leu Thr Ser Gly Val His Thr Phe
180 185 190
Pro Ala Val Leu Gin Ser Ser Gly Leu Tyr Ser Leu Ser Ser Val Val
195 200 205
Thr Val Pro Ser Ser Asn Phe Gly Thr Gin Thr Tyr Thr Cys Asn Val
210 215 220
Asp His Lys Pro Ser Asn Thr Lys Val Asp Lys Thr Val Gly Gly Gly
225 230 235 240
Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gin Lys Arg Asp
245 250 255
Asn Val Leu Phe Gin Ala Ala Thr Asp Glu Gin Pro Ala Val lie Lys
260 265 270
Thr Leu Glu Lys Leu Val Asn lie Glu Thr Gly Thr Gly Asp Ala Glu
275 280 285
Gly lie Ala Ala Ala Gly Asn Phe Leu Glu Ala Glu Leu Lys Asn Leu
290 295 300
Gly Phe Thr Val Thr Arg Ser Lys Ser Ala Gly Leu Val Val Gly Asp
305 310 315 320
Asn lie Val Gly Lys lie Lys Gly Arg Gly Gly Lys Asn Leu Leu Leu
325 330 335
Met Ser His Met Asp Thr Val Tyr Leu Lys Gly lie Leu Ala Lys Ala
340 345 350
Pro Phe Arg Val Glu Gly Asp Lys Ala Tyr Gly Pro Gly lie Ala Asp
355 360 365
Asp Lys Gly Gly Asn Ala Val He Leu His Thr Leu Lys Leu Leu Lys
370 375 380
Glu Tyr Gly Val Arg Asp Tyr Gly Thr He Thr Val Leu Phe Asn Thr
385 390 395 400
Asp Glu Glu Lys Gly Ser Phe Gly Ser Arg Asp Leu He Gin Glu Glu
405 410 415
Ala Lys Leu Ala Asp Tyr Val Leu Ser Phe Glu Pro Thr Ser Ala Gly
420 425 430
Asp Glu Lys Leu Ser Leu Gly Thr Ser Gly He Ala Tyr Val Gin Val
435 440 445
Gin He Thr Gly Lys Ala Ser His Ala Gly Ala Ala Pro Glu Leu Gly
450 455 460
Val Asn Ala Leu Val Glu Ala Ser Asp Leu Val Leu Arg Thr Met Asn
465 470 475 480
He Asp Asp Lys Ala Lys Asn Leu Arg Phe Gin Trp Thr He Ala Lys
485 490 495
Ala Gly Gin Val Ser Asn He He Pro Ala Ser Ala Thr Leu Asn Ala
500 505 510
Asp Val Arg Tyr Ala Arg Asn Glu Asp Phe Asp Ala Ala Met Lys Thr
515 520 525
Leu Glu Glu Arg Ala Gin Gin Lys Lys Leu Pro Glu Ala Asp Val Lys
530 535 540
Val He Val Thr Arg Gly Arg Pro Ala Phe Asn Ala Gly Glu Gly Gly
545 550 555 560
Lys Lys Leu Val Asp Lys Ala Val Ala Tyr Tyr Lys Glu Ala Gly Gly
565 570 575
Thr Leu Gly Val Glu Glu Arg Thr Gly Gly Gly Thr Asp Ala Ala Tyr
580 585 590
Ala Ala Leu Ser Gly Lys Pro Val He Glu Ser Leu Gly Leu Pro Gly
595 600 605
Phe Gly Tyr His Ser Asp Lys Ala Glu Tyr Val Asp He Ser Ala He
610 615 620
Pro Arg Arg Leu Tyr Met Ala Ala Arg Leu He Met Asp Leu Gly Ala
625 630 635 640
Gly Lys
(2) INFORMATION FOR SEQ ID NO: 27: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 39 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27:
AAGCTTGGAA TTCAGTGTCA GGTCCAACTG CAGCAGCCT 39
(2) INFORMATION FOR SEQ ID NO: 28: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 54 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28:
GCTACCGCCA CCTCCGGAGC CACCACCGCC CCGTTTGATC TCGAGCTTGG TGCC 54
(2) INFORMATION FOR SEQ ID NO: 29: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 58 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29:
TCCGGAGGTG GCGGTAGCGG TGGCGGGGGT TCCCAGAAGC GCGACAACGT GCTGTTCC 58
(2) INFORMATION FOR SEQ ID NO: 30: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 24 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30:
CCTCGAGGAA TTCTTTCACT TGCC 24
(2) INFORMATION FOR SEQ ID NO: 31: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2019 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31:
ATGAAGTTGT GGCTGAACTG GATTTTCCTT GTAACACTTT TAAATGGAAT TCAGTGTCAG 60
GTCCAACTGC AGCAGCCTGG GGCTGAACTG GTGAAGCCTG GGGCTTCAGT GCAGCTGTCC 120
TGCAAGGCTT CTGGCTACAC CTTCACCGGC TACTGGATAC ACTGGGTGAA GCAGAGGCCT 180
GGACAAGGCC TTGAGTGGAT TGGAGAGGTT AATCCTAGTA CCGGTCGTTC TGACTACAAT 240
GAGAAGTTCA AGAACAAGGC CACACTGACT GTAGACAAAT CCTCCACCAC AGCCTACATG 300
CAACTCAGCA GCCTGACATC TGAGGACTCT GCGGTCTATT ACTGTGCAAG AGAGAGGGCC 360
TATGGTTACG ACGATGCTAT GGACTACTGG GGCCAAGGGA CCACGGTCAC CGTCTCCTCA 420
GGTGGCGGTG GCTCGGGCGG TGGTGGGTCG GGTGGCGGCG GATCTGACAT TGAGCTCTCA 480
CAGTCTCCAT CCTCCCTGGC TGTGTCAGCA GGAGAGAAGG TCACCATGAG CTGCAAATCC 540
AGTCAGAGTC TCCTCAACAG TAGAACCCGA AAGAACTACT TGGCTTGGTA CCAGCAGAGA 600
CCAGGGCAGT CTCCTAAACT GCTGATCTAT TGGGCATCCA CTAGGACATC TGGGGTCCCT 660
GATCGCTTCA CAGGCAGTGG ATCTGGGACA GATTTCACTC TCACCATCAG CAGTGTGCAG 720
GCTGAAGACC TGGCAATTTA TTACTGCAAG CAATCTTATA CTCTTCGGAC GTTCGGTGGA 780
GGCACCAAGC TCGAGATCAA ACGGGGCGGT GGTGGCTCCG GAGGTGGCGG TAGCGGTGGC 840
GGGGGTTCCC AGAAGCGCGA CAACGTGCTG TTCCAGGCAG CTACCGACGA GCAGCCGGCC 900
GTGATCAAGA CGCTGGAGAA GCTGGTCAAC ATCGAGACCG GCACCGGTGA CGCCGAGGGC 960
ATCGCCGCTG CGGGCAACTT CCTCGAGGCC GAGCTCAAGA ACCTCGGCTT CACGGTCACG 1020
CGAAGCAAGT CGGCCGGCCT GGTGGTGGGC GACAACATCG TGGGCAAGAT CAAGGGCCGC 1080
GGCGGCAAGA ACCTGCTGCT GATGTCGCAC ATGGACACCG TCTACCTCAA GGGCATTCTC 1140
GCGAAGGCCC CGTTCCGCGT CGAAGGCGAC AAGGCCTACG GCCCGGGCAT CGCCGACGAC 1200
AAGGGCGGCA ACGCGGTCAT CCTGCACACG CTCAAGCTGC TGAAGGAATA CGGCGTGCGC 1260
GACTACGGCA CCATCACCGT GCTGTTCAAC ACCGACGAGG AAAAGGGTTC CTTCGGCTCG 1320
CGCGACCTGA TCCAGGAAGA AGCCAAGCTG GCCGACTACG TGCTCTCCTT CGAGCCCACC 1380
AGCGCAGGCG ACGAAAAACT CTCGCTGGGC ACCTCGGGCA TCGCCTACGT GCAGGTCCAG 1440
ATCACCGGCA AGGCCTCGCA TGCCGGCGCC GCGCCCGAGC TGGGCGTGAA CGCGCTGGTC 1500
GAGGCTTCCG ACCTCGTGCT GCGCACGATG AACATCGACG ACAAGGCGAA GAACCTGCGC 1560
TTCCAGTGGA CCATCGCCAA GGCCGGCCAG GTCTCGAACA TCATCCCCGC CAGCGCCACG 1620
CTGAACGCCG ACGTGCGCTA CGCGCGCAAC GAGGACTTCG ACGCCGCCAT GAAGACGCTG 1680
GAAGAGCGCG CGCAGCAGAA GAAGCTGCCC GAGGCCGACG TGAAGGTGAT CGTCACGCGC 1740
GGCCGCCCGG CCTTCAATGC CGGCGAAGGC GGCAAGAAGC TGGTCGACAA GGCGGTGGCC 1800
TACTACAAGG AAGCCGGCGG CACGCTGGGC GTGGAAGAGC GCACCGGCGG CGGCACCGAC 1860
GCGGCCTACG CCGCGCTCTC AGGCAAGCCA GTGATCGAGA GCCTGGGCCT GCCGGGCTTC 1920
GGCTACCACA GCGACAAGGC CGAGTACGTG GACATCAGCG CGATTCCGCG CCGCCTGTAC 1980
ATGGCTGCGC GCCTGATCAT GGATCTGGGC GCCGGCAAG 2019
(2) INFORMATION FOR SEQ ID NO: 32: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 673 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32:
Met Lys Leu Trp Leu Asn Trp lie Phe Leu Val Thr Leu Leu Asn Gly
15 10 15
lie Gin Cys Gin Val Gin Leu Gin Gin Pro Gly Ala Glu Leu Val Lys
20 25 30
Pro Gly Ala Ser Val Gin Leu Ser Cys Lys Ala Ser Gly Tyr Thr Phe
35 40 45
Thr Gly Tyr Trp lie His Trp Val Lys Gin Arg Pro Gly Gin Gly Leu
50 55 60
Glu Trp lie Gly Glu Val Asn Pro Ser Thr Gly Arg Ser Asp Tyr Asn
65 70 75 80
Glu Lys Phe Lys Asn Lys Ala Thr Leu Thr Val Asp Lys Ser Ser Thr
85 90 95
Thr Ala Tyr Met Gin Leu Ser Ser Leu Thr Ser Glu Asp Ser Ala Val
100 105 110
Tyr Tyr Cys Ala Arg Glu Arg Ala Tyr Gly Tyr Asp Asp Ala Met Asp
115 120 125
Tyr Trp Gly Gin Gly Thr Thr Val Thr Val Ser Ser Gly Gly Gly Gly
130 135 140
Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Asp lie Glu Leu Ser
145 150 155 160
Gin Ser Pro Ser Ser Leu Ala Val Ser Ala Gly Glu Lys Val Thr Met
165 170 175
Ser Cys Lys Ser Ser Gin Ser Leu Leu Asn Ser Arg Thr Arg Lys Asn
180 185 190
Tyr Leu Ala Trp Tyr Gin Gin Arg Pro Gly Gin Ser Pro Lys Leu Leu
195 200 205
lie Tyr Trp Ala Ser Thr Arg Thr Ser Gly Val Pro Asp Arg Phe Thr
210 215 220
Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Thr lie Ser Ser Val Gin
225 230 235 240
Ala Glu Asp Leu Ala lie Tyr Tyr Cys Lys Gin Ser Tyr Thr Leu Arg
245 250 255
Thr Phe Gly Gly Gly Thr Lys Leu Glu lie Lys Arg Gly Gly Gly Gly
260 265 270
Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gin Lys Arg Asp Asn
275 280 285
Val Leu Phe Gin Ala Ala Thr Asp Glu Gin Pro Ala Val lie Lys Thr
290 295 300
Leu Glu Lys Leu Val Asn lie Glu Thr Gly Thr Gly Asp Ala Glu Gly
305 Tin TI ^
He Ala Ala Ala Gly Asn Phe Leu Glu Ala Glu Leu Lys Asn Leu Gly
325 330 335
Phe Thr Val Thr Arg Ser Lys Ser Ala Gly Leu Val Val Gly Asp Asn
340 345 350
He Val Gly Lys He Lys Gly Arg Gly Gly Lys Asn Leu Leu Leu Met
355 360 365
Ser His Met Asp Thr Val Tyr Leu Lys Gly He Leu Ala Lys Ala Pro
370 375 380
Phe Arg Val Glu Gly Asp Lys Ala Tyr Gly Pro Gly He Ala Asp Asp
385 390 395 400
Lys Gly Gly Asn Ala Val He Leu His Thr Leu Lys Leu Leu Lys Glu
405 410 415
Tyr Gly Val Arg Asp Tyr Gly Thr He Thr Val Leu Phe Asn Thr Asp
420 425 430
Glu Glu Lys Gly Ser Phe Gly Ser Arg Asp Leu He Gin Glu Glu Ala
435 440 445
Lys Leu Ala Asp Tyr Val Leu Ser Phe Glu Pro Thr Ser Ala Gly Asp
450 455 460
Glu Lys Leu Ser Leu Gly Thr Ser Gly He Ala Tyr Val Gin Val Gin
465 470 475 480
He Thr Gly Lys Ala Ser His Ala Gly Ala Ala Pro Glu Leu Gly Val
485 490 495
Asn Ala Leu Val Glu Ala Ser Asp Leu Val Leu Arg Thr Met Asn He
500 505 510
Asp Asp Lys Ala Lys Asn Leu Arg Phe Gin Trp Thr He Ala Lys Ala
515 520 525
Gly Gin Val Ser Asn He He Pro Ala Ser Ala Thr Leu Asn Ala Asp
530 535 540
Val Arg Tyr Ala Arg Asn Glu Asp Phe Asp Ala Ala Met Lys Thr Leu
545 550 555 560
Glu Glu Arg Ala Gin Gin Lys Lys Leu Pro Glu Ala Asp Val Lys Val
565 570 575
He Val Thr Arg Gly Arg Pro Ala Phe Asn Ala Gly Glu Gly Gly Lys
580 585 590
Lys Leu Val Asp Lys Ala Val Ala Tyr Tyr Lys Glu Ala Gly Gly Thr
595 600 605
Leu Gly Val Glu Glu Arg Thr Gly Gly Gly Thr Asp Ala Ala Tyr Ala
610 615 620
Ala Leu Ser Gly Lys Pro Val He Glu Ser Leu Gly Leu Pro Gly Phe
625 630 635 640
Gly Tyr His Ser Asp Lys Ala Glu Tyr Val Asp He Ser Ala He Pro
645 650 655
Arg Arg Leu Tyr Met Ala Ala Arg Leu He Met Asp Leu Gly Ala Gly
660 665 670
Lys
(2) INFORMATION FOR SEQ ID NO: 33: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 37 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33:
GGGCGCCGGC AAGTGATAAA ATTCCTCGAG GAGCTCC 37
(2) INFORMATION FOR SEQ ID NO: 34: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 19 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34:
CGCCACCTCT GACTTGAGC 19
(2) INFORMATION FOR SEQ ID NO: 35: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 37 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35:
GGAGCTCCTC GAGGAATTTT ATCACTTGCC GGCGCCC 37
(2) INFORMATION FOR SEQ ID NO: 36: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 19 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36:
GCTGAACGCC GACGTGCGC 19
(2) INFORMATION FOR SEQ ID NO: 37: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 2025 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37:
ATGAAGTTGT GGCTGAACTG GATTTTCCTT GTAACACTTT TAAATGGAAT TCAGTGTCAG 60
GTCCAACTGC AGCAGCCTGG GGCTGAACTG GTGAAGCCTG GGGCTTCAGT GCAGCTGTCC 120
TGCAAGGCTT CTGGCTACAC CTTCACCGGC TACTGGATAC ACTGGGTGAA GCAGAGGCCT 180
GGACAAGGCC TTGAGTGGAT TGGAGAGGTT AATCCTAGTA CCGGTCGTTC TGACTACAAT 240
GAGAAGTTCA AGAACAAGGC CACACTGACT GTAGACAAAT CCTCCACCAC AGCCTACATG 300
CAACTCAGCA GCCTGACATC TGAGGACTCT GCGGTCTATT ACTGTGCAAG AGAGAGGGCC 360
TATGGTTACG ACGATGCTAT GGACTACTGG GGCCAAGGGA CCACGGTCAC CGTCTCCTCA 420
GGTGGCGGTG GCTCGGGCGG TGGTGGGTCG GGTGGCGGCG GATCTGACAT TGAGCTCTCA 480
CAGTCTCCAT CCTCCCTGGC TGTGTCAGCA GGAGAGAAGG TCACCATGAG CTGCAAATCC 540
AGTCAGAGTC TCCTCAACAG TAGAACCCGA AAGAACTACT TGGCTTGGTA CCAGCAGAGA 600
CCAGGGCAGT CTCCTAAACT GCTGATCTAT TGGGCATCCA CTAGGACATC TGGGGTCCCT 660
GATCGCTTCA CAGGCAGTGG ATCTGGGACA GATTTCACTC TCACCATCAG CAGTGTGCAG 720
GCTGAAGACC TGGCAATTTA TTACTGCAAG CAATCTTATA CTCTTCGGAC GTTCGGTGGA 780
GGCACCAAGC TCGAGATCAA ACGGGGCGGT GGTGGCTCCG GAGGTGGCGG TAGCGGTGGC 840
GGGGGTTCCC AGAAGCGCGA CAACGTGCTG TTCCAGGCAG CTACCGACGA GCAGCCGGCC 900
GTGATCAAGA CGCTGGAGAA GCTGGTCAAC ATCGAGACCG GCACCGGTGA CGCCGAGGGC 960
ATCGCCGCTG CGGGCAACTT CCTCGAGGCC GAGCTCAAGA ACCTCGGCTT CACGGTCACG 1020
CGAAGCAAGT CGGCCGGCCT GGTGGTGGGC GACAACATCG TGGGCAAGAT CAAGGGCCGC 1080
GGCGGCAAGA ACCTGCTGCT GATGTCGCAC ATGGACACCG TCTACCTCAA GGGCATTCTC 1140
GCGAAGGCCC CGTTCCGCGT CGAAGGCGAC AAGGCCTACG GCCCGGGCAT CGCCGACGAC 1200
AAGGGCGGCA ACGCGGTCAT CCTGCACACG CTCAAGCTGC TGAAGGAATA CGGCGTGCGC 1260
GACTACGGCA CCATCACCGT GCTGTTCAAC ACCGACGAGG AAAAGGGTTC CTTCGGCTCG 1320
CGCGACCTGA TCCAGGAAGA AGCCAAGCTG GCCGACTACG TGCTCTCCTT CGAGCCCACC 1380
AGCGCAGGCG ACGAAAAACT CTCGCTGGGC ACCTCGGGCA TCGCCTACGT GCAGGTCCAG 1440
ATCACCGGCA AGGCCTCGCA TGCCGGCGCC GCGCCCGAGC TGGGCGTGAA CGCGCTGGTC 1500
GAGGCTTCCG ACCTCGTGCT GCGCACGATG AACATCGACG ACAAGGCGAA GAACCTGCGC 1560
TTCCAGTGGA CCATCGCCAA GGCCGGCCAG GTCTCGAACA TCATCCCCGC CAGCGCCACG 1620
CTGAACGCCG ACGTGCGCTA CGCGCGCAAC GAGGACTTCG ACGCCGCCAT GAAGACGCTG 1680
GAAGAGCGCG CGCAGCAGAA GAAGCTGCCC GAGGCCGACG TGAAGGTGAT CGTCACGCGC 1740
GGCCGCCCGG CCTTCAATGC CGGCGAAGGC GGCAAGAAGC TGGTCGACAA GGCGGTGGCC 1800
TACTACAAGG AAGCCGGCGG CACGCTGGGC GTGGAAGAGC GCACCGGCGG CGGCACCGAC 1860
GCGGCCTACG CCGCGCTCTC AGGCAAGCCA GTGATCGAGA GCCTGGGCCT GCCGGGCTTC 1920
GGCTACCACA GCGACAAGGC CGAGTACGTG GACATCAGCG CGATTCCGCG CCGCCTGTAC 1980
ATGGCTGCGC GCCTGATCAT GGATCTGGGC GCCGGCAAGT GATAA 2025
(2) INFORMATION FOR SEQ ID NO: 38: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 288 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38:
Met Lys Tyr Leu Leu Pro Thr Ala Ala Ala Gly Leu Leu Leu Leu Ala
15 10 15
Ala Gin Pro Ala Met Ala Gin Val Gin Leu Gin Gin Pro Gly Ala Glu
20 25 30
Leu Val Lys Pro Gly Ala Ser Val Gin Leu Ser Cys Lys Ala Ser Gly
35 40 45
Tyr Thr Phe Thr Gly Tyr Trp lie His Trp Val Lys Gin Arg Pro Gly
50 55 60
Gin Gly Leu Glu Trp lie Gly Glu Val Asn Pro Ser Thr Gly Arg Ser
65 70 75 80
Asp Tyr Asn Glu Lys Phe Lys Asn Lys Ala Thr Leu Thr Val Asp Lys
85 90 95
Ser Ser Thr Thr Ala Tyr Met Gin Leu Ser Ser Leu Thr Ser Glu Asp
100 105 110
Ser Ala Val Tyr Tyr Cys Ala Arg Glu Arg Ala Tyr Gly Tyr Asp Asp
115 120 125
Ala Met Asp Tyr Trp Gly Gin Gly Thr Thr Val Thr Val Ser Ser Gly
130 135 140
Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Asp lie
145 150 155 160
Glu Leu Ser Gin Ser Pro Ser Ser Leu Ala Val Ser Ala Gly Glu Lys
165 170 175
Val Thr Met Ser Cys Lys Ser Ser Gin Ser Leu Leu Asn Ser Arg Thr
180 185 190
Arg Lys Asn Tyr Leu Ala Trp Tyr Gin Gin Arg Pro Gly Gin Ser Pro
195 200 205
Lys Leu Leu lie Tyr Trp Ala Ser Thr Arg Thr Ser Gly Val Pro Asp
210 215 220
Arg Phe Thr Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Thr lie Ser
225 230 235 240
Ser Val Gin Ala Glu Asp Leu Ala lie Tyr Tyr Cys Lys Gin Ser Tyr
245 250 255
Thr Leu Arg Thr Phe Gly Gly Gly Thr Lys Leu Glu lie Lys Arg Glu
260 265 270
Gin Lys Leu lie Ser Glu Glu Asp Leu Asn His His His His His His
275 280 285
(2) INFORMATION FOR SEQ ID NO: 39: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 36 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39:
GCCCAACCAG CCATGGCCGA GGTGCAGCTG CAGCAG 36
(2) INFORMATION FOR SEQ ID NO: 40: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 54 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40:
CGACCCACCA CCGCCCGAGC CACCGCCACC CGAGCTCACG GCGACTGAGG TTCC 54
(2) INFORMATION FOR SEQ ID NO: 41: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 54 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41:
TCGGGCGGTG GTGGGTCGGG TGGCGGCGGA TCTCAGATTG TGCTCACCCA GTCT 54
(2) INFORMATION FOR SEQ ID NO: 42: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 24 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42:
CCGTTTGATC TCGAGCTTGG TCCC 24
(2) INFORMATION FOR SEQ ID NO: 43: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 843 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43:
ATGAAATACC TATTGCCTAC GGCAGCCGCT GGATTGTTAT TACTCGCTGC CCAACCAGCC 60
VTGGCCGAGG TGCAGCTGCA GCAGTCTGGG GCAGAGCTTG TGAGGTCAGG GGCCTCAGTC 120
\AGTTGTCCT GCACAGCTTC TGGCTTCAAC ATTAAAGACA ACTATATGCA CTGGGTGAAG 180
:AGAGGCCTG AACAGGGCCT GGAGTGGATT GCATGGATTG ATCCTGAGAA TGGTGATACT 240
5AATATGCCC CGAAGTTCCG GGGCAAGGCC ACTTTGACTG CAGACTCATC CTCCAACACA 300
3CCTACCTGC ACCTCAGCAG CCTGACATCT GAGGACACTG CCGTCTATTA CTGTCATGTC 360
:TGATCTATG CTGGTTATTT GGCTATGGAC TACTGGGGTC AAGGAACCTC AGTCGCCGTG 420
IGCTCGGGTG GCGGTGGCTC GGGCGGTGGT GGGTCGGGTG GCGGCGGATC TCAGATTGTG 480
:TCACCCAGT CTCCAGCAAT CATGTCTGCA TCTCCAGGGG AGAAGGTCAC CATAACCTGC 54o
\GTGCCAGCT CAAGTGTAAC TTACATGCAC TGGTTCCAGC AGAAGCCAGG CACTTCTCCC 600
\AACTCTGGA TTTATAGCAC ATCCAACCTG GCTTCTGGAG TCCCTGCTCG CTTCAGTGGC 660
V3TGGATCTG GGACCTCTTA CTCTCTCACA ATCAGCCGAA TGGAGGCTGA AGATGCTGCC 720
\CTTATTACT GCCAGCAAAG GAGTACTTAC CCGCTCACGT TCGGTGCTGG GACCAAGCTC 780
^AGATCAAAC GGGAACAAAA ACTCATCTCA GAAGAAGATC TGAATCACCA CCATCACCAC 840
:AT 843
(2) INFORMATION FOR SEQ ID NO: 44: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 281 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44:
Met Lys Tyr Leu Leu Pro Thr Ala Ala Ala Gly Leu Leu Leu Leu Ala
15 10 15
Ala Gin Pro Ala Met Ala Glu Val Gin Leu Gin Gin Ser Gly Ala Glu
20 25 30
Leu Val Arg Ser Gly Ala Ser Val Lys Leu Ser Cys Thr Ala Ser Gly
35 40 45
Phe Asn lie Lys Asp Asn Tyr Met His Trp Val Lys Gin Arg Pro Glu
50 55 60
Gin Gly Leu Glu Trp lie Ala Trp lie Asp Pro Glu Asn Gly Asp Thr
65 70 75 80
Glu Tyr Ala Pro Lys Phe Arg Gly Lys Ala Thr Leu Thr Ala Asp Ser
85 90 95
Ser Ser Asn Thr Ala Tyr Leu His Leu Ser Ser Leu Thr Ser Glu Asp
100 105 110
Thr Ala Val Tyr Tyr Cys His Val Leu He Tyr Ala Gly Tyr Leu Ala
115 120 125
Met Asp Tyr Trp Gly Gin Gly Thr Ser Val Ala Val Ser Ser Gly Gly
130 135 140
Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gin He Val
145 150 155 160
Leu Thr Gin Ser Pro Ala He Met Ser Ala Ser Pro Gly Glu Lys Val
165 170 175
Thr He Thr Cys Ser Ala Ser Ser Ser Val Thr Tyr Met His Trp Phe
180 185 190
Gin Gin Lys Pro Gly Thr Ser Pro Lys Leu Trp lie Tyr Ser Thr Ser
195 200 205
Asn Leu Ala Ser Gly Val Pro Ala Arg Phe Ser Gly Ser Gly Ser Gly
210 215 220
Thr Ser Tyr Ser Leu Thr lie Ser Arg Met Glu Ala Glu Asp Ala Ala
225 230 235 240
Thr Tyr Tyr Cys Gin Gin Arg Ser Thr Tyr Pro Leu Thr Phe Gly Ala
245 250 255
Gly Thr Lys Leu Glu lie Lys Arg Glu Gin Lys Leu lie Ser Glu Glu
260 265 270
Asp Leu Asn His His His His His His
275 280
(2) INFORMATION FOR SEQ ID NO: 45: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 72 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45:
TCGAGATCAA ACGGGAACAA AAACTCATCT CAGAAGAAGA TCTGAATCAC CACCATCACC 60
ACCATTAATG AG 72
(2) INFORMATION FOR SEQ ID NO: 46: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 72 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46:
AATTCTCATT AATGGTGGTG ATGGTGGTGA TTCAGATCTT CTTCTGAGAT GAGTTTTTGT 60
TCCCGTTTGA TC 72
(2) INFORMATION FOR SEQ ID NO: 47: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 864 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47:
ATGAAATACC TATTGCCTAC GGCAGCCGCT GGATTGTTAT TACTCGCTGC CCAACCAGCC 60
ATGGCCCAGG TCCAACTGCA GCAGCCTGGG GCTGAACTGG TGAAGCCTGG GGCTTCAGTG 120
CAGCTGTCCT GCAAGGCTTC TGGCTACACC TTCACCGGCT ACTGGATACA CTGGGTGAAG 180
CAGAGGCCTG GACAAGGCCT TGAGTGGATT GGAGAGGTTA ATCCTAGTAC CGGTCGTTCT 240
GACTACAATG AGAAGTTCAA GAACAAGGCC ACACTGACTG TAGACAAATC CTCCACCACA 300
GCCTACATGC AACTCAGCAG CCTGACATCT GAGGACTCTG CGGTCTATTA CTGTGCAAGA 360
GAGAGGGCCT ATGGTTACGA CGATGCTATG GACTACTGGG GCCAAGGGAC CACGGTCACC 420
GTCTCCTCAG GTGGCGGTGG CTCGGGCGGT GGTGGGTCGG GTGGCGGCGG ATCTGACATT 480
GAGCTCTCAC AGTCTCCATC CTCCCTGGCT GTGTCAGCAG GAGAGAAGGT CACCATGAGC 540
TGCAAATCCA GTCAGAGTCT CCTCAACAGT AGAACCCGAA AGAACTACTT GGCTTGGTAC 600
CAGCAGAGAC CAGGGCAGTC TCCTAAACTG CTGATCTATT GGGCATCCAC TAGGACATCT 660
GGGGTCCCTG ATCGCTTCAC AGGCAGTGGA TCTGGGACAG ATTTCACTCT CACCATCAGC 720
AGTGTGCAGG CTGAAGACCT GGCAATTTAT TACTGCAAGC AATCTTATAC TCTTCGGACG 780
TTCGGTGGAG GCACCAAGCT CGAGATCAAA CGGGAACAAA AACTCATCTC AGAAGAAGAT 840
CTGAATCACC ACCATCACCA CCAT 864
(2) INFORMATION FOR SEQ ID NO: 48: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 34 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48:
AAGCTTGGAA TTCAGTGTGA GGTGCAGCTG CAGC 34
(2) INFORMATION FOR SEQ ID NO: 49: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 45 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49:
CGCCACCTCC GGAGCCACCA CCGCCCCGTT TGATCTCGAG CTTGG 45
(2) INFORMATION FOR SEQ ID NO: 50: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1998 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50:
ATGAAGTTGT GGCTGAACTG GATTTTCCTT GTAACACTTT TAAATGGAAT TCAGTGTGAG 60
GTGCAGCTGC AGCAGTCTGG GGCAGAGCTT GTGAGGTCAG GGGCCTCAGT CAAGTTGTCC 120
TGCACAGCTT CTGGCTTCAA CATTAAAGAC AACTATATGC ACTGGGTGAA GCAGAGGCCT 180
GAACAGGGCC TGGAGTGGAT TGCATGGATT GATCCTGAGA ATGGTGATAC TGAATATGCC 240
CCGAAGTTCC GGGGCAAGGC CACTTTGACT GCAGACTCAT CCTCCAACAC AGCCTACCTG 300
CACCTCAGCA GCCTGACATC TGAGGACACT GCCGTCTATT ACTGTCATGT CCTGATCTAT 360
GCTGGTTATT TGGCTATGGA CTACTGGGGT CAAGGAACCT CAGTCGCCGT GAGCTCGGGT 420
GGCGGTGGCT CGGGCGGTGG TGGGTCGGGT GGCGGCGGAT CTCAGATTGT GCTCACCCAG 480
TCTCCAGCAA TCATGTCTGC ATCTCCAGGG GAGAAGGTCA CCATAACCTG CAGTGCCAGC 540
TCAAGTGTAA CTTACATGCA CTGGTTCCAG CAGAAGCCAG GCACTTCTCC CAAACTCTGG 600
ATTTATAGCA CATCCAACCT GGCTTCTGGA GTCCCTGCTC GCTTCAGTGG CAGTGGATCT 660
GGGACCTCTT ACTCTCTCAC AATCAGCCGA ATGGAGGCTG AAGATGCTGC CACTTATTAC 720
TGCCAGCAAA GGAGTACTTA CCCGCTCACG TTCGGTGCTG GGACCAAGCT CGAGATCAAA 780
CGGGGCGGTG GTGGCTCCGG AGGTGGCGGT AGCGGTGGCG GGGGTTCCCA GAAGCGCGAC 840
AACGTGCTGT TCCAGGCAGC TACCGACGAG CAGCCGGCCG TGATCAAGAC GCTGGAGAAG 900
CTGGTCAACA TCGAGACCGG CACCGGTGAC GCCGAGGGCA TCGCCGCTGC GGGCAACTTC 960
CTCGAGGCCG AGCTCAAGAA CCTCGGCTTC ACGGTCACGC GAAGCAAGTC GGCCGGCCTG 1020
GTGGTGGGCG ACAACATCGT GGGCAAGATC AAGGGCCGCG GCGGCAAGAA CCTGCTGCTG 1080
ATGTCGCACA TGGACACCGT CTACCTCAAG GGCATTCTCG CGAAGGCCCC GTTCCGCGTC 1140
GAAGGCGACA AGGCCTACGG CCCGGGCATC GCCGACGACA AGGGCGGCAA CGCGGTCATC 1200
CTGCACACGC TCAAGCTGCT GAAGGAATAC GGCGTGCGCG ACTACGGCAC CATCACCGTG 1260
CTGTTCAACA CCGACGAGGA AAAGGGTTCC TTCGGCTCGC GCGACCTGAT CCAGGAAGAA 1320
GCCAAGCTGG CCGACTACGT GCTCTCCTTC GAGCCCACCA GCGCAGGCGA CGAAAAACTC 1380
TCGCTGGGCA CCTCGGGCAT CGCCTACGTG CAGGTCCAGA TCACCGGCAA GGCCTCGCAT 1440
GCCGGCGCCG CGCCCGAGCT GGGCGTGAAC GCGCTGGTCG AGGCTTCCGA CCTCGTGCTG 1500
CGCACGATGA ACATCGACGA CAAGGCGAAG AACCTGCGCT TCCAGTGGAC CATCGCCAAG 1560
GCCGGCCAGG TCTCGAACAT CATCCCCGCC AGCGCCACGC TGAACGCCGA CGTGCGCTAC 1620
GCGCGCAACG AGGACTTCGA CGCCGCCATG AAGACGCTGG AAGAGCGCGC GCAGCAGAAG 1680
AAGCTGCCCG AGGCCGACGT GAAGGTGATC GTCACGCGCG GCCGCCCGGC CTTCAATGCC 1740
GGCGAAGGCG GCAAGAAGCT GGTCGACAAG GCGGTGGCCT ACTACAAGGA AGCCGGCGGC 1800
ACGCTGGGCG TGGAAGAGCG CACCGGCGGC GGCACCGACG CGGCCTACGC CGCGCTCTCA 1860
GGCAAGCCAG TGATCGAGAG CCTGGGCCTG CCGGGCTTCG GCTACCACAG CGACAAGGCC 1920
GAGTACGTGG ACATCAGCGC GATTCCGCGC CGCCTGTACA TGGCTGCGCG CCTGATCATG 1980
GATCTGGGCG CCGGCAAG 1998
(2) INFORMATION FOR SEQ ID NO: 51: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 666 amino acids
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51:
Met Lys Leu Trp Leu Asn Trp lie Phe Leu Val Thr Leu Leu Asn Gly
15 10 15
lie Gin Cys Glu Val Gin Leu Gin Gin Ser Gly Ala Glu Leu Val Arg
20 25 30
Ser Gly Ala Ser Val Lys Leu Ser Cys Thr Ala Ser Gly Phe Asn lie
35 40 45
Lys Asp Asn Tyr Met His Trp Val Lys Gin Arg Pro Glu Gin Gly Leu
50 55 60
Glu Trp lie Ala Trp lie Asp Pro Glu Asn Gly Asp Thr Glu Tyr Ala
65 70 75 80
Pro Lys Phe Arg Gly Lys Ala Thr Leu Thr Ala Asp Ser Ser Ser Asn
85 90 95
Thr Ala Tyr Leu His Leu Ser Ser Leu Thr Ser Glu Asp Thr Ala Val
100 105 110
Tyr Tyr Cys His Val Leu lie Tyr Ala Gly Tyr Leu Ala Met Asp Tyr
115 120 125
Trp Gly Gin Gly Thr Ser Val Ala Val Ser Ser Gly Gly Gly Gly Ser
130 135 140
Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gin He Val Leu Thr Gin
145 150 155 160
Ser Pro Ala lie Met Ser Ala Ser Pro Gly Glu Lys Val Thr He Thr
165 170 175
Cys Ser Ala Ser Ser Ser Val Thr Tyr Met His Trp Phe Gin Gin Lys
180 185 190
Pro Gly Thr Ser Pro Lys Leu Trp He Tyr Ser Thr Ser Asn Leu Ala
195 200 205
Ser Gly Val Pro Ala Arg Phe Ser Gly Ser Gly Ser Gly Thr Ser Tyr
210 215 220
Ser Leu Thr He Ser Arg Met Glu Ala Glu Asp Ala Ala Thr Tyr Tyr
225 230 235 240
Cys Gin Gin Arg Ser Thr Tyr Pro Leu Thr Phe Gly Ala Gly Thr Lys
245 250 255
Leu Glu He Lys Arg Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly
260 265 270
Gly Gly Gly Ser Gin Lys Arg Asp Asn Val Leu Phe Gin Ala Ala Thr
275 280 285
Asp Glu Gin Pro Ala Val He Lys Thr Leu Glu Lys Leu Val Asn He
290 295 300
Glu Thr Gly Thr Gly Asp Ala Glu Gly He Ala Ala Ala Gly Asn Phe
305 310 315 320
Leu Glu Ala Glu Leu Lys Asn Leu Gly Phe Thr Val Thr Arg Ser Lys
325 330 335
Ser Ala Gly Leu Val Val Gly Asp Asn He Val Gly Lys He Lys Gly
340 345 350
Arg Gly Gly Lys Asn Leu Leu Leu Met Ser His Met Asp Thr Val Tyr
355 360 365
Leu Lys Gly He Leu Ala Lys Ala Pro Phe Arg Val Glu Gly Asp Lys
370 375 380
Ala Tyr Gly Pro Gly He Ala Asp Asp Lys Gly Gly Asn Ala Val He
385 390 395 400
Leu His Thr Leu Lys Leu Leu Lys Glu Tyr Gly Val Arg Asp Tyr Gly
405 410 415
Thr lie Thr Val Leu Phe Asn Thr Asp Glu Glu Lys Gly Ser Phe Gly
420 425 430
Ser Arg Asp Leu lie Gin Glu Glu Ala Lys Leu Ala Asp Tyr Val Leu
435 440 445
Ser Phe Glu Pro Thr Ser Ala Gly Asp Glu Lys Leu Ser Leu Gly Thr
450 455 460
Ser Gly lie Ala Tyr Val Gin Val Gin lie Thr Gly Lys Ala Ser His
465 470 475 480
Ala Gly Ala Ala Pro Glu Leu Gly Val Asn Ala Leu Val Glu Ala Ser
485 490 495
Asp Leu Val Leu Arg Thr Met Asn lie Asp Asp Lys Ala Lys Asn Leu
500 505 510
Arg Phe Gin Trp Thr lie Ala Lys Ala Gly Gin Val Ser Asn lie lie
515 520 525
Pro Ala Ser Ala Thr Leu Asn Ala Asp Val Arg Tyr Ala Arg Asn Glu
530 535 540
Asp Phe Asp Ala Ala Met Lys Thr Leu Glu Glu Arg Ala Gin Gin Lys
545 550 555 560
Lys Leu Pro Glu Ala Asp Val Lys Val lie Val Thr Arg Gly Arg Pro
565 570 575
Ala Phe Asn Ala Gly Glu Gly Gly Lys Lys Leu Val Asp Lys Ala Val
580 585 590
Ala Tyr Tyr Lys Glu Ala Gly Gly Thr Leu Gly Val Glu Glu Arg Thr
595 600 605
Gly Gly Gly Thr Asp Ala Ala Tyr Ala Ala Leu Ser Gly Lys Pro Val
610 615 620
lie Glu Ser Leu Gly Leu Pro Gly Phe Gly Tyr His Ser Asp Lys Ala
625 630 635 640
Glu Tyr Val Asp lie Ser Ala lie Pro Arg Arg Leu Tyr Met Ala Ala
645 650 655
Arg Leu lie Met Asp Leu Gly Ala Gly Lys
660 665
(2) INFORMATION FOR SEQ ID NO: 52: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 3217 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52:
GAATTCGCCG CCACTATGGA TTTTCAAGTG CAGATTTTCA GCTTCCTGCT AATCAGTGCT 60 TCAGTCATAA TGTCCAGAGG ACAAACTGTT CTCTCCCAGT CTCCAGCAAT CCTGTCTGCA 120 TCTCCAGGGG AGAAGGTCAC AATGACTTGC AGGGCCAGCT CAAGTGTAAC TTACATTCAC 180
TGGTACCAGC AGAAGCCAGG TTCCTCCCCC AAATCCTGGA TTTATGCCAC ATCCAACCTG 240
GCTTCTGGAG TCCCTGCTCG CTTCAGTGGC AGTGGGTCTG GGACCTCTTA CTCTCTCACA 300
ATCAGCAGAG TGGAGGCTGA AGATGCTGCC ACTTATTACT GCCAACATTG GAGTAGTAAA 360
CCACCGACGT TCGGTGGAGG CACCAAGCTC GAGATCAAAC GGACTGTGGC TGCACCATCT 420
GTCTTCATCT TCCCGCCATC TGATGAGCAG TTGAAATCTG GAACTGCCTC TGTTGTGTGC 480
CTGCTGAATA ACTTCTATCC CAGAGAGGCC AAAGTACAGT GGAAGGTGGA TAACGCCCTC 540
CAATCGGGTA ACTCCCAGGA GAGTGTCACA GAGCAGGACA GCAAGGACAG CACCTACAGC 600
CTCAGCAGCA CCCTGACGCT GAGCAAAGCA GACTACGAGA AACACAAAGT CTACGCCTGC 660
GAAGTCACCC ATCAGGGCCT GAGTTCGCCC GTCACAAAGA GCTTCAACAG GGGAGAGTGT 720
TAATAGGAGC TCGGATCCAG ATCTGAGCTC CTGTAGACGT CGACATTAAT TCCGGTTATT 780
TTCCACCATA TTGCCGTCTT TTGGCAATGT GAGGGCCCGG AAACCTGGCC CTGTCTTCTT 840
GACGAGCATT CCTAGGGGTC TTTCCCCTCT CGCCAAAGGA ATGCAAGGTC TGTTGAATGT 900
CGTGAAGGAA GCAGTTCCTC TGGAAGCTTC TTGAAGACAA ACAACGTCTG TAGCGACCCT 960
TTGCAGGCAG CGGAACCCCC CACCTGGCGA CAGGTGCCTC TGCGGCCAAA AGCCACGTGT 1020
ATAAGATACA CCTGCAAAGG CGGCACAACC CCAGTGCCAC GTTGTGAGTT GGATAGTTGT 1080
GGAAAGAGTC AAATGGCTCT CCTCAAGCGT ATTCAACAAG GGGCTGAAGG ATGCCCAGAA 1140
GGTACCCCAT TGTATGGGAT CTGATCTGGG GCCTCGGTGC ACATGCTTTA CATGTGTTTA 1200
GTCGAGGTTA AAAAACGTCT AGGCCCCCCG AACCACGGGG ACGTGGTTTT CCTTTGAAAA 1260
ACACGATGAT AATACCATGG AGTTGTGGCT GAACTGGATT TTCCTTGTAA CACTTTTAAA 1320
TGGTATCCAG TGTGAGGTGA AGCTGGTGGA GTCTGGAGGA GGCTTGGTAC AGCCTGGGGG 1380
TTCTCTGAGA CTCTCCTGTG CAACTTCTGG GTTCACCTTC ACTGATTACT ACATGAACTG 1440
GGTCCGCCAG CCTCCAGGAA AGGCACTTGA GTGGTTGGGT TTTATTGGAA ACAAAGCTAA 1500
TGGTTACACA ACAGAGTACA GTGCATCTGT GAAGGGTCGG TTCACCATCT CCAGAGATAA 1560
ATCCCAAAGC ATCCTCTATC TTCAAATGAA CACCCTGAGA GCTGAGGACA GTGCCACTTA 1620
TTACTGTACA AGAGATAGGG GGCTACGGTT CTACTTTGAC TACTGGGGCC AAGGCACCAC 1680
TCTCACAGTG AGCTCGGCTA GCACCAAGGG ACCATCGGTC TTCCCCCTGG CCCCCTGCTC 1740
CAGGAGCACC TCCGAGAGCA CAGCCGCCCT GGGCTGCCTG GTCAAGGACT ACTTCCCCGA 1800
ACCGGTGACG GTGTCGTGGA ACTCAGGCGC TCTGACCAGC GGCGTGCACA CCTTCCCGGC 1860
TGTCCTACAG TCCTCAGGAC TCTACTCCCT CAGCAGCGTC GTGACGGTGC CCTCCAGCAA 1920
CTTCGGCACC CAGACCTACA CCTGCAACGT AGATCACAAG CCCAGCAACA CCAAGGTGGA 1980
CAAGACAGTT GGCGGTGGTG GCTCTGGTGG TGGCGGTAGC GGTGGCGGGG GTTCCCAGAA 2040
GCGCGACAAC GTGCTGTTCC AGGCAGCTAC CGACGAGCAG CCGGCCGTGA TCAAGACGCT 2100
GGAGAAGCTG GTCAACATCG AGACCGGCAC CGGTGACGCC GAGGGCATCG CCGCTGCGGG 2160
CAACTTCCTC GAGGCCGAGC TCAAGAACCT CGGCTTCACG GTCACGCGAA GCAAGTCGGC 2220
CGGCCTGGTG GTGGGCGACA ACATCGTGGG CAAGATCAAG GGCCGCGGCG GCAAGAACCT 2280
GCTGCTGATG TCGCACATGG ACACCGTCTA CCTCAAGGGC ATTCTCGCGA AGGCCCCGTT 2340
CCGCGTCGAA GGCGACAAGG CCTACGGCCC GGGCATCGCC GACGACAAGG GCGGCAACGC 2400
GGTCATCCTG CACACGCTCA AGCTGCTGAA GGAATACGGC GTGCGCGACT ACGGCACCAT 2460
CACCGTGCTG TTCAACACCG ACGAGGAAAA GGGTTCCTTC GGCTCGCGCG ACCTGATCCA 2520
GGAAGAAGCC AAGCTGGCCG ACTACGTGCT CTCCTTCGAG CCCACCAGCG CAGGCGACGA 2580
AAAACTCTCG CTGGGCACCT CGGGCATCGC CTACGTGCAG GTCCAGATCA CCGGCAAGGC 2640
CTCGCATGCC GGCGCCGCGC CCGAGCTGGG CGTGAACGCG CTGGTCGAGG CTTCCGACCT 2700
CGTGCTGCGC ACGATGAACA TCGACGACAA GGCGAAGAAC CTGCGCTTCC AGTGGACCAT 2760
CGCCAAGGCC GGCCAGGTCT CGAACATCAT CCCCGCCAGC GCCACGCTGA ACGCCGACGT 2820
GCGCTACGCG CGCAACGAGG ACTTCGACGC CGCCATGAAG ACGCTGGAAG AGCGCGCGCA 2880
GCAGAAGAAG CTGCCCGAGG CCGACGTGAA GGTGATCGTC ACGCGCGGCC GCCCGGCCTT 2940
CAATGCCGGC GAAGGCGGCA AGAAGCTGGT CGACAAGGCG GTGGCCTACT ACAAGGAAGC 3000
CGGCGGCACG CTGGGCGTGG AAGAGCGCAC CGGCGGCGGC ACCGACGCGG CCTACGCCGC 3060
GCTCTCAGGC AAGCCAGTGA TCGAGAGCCT GGGCCTGCCG GGCTTCGGCT ACCACAGCGA 3120
CAAGGCCGAG TACGTGGACA TCAGCGCGAT TCCGCGCCGC CTGTACATGG CTGCGCGCCT 3180
GATCATGGAT CTGGGCGCCG GCAAGTGATA ATCTAGA 3217
(2) INFORMATION FOR SEQ ID NO: 53: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 35 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53:
TGGATCTGAA GCTTAAACTA ACTCCATGGT GACCC 35
(2) INFORMATION FOR SEQ ID NO: 54: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 61 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54:
GCCACGGATC CCGCCACCTC CGGAGCCACC ACCGCCACAA TCCCTGGGCA CAATTTTCTT 60
G 61
(2) INFORMATION FOR SEQ ID NO: 55: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 94 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55:
GCCCAGGAAG CTTGGCGGTG GTGGCTCCGG AGGTGGCGGT AGCGGTGGCG GGGGTTCCCA 60
GAAGCGCGAC AACGTGCTGT TCCAGGCAGC TACC 94
(2) INFORMATION FOR SEQ ID NO: 56: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 51 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56:
ATGTGCGAAT TCAGCAGCAG GTTCTTGCCG CCGCGGCCCT TGATCTTGCC C 51
(2) INFORMATION FOR SEQ ID NO: 57: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 732 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION:16..720
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57:
GAATTCGCCG CCACC ATG GAT TTT CAA GTG GAG ATT TTC AGC TTC CTG CTA 51 Met Asp Phe Gin Val Gin lie Phe Ser Phe Leu Leu
15 10
ATC ACT GCT TCA GTC ATA ATG TCC AGA GGA CAA ACT GTT CTC TCC CAG 99 lie Ser Ala Ser Val lie Met Ser Arg Gly Gin Thr Val Leu Ser Gin
15 20 25
TCT CCA GCA ATC CTG TCT GCA TCT CCA GGG GAG AAG GTC ACA ATG ACT 147 Ser Pro Ala lie Leu Ser Ala Ser Pro Gly Glu Lys Val Thr Met Thr
30 35 40
TGC AGG GCC AGC TCA AGT GTA ACT TAG ATT CAC TGG TAG CAG CAG AAG 195
Cys Arg Ala Ser Ser Ser Val Thr Tyr lie His Trp Tyr Gin Gin Lys
45 50 55 60
CCA GGT TCC TCC CCC AAA TCC TGG ATT TAT GCC ACA TCC AAC CTG GCT 243 Pro Gly Ser Ser Pro Lys Ser Trp lie Tyr Ala Thr Ser Asn Leu Ala
65 70 75
TCT GGA GTC CCT GCT CGC TTC AGT GGC AGT GGG TCT GGG ACC TCT TAG 291 Ser Gly Val Pro Ala Arg Phe Ser Gly Ser Gly Ser Gly Thr Ser Tyr
80 85 90
TCT CTC ACA ATC AGC AGA GTG GAG GCT GAA GAT GCT GCC ACT TAT TAG 339 Ser Leu Thr lie Ser Arg Val Glu Ala Glu Asp Ala Ala Thr Tyr Tyr
95 100 105
TGC CAA CAT TGG AGT AGT AAA CCA CCG ACG TTC GGT GGA GGC ACC AAG 387 Cys Gin His Trp Ser Ser Lys Pro Pro Thr Phe Gly Gly Gly Thr Lys
110 115 120
CTG GAA ATC AAA CGG GCT GAT GCT GCA CCA ACT GTA TCC ATC TTC CCA 435
Leu Glu lie Lys Arg Ala Asp Ala Ala Pro Thr Val Ser lie Phe Pro
125 130 135 140
CCA TCC AGT GAG CAG TTA ACA TCT GGA GGT GCC TCA GTC GTG TGC TTC 483 Pro Ser Ser Glu Gin Leu Thr Ser Gly Gly Ala Ser Val Val Cys Phe
145 150 155
TTG AAC AAC TTC TAG CCC AAA GAG ATC AAT GTC AAG TGG AAG ATT GAT 531
Leu Asn Asn Phe Tyr Pro Lys Asp lie Asn Val Lys Trp Lys lie Asp
160 165 170
GGC ACT GAA CGA CAA AAT GGC GTC CTG AAC ACT TGG ACT GAT CAG GAC 579 Gly Ser Glu Arg Gin Asn Gly Val Leu Asn Ser Trp Thr Asp Gin Asp
175 180 185
AGC AAA GAC AGC ACC TAG AGC ATG AGC AGC ACC CTC ACG TTG ACC AAG 627 Ser Lys Asp Ser Thr Tyr Ser Met Ser Ser Thr Leu Thr Leu Thr Lys
190 195 200
GAC GAG TAT GAA CGA CAT AAC AGC TAT ACC TGT GAG GCC ACT CAC AAG 675
Asp Glu Tyr Glu Arg His Asn Ser Tyr Thr Cys Glu Ala Thr His Lys
205 210 215 220
ACA TCA ACT TCA CCC ATT GTC AAG AGC TTC AAC AGG AAT GAG TGT 720
Thr Ser Thr Ser Pro lie Val Lys Ser Phe Asn Arg Asn Glu Cys
225 230 235
TAATAAGAAT TC 732
(2) INFORMATION FOR SEQ ID NO: 58:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 235 amino acids
(B) TYPE: amino acid
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58:
Met Asp Phe Gin Val Gin lie Phe Ser Phe Leu Leu lie Ser Ala Ser
15 10 15
Val lie Met Ser Arg Gly Gin Thr Val Leu Ser Gin Ser Pro Ala lie
20 25 30
Leu Ser Ala Ser Pro Gly Glu Lys Val Thr Met Thr Cys Arg Ala Ser
35 40 45
Ser Ser Val Thr Tyr lie His Trp Tyr Gin Gin Lys Pro Gly Ser Ser
50 55 60
Pro Lys Ser Trp lie Tyr Ala Thr Ser Asn Leu Ala Ser Gly Val Pro
65 70 75 80
Ala Arg Phe Ser Gly Ser Gly Ser Gly Thr Ser Tyr Ser Leu Thr lie
85 90 95
Ser Arg Val Glu Ala Glu Asp Ala Ala Thr Tyr Tyr Cys Gin His Trp
100 105 110
Ser Ser Lys Pro Pro Thr Phe Gly Gly Gly Thr Lys Leu Glu lie Lys
115 120 125
Arg Ala Asp Ala Ala Pro Thr Val Ser lie Phe Pro Pro Ser Ser Glu
130 135 140
Gin Leu Thr Ser Gly Gly Ala Ser Val Val Cys Phe Leu Asn Asn Phe
145 150 155 160
Tyr Pro Lys Asp lie Asn Val Lys Trp Lys lie Asp Gly Ser Glu Arg
165 170 175
Gin Asn Gly Val Leu Asn Ser Trp Thr Asp Gin Asp Ser Lys Asp Ser
180 185 190
Thr Tyr Ser Met Ser Ser Thr Leu Thr Leu Thr Lys Asp Glu Tyr Glu
195 200 205
Arg His Asn Ser Tyr Thr Cys Glu Ala Thr His Lys Thr Ser Thr Ser
210 215 220
Pro lie Val Lys Ser Phe Asn Arg Asn Glu Cys
225 230 235
(2) INFORMATION FOR SEQ ID NO: 59: (i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 1974 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: other nucleic acid (ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION:16..1956
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59:
AAGCTTGCCG CCACC ATG AAG TTG TGG CTG AAC TGG ATT TTC CTT GTA ACA 51 Met Lys Leu Trp Leu Asn Trp lie Phe Leu Val Thr
15 10
CTT TTA AAT GGT ATC CAG TGT GAG GTG AAG CTG GTG GAG TCT GGA GGA 99 Leu Leu Asn Gly lie Gin Cys Glu Val Lys Leu Val Glu Ser Gly Gly
15 20 25
GGC TTG GTA CAG CCT GGG GGT TCT CTG AGA CTC TCC TGT GCA ACT TCT 147 Gly Leu Val Gin Pro Gly Gly Ser Leu Arg Leu Ser Cys Ala Thr Ser
30 35 40
GGG TTC ACC TTC ACT GAT TAG TAG ATG AAC TGG GTC CGC CAG CCT CCA 195
Gly Phe Thr Phe Thr Asp Tyr Tyr Met Asn Trp Val Arg Gin Pro Pro
45 50 55 60
GGA AAG GCA CTT GAG TGG TTG GGT TTT ATT GGA AAC AAA GCT AAT GGT 243 Gly Lys Ala Leu Glu Trp Leu Gly Phe lie Gly Asn Lys Ala Asn Gly
65 70 75
TAG ACA ACA GAG TAG ACT GCA TCT GTG AAG GGT CGG TTC ACC ATC TCC 291 Tyr Thr Thr Glu Tyr Ser Ala Ser Val Lys Gly Arg Phe Thr lie Ser
80 85 90
AGA GAT AAA TCC CAA AGC ATC CTC TAT CTT CAA ATG AAC ACC CTG AGA 339 Arg Asp Lys Ser Gin Ser lie Leu Tyr Leu Gin Met Asn Thr Leu Arg
95 100 105
GCT GAG GAC ACT GCC ACT TAT TAG TGT ACA AGA GAT AGG GGG CTA CGG 387 Ala Glu Asp Ser Ala Thr Tyr Tyr Cys Thr Arg Asp Arg Gly Leu Arg
110 115 120
TTC TAG TTT GAC TAG TGG GGC CAA GGC ACC ACT CTC ACA GTC TCC TCA 435
Phe Tyr Phe Asp Tyr Trp Gly Gin Gly Thr Thr Leu Thr Val Ser Ser
125 130 135 140
GCC AAA ACG ACA CCC CCA TCT GTC TAT CCA CTG GCC CCT GGA TCT GCT 483 Ala Lys Thr Thr Pro Pro Ser Val Tyr Pro Leu Ala Pro Gly Ser Ala
145 150 155
GCC CAA ACT AAC TCC ATG GTG ACC CTG GGA TGC CTG GTC AAG GGC TAT 531 Ala Gin Thr Asn Ser Met Val Thr Leu Gly Cys Leu Val Lys Gly Tyr
160 165 170
TTC CCT GAG CCA GTG ACA GTG ACC TGG AAC TCT GGA TCT CTG TCC AGC 579 Phe Pro Glu Pro Val Thr Val Thr Trp Asn Ser Gly Ser Leu Ser Ser
175 180 185
GGT GTG CAC ACC TTC CCA GCT GTC CTG CAG TCT GAC CTC TAG ACT CTG 627 Gly Val His Thr Phe Pro Ala Val Leu Gin Ser Asp Leu Tyr Thr Leu
190 195 200
AGC AGC TCA GTG ACT GTC CCC TCC AGC ACC TGG CCC AGC GAG ACC GTC 675
Ser Ser Ser Val Thr Val Pro Ser Ser Thr Trp Pro Ser Glu Thr Val
205 210 215 220
ACC TGC AAC GTT GCC CAC CCG GCC AGC AGC ACC AAG GTG GAC AAG AAA 723 Thr Cys Asn Val Ala His Pro Ala Ser Ser Thr Lys Val Asp Lys Lys
225 230 235
ATT GTG CCC AGG GAT TGT GGC GGT GGT GGC TCC GGA GGT GGC GGT AGC 771 lie Val Pro Arg Asp Cys Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser
240 245 250
GGT GGC GGG GGT TCC CAG AAG CGC GAC AAC GTG CTG TTC CAG GCA GCT 819 Gly Gly Gly Gly Ser Gin Lys Arg Asp Asn Val Leu Phe Gin Ala Ala
255 260 265
ACC GAC GAG CAG CCG GCC GTG ATC AAG ACG CTG GAG AAG CTG GTC AAC 867 Thr Asp Glu Gin Pro Ala Val lie Lys Thr Leu Glu Lys Leu Val Asn
270 275 280
ATC GAG ACC GGC ACC GGT GAC GCC GAG GGC ATC GCC GCT GCG GGC AAC 915
He Glu Thr Gly Thr Gly Asp Ala Glu Gly He Ala Ala Ala Gly Asn
285 290 295 300
TTC CTC GAG GCC GAG CTC AAG AAC CTC GGC TTC ACG GTC ACG CGA AGC 963 Phe Leu Glu Ala Glu Leu Lys Asn Leu Gly Phe Thr Val Thr Arg Ser
305 310 315
AAG TCG GCC GGC CTG GTG GTG GGC GAC AAC ATC GTG GGC AAG ATC AAG 1011 Lys Ser Ala Gly Leu Val Val Gly Asp Asn He Val Gly Lys He Lys
320 325 330
GGC CGC GGC GGC AAG AAC CTG CTG CTG ATG TCG CAC ATG GAC ACC GTC 1059 Gly Arg Gly Gly Lys Asn Leu Leu Leu Met Ser His Met Asp Thr Val
335 340 345
TAG CTC AAG GGC ATT CTC GCG AAG GCC CCG TTC CGC GTC GAA GGC GAC 1107 Tyr Leu Lys Gly He Leu Ala Lys Ala Pro Phe Arg Val Glu Gly Asp
350 355 360
AAG GCC TAC GGC CCG GGC ATC GCC GAC GAC AAG GGC GGC AAC GCG GTC 1155
Lys Ala Tyr Gly Pro Gly He Ala Asp Asp Lys Gly Gly Asn Ala Val
365 370 375 380
ATC CTG CAC ACG CTC AAG CTG CTG AAG GAA TAC GGC GTG CGC GAC TAC 1203 He Leu His Thr Leu Lys Leu Leu Lys Glu Tyr Gly Val Arg Asp Tyr
385 390 395
GGC ACC ATC ACC GTG CTG TTC AAC ACC GAC GAG GAA AAG GGT TCC TTC 1251
Gly Thr lie Thr Val Leu Phe Asn Thr Asp Glu Glu Lys Gly Ser Phe
400 405 410
GGC TCG CGC GAC CTG ATC CAG GAA GAA GCC AAG CTG GCC GAC TAG GTG 1299
Gly Ser Arg Asp Leu lie Gin Glu Glu Ala Lys Leu Ala Asp Tyr Val
415 420 425
CTC TCC TTC GAG CCC ACC AGC GCA GGC GAC GAA AAA CTC TCG CTG GGC 1347
Leu Ser Phe Glu Pro Thr Ser Ala Gly Asp Glu Lys Leu Ser Leu Gly
430 435 440
ACC TCG GGC ATC GCC TAC GTG CAG GTC AAC ATC ACC GGC AAG GCC TCG 1395
Thr Ser Gly He Ala Tyr Val Gin Val Asn He Thr Gly Lys Ala Ser
445 450 455 460
CAT GCC GGC GCC GCG CCC GAG CTG GGC GTG AAC GCG CTG GTC GAG GCT 1443
His Ala Gly Ala Ala Pro Glu Leu Gly Val Asn Ala Leu Val Glu Ala
465 470 475
TCC GAC CTC GTG CTG CGC ACG ATG AAC ATC GAC GAC AAG GCG AAG AAC 1491
Ser Asp Leu Val Leu Arg Thr Met Asn lie Asp Asp Lys Ala Lys Asn
480 485 490
CTG CGC TTC AAC TGG ACC ATC GCC AAG GCC GGC AAC GTC TCG AAC ATC 1539
Leu Arg Phe Asn Trp Thr He Ala Lys Ala Gly Asn Val Ser Asn He
495 500 505
ATC CCC GCC AGC GCC ACG CTG AAC GCC GAC GTG CGC TAC GCG CGC AAC 1587
He Pro Ala Ser Ala Thr Leu Asn Ala Asp Val Arg Tyr Ala Arg Asn
510 515 520
GAG GAC TTC GAC GCC GCC ATG AAG ACG CTG GAA GAG CGC GCG CAG CAG 1635
Glu Asp Phe Asp Ala Ala Met Lys Thr Leu Glu Glu Arg Ala Gin Gin
525 530 535 540
AAG AAG CTG CCC GAG GCC GAC GTG AAG GTG ATC GTC ACG CGC GGC CGC 1683
Lys Lys Leu Pro Glu Ala Asp Val Lys Val He Val Thr Arg Gly Arg
545 550 555
CCG GCC TTC AAT GCC GGC GAA GGC GGC AAG AAG CTG GTC GAC AAG GCG 1731
Pro Ala Phe Asn Ala Gly Glu Gly Gly Lys Lys Leu Val Asp Lys Ala
560 565 570
GTG GCC TAC TAC AAG GAA GCC GGC GGC ACG CTG GGC GTG GAA GAG CGC 1779
Val Ala Tyr Tyr Lys Glu Ala Gly Gly Thr Leu Gly Val Glu Glu Arg
575 580 585
ACC GGC GGC GGC ACC GAC GCG GCC TAC GCC GCG CTC TCA GGC AAG CCA 1827
Thr Gly Gly Gly Thr Asp Ala Ala Tyr Ala Ala Leu Ser Gly Lys Pro
590 595 600
GTG ATC GAG AGC CTG GGC CTG CCG GGC TTC GGC TAC CAC AGC GAC AAG 1875
Val He Glu Ser Leu Gly Leu Pro Gly Phe Gly Tyr His Ser Asp Lys
605 610 615 620
GCC GAG TAC GTG GAC ATC AGC GCG ATT CCG CGC CGC CTG TAC ATG GCT 1923
Ala Glu Tyr Val Asp He Ser Ala He Pro Arg Arg Leu Tyr Met Ala
625 630 635
GCG CGC CTG ATC ATG GAT CTG GGC GCC GGC AAG TGATAAGAAT TCCTCGAG 1974
Ala Arg Leu lie Met Asp Leu Gly Ala Gly Lys
640 645
(2) INFORMATION FOR SEQ ID NO: 60:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 647 amino acids
(B) TYPE: amino acid
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60:
Met Lys Leu Trp Leu Asn Trp lie Phe Leu Val Thr Leu Leu Asn Gly
15 10 15
lie Gin Cys Glu Val Lys Leu Val Glu Ser Gly Gly Gly Leu Val Gin
20 25 30
Pro Gly Gly Ser Leu Arg Leu Ser Cys Ala Thr Ser Gly Phe Thr Phe
35 40 45
Thr Asp Tyr Tyr Met Asn Trp Val Arg Gin Pro Pro Gly Lys Ala Leu
50 55 60
Glu Trp Leu Gly Phe lie Gly Asn Lys Ala Asn Gly Tyr Thr Thr Glu
65 70 75 80
Tyr Ser Ala Ser Val Lys Gly Arg Phe Thr He Ser Arg Asp Lys Ser
85 90 95
Gin Ser He Leu Tyr Leu Gin Met Asn Thr Leu Arg Ala Glu Asp Ser
100 105 110
Ala Thr Tyr Tyr Cys Thr Arg Asp Arg Gly Leu Arg Phe Tyr Phe Asp
115 120 125
Tyr Trp Gly Gin Gly Thr Thr Leu Thr Val Ser Ser Ala Lys Thr Thr
130 135 140
Pro Pro Ser Val Tyr Pro Leu Ala Pro Gly Ser Ala Ala Gin Thr Asn
145 150 155 160
Ser Met Val Thr Leu Gly Cys Leu Val Lys Gly Tyr Phe Pro Glu Pro
165 170 175
Val Thr Val Thr Trp Asn Ser Gly Ser Leu Ser Ser Gly Val His Thr
180 185 190
Phe Pro Ala Val Leu Gin Ser Asp Leu Tyr Thr Leu Ser Ser Ser Val
195 200 205
Thr Val Pro Ser Ser Thr Trp Pro Ser Glu Thr Val Thr Cys Asn Val
210 215 220
Ala His Pro Ala Ser Ser Thr Lys Val Asp Lys Lys He Val Pro Arg
225 230 235 240
Asp Cys Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly
245 250 255
Ser Gin Lys Arg Asp Asn Val Leu Phe Gin Ala Ala Thr Asp Glu Gin
260 265 270
Pro Ala Val He Lys Thr Leu Glu Lys Leu Val Asn He Glu Thr Gly
275 280 285
Thr Gly Asp Ala Glu Gly He Ala Ala Ala Gly Asn Phe Leu Glu Ala
290 295 300
Glu Leu Lys Asn Leu Gly Phe Thr Val Thr Arg Ser Lys Ser Ala Gly
305 310 315 320
Leu Val Val Gly Asp Asn He Val Gly Lys He Lys Gly Arg Gly Gly
325 330 335
Lys Asn Leu Leu Leu Met Ser His Met Asp Thr Val Tyr Leu Lys Gly
340 345 350
He Leu Ala Lys Ala Pro Phe Arg Val Glu Gly Asp Lys Ala Tyr Gly
355 360 365
Pro Gly He Ala Asp Asp Lys Gly Gly Asn Ala Val He Leu His Thr
370 375 380
Leu Lys Leu Leu Lys Glu Tyr Gly Val Arg Asp Tyr Gly Thr He Thr
385 390 395 400
Val Leu Phe Asn Thr Asp Glu Glu Lys Gly Ser Phe Gly Ser Arg Asp
405 410 415
Leu He Gin Glu Glu Ala Lys Leu Ala Asp Tyr Val Leu Ser Phe Glu
420 425 430
Pro Thr Ser Ala Gly Asp Glu Lys Leu Ser Leu Gly Thr Ser Gly He
435 440 445
Ala Tyr Val Gin Val Asn He Thr Gly Lys Ala Ser His Ala Gly Ala
450 455 460
Ala Pro Glu Leu Gly Val Asn Ala Leu Val Glu Ala Ser Asp Leu Val
465 470 475 480
Leu Arg Thr Met Asn He Asp Asp Lys Ala Lys Asn Leu Arg Phe Asn
485 490 495
Trp Thr He Ala Lys Ala Gly Asn Val Ser Asn He He Pro Ala Ser
500 505 510
Ala Thr Leu Asn Ala Asp Val Arg Tyr Ala Arg Asn Glu Asp Phe Asp
515 520 525
Ala Ala Met Lys Thr Leu Glu Glu Arg Ala Gin Gin Lys Lys Leu Pro
530 535 540
Glu Ala Asp Val Lys Val He Val Thr Arg Gly Arg Pro Ala Phe Asn
545 550 555 560
Ala Gly Glu Gly Gly Lys Lys Leu Val Asp Lys Ala Val Ala Tyr Tyr
565 570 575
Lys Glu Ala Gly Gly Thr Leu Gly Val Glu Glu Arg Thr Gly Gly Gly
580 585 590
Thr Asp Ala Ala Tyr Ala Ala Leu Ser Gly Lys Pro Val He Glu Ser
595 600 605
Leu Gly Leu Pro Gly Phe Gly Tyr His Ser Asp Lys Ala Glu Tyr Val
610 615 620
Asp He Ser Ala He Pro Arg Arg Leu Tyr Met Ala Ala Arg Leu He
625 630 635 640
Met Asp Leu Gly Ala Gly Lys 645
CLAIMS
1 A gene construct encoding a cell targeting moiety and a heterologous prodrug
activating enzyme for use as a medicament in a mammalian host wherein the gene construct is
capable of expressing the cell targeting moiety and enzyme as a conjugate within a target cell
in the mammalian host and wherein the conjugate is directed to leave the cell thereafter for
selective localisation at a cell surface antigen recognised by the cell targeting moiety.
2 A gene construct for use as a medicament according to claim 1 wherein the cell
targeting moiety is an antibody.
3 A gene construct for use as a medicament according to claim 2 wherein the antibody is
an anti-CEA antibody selected from antibody A5B7 or 806.077 antibody.
4 A gene construct for use as a medicament according to any preceding claim wherein
the heterologous enzyme is a carboxypeptidase.
5 A gene construct for use as a medicament according to claim 4 wherein the
carboxypeptidase is CPG2.
6 A gene construct for use as a medicament according to claim 5 wherein the CPG2 has
mutated polypeptide glycosylation sites so as to prevent or reduce glycosylation on expression
in mammalian cells.
7 A gene construct for use as a medicament according to any one of claims 5-6 in which
the antibody-enzyme CPG2 conjugate is a fusion protein in which the enzyme is fused to the
C terminus of the antibody through the heavy or light chain thereof whereby dimerisation of
the encoded conjugate when expressed can take place through a dimerisation domain on
CPG2.
8 A gene construct for use as a medicament according to claim 7 wherein the fusion
protein is formed through linking a C-terminus of an antibody Fab heavy chain to an N-
terminus of a CPG2 molecule to form a Fab-CPG2 whereby two Fab-CPG2 molecules when
expressed dimerise through CPG2 to form a (Fab-CPG2)2 conjugate.
9 A gene construct for use as a medicament according to claim 4 wherein the
carboxypeptidase is selected from [D253K]HCPB, [G251T,D253K]HCPB or
[A248S,G251T,D253K]HCPB.
10 A gene construct for use as a medicament according to any preceding claim
comprising transcriptional regulatory sequence which comprises a promoter and a control
element which is a genetic switch to control expression of the gene construct.
11 A gene construct for use as a medicament according to claim 10 in which the
transcriptional regulatory sequence comprises a genetic switch control element regulated by
presence of tetracycline or ecdysone.
12 A gene construct for use as a medicament according to claim 10 or 11 wherein the
promoter is dependent on cell type and is selected from the following promoters:
carcinoembryonic antigen (CEA); alpha-foetoprotein (AFP); tyrosine hydroxylase; choline
acetyl transferase; neurone specific enolase; insulin; glial fibro acidic protein; HER-2/neu; c-
erbB2; and N-myc.
13 A gene construct for use as a medicament according to any preceding claim which is
packaged within an adenovirus for delivery to the mammalian host.
14 Use of a gene construct as defined in any one of claims 1 -12 for manufacture of a
medicament for cancer therapy in a mammalian host.
15 A matched two component system designed for use in a mammalian host in which the
components comprise:
(i) a first component that is a gene construct as defined in any one of claims 1-13 and; (ii) a second component that is a prodrug which can be converted into a cytotoxic drug by the heterologous enzyme encoded by the first component.
16 A matched two component system according to claim 15 in which:
the first component comprises a gene encoding the heterologous enzyme CPG2; and the second component prodrug is selected from N-(4-[N,N-bis(2-iodoethyl)amino]-phenoxycarbonyl)-L-glutamicacid, N-(4-[N,N-bis(2-chloroethyl)amino]-phenoxycarbonyl)-L-glutamic-gamma-(3,5-dicarboxy)anilideorN-(4-[N,N-bis(2-chloroethyl)amino]-phenoxycarbonyl)-L-glutamic acid or a pharmaceutically acceptable salt thereof.
17 A method for the delivery of a cytotoxic drug to a site which comprises administering
to a host a first component that is a gene construct as defined in any one of claims 1-13;
followed by administration to the host of a second component that is a prodrug which can be
converted into a cytotoxic drug by the heterologous enzyme encoded by the first component.
18 A method according to claim 17 in which the first component comprises a gene
encoding the heterologous enzyme CPG2; and the second component prodrug is selected from
N-(4-[N,N-bis(2-iodoethyl)amino]phenoxycarbonyl)-L-glutamic acid, N-(4-[N,N-
bis(2-chloroethyl)amino]-phenoxycarbonyl)-L-glutamic-gamma-(3,5-dicarboxy)anilideorN-
(4-[N,N-bis(2-chloroethyl)amino]-phenoxycarbonyl)-L-glutamic acid or a pharmaceutically acceptable salt thereof.
19. A gene construct encoding a cell substantially as hereinbefore with reference to the foregoing examples and accompanying drawings.