Sign In to Follow Application
View All Documents & Correspondence

Expression And Secretion Of Recombinant Proteins In Pichia Pastoris

Abstract: The present invention relates to compositions and methods for enhanced secretion of recombinant proteins in yeast expression systems. Specifically, the invention provides engineered signal peptides, promoters, codon-optimized expression constructs, and recombinant vectors that collectively improve the secretion efficiency of heterologous proteins in budding yeast. Also disclosed are recombinant host cells comprising such constructs, along with fermentation processes optimized for high-yield expression and secretion. The invention enables improved production, folding, and recovery of biologically active recombinant proteins suitable for use in food, nutrition, pharmaceuticals, cosmetics, personal care, and industrial applications.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
16 August 2024
Publication Number
36/2025
Publication Type
INA
Invention Field
BIOTECHNOLOGY
Status
Email
Parent Application

Applicants

SARVATGC TECHNOLOGIES PVT. LTD.
MERA HOMES, H Block, Flat Number 301, Whitefield, Seegehalli, Kannanmangala, Bangalore South, Bangalore - 560067

Inventors

1. Alok Kumar Malaviya
C/O Sarvatgc Technologies Pvt. Ltd. Mera Homes, H Block, Flat Number 301, Whitefield, Seegehalli, Kannanmangala, Bangalore South, Bangalore - 560067
2. Akshay Mittal
C/O Sarvatgc Technologies Pvt. Ltd. Mera Homes, H Block, Flat Number 301, Whitefield, Seegehalli, Kannanmangala, Bangalore South, Bangalore - 560067

Specification

DESC:SEQUENCE LISTING
[1] A sequence listing in compliance with WIPO Standard ST.26 has been submitted separately in XML format (fileName="Pichia pastoris Sequences-11082025.xml" softwareName="WIPO Sequence" productionDate="2025-08-11") as part of this application. The content of the sequence listing is incorporated herein by reference in its entirety. In case of any inconsistency between the written specification and the sequence listing, the content of the XML file shall prevail with respect to nucleotide and amino acid sequence disclosures.
FIELD OF INVENTION
[2] The present invention relates to the field of recombinant protein production in yeast systems. More particularly, it relates to the construction and optimization of recombinant vectors for improved secretion of heterologous proteins, with enhancing expression efficiency and extracellular yield in host.
BACKGROUND OF THE INVENTION
[3] Numerous proteins of importance in research, food, industry, and medicine—including enzymes, vaccines, hormones, and biopharmaceuticals—are produced using recombinant host cells. Among eukaryotic systems, budding yeasts are preferred due to rapid growth, cost-effective culturing, and the ability to perform post-translational modifications such as proteolytic processing, disulfide bond formation, phosphorylation, and glycosylation. Common yeast hosts include Saccharomyces cerevisiae, Pichia pastoris, Hansenula polymorpha, and Kluyveromyces lactis.
[4] Pichia pastoris (now Komagataella phaffii) is widely used for recombinant protein (RP) production, offering advantages such as high-density growth, scalable fermentation, low-cost media, limited secretion of host proteins, ease of genetic manipulation, and FDA GRAS status. Over the past three decades, P. pastoris has supported the production of thousands of valuable proteins for industrial, pharmaceutical, and nutritional applications (Cereghino et al., 2000; Du et al., 2011).
[5] Recombinant protein (RP) expression in Pichia pastoris typically utilizes codon-optimized genes cloned into expression vectors, which are then transformed into host strains such as X-33, GS115, BG10, or BG12. The expression vector includes key genetic elements such as a promoter, secretion signal peptide, coding sequence for the target protein, transcription terminator, selection marker, yeast-specific genomic integration locus, and bacterial elements. Among the commonly used promoters, AOX1p (a methanol-inducible promoter) and GAPp (a constitutive promoter) are widely employed. More recently, alternative endogenous promoters such as CAT1p, FLDp, DAS, and GCW14p, as well as orthologous promoters like MOXp and FMDp from Hansenula polymorpha, have shown comparable or superior expression performance relative to AOX1p (Dou et al., 2021; Vogl et al., 2020).
[6] In heterologous expression systems, recombinant proteins are frequently directed for secretion into the extracellular medium via a signal peptide fused to the N-terminus of the protein coding sequence. This secretion facilitates simplified downstream processing and reduced purification costs.
[7] In yeast and filamentous fungi, most signal peptides are short sequences of 15–30 amino acids, sufficient to direct nascent proteins to the endoplasmic reticulum (ER) and subsequent translocation. In contrast, the Saccharomyces cerevisiae a-mating factor (a-MF) signal peptide comprises two distinct regions: a short pre-region (19 amino acids) that directs the nascent protein to the ER, and a longer pro-region (~70 amino acids) that aids in proper folding and enhances secretion efficiency. Together, these domains have been shown to significantly improve the yield of secreted recombinant proteins. Although both native and a-MF signal peptides are commonly used in Pichia pastoris, secretion efficiency can vary and is often protein-specific. Consequently, several studies have investigated modified versions of a-MF and alternative signal peptides to improve secretion, with mixed results, as described by Kumar et al., 2022.
[8] Although promoter and signal peptide modifications have individually enhanced expression or secretion, their combinatorial optimization remains underexplored. Signal peptides are often evaluated only with AOX1p, and promoter studies rarely modify the signal peptide component. This underlines a key gap and opportunity for improved expression by systematically pairing strong promoters with optimized signal peptides in the same expression vector.
[9] However, high-level protein expression can overwhelm the ER folding machinery, leading to misfolding, ER stress, and reduced secretion. Improper folding or processing often triggers quality control pathways, lowering yield and functionality, as reported by Brodsky et al., 2011.
[10] Thus, optimizing both expression and secretion components is critical. Existing systems often lack integrated approaches to balance protein expression strength with proper protein folding and secretion efficiency.
[11] Therefore, there is a clear need for recombinant expression systems that enable high-yield production of heterologous proteins;
? Promote efficient secretion into the culture medium;
? Ensure correct folding and post-translational modifications;
? Minimize ER stress and protein aggregation.
[12] The present invention addresses these needs by providing engineered signal peptide, combinatorially optimized promoter–signal combinations, and recombinant yeast strains tailored for high secretory yields of recombinant proteins in budding yeast.
BRIEF SUMMARY OF INVENTION
[13] The present disclosure provides engineered expression systems and methods for the high-yield secretory expression of heterologous proteins in budding yeast. Certain aspects of the present invention relate to recombinant vectors encoding engineered or chimeric secretion signal peptides operably linked to the N-terminal region of heterologous proteins. Other aspects pertain to combinatorially optimized expression cassettes comprising strong promoters and scarless vector designs for improved extracellular protein recovery.
[14] One aspect of the present invention relates to a nucleic acid construct encoding a fusion polypeptide comprising a secretion signal peptide and the heterologous protein.
[15] In another aspect of the present invention, the heterologous protein is a food-grade protein.
[16] One aspect of the present invention relates to the engineered a-MF secretion signal peptide comprising one or more pro-region modifications, including but not limited to a point mutation and/or deletion of the EAEA repeat sequence, a tetrapeptide.
[17] Another aspect of the present invention comprises one or more modifications in the pre-region, obtained by combining pre-region segments derived from signal peptides of different functional proteins, including but not limited to a-amylase, lactoferrin, ß-lactoglobulin, lysozyme, serum albumin, EPX1, 0030, UTH1, or SCW10.
[18] Another aspect of the invention relates to an engineered a-MF secretion signal peptide comprising a modified pre-region and/or pro-region sequence, designed to enhance the secretion of heterologous proteins.
[19] Another aspect of the present invention provides an engineered signal peptide that improves ER targeting and translocation, enhances post-translational processing, and increases extracellular secretion efficiency of RPs.
[20] Another aspect of the present invention provides a nucleic acid construct comprising a strong promoter including but not limited to AOX1p, CAT1p, FLDp, GCW14p, FMDp, or MOXp.
[21] Another aspect of the present invention relates to a recombinant expression system comprising a host cell transformed with the engineered nucleic acid construct, wherein the host cell is a budding yeast, preferably Pichia pastoris (Komagataella phaffii), and more specifically a strain selected from wild-type or auxotrophic strains.
[22] Another aspect of the present invention relates to an engineered signal peptide that enhances secretion compared to the native a-MF signal peptide, achieving improved extracellular titers of RPs.
[23] Another aspect of the present invention provides a method for producing food-grade proteins, comprising culturing the recombinant yeast strain under expression conditions and recovering the protein from the culture medium using standard purification methods.
[24] Another aspect of the present invention relates to the recovery of the food-grade protein as a functional protein and used in food, beverage, nutraceutical, or personal care applications.
[25] In another aspect of the present invention, the expression system described herein is adapted to express other food-grade proteins, including but not limited to thaumatin, monellin, curculin, mabinlin, brazzein, lactoferrin, casein, ??-lactalbumin, or ß-lactoglobulin.
BRIEF DESCRIPTION OF THE DRAWINGS
[26] The drawings described below are intended to provide a better understanding of the invention and illustrate exemplary embodiments. These figures are not necessarily drawn to scale and should not be interpreted as limiting the scope of the invention. Wherever possible, like reference numerals are used to refer to like elements throughout the drawings. The embodiments shown are for illustrative purposes only and are not intended to restrict the invention to the specific forms disclosed.
Figure 1: Expression Vector Map
[27] Figure 1 depicts a schematic representation of a recombinant expression vector. The construct includes a gene encoding a protein of interest, fused at its N-terminus to an engineered secretion signal peptide. The signal peptide comprises a functional variant of the Saccharomyces cerevisiae a-MF (pre-pro-aMF), modified to enhance secretion efficiency in Pichia pastoris. Key features of the vector—including the promoter, secretion signal, coding region for protein of interest, terminator, selection marker, yeast target locus, and bacterial elements—are annotated.
Figure 2: SDS-PAGE Image – Comparison of Expression Levels in EC1 and EC2 Construct Clones
[28] Figure 2 shows an SDS-PAGE gel image of culture supernatants from different clones expressing Brazzein. Lane M contains molecular weight markers (PAGEmark Tricolor PLUS, G Biosciences, 786-854). ‘EC’ denotes expression cassettes as described in Table 6. Lanes 1–2 represent EC1 clones (pPICZa vector); lanes 3–4 represent EC2 clones (engineered vectors). Lane S is loaded with an in-house Brazzein standard. Expression was performed in parallel shake flask cultures under identical conditions, and equal volumes of supernatant were analyzed.
Figure 3: Bar Graph – Comparison of Expression Levels in EC1 and EC2 Construct Clones
[29] Figure 3 presents a bar graph comparing extracellular secretion levels of Brazzein expressed with EC1 (pPICZalpha vector) versus EC2 (engineered vector). Data represent densitometric quantification of protein bands from the SDS-PAGE gel in Figure 2, analyzed using Bio-Rad Image Lab software (Version 6.1.0). The expression level of EC1 is used as the baseline for calculating relative fold improvement.
Figure 4: SDS-PAGE Image – Comparison of Expression Levels among Multiple Engineered Construct Clones
[30] Figure 4 shows an SDS-PAGE gel image of culture supernatants from different clones expressing Brazzein. Lane M contains molecular weight markers (PAGEmark Tricolor PLUS, G Biosciences, 786-854). 'EC' denotes the expression cassette as described in Table 6. Lanes 1–3 represent EC3 clones; lanes 4–6 represent EC4 clones; lanes 7–8, 15–16, and 20–21 represent EC2 clones; lanes 9–11 represent EC5 clones; lanes 12–14 represent EC6 clones; lanes 17–19 represent EC7 clones. Lane S is loaded with an in-house Brazzein standard. Expression was performed in parallel shake flask cultures under identical conditions, and equal volumes of supernatant were analyzed.
Figure 5: Bar Graph – Comparison of Expression Levels among Multiple Engineered Construct Clones
[31] Figure 5 presents a bar graph comparing the extracellular expression levels of Brazzein expressed with multiple engineered expression constructs. Data reflect densitometric quantification of protein bands from SDS-PAGE gel (Figure 4), using Bio-Rad Image Lab software (Version 6.1.0). The expression level of EC2 is used as a baseline for calculating relative fold improvement.
Table 1: Details and Source of Signal Peptides
[32] Table 1 lists the origin and classification of signal peptides evaluated in the present invention. Signal peptides SP1 and SP2 correspond to native and commercial vector sequences of the S. cerevisiae a-MF signal peptide and the SP3 represents the engineered sequence of a-MF signal peptide. SP4 to SP8 are native signal peptides derived from heterologous proteins across various species, including mammalian, avian, and fungal origins. SP9 to SP13 represent chimeric signal peptides, each consisting of a heterologous pre-region (from SP4–SP8) fused to a modified a-MF pro-region (from SP3). This table provides the contextual source or reference for each peptide sequence data.
Table 2: Amino Acid Sequences of Signal Peptides
[33] Table 2 provides the amino acid sequences corresponding to the signal peptides listed in Table 1. SEQ ID NOs:1 to 2 correspond to native and commercial vector sequences of the S. cerevisiae a-MF signal peptide and the SEQ ID NOs:3 represents the engineered sequence of a-MF signal peptides. SEQ ID NOs:4 to 8 present the amino acid sequences of native signal peptides from heterologous proteins such as ß-lactoglobulin, serum albumin, lysozyme, and two secreted proteins from Pichia pastoris. SEQ ID NOs:9 to 13 contain chimeric peptide sequences, each combining a heterologous pre-region with the engineered a-MF pro-region. These sequences were designed to assess secretion performance in recombinant expression systems.
Table 3: Nucleotide Sequences of Signal Peptides
[34] Table 3 displays nucleotide sequences corresponding to the signal peptides in Table 2. SEQ ID NOs:14 and 15 correspond to native and commercial a-MF nucleotide sequences, while SEQ ID NO:16 represents a codon-optimized version of the engineered a-MF signal peptide. SEQ ID NOs:17 to 21 correspond to nucleotide sequences of heterologous peptides (SP4 to SP8), and SEQ ID NOs:22 to 26 represent the nucleotide sequences of chimeric peptides (SP9 to SP13), constructed by fusing the pre-region and engineered a-MF pro-region coding sequences.
Table 4: Details of Promoters Used for Expression
[35] Table 4 summarizes the promoters used for yeast-based recombinant protein expression. These include both inducible and constitutive promoters from Pichia pastoris and Hansenula polymorpha. Promoters such as AOX1 (Alcohol Oxidase 1), FLD (Formaldehyde Dehydrogenase), CAT1 (Catalase 1), and GCW14 (GPI-anchored cell wall protein) are derived from P. pastoris, while FMD (Formate Dehydrogenase) and MOX1 (Methanol Oxidase) originate from H. polymorpha. Promoters were selected based on regulatory behavior, inducibility, and compatibility with high-level expression.
Table 5: Sequence of Brazzein
[36] Table 5 presents the protein and codon-optimized nucleotide sequences of Brazzein, a natural sweet-tasting protein originally isolated from Pentadiplandra brazzeana. SEQ ID NO: 27 contains the mature amino acid sequence of Des-pyrE-Brazzein (53 amino acid form), and SEQ ID NO: 28 contains the codon-optimized version adapted for yeast expression. These sequences are compatible with the signal peptide and promoter systems disclosed herein.
Table 6: Details of Protein Expression Cassettes.
[37] Table 6 provides an overview of expression cassettes designed to assess the secretion and expression potential of Brazzein. Each cassette includes a promoter, signal peptide, protein of interest (Brazzein), and a terminator. Variations in promoter and signal peptide combinations were systematically tested to evaluate their impact on expression yield and secretion efficiency in yeast systems.
Description of the invention
[38] The following detailed description broadly outlines the key features and technical merits of the present invention to facilitate a clearer understanding of the embodiments that follow. Further aspects, examples, and advantages will be described in subsequent sections, forming the core of the invention’s disclosure. It should be understood by those skilled in the art that various modifications, equivalent methods, and alternative implementations may be employed without departing from the scope and spirit of the invention. The present invention relates to recombinant expression systems optimized for the efficient secretory production of heterologous proteins in yeast or fungi, particularly in budding yeast. Specifically, the invention focuses on engineered secretion signal peptides derived from the pre-pro region of the Saccharomyces cerevisiae a-MF to increase the secretion of protein of interest into the extracellular culture media.
Definitions:
[39] For the purpose of this specification, the following terms shall have the meanings set forth below. These definitions are intended to clarify the scope and interpretation of the invention, and where any discrepancy exists between standard usage and the definitions herein, the definitions in this section shall prevail.
[40] The headings provided herein are not limitations of the various aspects or embodiments of the invention which can be had by reference to the specification as a whole. Accordingly, the terms defined immediately below are more fully defined by reference to the specification as a whole. Numeric ranges are inclusive of the numbers defining the range. Unless otherwise indicated; nucleic acids are written left to right in 5' to 3' orientation; amino acid sequences are written left to right in amino to carboxy orientation, respectively.
[41] It should be noted that, as used in this specification and the appended claims, the singular forms “a,” “an,” and “the” include plural references unless the content clearly dictates otherwise. Thus, for example, reference to a composition containing “a compound” includes a mixture of two or more compounds. It should also be noted that the term “or” is generally employed in its sense including “and/or” unless the content clearly dictates otherwise..
[42] The terms “peptides”, “proteins”, and “polypeptides” are used interchangeably herein.
[43] The term “Signal Peptide / Secretion Signal Peptide” as used herein includes a short amino acid sequence located at the N-terminus of a protein that directs the nascent peptide to the host’s secretory pathway, facilitating its translocation into the endoplasmic reticulum (ER) and eventual extracellular secretion.
[44] The term “native” as used herein refers to a biological sequence (e.g., DNA, RNA, peptide, or protein) or regulatory element that is naturally occurring and unmodified, derived from the host organism in which it is used. Native sequences are endogenously present in the genome or proteome of the host species.
[45] The term “heterologous signal peptide” as used herein refers to a signal peptide that is not naturally associated with the heterologous protein to which it is fused. The signal peptide may be derived from a different gene, a different organism, or may be synthetically engineered or adapted to enhance secretion efficiency in a given host cell. In some embodiments, the heterologous signal peptide is an engineered a-mating factor (a-MF) signal peptide optimized for expression and secretion in Pichia pastoris. In certain cases, the signal peptide may be native to the host organism, but not to the same gene or expression cassette, and is therefore considered heterologous in the context of the recombinant construct.
[46] The term “synthetic” as used herein refers to any biological sequence that has been artificially designed or constructed, either de novo or based on native or non-native templates. Synthetic sequences may include codon-optimized versions, chimeric constructs, mutated variants, or entirely novel sequences not found in nature. These are typically generated using in silico design tools, gene synthesis technologies, or molecular engineering techniques.
[47] The term “non-native” as used herein refers to any biological sequence—such as a nucleotide or amino acid sequence—that originates from an organism different from the host in which it is expressed or utilized. Such sequences may perform the same or different biological function in the host organism compared to their original source. A non-native sequence may encode the same protein as a native one but is derived from a different species or engineered source.
[48] The term “host” as used herein refers to any microorganism, including but not limited to bacteria, yeast, and filamentous fungi, that is capable of being genetically engineered and cultivated under laboratory or industrial fermentation conditions for the purpose of producing desired biomolecules.
[49] The term “recombinant” as used herein refers to a molecule, cell, or organism that has been genetically modified to contain nucleic acid sequences not naturally present in that context, or arranged in a manner not found in nature. This includes sequences introduced via artificial means such as cloning, vector assembly, or genome editing.
[50] The term “heterologous protein expression” as used herein refers to the production of a protein in a host organism that does not naturally produce that protein. The protein-encoding gene is introduced into the host through genetic engineering, and the host’s cellular machinery is utilized for transcription, translation, and processing.
[51] The term “heterologous protein” as used herein refers to a protein in a host organism that does not naturally produce that protein.
[52] The term “expression construct” as used herein refers to a synthetically assembled nucleic acid molecule designed to enable expression of a gene of interest. Such a construct typically includes regulatory elements such as promoters, signal peptides, coding sequences, and terminators, and may be maintained on a plasmid or integrated into the genome.
[53] The term “chromosomal integration” as used herein refers to the insertion of a foreign nucleic acid sequence into the genome of a host organism, such that it becomes part of the host’s chromosomal DNA and is stably inherited across generations.
[54] The term “codon optimization” as used herein refers to the alteration of a DNA sequence encoding a protein, without changing the amino acid sequence, to use codons that are more frequently used by the host organism, thereby improving translation efficiency.
[55] The term “vector” as used herein refers to a nucleic acid molecule capable of transporting another nucleic acid molecule to which it has been linked.
[56] The term includes “plasmids”, as used herein which generally refers to a circular double stranded DNA loop into which additional DNA segments can be ligated, and linear double-stranded molecules, such as those resulting from amplification by the polymerase chain reaction (PCR) or from treatment of a plasmid with a restriction enzyme. Other non-limiting examples of vectors include bacteriophages, cosmids, bacterial artificial chromosomes (BAC), yeast artificial chromosomes (YAC), and viral vectors (i.e., complete or partial viral genomes into which additional DNA segments are ligated). Certain vectors are capable of autonomous replication in a recombinant host cell into which they are introduced (e.g., vectors having an origin of replication that functions in the cell). Other vectors upon introduction can be integrated into the genome of a recombinant host cell and are thereby replicated along with the cell genome.
[57] The term “Protein folding” refers to a physical process by which a linear polypeptide folds into its characteristic and functional three-dimensional structure. Polypeptides exist as an unfolded polypeptide or random coil when translated from a sequence of mRNA to a linear chain of amino acids.
[58] The term "modification" refers to any alteration made to a native (wild-type) sequence, structure, or component, including but not limited to deletions, insertions, substitutions, truncations, fusions, or chemical derivatizations. Modifications may affect nucleic acid sequences, amino acid sequences, regulatory elements (e.g., promoters or signal peptides), vector backbones, or host cells. Modifications can be achieved through recombinant DNA technology, site-directed mutagenesis, synthetic gene design, or other molecular biology techniques, and may be intended to improve expression, stability, secretion, solubility, or biological activity of the recombinant product.
[59] The term "point mutation" refers to a change at a single nucleotide position within a nucleic acid sequence, resulting in the substitution of one base pair for another. This may lead to a silent (synonymous), missense (amino acid altering), or nonsense (stop codon introducing) mutation in the corresponding amino acid sequence. Point mutations can be introduced intentionally through mutagenesis or may occur naturally, and they are often employed to modify specific protein characteristics, such as secretion efficiency, enzymatic activity, or folding stability
[60] The term “promoter” as used herein refers to a DNA sequence upstream of a coding region that initiates transcription of the associated gene by providing a binding site for RNA polymerase and associated transcription factors.
[61] The term “constitutive promoter” as used herein refers to a promoter that drives continuous gene expression regardless of external stimuli or environmental conditions.
[62] The term “inducible promoter” as used herein refers to a promoter that enables gene expression only in response to specific environmental cues, nutrients, or chemical inducers.
[63] The term “target loci” as used herein refers to specific, predefined regions in a host genome selected for genetic modification or integration of an expression construct due to their transcriptional activity and genomic stability.
[64] The term “signal recognition particle (SRP)” as used herein refers to a ribonucleoprotein complex in cells that recognizes signal peptides on nascent proteins and directs the ribosome to the endoplasmic reticulum for co-translational translocation.
[65] The term “fermentation” as used herein refers to the controlled cultivation of microbial cells under specific conditions in bioreactors or flasks for the purpose of producing recombinant proteins or other desired biomolecules.
[66] The term “GRAS” (Generally Recognized As Safe) as used herein refers to a designation by regulatory agencies, such as the U.S. FDA, indicating that a substance is considered safe for use in food based on a long history of common use or scientific evidence.
[67] The term “Brazzein” as used herein refers to a naturally occurring sweet-tasting protein originally isolated from the fruit of the African plant Pentadiplandra brazzeana, and characterized by its high sweetness intensity and heat stability.
[68] The term “sweet protein” as used herein refers to a class of proteins that elicit a sweet taste in humans, typically much sweeter than sucrose on a weight basis, and are used as natural non-caloric sweeteners.
[69] The term “food grade” as used herein refers to substances or processes that meet the safety, hygiene, and purity standards required for human consumption as defined by regulatory agencies.
[70] The term “purification” as used herein refers to the set of processes used to isolate a target protein or molecule from a mixture, such as a fermentation broth, to achieve the desired level of purity for its intended application.
[71] The term “lyophilization” as used herein refers to a freeze-drying technique used to preserve proteins or biologics by removing water under low temperature and pressure, resulting in a stable, dry powder.
[72] The term “expression cassette” as used herein refers to a modular DNA segment comprising regulatory elements (such as promotor and terminator), signal peptide and a protein coding sequence that is capable of driving expression of a gene in a host organism.
[73] The term “chromosomal integration” as used herein refers to the insertion of a foreign nucleic acid sequence into the genome of a host organism, such that it becomes part of the host’s chromosomal DNA and is stably inherited across generations.
[74] The term “codon optimization” as used herein refers to the alteration of a DNA sequence encoding a protein, without changing the amino acid sequence, to use codons that are more frequently used by the host organism, thereby improving translation efficiency.
[75] The term “chimeric” as used herein refers to genetic chimerism, wherein a single organism is composed of cells with different genotypes. In the context of microbes, “chimeric” describes engineered combinations of genetic or protein components derived from different sources, created to improve function, expression, or overall utility in microbial systems.
[76] The term “ion-exchange chromatography” as used herein refers to a purification technique in which proteins are separated based on their net charge, using a charged resin that binds oppositely charged molecules under a specific buffer.
[77] The term “adapted” as used herein refers to a nucleic acid sequence or polypeptide—such as a signal peptide or pre-region—that has been modified, engineered, selected, or functionally optimized to perform a desired function more effectively in a specific host system or context. In particular, when referring to a heterologous pre-region, the term "adapted" indicates that the sequence has been configured or engineered to enhance the secretion of a heterologous protein into the extracellular culture medium of a host cell, such as Pichia pastoris.
[78] The term “enhances secretion” as used herein refers to an improvement in the quantity or efficiency of extracellular secretion of a heterologous protein by a host cell, relative to a reference signal peptide under comparable expression conditions. The enhancement may be: quantitative, such as at least a 1.5-fold, 2-fold, or greater increase in the amount of heterologous protein detected in the culture medium; qualitative, such as more complete cleavage of the signal peptide, reduction in intracellular retention or aggregation, or improved folding or solubility resulting in enhanced extracellular localization; or a combination thereof. The enhancement is typically measured using analytical techniques such as ELISA, SDS-PAGE, Western blotting, or protein activity assays, comparing the test construct to a control expressing the same heterologous protein with a standard or unmodified a-mating factor signal peptide.
[79] The term “scar-sequence-free” or “scarless secretion” as used herein refers to the secretion of a recombinant protein from a host cell such that the mature protein product lacks extraneous amino acid residues at its N-terminus, which may otherwise arise from incomplete proteolytic processing of the signal peptide, particularly the pro-region of the Saccharomyces cerevisiae a-mating factor (a-MF) signal peptide. In conventional a-MF signal peptides, incomplete cleavage by Ste13 dipeptidyl aminopeptidase may leave residual amino acid sequences such as the EAEA tetrapeptide at the N-terminus of the mature secreted protein. Such residual "scar" sequences can negatively impact the functionality, stability, safety, or regulatory acceptability of the expressed protein, especially in food, therapeutic, or diagnostic applications. In contrast, “scar-sequence-free” as used herein refers to engineered a-MF signal peptides that incorporate specific deletions—such as the removal of the EAEA spacer motif located downstream of the Kex2 protease cleavage site—thus allowing precise and complete proteolytic processing. This results in secretion of the intended mature protein without any non-native N-terminal residues. The absence of scar sequences may be confirmed by N-terminal sequencing, mass spectrometry, or comparative peptide mapping of the secreted protein product.
[80] The term “regulatory element” refers to any nucleotide sequence that modulates, enhances, suppresses, or otherwise regulates the expression of a nucleic acid sequence. Such elements may include, but are not limited to, promoters, enhancers, silencers, operator sequences, insulators, transcription terminators, polyadenylation signals, 5' and 3' untranslated regions (UTRs), and intronic regulatory motifs.
[81] In one embodiment, the invention relates to a recombinant expression system comprising an engineered vector designed to enhance the expression of a heterologous protein of interest in a host cell. The engineered vector comprises an engineered signal sequence configured to facilitate efficient secretion of the target protein, a strong promoter, a terminator, and a target genomic integration locus, as illustrated in Figure 1.
[82] In another embodiment, the invention features an engineered signal peptide, wherein the signal peptide is derived from the native a-MF signal peptide of Saccharomyces cerevisiae or commercial variant pPICZa (Invitrogen). In certain embodiments, the signal peptide comprises modifications in the pro-region, the pre-region, or both, to enhance secretion efficiency and improve N-terminal processing of the expressed heterologous protein.
[83] Secretion of recombinant proteins into the extracellular medium offers significant advantages for downstream processing, eliminating the need for cell lysis and simplifying purification. In yeast or fungi, secretion is typically mediated by signal peptides that direct the nascent polypeptide into the ER for post-translational processing and trafficking. A widely used signal peptide for this purpose is the a-MF pre-pro sequence from S. cerevisiae.
[84] The a-MF signal peptide consists of two distinct regions:
? The pre-region, approximately 19 amino acids in length, functions as a classical ER-targeting signal. It is recognized by the signal recognition particle (SRP) and cleaved by signal peptidase as the polypeptide enters the ER lumen.
? The pro-region, approximately 70 amino acids, remains transiently in the ER and facilitates protein folding, trafficking, and packaging into secretory vesicles. This region is typically processed by Kex2 and Ste13 proteases, as reported by Brake et al., (1984).
[85] Despite its widespread use in yeast and fungal systems, the performance of the a-MF signal peptide is protein-dependent and not optimal for all heterologous proteins, as also reported by Cereghino et al. (2000) and Dou et al. (2021). The following limitations have been identified:
1. Inefficient ER Translocation: The pre-region mediates post-translational translocation of the RP into the ER via the Sec pathway. However, if the RP folds prematurely in the cytoplasm, it may fail to enter the ER, resulting in cytoplasmic accumulation or even blockage of the translocon.
2. ER Stress and Protein Misfolding: While the pro-region facilitates proper folding and trafficking, its fusion to certain heterologous proteins can promote aggregation or misfolding within the ER, especially under over-expression conditions. This can activate the unfolded protein response (UPR) and lead to degradation of the RP via ER-associated degradation (ERAD) pathways.
3. Incomplete N-terminal Processing: The Ste13 protease trims dipeptides such as Glu-Ala (E-A) from the N-terminus of the mature protein. Inefficient cleavage may leave residual EAEA scar sequences, altering the N-terminus of the secreted RPs. This issue is particularly undesirable for food proteins, as the presence of extra amino acid residues can impact safety, functionality, and regulatory approval.
[86] To address these challenges, the present invention provides engineered a-MF secretion signal peptides with specific modifications in the pre-region, pro-region, and/or in both.
[87] Moreover, previous approaches have typically focused on either promoter engineering or signal peptide modifications in isolation. The synergistic impact of combining optimized promoters with engineered signal peptides remains underexplored. To address this gap, the present invention provides an optimal promoter to be used with the engineered signal peptides to drive and enhance RP expression and secretion by avoiding ER stress and entry into degradation pathways.
[88] In one embodiment, the expression system comprises engineered a-MF signal peptides represented by polypeptide sequences SEQ ID NOs: 1, 2, and 3, and their corresponding nucleotide sequences SEQ ID NOs: 14, 15, and 16. SEQ ID NO: 1 or14 corresponds to the native a-MF leader sequence derived from Saccharomyces cerevisiae. SEQ ID NO: 2 or 15 represents the a-MF signal peptide variant utilized in the commercial expression vector pPICZa (Invitrogen, Thermo Fisher). The codon optimized SEQ ID NO: 3 or 16 comprises an engineered a-MF signal peptide with one or more modifications in the pro-region, designed to enhance secretion efficiency and improve N-terminal processing of the expressed heterologous protein, as described in the Table 2 and 3 of the present invention.
[89] In one embodiment, the engineered pro-region of signal peptides incorporates one or more of the following modifications:
? A V50A substitution in the pro-region, previously reported to reduce endoplasmic reticulum (ER) stress, and improve secretion of recombinant proteins (Ito et al, 2022);
? Deletion of the EAEA spacer motif, located after the Kex2 cleavage sites, thereby enabling production of a scarless mature protein and improved proteolytic processing;
? Additional neutral point mutations, such as cloning-derived substitutions or silent restriction sites, which do not impair secretion efficiency.
[90] In one embodiment, the invention provides engineered signal peptides comprising polypeptides selected from SEQ ID NOs: 1, 2, or 3, or polypeptides having at least 80%, 85%, 90%, 95%, or 99% sequence identity to any of SEQ ID NOs: 1–3. These polypeptides represent modified versions of the native a-MF signal peptide, incorporating specific changes such as the V50A substitution and the deletion of the EAEA motif, as outlined in Table 1, which specifies the positions and identities of the altered amino acid residues.
[91] In another embodiment, the invention provides nucleic acid sequences encoding the aforementioned engineered signal peptides, selected from SEQ ID NOs: 14, 15, or 16, or nucleic acid sequences having at least 80%, 85%, 90%, 95%, or 99% sequence identity to any of SEQ ID NOs: 14–16. These nucleotide sequences correspond to the modified signal peptides described above and are similarly characterized by changes relative to the native a-MF signal peptide, as detailed in Table 1.
[92] In one embodiment, the engineered signal peptide includes a restriction site strategically positioned within the pro-region of the a-MF sequence. The signal peptide comprises a polypeptide sequence selected from SEQ ID NO: 2 or 3, and/or a corresponding nucleotide sequence selected from SEQ ID NO: 15 or 16.
[93] In a related embodiment, the signal peptide incorporates a deletion of the EAEA spacer sequence within the pro-region of the a-MF to prevent the inclusion of non-native scar residues at the N-terminus of the expressed recombinant protein. This deletion is reflected in the polypeptide sequences defined by SEQ ID NO: 3 and/or their encoding nucleotide sequences SEQ ID NO: 16.
[94] In another embodiment, the engineered signal peptide exhibits at least 80%, 85%, 90%, 95%, or 99% sequence identity to SEQ ID NO: 3 at the polypeptide level, and to SEQ ID NO: 16 at the nucleotide level.
[95] In one embodiment, comparative analysis demonstrated that constructs incorporating the engineered signal peptide sequences SEQ ID NO: 3 or SEQ ID NO: 16 led to significantly enhanced secretion of the target protein compared to those containing the commercial sequences SEQ ID NO: 2 or SEQ ID NO: 15. This improvement in secretion efficiency is illustrated in Figure 2 (SDS-PAGE analysis) and Figure 3 (quantitative bar graph), highlighting the superior performance of the engineered constructs in promoting extracellular protein secretion, likely due to improved ER folding and efficient vesicle trafficking.
[96] In certain cases, such as with proteins capable of folding in the cytosol, high-level expression of RP can hinder the post-translational ER translocation mediated by the pre-peptide of a-MF. Employing hybrid signal peptides that promote co-translational translocation can improve ER entry of the nascent RP, thereby enhancing secretion efficiency (Barrero et al., 2018). However, combinatorial approaches to relieve bottlenecks at each step of the secretion pathway, including translocation, folding, vesicular trafficking, and maturation processing of RP, remain underexplored. Such integrated strategies are essential to exploit the full potential of the host organism.
[97] To address this limitation, additional embodiments incorporate chimeric signal peptides that combine the engineered a-MF pro-region (including V50A and/or ?EAEA modifications) with pre-region sequences sourced from yeast, fungal, mammalian, or synthetic origins. These chimeric peptides were designed to improve ER translocation, folding kinetics, trafficking efficiency, and compatibility with diverse target proteins.
These include:
? Heterologous pre-regions from signal peptide such as ß-lactoglobulin (Bos taurus), lysozyme (Gallus gallus), serum albumin (Homo sapiens), EXP1 (P. pastoris), or 0030 (P. pastoris);
? A pro-region derived from engineered a-MF signal peptide incorporating the V50A mutation and EAEA deletion as represented in the polypeptide sequence as defined in SEQ ID NO: 3, and/or a corresponding nucleotide sequence encoding the signal peptide as defined in SEQ ID NO: 16;
[98] Another aspect of the present invention, the engineered signal peptide is a chimeric signal peptide, wherein the pre-region is derived from a signal peptide of functional protein such as a-amylase, lactoferrin, ß-lactoglobulin, lysozyme, EPX1, 0030, serum albumin, or UTH1, or SCW10, and the pro-region is derived or modified from the S. cerevisiae a-MF.
[99] In one embodiment, the present invention provides a chimeric signal peptide in which both the pre-region and pro-region are engineered. Specifically, the pre-region of the a-MF signal peptide is substituted with pre-regions derived from native or heterologous signal peptides originating from various species, including yeast, fungi, mammals, or synthetic hybrids.
[100] In one embodiment, the chimeric signal peptides of the present invention enhance ER targeting and improve the secretion of the heterologous protein.
[101] In one embodiment, the pre-region used to construct a chimeric signal peptide comprises a polypeptide sequence selected from SEQ ID NOs: 4, 5, 6, 7, or 8.
[102] In another embodiment, the invention provides the corresponding nucleic acid sequences encoding the pre-region of the chimeric signal peptide, selected from SEQ ID NOs: 17, 18, 19, 20, or 21.
[103] In one embodiment, the engineered signal peptide used in the expression construct comprises a polypeptide sequence selected from SEQ ID NOs: 9, 10, 11, 12, or 13, and is operably fused to the N-terminus of a codon-optimized sequence encoding a heterologous protein, such as a food-grade protein.
[104] In another embodiment, nucleotide sequence that encode signal peptide is selected from SEQ ID NOs: 22, 23, 24, 25, or 26, or a sequence having at least 80%, 85%, 90%, 95%, or 99% identity thereto, and is operably fused to a codon-optimized nucleotide sequence encoding a heterologous protein, such as a food-grade protein.
[105] Table 1 summarizes the signal sequences, both native and non-native, that fall within the scope of the present invention.
[106] Table-1: The sequences used in the above embodiments are summarized below.
Signal peptide Description Source Reference
SP1 a-MF native Saccharomyces cerevisiae GenBank: ATI15595.1
SP2 a-MF pPICZa vector Commercial vector Invitrogen
SP3 a-MF pro region modified Engineered sequence Present invention
SP4 ß-lactoglobulin Bos taurus UniProt: P02754
SP5 Serum albumin Homo sapiens UniProt: P02768
SP6 Lysozyme Gallus gallus UniProt: P27042
SP7 EXP1 (Extracellular protein X1) Pichia pastoris GenBank: CAH2450272.1
SP8 0030 (PAS_chr3_0030) Pichia pastoris GenBank: CAH2450323.1
SP9 pre SP4 + pro SP3 Engineered sequence Present invention
SP10 pre SP5 + pro SP3 Engineered sequence Present invention
SP11 pre SP6 + pro SP3 Engineered sequence Present invention
SP12 pre SP7 + pro SP3 Engineered sequence Present invention
SP13 pre SP8 + pro SP3 Engineered sequence Present invention

Signal sequence list and details
[107] Table 2: Amino acid sequence of signal peptides.
SEQ ID No: Signal peptide Sequence
01 SP1 MRFPSIFTAVLFAASSALAAPVNTTTEDETAQIPAEAVIGYLDLEGDFDVAVLPFSNSTNNGLLFINTTIASIAAKEEGVSLDKREAEA
02 SP2 MRFPSIFTAVLFAASSALAAPVNTTTEDETAQIPAEAVIGYSDLEGDFDVAVLPFSNSTNNGLLFINTTIASIAAKEEGVSLEKREAEA
03 SP3 MRFPSIFTAVLFAASSALAAPVNTTTEDETAQIPAEAVIGYSDLEGDFDAAVLPFSNSTNNGLLFINTTIASIAAKEEGVSLEKR
04 SP4 MKCLLLALALTCGAQA
05 SP5 MKWVTFISLLFLFSSAYS
06 SP6 MLGKNDPMCLVLVLLGLTALLGICQG
07 SP7 MKLSTNLILAIAAASAVVSA
08 SP8 MKFAISTLLIILQAAAVFA
09 SP9 MKCLLLALALTCGAQAAPVNTTTEDETAQIPAEAVIGYSDLEGDFDAAVLPFSNSTNNGLLFINTTIASIAAKEEGVSLEKR
10 SP10 MKWVTFISLLFLFSSAYSAPVNTTTEDETAQIPAEAVIGYSDLEGDFDAAVLPFSNSTNNGLLFINTTIASIAAKEEGVSLEKR
11 SP11 MLGKNDPMCLVLVLLGLTALLGICQGAPVNTTTEDETAQIPAEAVIGYSDLEGDFDAAVLPFSNSTNNGLLFINTTIASIAAKEEGVSLEKR
12 SP12 MKLSTNLILAIAAASAVVSAAPVNTTTEDETAQIPAEAVIGYSDLEGDFDAAVLPFSNSTNNGLLFINTTIASIAAKEEGVSLEKR
13 SP13 MKFAISTLLIILQAAAVFAAPVNTTTEDETAQIPAEAVIGYSDLEGDFDAAVLPFSNSTNNGLLFINTTIASIAAKEEGVSLEKR
[108] Table 3: Nucleotide sequence of signal peptides.
SEQ ID No: Signal peptide Sequence
14 SP1 atgagatttccttcaatttttactgcagttttattcgcagcatcctccgcattagctgctccagtcaacactacaacagaagatgaaaccgcacaaattccggctgaagctgtcatcggttacttggatttagaaggggatttcgatgttgctgttttaccattttccaacagcacaaataacgggttattgtttataaatactactattgccagcattgctgctaaagaagaaggggtatctttggataaaagagaggctgaagct
15 SP2 atgagatttccttcaatttttactgctgttttattcgcagcatcctccgcattagctgctccagtcaacactacaacagaagatgaaacggcacaaattccggctgaagctgtcatcggttactcagatttagaaggggatttcgatgttgctgttttgccattttccaacagcacaaataacgggttattgtttataaatactactattgccagcattgctgctaaagaagaaggggtatctctcgagaaaagagaggctgaagct
16 SP3 (Codon optimised) atgcgcttcccgtccatcttcacagctgtactgtttgcggcctcttcggcattagctgcgcccgttaatacgactaccgaagatgagacagcacaaattcctgctgaagcggtcatcggctattctgatttagagggtgacttcgacgcagccgtacttccattttctaacagtacgaacaatggactgctctttataaatactaccattgcctcgatagcagctaaagaggaaggggtgagcttggaaaagaga
17 SP4 (Codon optimised) atgaaatgcctattactcgccttggctcttacctgtggcgcacaagcg
18 SP5 (Codon optimised) atgaaatgggtcactttcatctctctcctttttctattttcctcggcgtatagt
19 SP6 (Codon optimised) atgttgggtaaaaatgatcctatgtgtctggttttagtgctacttggcctcacggccctgctcgggatttgccaagga
20 SP7 (Codon optimised) atgaagctctccaccaatttgattctagctattgcagcagcttccgccgttgtctcagct
21 SP8 (Codon optimised) atgaagttcgcaatttcaacacttcttattatcctacaggctgccgctgtttttgct
22 SP9 (Codon optimised) atgaaatgcctattactcgccttggctcttacctgtggcgcacaagcggctcctgtcaacaccaccactgaagacgaaaccgctcaaattccagctgaagcagttattggttattctgatttggaaggtgacttcgatgctgcagttttgccattttctaattcaactaacaatggtttgttgtttattaatactacaattgcttctattgctgcaaaagaagaaggtgtttctctcgagaagaga
23 SP10 (Codon optimised) atgaaatgggtcactttcatctctctcctttttctattttcctcggcgtatagtgctcctgtcaacaccaccactgaagacgaaaccgctcaaattccagctgaagcagttattggttattctgatttggaaggtgacttcgatgctgcagttttgccattttctaattcaactaacaatggtttgttgtttattaatactacaattgcttctattgctgcaaaagaagaaggtgtttctctcgagaagaga
24 SP11 (Codon optimised) atgttgggtaaaaatgatcctatgtgtctggttttagtgctacttggcctcacggccctgctcgggatttgccaaggagctcctgtcaacaccaccactgaagacgaaaccgctcaaattccagctgaagcagttattggttattctgatttggaaggtgacttcgatgctgcagttttgccattttctaattcaactaacaatggtttgttgtttattaatactacaattgcttctattgctgcaaaagaagaaggtgtttctctcgagaagaga
25 SP12 (Codon optimised) atgaagctctccaccaatttgattctagctattgcagcagcttccgccgttgtctcagctgctcctgtcaacaccaccactgaagacgaaaccgctcaaattccagctgaagcagttattggttattctgatttggaaggtgacttcgatgctgcagttttgccattttctaattcaactaacaatggtttgttgtttattaatactacaattgcttctattgctgcaaaagaagaaggtgtttctctcgagaagaga
26 SP13 (Codon optimised) atgaagttcgcaatttcaacacttcttattatcctacaggctgccgctgtttttgctgctcctgtcaacaccaccactgaagacgaaaccgctcaaattccagctgaagcagttattggttattctgatttggaaggtgacttcgatgctgcagttttgccattttctaattcaactaacaatggtttgttgtttattaatactacaattgcttctattgctgcaaaagaagaaggtgtttctctcgagaagaga
[109] In one aspect of the present invention, the expression construct comprises a strong promoter selected from, but not limited to, AOX1p, CAT1p, FLDp, GCW14p, FMDp, or MOXp. These promoters have demonstrated robust transcriptional activity and are suitable for high-level expression of heterologous proteins in Pichia pastoris.
[110] In some embodiments, the promoter sequence is derived from endogenous genes of P. pastoris, including both constitutive or methanol-inducible promoters. In other embodiments, the promoter is sourced from homologous or orthologous organisms, including other eukaryotic species. For example, orthologous promoters such as MOXp and FMDp from Hansenula polymorpha offer regulatory compatibility with P. pastoris and can be used to tailor gene expression to specific fermentation conditions or host strains. This promoter flexibility allows for the customization of the expression system based on desired protein yield, induction strategy, and process scalability.
[111] In one embodiment, the promoter is defined as either methanol- inducible promoter (e.g., AOX1p, CAT1p, FLDp, FMDp, or MOXp) or a constitutive promoter (e.g., GCW14p, GAPp), providing options for regulated or continuous expression, respectively.
[112] In a preferred embodiment, the promoter is selected from the group consisting of AOX1p, CAT1p, FLDp, GCW14p, FMDp, and MOXp. These promoters provide viable alternatives to conventional AOX1p systems, enabling strain-specific optimization and broader application.
[113] In one embodiment, the expression construct includes the selected promoter in combination with a chimeric a-MF signal peptide, represented in SEQ ID NOs: 9, 10, 11, 12, or 13. These engineered signal peptides are designed to enhance ER translocation efficiency, reduce retention, and improve secretion of the target protein.
[114] In another embodiment, the expression system employs a dual-optimization strategy wherein a strong promoter (e.g., AOX1p, CAT1p, FLDp, GCW14p, FMDp, or MOXp), as listed in Table 4, is integrated with the engineered secretion signal peptide within the same expression cassette. This coordinated approach enhances transcriptional strength and secretory efficiency, and maximizes extracellular protein yield.
[115] In one embodiment, the recombinant expression construct comprises a promoter sequence that is selected from endogenous (native to Pichia pastoris), heterologous, or orthologous sources, providing flexibility in tailoring transcriptional regulation to the expression system. Native promoters include those naturally occurring in P. pastoris, while heterologous promoters are derived from other yeast species. Orthologous promoters, such as MOXp or FMDp from Hansenula polymorpha, offer regulatory profiles compatible with P. pastoris. This promoter diversity enables the construct to be optimized for different host strains, expression levels, and fermentation conditions, supporting improved productivity and process adaptability.
[116] In one embodiment, the promoter is further defined as either an inducible promoter, such as AOX1p, MOXp, or FMDp, or a constitutive promoter, such as GCW14p or GAPp, depending on the desired level and timing of gene expression. This choice enables tight control of protein production or continuous high-yield expression without the need for inducers.
[117] In one embodiment, the promoter sequence is specifically selected from the group consisting of AOX1p, CAT1p, FLDp, GCW14p, FMDp, or MOXp, which have been demonstrated to support high expression of heterologous proteins in P. pastoris and related yeast species. These promoters provide alternatives to traditional AOX1p-based systems and enable strain-specific optimization.
[118] In one embodiment, the expression system comprises the combinatorial use of a selected strong promoter (e.g., AOX1p, CAT1p, FLDp, GCW14p, FMDp, or MOXp) described in table 4 with the engineered secretion signal peptide of invention, integrated within the same expression cassette. This dual optimization of transcriptional and secretory control elements enables high-level expression with efficient secretion, minimizing ER stress and maximizing extracellular protein yield in hosts.
Table 4 lists the promoters encompassed by the invention.
[119] Table-4: Details of promoters
Description Source
AOX1 (Alcohol Oxidase 1) Pichia pastoris
FLD (Formaldehyde dehydrogenase) Pichia pastoris
CAT1 (Catalase 1) Pichia pastoris
GCW14 (Potential glycosyl phosphatidylinositol (GPI)-anchored protein) Pichia pastoris
FMD (Formate dehydrogenase) Hansenula polymorpha
MOX1 (Methanol oxidase) Hansenula polymorpha
[120] In one aspect of the present invention, the expression system comprises a combination of a strong promoter and an engineered or chimeric signal peptide—as described herein—for the high-yield expression and secretion of a heterologous protein in a host cell system. The heterologous protein may be selected from a wide range of functionally and commercially relevant protein classes, including but not limited to a food -grade, an enzyme, a peptide, an antibody or antigen-binding fragment thereof, a protein antibiotic, a fusion protein, a vaccine or a vaccine-like protein or particle, a growth factor, a hormone or a cytokine
[121] In certain embodiments, the heterologous protein is a food-grade protein, which may include sweet or non-sweet proteins suitable for direct consumption, food enhancement, or formulation. Non-limiting examples include: Milk and dairy proteins: casein, ß-lactoglobulin, a-lactalbumin, immunoglobulins, lactoferrin, lysozyme, osteopontin
[122] Sweet proteins: brazzein, thaumatin, monellin, curculin, mabinlin, miraculin; Meat proteins: myoglobin, collagen; Egg proteins: ovotransferrin, ovomucoid, avidin. These proteins may be incorporated into functional foods, dairy alternatives, high-protein beverages, or sugar-reduced food formulations.
[123] In other embodiments, the heterologous protein is a therapeutic protein intended for pharmaceutical or clinical applications. Examples include: Monoclonal antibodies and fragments (Fab, scFv, nanobodies); Hormones: insulin, erythropoietin, human growth hormone; Cytokines and interferons: IL-2, IFN-a; Vaccine antigens: subunit proteins, VLPs; Enzymes for enzyme replacement therapy: glucocerebrosidase, asparaginase; Fusion proteins with therapeutic activity or extended half-life.
[124] In some embodiments, the heterologous protein is used for diagnostic, analytical, or research applications, including: Reporter proteins: GFP, luciferase, SEAP; Molecular biology enzymes: Taq polymerase, restriction enzymes, ligases; Affinity tags and binding proteins: His-tag, FLAG, Protein A/G; Assay components for ELISA, biosensors, or immunoassays
[125] In other embodiments, the heterologous protein is an industrial enzyme used in manufacturing, processing, or environmental applications. Examples include: Cellulases and xylanases for biofuel or textile processing; Lipases for detergents and oleochemical production; Proteases for food and feed industries; Phytases for livestock feed; Laccases for bioremediation, wine clarification, or dye processing.
[126] In some embodiments, the heterologous protein is a structural, self-assembling, or adhesive protein for use in biomaterials, tissue engineering, or sustainable materials. Examples include:Collagen, gelatin, and elastin-like proteins (ELPs); Silk fibroin for biofilms and fibers; Mussel foot proteins (MFPs) for underwater adhesives; Resilin for flexible biomaterials.
[127] In further embodiments, the heterologous protein is used for animal or agricultural applications, such as: Antimicrobial peptides (AMPs) for disease control; Insecticidal proteins: Bt toxins, proteinase inhibitors; Plant growth regulators or resistance enhancers; Veterinary vaccines or immune-boosting proteins; Edible and Nutritional Applications.
[128] In one or more embodiments, the heterologous protein is suitable for use in a wide variety of edible and food-related applications, including but not limited to: Beverages: dairy analogues, protein shakes, sweetened drinks, juices, drinks; Confectionery products: chocolates, candies, jellies, chewing gum; Frozen desserts; Bakery items: low-sugar cookies, cakes, pastries; Dairy and plant-based alternatives: yogurt, cheese analogues, milks; Nutraceuticals and functional foods: fortified supplements, high-protein snacks; Sweetened food items that require natural sugar substitutes, where sweet proteins such as brazzein and thaumatin provide sweetness without calories or glycemic impact; Pet foods and treats with functional or high-value protein additives.
[129] In additional embodiments, the expressed heterologous proteins are suitable for use in: Pharmaceuticals and biotherapeutics; Personal care products: cosmeceuticals, anti-aging serums, protein-based moisturizers; Cosmetics: protein additives in creams, masks, or makeup; Industrial and environmental applications: enzyme cleaners, biofuels, or green chemistry reactions.
[130] In one aspect of the present invention, the expression system comprises a combination of a strong promoter and an engineered or chimeric signal peptide, as described in the present invention, to express a heterologous protein. The heterologous protein is a food-grade protein, which may be sweet or non-sweet protein.
[131] In a further embodiment, the heterologous protein is selected from the group consisting of milk proteins, such as casein, whey protein, immunoglobulin, lactoferrin, lysozyme, and osteopontin; sweet proteins, such as brazzein, thaumatin, monellin, curculin, mabinlin, and miraculin; meat proteins, such as collagen and myoglobin; and egg proteins, such as ovotransferrin and ovomucoid.
[132] In a preferred embodiment, the sweet protein is Brazzein, selected for its favorable molecular characteristics, including small size, high sweetness potency (approximately 500–2000 times that of sucrose), excellent water solubility, and notable thermal and pH stability. Brazzein is composed of 54 amino acids and exists naturally in two isoforms: Type I (pyrE-Brazzein), which retains the N-terminal pyroglutamate residue, and Type II (des-pyrE-Brazzein), which lacks this residue and comprises 53 amino acids. These isoforms are found in the ripe fruit of Pentadiplandra brazzeana in an approximate ratio of 80% Type I to 20% Type II. In the preferred embodiment, Type II (des-pyrE-Brazzein) is favored due to its higher sweetness intensity.

[133] Table 5: Sequence of Brazzein.
SEQ ID No: Description Sequence
27 Brazzein (Type II, amino acid sequence) DKCKKVYENYPVSKCQLANQCNYDCKLDKHARSGECFYDEKRNLQCICDYCEY
28 Brazzein (Type II, codon optimised nucleotide sequence) GACAAGTGCAAGAAAGTTTATGAGAACTATCCAGTGTCGAAATGCCAATTAGCAAACCAGTGTAATTACGACTGCAAACTCGATAAGCACGCCCGATCTGGTGAGTGCTTCTACGATGAAAAGCGCAATCTGCAGTGTATTTGTGATTATTGTGAATAC
[134] The sweet taste of Brazzein is attributed to its compact tertiary structure, stabilized by four intramolecular disulfide bonds that lack free thiols. These disulfide linkages are essential for maintaining the protein’s conformational stability and biological activity under a range of processing conditions. Functional analyses have identified critical amino acid residues—such as Asp29, Glu41, Lys30, Arg33, and Arg43—as key contributors to its sweetness perception. Given the well-documented health concerns associated with high sugar intake—including obesity, cardiovascular diseases, type 2 diabetes, and metabolic syndrome—Brazzein represents a promising natural alternative to conventional sweeteners, as reported by Bill Tawil et al., (2025). However, large-scale production from its native source is limited due to the restricted cultivation of P. brazzeana, which is endemic to specific tropical regions. To address this limitation, the present invention provides recombinant expression strategies employing microbial systems for the efficient and scalable biosynthesis of functional Brazzein.
[135] In one embodiment, the engineered signal peptide selected from SEQ ID NOs: 9, 10, 11, 12, or 13 is fused to the N-terminus of a codon-optimized sequence encoding type II Brazzein, selected from SEQ ID NO: 28.
[136] In another embodiment, the expression of the engineered signal peptide with Brazzein coding sequence placed under the control of a promoter selected from Table 4 (AOX1p, CAT1p, FLDp, GCW14p, FMDp, or MOXp).
[137] In a further embodiment, the engineered signal peptide is processed in the ER and/or Golgi apparatus to release mature Brazzein protein into the extracellular medium in its native type II form.
[138] In one embodiment, the recombinant Brazzein produced by the host strain is secreted into the culture medium in a properly folded and biologically active form, retaining its intense sweetness. The recombinant Brazzein produced using the engineered signal peptide does not contain non-native N-terminal residues, due to precise cleavage of the signal peptide.
[139] In one embodiment, the recombinant Brazzein retains its heat stability, enabling use in food and beverage applications requiring high-temperature processing, including baking and cooking.
[140] In one embodiment, the recombinant Brazzein is non-caloric, making it suitable for formulation of food and beverages targeted toward individuals seeking low-calorie diets.
[141] In one embodiment, the recombinant Brazzein is suitable for diabetic consumption, as it has no effect on blood glucose levels and possesses a low glycemic index.
[142] In one embodiment, the recombinant expression system further comprises a transcription terminator sequence operably linked downstream of the coding sequence of the heterologous protein to ensure proper termination and stabilization of the transcribed mRNA. This facilitates efficient release of RNA polymerase and enhances transcript stability, thereby contributing to improved expression yields.
[143] In a preferred embodiment, the terminator used in the expression system is the AOX1 terminator derived from the Pichia pastoris alcohol oxidase 1 gene. The AOX1 terminator is one of the most commonly employed terminators in Pichia-based expression systems due to its high performance and increased mRNA stability under both inducible and constitutive conditions.
[144] In another embodiment, the terminator may be selected from a range of commonly employed yeast terminators, including but not limited to the CYC1 terminator, GAP terminator, TEF1 terminator, ADH1 terminator, TRP1 terminator, and/or FLD terminator. Alternatively, synthetic minimal terminators such as tCYC1-min or tADH1-short may be utilized. These engineered terminators are designed for reduced sequence length, and minimal sequence redundancy, thereby improving construct design flexibility in recombinant systems.
[145] In a particularly preferred embodiment of the invention, the AOX1 terminator is employed to maintain compatibility with inducible expression systems driven by the AOX1 promoter, although it can also be effectively paired with alternative promoters. The selection of the terminator is based on promoter strength, construct stability, and the need to avoid homologous recombination or repetitive sequence elements.
[146] In one embodiment, the expression vector comprises a target loci. The target loci is selected from AOX1, AOX2, GAP, FLD, HIS, or other targeted genomic loci to facilitate stable integration into the genome of Pichia pastoris via homologous recombination.
[147] In one embodiment, the expression vector comprises a selection marker . The selection marker may be either an auxotrophic gene, such as HIS4 or URA3, or an antibiotic resistance gene that confers resistance to antibiotics such as Zeocin, G418 (Geneticin), Hygromycin B, or Nourseothricin, thereby allowing for selection of transformed cells.
[148] In one embodiment, the expression vector comprises a protein expression cassette (EC) containing the following elements in sequence: a promoter, a signal peptide, a protein of interest, and a terminator.
[149] In a specific embodiment, the present invention provides a series of expression cassettes designed to evaluate the enhanced secretion of heterologous proteins using engineered signal peptides. These signal peptides were developed through pre- and pro-region modifications as described herein.
[150] The constructed expression cassettes, designated EC1 to EC7, each incorporate one of the engineered signal peptides SP2, SP3, SP9, SP10, SP11, SP12, and SP13, respectively. The amino acid sequences of these signal peptides are listed in Table 2, and their corresponding nucleotide sequences are provided in Table 3.
[151] A comprehensive description of each expression cassette—including the associated engineered signal peptide, the heterologous protein sequence, the promoter, and the terminator used—is summarized in Table 6.
[152] Table 6: Details of protein expression cassette.

Expression cassette (EC) Promoter Signal peptide Protein on interest Terminator
EC1 AOX1 SP2 Brazzein AOX1
EC2 FMD SP3 Brazzein AOX1
EC3 FMD SP9 Brazzein AOX1
EC4 FMD SP10 Brazzein AOX1
EC5 FMD SP11 Brazzein AOX1
EC6 FMD SP12 Brazzein AOX1
EC7 FMD SP13 Brazzein AOX1

[153] In one aspect of the present invention, the host cell is selected from budding yeasts or filamentous fungi, preferably from budding yeast species such as Pichia pastoris, Saccharomyces cerevisiae, Hansenula polymorpha (Ogataea polymorpha), Yarrowia lipolytica, Candida boidinii, and Kluyveromyces lactis.
[154] In a preferred embodiment, the host is a Pichia pastoris wild-type strain or an auxotrophic strain. The wild-type strains are selected from X33 or BG10, and the auxotrophic strains are selected from GS115 or BG12.
[155] In another aspect of the present invention, the host cell is a genetically modified Pichia pastoris strain comprising one or more inserted genes selected from folding chaperones, transcription factors, translation factors, secretion enhancers, protease-deficient alleles, or metabolic regulators to enhance secretory expression and protein stability, as described in Potvin et al., 2012.
[156] In one embodiment of the present invention, the engineered yeast cell expresses and secretes heterologous protein directly into the culture medium under methanol inducible or constitutive conditions.
[157] The recombinant host cell may be a Pichia pastoris strain selected from Mut? (methanol utilization positive), which enables strong induction under the AOX1 promoter; Mut? (methanol utilization slow), which offers reduced methanol consumption and oxygen demand; or Mut? (methanol utilization negative), used in methanol-free expression systems driven by alternative promoters such as FMD, GAP, GCW14, or MOX.
[158] In another aspect of the present invention, a host cell, such as Pichia pastoris strain BG12, is transformed with an expression vector comprising one of the expression cassettes listed in Table 6, resulting in the secretion of the expressed Brazzein protein into the extracellular medium. Transformation of Pichia pastoris competent cells is performed by electroporation using linearized plasmid DNA containing the expression cassette.
[159] In one embodiment, transformants are selected either on minimal medium plates lacking essential amino acids (e.g., histidine or uracil) or on complex medium plates supplemented with antibiotics such as Zeocin, Hygromycin B, or Nourseothricin.
[160] In one embodiment, the genomic integration of the engineered Brazzein expression construct, including the signal peptide, is confirmed by polymerase chain reaction (PCR) using primers flanking the insertion site. The resulting PCR products are analyzed by agarose gel electrophoresis to verify integration at the correct genomic locus.
[161] In one embodiment, PCR-positive transformants or clones are cultured in shake flasks under methanol-inducible conditions to evaluate expression and extracellular secretion of Brazzein as described in the procedure 3 of Example 2 in the present invention.
[162] In another embodiment, the secreted Brazzein protein is analyzed by SDS-PAGE, and its concentration is quantified using band densitometry.
[163] In one embodiment, recombinant Pichia pastoris clones comprising the engineered expression cassette EC2—composed of the FMD promoter and an engineered pro-region of the a-MF—demonstrate approximately a 1.4-fold increase in Brazzein expression levels compared to clones carrying a commercial expression vector.
[164] In one embodiment, recombinant Pichia pastoris clones with engineered expression cassettes EC5, EC6, or EC7, each comprising of the FMD promoter, a signal peptide of a-MF consists of the chimeric pre-region, and an engineered pro-region, demonstrate approximately 1.2 to 1.6 fold increase in Brazzein expression levels compared to clones carrying the EC2 cassette. The Pichia pastoris clones with engineered expression cassettes EC3 and EC4, have shown the same or lower expression of Brazzein than the EC2 control.
[165] In one embodiment, a recombinant Pichia pastoris clone comprising an engineered expression vector for Brazzein expression is selected based on protein expression levels in shake-flask cultures and advanced to laboratory-scale fermentation in a defined medium. Cells are then subjected to fermentation in accordance with the Pichia Fermentation Process Guidelines provided by Invitrogen or equivalent protocols.
[166] In one embodiment, the recombinant Pichia pastoris host cell, transformed with the engineered construct, is cultivated in defined media under controlled conditions, including temperature (20–30°C), pH (5.0–6.5), dissolved oxygen (>15%), and a suitable feeding strategy. The expression construct may include co-expression of a folding chaperone or helper protein under a separate promoter to enhance secretion.
[167] In one embodiment, the improved expression system enables a consistent, and tunable fermentation process for high-yield production of Brazzein. The culture broth is clarified by centrifugation and microfilteration (MF) followed by ultrafilteration (UF) and diafilteration (DF), using molecular weight cut-off appropriate for the target protein (e.g., 1–100 kDa). The resulting protein is subsequently purified by column chromatography. Purity is assessed by SDS-PAGE and additional analytical methods, such as densitometric analysis, high-performance liquid chromatography (HPLC), typically demonstrating a purity exceeding 90%.
[168] In one embodiment, the final purified protein corresponds to the native Type II Brazzein sequence, free of any extraneous or scar sequences at the N-terminus or C-terminus. This is confirmed using analytical methods such as intact mass analysis, N-terminal sequencing, and/or peptide mapping.
[169] In one embodiment, the recombinant Brazzein protein produced using the methods described herein is subjected to lyophilization (freeze-drying under vacuum), or spray drying to yield a stable, dry powder formulation. This process preserves the structural integrity and sweetness activity of the protein, enabling long-term storage and transport without the requirement for cold chain logistics. The lyophilized protein retains thermal stability, pH tolerance, and functional properties upon reconstitution in water.
[170] In one embodiment, the lyophilized Brazzein is reconstituted in aqueous solution and demonstrates retention of its sweetness profile, making it suitable for incorporation into a variety of liquid and semi-solid food products, beverages, and dietary supplements.
[171] In various embodiments, the lyophilized Brazzein powder is formulated into multiple dosage and delivery formats including, but not limited to, tablets, lozenges, powders, sachets, capsules, oral films, emulsions, nanoemulsions, gels, gums, sprays, and lollipops. These forms enhance palatability, dosage precision, portability, and consumer convenience.
[172] In some embodiments, the lyophilized protein is blended with food-grade excipients, carriers, or bulking agents such as maltodextrin, inulin, cellulose, or equivalent materials to facilitate dosage uniformity, stability, and dispersion in food and beverage formulations.
[173] In other embodiments, Brazzein is formulated into liquid-based formats, such as syrups or concentrated solutions, optionally in buffered or preserved media, to enhance water solubility and application in beverages, liquid dietary supplements, and oral formulations.
[174] In one embodiment, purified Brazzein is crystallized from solution to yield crystalline forms suitable for use in controlled-release applications, specialty foods requiring prolonged sweetness perception, or formulation into compressible tablets.
[175] In certain embodiments, gel-based formulations of Brazzein are used in applications requiring specific textures or viscosities. These gel formats may also be incorporated into food matrices or semi-solid nutraceutical compositions.
[176] In other embodiments, Brazzein is utilized as a sweetener or sweetness enhancer in a wide range of product categories including functional beverages, diabetic- and keto-friendly foods, infant and geriatric nutrition, confectionery, dairy analogs, baked goods, and ready-to-drink (RTD) products. Brazzein’s favorable heat and pH stability make it suitable for these applications without loss of sweetness or sensory profile.
[177] In further embodiments, Brazzein is combined with other natural or synthetic sweeteners, such as stevia, sucralose, erythritol, or monk fruit extract, to synergistically enhance taste, reduce overall sweetness dosage, and mask off-notes, thereby improving product formulation and consumer acceptance.
[178] In preferred embodiments, Brazzein is produced recombinantly using the engineered constructs and signal sequences disclosed herein, and recovered without the use of fusion tags, glycosylation enhancers, or co-expression of exogenous chaperones. The resulting protein retains a sweetness potency equivalent to or exceeding that of plant-extracted Brazzein, with the added advantages of batch-to-batch consistency, scalability, and regulatory compliance for food-grade and pharmaceutical applications.
EXAMPLES
[179] The following examples are included to demonstrate certain embodiments disclosed herein. It should be appreciated by those of skill in the art that the techniques disclosed in the examples which follow represent techniques discovered by the inventor to function well in the practice of the disclosed embodiments, and thus can be considered to constitute certain modes for its practice. However, those of skill in the art should, in light of the present disclosure, appreciate that many changes can be made in the specific embodiments which are disclosed and still obtain a like or similar result without departing from the spirit and scope of the embodiments disclosed herein.
EXAMPLE 1: Expression Vector construction for Secretory Expression of Brazzein
[180] This example describes the construction of various recombinant expression vectors designed for the secretory expression of Brazzein in yeast. Each vector contains a Brazzein expression cassette comprising a promoter, secretion signal peptide, the Brazzein coding sequence, and a transcription terminator. In addition, the vectors include yeast targeting sequences, a selection marker for yeast transformation, and bacterial elements required for plasmid propagation and maintenance in E. coli.
[181] The specific components of the Brazzein expression cassettes used in the various expression vectors developed in this invention are detailed in Table 6.
[182] Procedure 1 - Vector construction for Expression Cassette 1 (EC1):
[183] A codon-optimized nucleotide sequence encoding the type II variant of Brazzein was synthesized de novo by GenScript (USA) and cloned into the pPICZaA vector. The insert was placed in-frame with the a-mating factor (a-MF) secretion signal, using EcoRI and NotI restriction sites, following standard molecular cloning techniques. The resulting construct is designated as EC1_pPICZaA.
[184] This vector features the AOX1 promoter to drive Brazzein expression and includes the BleoR gene, which confers resistance to the antibiotic Zeocin, allowing for selection of recombinant clones.
[185] Procedure 2 - Vector construction for Expression Cassette 2 (EC2):
[186] A codon-optimized nucleotide sequence encoding an engineered a-MF secretion signal (SP3) followed by the type II variant of Brazzein, was de novo synthesized as a single fragment by GenScript (USA). This construct also included a promoter sequence derived from the FMD gene of Hansenula polymorpha and a terminator sequence from the AOX1 gene of Pichia pastoris. The synthesized fragment was cloned into the pAO815 vector at the BamHI restriction site using standard molecular cloning techniques. The resulting construct is designated as EC2_pAO815.
[187] This vector contains the AOX1 promoter and the AOX1 3' fragment (region downstream of AOX1 gene), which serve as homology arms for targeted integration into the AOX1 locus of the P. pastoris genome. It also includes a functional HIS4 gene, enabling selection of recombinant clones in his? auxotrophic strains.
[188] Procedure 3 - Vector Construction of Expression Cassette 3-7 (EC3-7):
[189] Codon-optimized nucleotide sequences encoding engineered a-MF secretion signals SP9, SP10, SP11, SP12, or SP13, followed by the type II variant of Brazzein were de novo synthesized as single fragments by Twist Bioscience (USA). Each synthesized fragment was cloned into the EC2_pAO815 vector to replace the original SP3 signal peptide, using standard molecular cloning techniques. The resulting construct is designated as EC3_pAO815, EC4_pAO815, EC5_pAO815, EC6_pAO815, or EC7_pAO815.
[190] An illustrative diagram of the pAO815 vector containing an engineered expression cassette for the secretory expression of Brazzein in Pichia pastoris is shown in Figure 1.
EXAMPLE 2: Brazzein Expression Using Commercial and Engineered Vector
[191] This example describes the procedure for and comparison of the secretory expression of Brazzein in Pichia pastoris using a commercial expression vector (EC1_pPICZaA) and an engineered expression vector (EC2_pAO815).
[192] Procedure 1 - Integration of EC1_pPICZaA into Pichia pastoris genome
[193] The EC1_pPICZaA plasmid was linearized with SacI restriction enzyme and column purified for genomic integration.
[194] Electrocompetent cells of the Pichia pastoris auxotrophic strain BG12 (BioGrammatics) were prepared and transformed with the linearized EC1_pPICZaA DNA fragment, following the protocol described in the EasySelect Pichia Expression Kit manual (Invitrogen).
[195] Transformed colonies were selected on YPD agar plates containing Zeocin at a concentration of 100?µg/mL. Successful genomic integration of the Brazzein expression cassette was confirmed by PCR amplification of defined genomic regions using gene-specific primers.
[196] Procedure 2 - Integration of EC2_pPAO815 into Pichia pastoris genome
[197] The plasmid EC2_pPAO815 was linearized with the BglII restriction enzyme. The resulting integration fragment, comprising the Pichia target loci, Brazzein expression cassette, and selection marker, was separated by agarose gel electrophoresis and purified to eliminate bacterial vector elements.
[198] Electrocompetent cells of the Pichia pastoris auxotrophic strain BG12 (BioGrammatics) were prepared and transformed with the purified DNA fragment, following the protocol described in the EasySelect Pichia Expression Kit manual (Invitrogen).
[199] Transformed colonies were selected on minimal media agar plates lacking histidine. Successful genomic integration of the Brazzein expression cassette was confirmed by PCR amplification of defined genomic regions using gene-specific primers.
[200] Procedure 3 - Shake-flask expression and evaluation
[201] On Day 1, the selected clones were inoculated into YPD medium (1% yeast extract, 2% peptone, and 2% dextrose) and cultured for 20 hours at 30?°C with shaking at 250?rpm.
[202] On Day 2, BMGY medium (1% yeast extract, 2% peptone, 2% glycerol, 0.1?M phosphate buffer, pH 6.0, and 1× yeast nitrogen base) was inoculated with the primary culture at an initial OD600 of 0.4 and incubated for 28 hours at 30?°C with shaking at 250?rpm.
[203] On day-3, the BMGY culture was centrifuged to remove the supernatant. The cell pellet was weighed and resuspended in BMMY medium containing 1% methanol at a volume of 5 mL per gram of wet cell weight to initiate induction (0-hour time point).
[204] On day-4, the culture was supplemented with 1.0% methanol.
[205] On day-5, the culture was supplemented with 1.0% methanol.
[206] On day-6, the culture broth was harvested and the supernatant fraction was separated by centrifugation.
[207] Samples were prepared in SDS loading dye, and equal volume of each sample was loaded per well on a 16% Bis-Tris SDS-PAGE gel. Following electrophoresis, gels were stained with coomassie brilliant blue, destained, and imaged.
[208] Band intensities were quantified using Bio-Rad Image Lab software, and fold improvements were calculated relative to the control clone.
[209] Shake-flask expression studies were independently conducted for ten clones each of EC1 and EC2, all of which were PCR-confirmed for single-copy genomic integration. From these, the top two performing clones for each construct were selected for a parallel shake-flask experiment to enable head-to-head comparison. Figure 2 presents the Bis-Tris SDS-PAGE gel image showing the expression profiles of the top two clones for both EC1 and EC2 constructs. Figure 3 displays a bar graph of protein expression levels, quantified by densitometry analysis of the SDS-PAGE gel shown in Figure 2. The results demonstrate an approximate 1.4-fold improvement in secretory expression of Brazzein with the engineered EC2 vector compared to the commercial EC1 vector control.
EXAMPLE 3: Brazzein Expression Using Multiple Engineered Vectors.
[210] This example describes the procedure for and comparison of the secretory expression of Brazzein in Pichia pastoris using multiple engineered vectors composed of different chimeric signal peptides.
[211] Procedure 1 - Integration of EC3, EC4, EC5, EC6, or EC7_pPAO815 into Pichia pastoris genome.
[212] The expression vectors EC3, EC4, EC5, EC6, or EC7_pPAO815 were linearized using the BglII restriction enzyme. The resulting integration fragment, comprising the Pichia target loci, Brazzein expression cassette, and selection marker, was separated by agarose gel electrophoresis and purified to eliminate bacterial vector elements.
[213] Electrocompetent cells of the Pichia pastoris auxotrophic strain BG12 (BioGrammatics) were prepared and transformed with the purified DNA fragment, following the protocol described in the EasySelect Pichia Expression Kit manual (Invitrogen).
[214] Transformed colonies were selected on minimal media agar plates lacking histidine. Successful genomic integration of the Brazzein expression cassette was confirmed by PCR amplification of defined genomic regions using gene-specific primers.
[215] Procedure 2 - Shake-flask expression and evaluation of clones.
[216] Shake-flask expression studies were independently performed on 10 to 15 clones each of constructs designated EC3, EC4, EC5, EC6, and EC7, in accordance with the procedure outlined in Procedure 3 of Example 2. Each clone was confirmed by PCR to contain a single-copy genomic integration of the respective expression cassette. From the confirmed clones, the three highest expressing clones for each construct were selected for subsequent parallel shake-flask experiments to facilitate a direct comparative analysis against clones comprising the EC2 construct.
[217] Figure 4 depicts a Bis-Tris SDS-PAGE gel image illustrating the expression profiles of the top three clones for each of the tested expression constructs. Figure 5 presents a bar graph quantifying protein expression levels based on densitometric analysis of the SDS-PAGE gel shown in Figure 4. The data indicate that the EC7 construct resulted in an approximate 1.6-fold increase in secretory expression of Brazzein relative to the EC2 control construct. Constructs EC5 and EC6 demonstrated approximately 1.2-fold increases in expression compared to EC2, whereas constructs EC3 and EC4 did not exhibit a measurable improvement in expression levels.
[218] The foregoing description and examples are illustrative and not intended to limit the scope of the invention. Modifications and equivalents apparent to those skilled in the art are within the scope defined by the appended claims.

References
1. Lin-Cereghino, J., Cereghino, J.L., Ilgen, C., & Cregg, J.M. Heterologous protein expression in the methylotrophic yeast Pichia pastoris. FEMS Microbiol. Rev., 24(1), 45–66 (Jan. 2000).
2. Vogl, T., et al. Orthologous promoters from related methylotrophic yeasts surpass expression of endogenous promoters of Pichia pastoris. AMB Express, 10(1), 38 (Feb. 25, 2020).
3. Dou, Z., et al. Screening and evaluation of the strong endogenous promoters in Pichia pastoris. Microb. Cell Fact., 20, 156 (2021).
4. Vogl, T., Sturmberger, D., Kickenweiz, H., Gasser, M., Maurer, B., Hann, M., Sauer, P., Gasser, A., & Glieder, A. Methanol-independent protein expression in Pichia pastoris from derepressed AOX1 promoters. Microb. Cell Fact., 15, 5 (Jan. 2016).
5. Ito, Y., Ishigami, M., Hashiba, N., Nakamura, Y., Terai, G., Hasunuma, G.Terai , & Kondo, A. (2022). Avoiding entry into intracellular protein degradation pathways by signal mutations increases protein secretion in Pichia pastoris. Microbial Biotechnology, 15(9), 2364-2378.
6. Barrero, J.J., Casler, J.C., Valero, F. et al. An improved secretion signal enhances the secretion of model proteins from Pichia pastoris. Microb Cell Fact 17, 161 (2018).
7. Du, G., Zhuang, L., & Chen, J. Engineering microbial factories for synthesis of value-added products. J. Ind. Microbiol. Biotechnol., 38(8), 873–890 (Aug. 2011).
8. Potvin, G., Ahmad, J., & Goulet, D. Recombinant protein production in Pichia pastoris: A review of systems biology tools for strain improvement. Process Biochem., 47(9), 1407–1416 (Sep. 2012).
9. Delic, M., Mattanovich, I., Mattanovich, D., & Gasser, N. The secretory pathway: exploring yeast diversity. FEMS Microbiol. Rev., 37(5), 872–914 (Sep. 2013).
10. Brodsky JL, Skach WR. Protein folding and quality control in the endoplasmic reticulum: Recent lessons from yeast and mammalian cell systems. Curr Opin Cell Biol., 23(4):464-75 (Aug 2011).
11. Invitrogen, EasySelect Pichia Expression Kit. Thermo Fisher Scientific, Publication No. MAN0000042.
12. Invitrogen, Pichia Expression System – Fermentation Process Guidelines. Thermo Fisher Scientific, Publication No. MAN0000036.
13. Kumar, G., Kumar, R., Mishra, D., & Chauhan, R.S. Alpha-mating factor-based signal peptides for recombinant protein secretion in yeast. J. Appl. Biotechnol. Bioeng., 12(2), 78–83 (2022).
,CLAIMS:We Claim:
1. A polypeptide comprising an amino acid sequence having at least 80%, 85%, 90%, or 95% sequence identity to an amino acid sequence selected from SEQ ID NO: 3, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, or SEQ ID NO: 13.
2. The polypeptide of claim 1, wherein the polypeptide enhances the secretion of a heterologous protein when expressed in a host cell.
3. The polypeptide of claim 1 or 2, wherein the polypeptide is an engineered alpha-mating factor (a-MF) signal peptide, comprising at least one modification: (i) the pre-region; (ii) the pro-region; or (iii) a combination of (i) and (ii).
4. The polypeptide of claim 3, comprising:
(a) a pro-region comprising one or more modifications selected from mutation, deletion, or substitution, wherein the modified pro-region enhances the secretion efficiency of the heterologous protein; and
(b) a pre-region selected from a heterologous signal peptide of a native or non-native organism,
wherein the pre-region is adapted to improve endoplasmic reticulum (ER) translocation efficiency and thereby enhance the yield of the heterologous protein.
5. The polypeptide of claim 1 or 4, wherein the pro-region comprises one or more modifications selected from:
(i) a point mutation at position 50 comprising a valine-to-alanine substitution (V50A);
(ii) deletion of the EAEA tetrapeptide in the pro-region, wherein the deletion facilitates production of a scar-sequence-free heterologous protein; and
(iii) a combination of two or more modifications selected from items (i) and (ii),
wherein the polypeptide enhances the extracellular yield of the heterologous protein under identical expression conditions.
6. The polypeptide of any one of claims 1 to 5, wherein the pre-region comprises a heterologous signal peptide selected from one or more proteins selected from the group consisting of ß-lactoglobulin, lysozyme, serum albumin, EPX1, and 0030.
7. The polypeptide of any one of claims 1 to 6, wherein the signal peptide comprises a heterologous pre-region and a modified a-MF pro-region, and wherein the signal peptide is a chimeric signal peptide comprising said heterologous pre-region and modified a-MF pro-region.
8. The polypeptide of any one of claims 1 to 7, wherein the modified pro-region comprises an amino acid sequence as set forth in SEQ ID NO: 3, and the heterologous pre-region comprises an amino acid sequence selected from SEQ ID NOs: 4, 5, 6, 7, or 8, each having at least 80%, 85%, or 90% sequence identity to the respective SEQ ID NO.
9. A nucleic acid sequence encoding the polypeptide of any one of claims 1 to 8, wherein the nucleotide sequence of the modified pro-region comprises SEQ ID NO: 16, and the nucleotide sequence of the heterologous pre-region is selected from the group consisting of SEQ ID NOs: 17, 18, 19, 20, or 21, each having at least 80%, 85%, 90%, or 95% sequence identity to the respective SEQ ID NO.
10. A nucleic acid sequence encoding the polypeptide of claim 9, wherein the nucleotide sequence comprises a sequence selected from the group consisting of SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, and SEQ ID NO: 26, each having at least 80%, 85%, 90%, or 95% sequence identity.
11. A recombinant expression construct comprising:
(a) a nucleic acid sequence encoding the polypeptide as defined in any one of claims 1 to 10, operably linked to a nucleic acid sequence encoding a heterologous protein; and
(b) a recombinant vector comprising the nucleic acid construct of part (a),
wherein the vector further comprises a promoter and one or more regulatory elements operably linked to the nucleic acid sequence, and
wherein expression of the heterologous protein in a host cell results in enhanced secretion of the heterologous protein into the extracellular culture medium.
12. The recombinant expression construct of claim 11,
wherein the promoter is selected from native, heterologous, or orthologous promoters;
wherein the promoter is either inducible or constitutive; and
wherein the promoter is selected from the group consisting of AOX1p, CAT1p, FLDp, GCW14p, FMDp, and MOXp.
13. The recombinant expression construct of claim 11 or 12, wherein the host cell is selected from the group comprising Saccharomyces cerevisiae, Pichia pastoris, Hansenula polymorpha, and Kluyveromyces lactis.
14. The recombinant expression construct of claim 13, wherein the host cell is a Pichia pastoris wild-type or auxotrophic strain.
15. The recombinant expression construct of claim 11, wherein the heterologous protein is selected from the group consisting of at least one food-grade protein, an enzyme, a peptide, an antibody or antigen-binding fragment thereof, a protein antibiotic, a fusion protein, a vaccine or a vaccine-like protein or particle, a growth factor, a hormone, or a cytokine.
16. The recombinant expression construct of claim 15, wherein the at least one food-grade protein is selected from:
(i) milk proteins selected from the group consisting of casein, whey protein, immunoglobulin, lactoferrin, lysozyme, and osteopontin;
(ii) sweet proteins selected from the group consisting of brazzein, thaumatin, monellin, curculin, mabinlin, and miraculin;
(iii) meat proteins selected from the group consisting of collagen and myoglobin; and
(iv) egg proteins selected from the group consisting of ovotransferrin and ovomucoid.
17. The recombinant expression construct of claim 11, wherein the heterologous protein is suitable for use in food, edible products, beverages, juices, drinks, confectionery, frozen desserts, nutraceuticals, pet foods, pharmaceuticals, cosmetics, diagnostics, research reagents, agricultural applications, veterinary applications, structural or biomaterial applications, or industrial and environmental applications.
18. The recombinant expression construct of claim 11, wherein the heterologous protein is formulated in a form selected from the group comprising powder, syrup, granules, capsules, crystal, gel, solution, liquid, compressed tablets, or tablets.

Documents

Application Documents

# Name Date
1 202441062298-STATEMENT OF UNDERTAKING (FORM 3) [16-08-2024(online)].pdf 2024-08-16
2 202441062298-Sequence Listing in txt [16-08-2024(online)].txt 2024-08-16
3 202441062298-Sequence Listing in PDF [16-08-2024(online)].pdf 2024-08-16
4 202441062298-PROVISIONAL SPECIFICATION [16-08-2024(online)].pdf 2024-08-16
5 202441062298-FORM FOR STARTUP [16-08-2024(online)].pdf 2024-08-16
6 202441062298-FORM FOR SMALL ENTITY(FORM-28) [16-08-2024(online)].pdf 2024-08-16
7 202441062298-FORM 1 [16-08-2024(online)].pdf 2024-08-16
8 202441062298-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [16-08-2024(online)].pdf 2024-08-16
9 202441062298-EVIDENCE FOR REGISTRATION UNDER SSI [16-08-2024(online)].pdf 2024-08-16
10 202441062298-DRAWINGS [16-08-2024(online)].pdf 2024-08-16
11 202441062298-DECLARATION OF INVENTORSHIP (FORM 5) [16-08-2024(online)].pdf 2024-08-16
12 202441062298-Proof of Right [31-08-2024(online)].pdf 2024-08-31
13 202441062298-FORM-26 [31-08-2024(online)].pdf 2024-08-31
15 202441062298-Sequence Listing in PDF [14-08-2025(online)].pdf 2025-08-14
16 202441062298-DRAWING [14-08-2025(online)].pdf 2025-08-14
17 202441062298-COMPLETE SPECIFICATION [14-08-2025(online)].pdf 2025-08-14
18 202441062298-FORM-9 [28-08-2025(online)].pdf 2025-08-28
19 202441062298-FORM-5 [28-08-2025(online)].pdf 2025-08-28
20 202441062298-FORM 3 [28-08-2025(online)].pdf 2025-08-28
21 202441062298-STARTUP [29-08-2025(online)].pdf 2025-08-29
22 202441062298-FORM28 [29-08-2025(online)].pdf 2025-08-29
23 202441062298-FORM 18A [29-08-2025(online)].pdf 2025-08-29
24 202441062298-Power of Attorney [02-09-2025(online)].pdf 2025-09-02
25 202441062298-FORM28 [02-09-2025(online)].pdf 2025-09-02
26 202441062298-Form 1 (Submitted on date of filing) [02-09-2025(online)].pdf 2025-09-02
27 202441062298-Covering Letter [02-09-2025(online)].pdf 2025-09-02
28 202441062298-CERTIFIED COPIES TRANSMISSION TO IB [02-09-2025(online)].pdf 2025-09-02
29 202441062298-RELEVANT DOCUMENTS [12-09-2025(online)].pdf 2025-09-12
30 202441062298-FORM 13 [12-09-2025(online)].pdf 2025-09-12
31 202441062298-AMENDED DOCUMENTS [12-09-2025(online)].pdf 2025-09-12