Abstract: ENGINEERING OF PROTEINS TO HAVE MULTIPLE ACTIVE SITES TO CATALYZE MULTIPLE REACTIONS. The engineering of proteins to include multiple active sites for catalyzing various reactions represents a significant advancement in enzyme engineering. Currently, most industrial enzymatic processes involve enzymes that catalyze single reactions, making multistep reactions cumbersome due to differing reaction conditions, costs, and purification challenges. This invention focuses on introducing a hetero-active site to an enzyme, which serves as a second active site capable of catalyzing different reactions than the native site. It employs insights from a non-covalent interactions-based 2D reduced density gradient map, a 3D electronic density map, and the topological and algebraic properties of the pocket to compare the query NCI pocket against a reference active site. Using this technology, a transaminase enzyme was engineered to incorporate a hetero-lipase active site, enabling lipase reactions without interfering with the natural active site.
DESC:DESCRIPTION
FIELD OF THE INVENTION
[0001] The invention relates to the field of Biological catalysis, Life science, Computational biology, Biocatalysis, Molecular biology and Chemistry.
BACKGROUND OF THE INVENTION
[0002] Enzymes are biocatalysts, a necessary component in all kinds of living organisms that must be required for catalysing chemical reactions. Over the last few decades, enzymes have been industrially being used for producing many drugs, and intermediate compounds. Enzymes are versatile, specific, eco-friendly and natural catalysts. Many enzymes are known for their applications and use in harsh industrial reaction conditions. There are many reported enzymes with industrial usage such as ketoreductases, CALB Lipase, Transaminases and many others for producing large-scale production of respective products (Tjørnelund et al., 2023, US8293507B2). Enzymes are highly specific towards the selection of substrates and many engineering studies are conducted across the globe to develop enzymes that work in industrially viable conditions.
[0003] Though enzymes are used in industrial processes for example, the production of API or drugs, it is not a feasible process yet as in some steps or situations enzymes cannot be used because of the complex chemistry or in certain cases multiple enzymes are required to achieve the end product when compared to single step chemical reaction.
[0004] An enzyme can be engineered using traditional methods such as directed evolution and rational engineering approaches. Many engineering approaches developed enzymes to accommodate non-native substrates and improve the reaction rate of the enzymes to a greater extent. The active site is a focused region on the enzyme where many engineering studies are conducted as it is a main motif where the substrate interacts and the reaction happens (Yabukarski, Filip et al., 2020). The active site of the enzyme contains catalytic residues that perform the reaction of interest and the neighbouring residues are responsible for stabilizing or accommodating the substrate in the active site which are known as pillion residues or second shell residues. Mutating or engineering the active site has to be done with utmost care as it will affect the substrate binding and many times mutations are not beneficiary or are detrimental (Del Sol, Antonio, et al., 2006).
[0005] Generally, a natural active site is a well-constructed geometrical structure of the enzyme where minimal alterations or mutations are incorporated to show drastic changes. The energy, electrostatic interactions, anchoring residues and geometrical nature of the pocket is an important factor that holds the enzyme's reactiveness and substrate selection (Judge, Allison et al., 2024). For example, as explained in Figure 1, methyl 3-oxo-4-(2,4,5-trifluorophenyl)butanoate converted into 3-oxo-4-(2,4,5-trifluorophenyl)butanoic acid in the first step, and the same undergoes amination reaction where transaminase enzyme convert its keto group into amino group which produces (3R)-3-amino-4-(2,4,5-trifluorophenyl)butanoic Acid (Khobragade, Taresh P., et al., 2021). The given reaction example is a two-step reaction where two different enzymes are used, these two different enzymatic reactions can be converted into a single enzyme reaction if an enzyme possesses a double active site and each for a different type of reaction. Enzymes with double-active sites are an intriguing topic in biocatalysis and biotechnology, they offer potential advantages in catalysing multiple reactions performed by a single enzyme. Many industrial multi-step reactions are tedious due to the different reaction conditions for every enzyme, cost and resource. Incorporating double active sites on the same enzyme which can perform multiple reactions can change the perspective of the industrial usage of enzymes(Naveen et al., 2024).
[0006] Double-active sites on enzymes are regions where two substrates can bind in two different sites and undergo a chemical reaction where the given substrate will bind in the first active site and be converted into a product. The same product will become a substrate in another active site and be converted into a final product. These enzymes are also known as multi-site or double-active site enzymes. If the engineered double-active site enzyme can accommodate the same substrate in two different active sites it can be called a homo-active site. If the engineered enzyme can accommodate different substrates in different active sites it can be called as hetero-active site enzyme. Multiple studies have been done to develop homo-active site enzymes that can accommodate two active sites to perform similar reactions (Santiago et. al. 2018). In Santiago et al. paper, they have engineered a lipase enzyme to have an additional active site, which can catalyse similarly to the native active site.
[0007] However, an engineered second active site capable of performing different reactions compared to the native site can be called a hetero-active site. So far, no successful studies have been reported on Hetero-active sites. For example, obtaining lipase activity in transaminase enzyme. The present invention describes a method of incorporating hetero-active sites with transaminase. For example, the enzyme transaminase is engineered to have another active site to perform a hydrolysis reaction. The engineered transaminase with double can be used in industrial conditions for performing lipase activity and transamination activity. The method uses NCI Analysis as a main component which compares an active site of an enzyme and a pocket to build a hetero-active site by comparing the NCI Indices of an active site with a pocket (Contreras-García, Julia, et al., 2011).
[0008] Noncovalent interactions are essential for various protein functions, including ligand or substrate binding mode, protein-membrane interactions, enzyme-lipid interactions, and enzymatic reactions such as transition states and intermediate states. These interactions differ from covalent bonds in that they do not alter the chemical bonding of atoms or molecules but rather hold them together. The strength of noncovalent interactions can vary depending on the atoms or molecules involved, ranging from -0.5 to -50 kcal/mol. These interactions provide macromolecules with the necessary dynamic nature for their biological functions at room temperature. Noncovalent interactions are more flexible and easier to form or break compared to covalent connections (Karshikoff, Andrey et. al, 2020). They also have a crucial role in determining the structures and functions of macromolecules or their complexes, including molecular recognition and catalysis, often working together in a coordinated manner. The energy interaction between two interacting atoms or molecules consists of five components: electrostatic, dispersion, polarization, exchange-repulsion, and charge transfer. Computational techniques are necessary to understand these forces. Examples of common noncovalent interactions in biomolecules include Van der Waals contacts, hydrophobic interactions, ionic bonds, and hydrogen bonds. Noncovalent interactions also have a critical influence on protein structure and function (Adhav et al., 2023).
[0009] Among the various noncovalent interactions, hydrogen bonds have a significant impact on protein structure and function. Hydrogen bonds occur when an atom of hydrogen from a molecule or molecular fragment X-H, where X is more electronegative than H, interacts with an atom or group of atoms in the same or a different molecule, leading to bond formation.
[0010] Salt bridges are another non-covalent interaction where a combination of hydrogen bonds and ionic interaction forms strong salt bridges. The interaction occurs between the negatively charged amino acids and the positively charged amino acids which are present in certain residue side-chains such as Arginine, lysine, aspartic acid and glutamic acid.
[0011] Van Der Waals interactions: When uncharged atoms or molecules interact, weak intermolecular forces known as van der Waals forces are created. These forces, which are distance-dependent and rapidly disappear as molecules get farther apart, include attractions and repulsions brought on by momentary shifts in electron density that create temporary dipoles.
[0012] Stacking interactions: There are different stacking interactions which can observed within proteins, which are Pi-Pi stacking which occurs between two aromatic amino acids which leads to the formation of strong forces between the amino acids. Aromatic amino acids such as Phenylalanine form strong Pi-Pi stacking between each other. Cation-Pi interactions can be seen between a positively charged amino acid and an aromatic ring, similarly, anion-pi interactions can be observed between a negatively charged amino acid and an aromatic ring.
[0013] It is important to understand the difference between a pocket and an active site of an enzyme. The enzyme’s active site holds catalytic residues which are responsible for performing the reactions and other pocket residues which anchor the substrate molecule when it binds. A simple pocket which does not catalyse a reaction, but it still holds a function when it is a halogen-binding pocket. The development of multi-site enzymes requires understanding protein pockets and computational tools to understand their limitations and capabilities to explore the pockets. There are many computational tools that will hint at pockets of proteins and enzymes, yet they are not fully sufficient to determine the function of a pocket. It is also very much essential to understand the difference between a pocket and an active site of an enzyme. The active site holds a unique feature when compared to a normal pocket which does not accommodate more than water and ions. For this reason, it is very important to understand the difference between a pocket and an active site (Bakheit, Ahmed H., et al., 2022).
[0014] The distinction between other pockets in a protein and the active site of an enzyme:
[0015] Proteins hold a specific region on their surfaces called "pockets." These pockets, essentially depressions or clefts formed by the unique arrangement of amino acid side chains, play various roles. They can be involved in binding interactions with other proteins, DNA, or small molecules. However, their function isn't always related to binding; sometimes, they merely contribute to the overall structure of the protein. The nature of these pockets can be diverse—some being hydrophobic, others hydrophilic, depending on the amino acid residues that line them. For instance, certain membrane proteins possess lipid-binding pockets, which help anchor them to the membrane, while some kinases have regulatory pockets that control their activity.
[0016] A specialized type of pocket is the "active site" found in enzymes. These are specific regions where substrate molecules are not only bound but also undergo chemical transformations. The structure of an active site is meticulously designed to be complementary to its substrate, making these sites highly specific. The amino acid residues in the active site can vary from hydrophobic to hydrophilic and may even contain metal ions, all contributing to the enzyme's catalytic function. Classic examples include the serine proteases like trypsin, where a Ser-His-Asp triad facilitates peptide bond hydrolysis, and metalloenzymes like carbonic anhydrase, where a Zn²? ion is instrumental in the enzymatic reaction Table 1.
[0017] While all active sites qualify as pockets, the inverse is not true. Active sites are specialized pockets tailored for catalysis, exhibiting high specificity due to the intricate arrangement of their amino acids. On the other hand, general pockets might have a broader range of binding interactions and might not always be associated with a specific function. In essence, while both pockets and active sites are interactive regions on proteins, only active sites have evolved to perform the highly specific task of catalysis. Additionally, the difference is also listed in Table 2.
Feature Pocket in a Protein Active Site in an Enzyme
Residues that Make the Pocket/Active Site May include a variety of amino acid residues (hydrophobic, hydrophilic, charged, etc.). Highly specific amino acid residues that are essential for catalytic activity (e.g., Ser, His, Asp in serine proteases).
Nature of the Pocket/Active Site (Electrostatic) Can be hydrophobic or hydrophilic based on the function or lack thereof. Typically electrostatically complementary to the substrate for high-affinity binding.
Kinds of Interactions Majorly Van der Waals forces and Hydrophobic interactions
Electrostatic interactions in some cases
May not always form hydrogen bonds Few Van der Waals forces and stabilized with more Hydrogen bonds
Majorly stabilized by Electrostatic interactions
Covalent interactions (in some cases, as in cysteine proteases)
Metal ion interactions (e.g., Zn²? )
[0018] Table 1: contains the general difference between a pocket in a protein and an active site in an enzyme. The majority of the difference comes with different types of interactions a substrate will get when it is bound in the active site of an enzyme.
Protein Name Feature (Pocket/Active Site) Residues Involved Interaction/Nature
Haemoglobin Pocket Val, Leu Hydrophobic pockets between subunits
p53 Pocket Arg, Lys Electrostatic interaction with DNA
HIV-1 protease Pocket Various hydrophobic residues Hydrophobic interactions in its flaps and pockets
Calmodulin Pocket Various hydrophobic residues Hydrophobic pocket binding to calcium ions and target proteins
Glucocorticoid receptor Pocket Various hydrophobic residues Hydrophobic pocket binding to glucocorticoids
Lysozyme Active Site Glu35, Asp52 Cleaves polysaccharide chains in bacterial cell walls
Ribonuclease A Active Site Various polar and charged residues Facilitates RNA cleavage
Acetylcholinesterase Active Site Glu, Ser, His Electrostatic interactions with acetylcholine
Carbonic anhydrase Active Site Zn²? binding site Converts carbon dioxide and water to bicarbonate and protons
DNA polymerase Active Site Various charged residues Catalyzes the synthesis of DNA
Estrogen receptor Pocket Various hydrophobic residues Hydrophobic pocket binding to estrogens
Alcohol dehydrogenase Active Site Zn²?, His, Cys Catalyzes the conversion of alcohols
Cytochrome P450 Active Site Heme-containing site Oxidation of organic substances
Cyclin-dependent kinase 2 Pocket Various hydrophobic residues Regulatory pocket that can bind cyclins
Beta-lactamase Active Site Ser, Lys Hydrolyzes the beta-lactam ring in antibiotics
Myoglobin Pocket Val, Leu Hydrophobic pockets binding to oxygen
Hexokinase Active Site Asp, Glu Phosphorylates hexoses to produce hexose phosphate
Insulin receptor Pocket Various hydrophobic residues Pockets for insulin binding
Phospholipase A2 Active Site His, Asp Cleaves the sn-2 position of phospholipids
Adrenergic receptor Pocket Asp, Ser, Phe Hydrophobic pocket for epinephrine/norepinephrine binding
[0019] Table 2: the list of proteins and enzymes which are bearing active site and normal protein with its native type of interactions along with interactive residues in the pocket are listed.
1. Lysozyme:
o Active Site: The Glu35 and Asp52 residues, crucial for activity, form hydrogen bonds with the substrate. Hydrogen bonds are one crucial interaction between the substrate and active site residues to anchor the substrate (Gálvez-Iriqui, Alma Carolina, et al., 2005). NCI Analysis provides insights into important interactions such as hydrogen bond interactions, VdW interactions, repulsive interactions etc. Improving these interactions will improve the binding affinity of the substrate and activity. Glu35 and Asp52 are negatively charged amino acids, Asp52 attacks on the carboxyl carbon attacked to glycosyl bond. The attack conformation of the Asp52 when the substrate is bound in the active site can be captured by NCI Analysis.
o Beyond the Active Site: The cleft-like arrangement in lysozyme may harbour several weak interactions with the polysaccharide chain, stabilizing its position and guiding it toward the active site. The cleft-like arrangement is also made up of short helices and loops which show weaker interactions as they are dynamic. NCI analysis helps to understand the interactions around the cleft-like motif and improving the interactions around the motif helps to stabilize the motif. The interaction of substrate (polysaccharide chain) apart from Glu35 and Asp52 is crucial as they anchor the substrate. NCI Analysis can predict different types of interactions.
2. Ribonuclease A:
o Active Site: The basic residues (His12, His119, Lys41) form both hydrogen bonds and electrostatic interactions with the RNA substrate. Ribonuclease A cleaves the RNA molecule where His12 and His119 play an important role being catalytic residues (Polydoridis, Savvas, et al., 2007). NCI analysis can capture the electrostatic interaction between His12. His119 and the reaction is two steps reactions where all the transition states can be analysed using NCI analysis.
o Beyond the Active Site: Basic surface residues that are distal to the active site can attract the RNA's negatively charged backbone, guiding it towards the active site. Using the NCI index, one can map these interactions along the enzyme's surface. The pillion residues such as Asp83, Ser123, Asp121, and Gln11 are present in the second shell of the active site. The NCI Index can capture the interaction of these residues with catalytic resides and stabilization of substrate by forming water bridges.
3. Acetylcholinesterase:
o Active Site: The Ser200, His440, and Glu199 are catalytic triad residues interact with the substrate via hydrogen bonds and electrostatic interactions. The catalytic centre is known to have more number hydrophobic and aromatic residues in the pocket (Fotheringham et al., 2000). Several Phenylalanine and Tyrosine are located in the pocket apart from catalytic residues. NCI Analysis utilized to understand the cation-Pi or Anion-Pi or Pi-Pi stacking between substrate and pocket residues.
o Beyond the Active Site: The gorge leading to the active site likely contains multiple weak interactions that guide and stabilize the acetylcholine molecule. The NCI index can be used to visualize these guiding interactions, giving insight into the enzyme's specificity. The Pillion residues such as Tyr130, Tyr116, and Phe331 are shown Pi-stacking with the residues. The NCI Analysis can capture the intensity of the Pi-Pi stacking interactions.
4. Hexokinase:
o Active Site: The Asp211, Asp269forms a strong hydrogen bond with the hexose substrate, aiding in phosphorylation. The catalytic site is highly hydrophilic in nature due to the presence of Asn, Gln and Asp in the pocket (Nishimasu, Hiroshi, et al. 2007). The NCI Analysis can predict the hydrogen bond interaction with these residues and electrostatic interaction with Asp.
o Beyond the Active Site: The pocket of hexokinase undergoes conformational changes upon substrate binding. Various weak interactions between the cleft residues and the substrate can be visualized using the NCI index, providing a detailed picture of how the enzyme encloses its substrate. The residue such as Thr212, Asn237, Lys176, and Tyr159 shows long-range interaction with the substrate in the pocket. NCI analysis can identify different interactions with these residues which will enhance the knowledge for engineering the enzyme.
5. Lipase
o Active site: The active site residues are hydrophilic such as Ser, His, and Asp but the surrounding pocket-forming residues are hydrophobic. The lipase core is maintained to be hydrophobic as it prefers to be in the oil-water interface. The substrate molecule needs to form strong hydrophobic interactions with surrounding amino acids which facilitates the energetically feasible reaction to occur (Castillo, Edmundo, et al., 2016). NCI Analysis helps to find non-covalent interaction of substrate with surrounding residues and also with catalytic residue. Through NCI analysis it is possible to improve interactions of the substrate which facilitates the formation of better near-attack conformation.
o Beyond Active site: The energetically feasible binding conformation of the substrate in the active site provides accessibility to quick conversion of the substrate with the help of catalytic residues. The binding conformation of the substrate is always defined by the entry of the substrate on the surface. The amino acid residues which are present in the bottleneck of the entry channel anchor the substrate molecule and orient it to form a proper near-attack conformation. NCI analysis helps to find out non-covalent interactions across the entry path. The channel engineering can be done with the help of NCI analysis and by understanding atomic NCI interactions.
6. Transaminase:
o Active site: The active site of the transaminase enzyme is made up of two different chains which is present in the core of the enzyme. The enzyme core holds positively charged amino acids, which are crucial for holding the cofactor and also the substrate. The important catalytic residue in transaminase is Lys, along with the cofactor it involves in the reaction. The charged residues such as Arg and Lys can be seen in the pocket, and Glu and Asp are involved in the stabilization of the PLP cofactor (Guidi, Benedetta, et al., 2018). NCI analysis can capture information such as interactions and stabilization of PLP, hydrogen bonds with close-by residues, and interactions of Lys with PLP etc.
o Beyond the Active Site: The active site is present in the core, the substrate molecule has to enter the active site through a tunnel. The tunnel has more positively charged residues, and one important residue is Arg, In the absence of this Arg, which is present in the path may not facilitate easy entry of substrate and feasible reaction. The tunnel residues are important to anchor the substrate to bind in precise conformation. Transaminases are known to produce different enantiomeric products, so studying pocket residue and its interactions with the substrate using NCI studies can reveal the crucial interactions.
[0020] Apart from the above-given list, in Table 3, the list of enzymes and their structure and functional relationship explaining the external or residues beyond the active site which helps maintain the specificity and mechanism are listed.
Sl.No Enzyme Name External Residue Roles Contribution to Specificity and Mechanism
1 Lysozyme Cleft-like arrangement Directs the polysaccharide chain and ensures its correct alignment with the active site.
2 Ribonuclease A Basic surface residues Attract the negatively charged RNA and guide it into the active site.
3 Acetylcholinesterase Gorge leading to the active site Helps direct the long acetylcholine molecule into the active site for hydrolysis.
4 Carbonic anhydrase Hydrophobic patch Helps position and orient the substrate, ensuring efficient conversion.
5 Alcohol dehydrogenase Rossmann fold Aids in NAD+ binding, ensuring the correct alignment for electron transfer.
6 DNA polymerase Finger, thumb, and palm domains Ensures DNA is correctly aligned and enclosed, providing fidelity during replication.
7 Beta-lactamase Omega loop Helps in substrate binding and orientation, essential for antibiotic resistance.
8 Hexokinase Large cleft Undergoes a conformational change to enclose the substrate and align it with ATP.
9 Phospholipase A2 Lid domain Covers the active site once substrate binds, ensuring correct orientation and efficient hydrolysis.
10 Triose phosphate isomerase Loop region Undergoes conformational changes upon substrate binding to shield the reaction from water.
[0021] Table 3: the table contains the list of enzymes and their structure and functional relationship explaining the external or residues beyond the active site which helps maintain the specificity and mechanism.
[0022] Non-covalent interactions are a type of interaction that can be captured between molecules or within a molecule, mainly electrostatic interactions, hydrogen bond interactions, Van Der Walls interactions, hydrophobic interactions, ionic interactions etc., without the involvement of electron sharing. These interactions help to understand different states of the substrate such as near-attack conformation, entry conformation, product exit conformation etc. Capturing these types of interactions mainly when the substrate is inside the active site gives a better understanding of stabilizing residues in the pocket and orientation of the substrate. NCI analysis can be applied to any molecule and its surroundings. The Noncovalent Interaction (NCI) index is a valuable tool for visualizing and analyzing interactions in molecular systems. At its core, the NCI index is grounded in quantum mechanics. It uses electron density, ?(r), and its derivatives, primarily the reduced gradient of the electron density, s, to capture the spatial regions where non-covalent interactions occur.
[0023] The present invention describes a method to design hetero active sites on an enzyme or a protein using NCI Analysis, 2D and 3D CNN, and Feature-Based Cross-Correlation technique for fingerprint matching analysis. The method is utilized to develop hetero-active sites that can perform dual or multiple functions. NCI Analysis is better than traditional MD simulations as it gives a better understanding of a given enzyme-substrate complex.
[0024] The invention presents engineered transaminase with multiple active sites having multiple functions as an example. Transaminase is an enzyme, known for doing amination reactions, an additional active site was built using the step-wise protocol explained in the brief description of the invention sections which can function as a hydrolase catalytic site. The invention can be used to engineer any enzyme or protein of interest to build hetero-active sites to perform additional functions as required.
OBJECTS OF THE INVENTION
[0025] The objective is to develop a technology to create a hetero lipase active site on any given enzyme or a protein without disturbing its natural active site or function.
[0026] The invention aims to create a hetero lipase active site on an enzyme and a protein without affecting their natural active sites. The developed technology utilizes information from NCI indices, 2D and 3D CNN, topology, and algebraic properties to develop the hetero-lipase active site on an enzyme. This new site performs a different function from the enzyme's native function.
[0027] Terminologies
[0028] Unless otherwise defined, all technical and scientific terms used herein are generally understood to have the same meaning as commonly understood by one of ordinary skill in the art to which this invention pertains. As used herein, the following terms are intended to have the following meanings.
[0029] “polypeptide” and “protein” are used interchangeably to denote a polymer of at least two amino acids covalently linked by an amide bond, regardless of length or post-translational modification.
[0030] "Derived from" denotes the original polypeptide or the gene that encodes it, on which the engineering or processes are based.
[0031] "Conversion" refers to the enzymatic transformation of the substrate into the corresponding product. "Percentage conversion" indicates the proportion of the substrate that is converted to the product within a specified time under specific conditions. Therefore, the "enzymatic activity" or "activity" of an enzyme(s) can be expressed as the "percentage conversion" of the substrate to the product.
"Cofactor" refers to a substance that is essential or beneficial for the activity of an enzyme. In the case of leucine dehydrogenase, the cofactor is typically a nicotinamide cofactor.
[0032] "Thermostable protein" refers to a polypeptide that maintains comparable activity (more than, for example, 50% to 70%) even after exposure to high or low temperatures (10 - 20°C or 30 - 60°C) for a specific duration (e.g., 0.5 - 24 hrs), when compared to the untreated polypeptide.
[0033] "pH stable protein" refers to a polypeptide that maintains comparable activity (more than, for example, 50% to 70%) even after exposure to high or low pH levels (4.5 - 6 or 8 - 12) for a specific duration (e.g., 0.5 - 24 hrs), when compared to the untreated polypeptide.
[0034] In the context of the polypeptides shown here, "amino acid" or "residue" refers to the specific building block at a particular position in the sequence (e.g., X288 indicates that the "amino acid" or "residue" at position 45 is a Lys).
[0035] When an amino acid is incorporated into a peptide or polypeptide, it is classified as an "acidic amino acid or residue" if the hydrophilic amino acid or residue has a side chain with a pKa value lower than approximately 6. Acidic amino acids usually have negatively charged side chains at physiological pH due to the loss of a hydrogen ion. L-Glu (E) and L-Asp are examples of acidic amino acids that are genetically encoded (D).
[0036] When an amino acid is incorporated into a peptide or polypeptide, it is classified as a "basic amino acid or residue" if it is a hydrophilic amino acid or residue with a side chain that has a pKa value higher than approximately 6. Basic amino acids typically have positively charged side chains at physiological pH due to their interaction with hydronium ions. L-Arg (R) and L-Lys are two examples of basic amino acids that are genetically encoded (K)."Polar amino acid or residue" refers to a hydrophilic amino acid or residue with an uncharged side chain at physiological pH. Additionally, it must have at least one bond in which one of the atoms involved holds the pair of electrons more tightly. There are several genetically encoded polar amino acids, including L-Asn (N), L-Gln (Q), L-Ser (S), and L-Thr (T). A "hydrophobic amino acid or residue" is characterized by having an uncharged side chain at physiological pH. In these cases, the pair of electrons shared by two atoms is typically evenly held by each atom. These amino acids are called "non-polar amino acids or residues" because their side chains are not polar. The genetically encoded non-polar amino acids include L-Gly (G), L-Leu (L), L-Val (V), L-Ile (I), L-Met (M), and L-Ala (A).
[0037] On the other hand, a "hydrophilic amino acid or residue" is defined as having a side chain that exhibits hydrophobicity of less than zero, according to the normalized consensus hydrophobicity scale. The genetically encoded hydrophilic amino acids are L-Thr (T), L-Ser (S), L-His (H), L-Glu (E), L-Asn (N), L-Gln (Q), L-Asp (D), L-Lys (K), and L-Arg (R).
[0038] According to the normalized consensus hydrophobicity scale, an amino acid or residue is considered "hydrophobic" if its side chain has a hydrophobicity value higher than zero. The genetically encoded hydrophobic amino acids are L-Pro (P), L-Ile (I), L-Phe (F), L-Val (V), L-Leu (L), L-Trp (W), L-Met (M), L-Ala (A), and L-Tyr (Y).
[0039] An "aromatic amino acid or residue" refers to a hydrophilic or hydrophobic amino acid or residue that contains at least one aromatic or heteroaromatic ring in its side chain. The genetically encoded aromatic amino acids are L-Phe (F), L-Tyr (Y), and L-Trp (W). L-His (H) can be categorized as either an aromatic residue due to its side chain's heteroaromatic ring or as a basic residue due to the pKa of its heteroaromatic nitrogen atom.
[0040] An "aliphatic amino acid or residue" is a hydrophobic amino acid or residue with an aliphatic hydrocarbon side chain. The genetically encoded aliphatic amino acids are L-Ala (A), L-Val (V), L-Leu (L), and L-Ile (I).
[0041] When describing a difference in amino acids or residues between a polypeptide sequence and a reference sequence at a specific location, it is referred to as an "amino acid difference or residue difference". For example, if the residue at position X43 changes to any other residue besides valine, it is considered a residue difference at position X116, where the reference sequence contains valine. In the case of an enzyme, it may have one or more residues that differ from the reference sequence, usually indicated by a list of specific locations where the modifications occur. Amino acid residues with charges in their side chains are referred to as "charged residues." Negatively charged residues include L-Asp (D) and L-Glu (E), while positively charged residues include L-Arg (R) and L-Lys (K). In certain embodiments, "salt bridges" are the interactions between charged residues, specifically between L-Arg (R) and either L-Asp (D) or L-Glu (E), as well as between L-Lys (K) and either L-Asp (D) or L-Glu (E). These charged residue interactions are a type of interaction that helps stabilize enzyme structures.
[0042] Interactions
[0043] An understanding of scalable interactions between two atoms or a set of atoms is important information to study any given reactions or reactive complexes. In such a state, a known with many interactions and energetically controlled biological system is an enzyme-substrate complex where the substrate molecule is bound to the enzyme’s active site and with the help of amino acids present in the enzyme it will convert into the product. The substrate’s entry to the active site and after stabilized in the active site, it will convert into a product that involves multiple steps. During the entry, the amino acids in the entry path anchor or interact with the substrate’s atoms to orient the substrate to lead in a specific path. When it is in the pocket, the pocket residues interact with the substrate and stabilize it so that the substrate is ready for the reaction to happen. Many crucial residues interact with the substrate during the reaction to stabilize the intermediate states of the reactions. When the product is formed, atoms of the product interact with different amino acids present in the pocket and dissociation path so that the product can lead outside the protein.
[0044] During the above-explained process of reaction, different amino acids of enzymes show different types of interactions with a substrate such as ionic bonds, hydrogen bonds van der Waals interactions, non-covalent interactions, and localized and delocalized Pi-Pi stacking etc., These interactions are necessary for maintaining the energetically feasible conformation of the substrate.
[0045] NCI interactions
[0046] Non-covalent interactions are a sort of interaction where molecules interact electromagnetically with one another or with the same molecule but do not share electrons.
[0047] NCI index and electron density
[0048] NCI Index is an analysis that captures hydrogen bonds, ionic interactions, van der Waals interactions, noncovalent interactions etc., and provides a reduced 2D density gradient map and 3D electronic density map. In the enzyme-substrate complex, it is necessary to understand above mentioned interactions to study the fundamentals. The NCI Index analysis is an informative analysis, that can be derived using the NCI Analysis tool. The tool is developed based on key concepts of Density Functional Theory (DFT), where electronic density can be directly related to the energy spent during the interaction of atoms of enzyme-substrate complex and therefore the density of individual atoms or group of atoms provides energetics which is helpful for deriving the energetic state of the complex.
[0049] Natural active site
[0050] Natural active site, in this invention the context refers to the active site of any enzyme that is naturally occurring in the core of an enzyme. The natural active site always carries the native reaction or function of the enzyme by using catalytic residues.
[0051] 3DCNN for matching the 3D pockets:
[0052] The 3D NCI plot is a visual representation of the reduced density gradient and is typically depicted as iso-surfaces, which are coloured using a strength scale. The strength of each iso-surface is generally determined by calculating the product of the electron density and the second eigenvalue (?H) of the Hessian of the electron density at each point on the iso-surface. The sign of ?H indicates whether the interaction is attractive or repulsive. We have implemented iterative closest point (ICP) algorithm for 3D matching of the electron density maps generated from 3D NCI plot. The ICP algorithm is an iterative optimization algorithm that seeks to minimize the distance between two sets of points by iteratively aligning them using a rigid transformation. To use the ICP algorithm for electron density matching, we first convert the electron density maps into point clouds. We then apply the ICP algorithm to iteratively align the point clouds, starting with an initial guess for the transformation and refining it at each iteration. The ICP algorithm works by iteratively matching the closest points between the two sets and computing a rigid transformation that aligns the matched points. The matching is done using a suitable distance metric, such as the Euclidean distance or the Chamfer distance. The rigid transformation is computed using techniques such as the singular value decomposition (SVD) or the quaternion-based rotation. At each iteration, the ICP algorithm finds the closest point in the target point cloud for each point in the source point cloud and computes the transformation that best aligns the corresponding points. This transformation is then applied to the source point cloud to bring it closer to the target point cloud.
[0053] Hetero-active site
[0054] The context “hetero active site” in this invention refers to an engineered enzyme that has an additional active site along with a natural active site which will perform a different reaction from the native active site.
[0055] Database
[0056] In this context “database” in this invention refers to the storage of a 2D reduced density gradient map and a 3D electronic density map, topology and algebraic properties of the pocket from Reference NCI Indices and Query NCI Indices.
[0057] Training set
[0058] In this invention, the term 'training set' refers to the collection of catalytic conformations or enzyme-substrate complexes utilized to generate 2D reduced density gradient maps and 3D electronic density maps through NCI analysis, as well as the topology and algebraic properties of the pocket. These data are stored in a database for algorithm training purposes and are collectively known as the training set.
[0059] Testing set
[0060] In this invention, the context “testing set” refers to the list of pocket-substrate conformation or used for deriving 2D reduced density gradient map and 3D electronic density map from the NCI Analysis and topology and algebraic properties of the pocket and storing it in a database. Which will be called as testing sets.
[0061] 3D Electronic Density map
[0062] The context “3D Electronic Density map”, in this invention refers to the output of NCI analysis where electronic clouds define the type of interaction at the atomic level to the given complex. If the given complex is part of an enzyme-substrate complex, then the electronic clouds appear between the substrate and amino acids surrounded.
[0063] 2D reduced density gradient map
[0064] In this invention, the context “2D reduced density gradient map” refers to the output of NCI Analysis, a 2D graph that represents attractive, repulsive forces. The 2D reduced density gradient map gives an understanding of how the complex is built in terms of local interactions in a representative way.
[0065] CNN
[0066] The NCI Index is one such analysis that captures hydrogen bonds, ionic interactions, van der Waals interactions, noncovalent interactions etc., using 3D CNN. The NCI Analysis provides non-covalent interaction for the given ES complex which can be well addressed using 2D images. The NCI tool is developed based on key concepts of Density Functional Theory (DFT), where electronic density can be directly related to the energy spent during the interaction of atoms of ES complex and therefore the density of individual atoms or groups of atoms provides energetics which is helpful for deriving the energetic state of the complex. The masked interactions in the 2D plots were explored using 3D CNN with a 3-dimensional understanding of the complex. The binding pose, the energy of the ES complex, the reactive distances and the angle between atoms of substrate and catalytic residues were captured using 3D CNN.
[0067] Reference NCI Indices
[0068] Reference NCI Indices in this context is referred to the development of 2D reduced density gradient map and a 3D electronic density map, topology and algebraic properties of the collected diverse sequences from different classes of enzymes.
[0069] Query NCI Indices
[0070] Query NCI Indices in this context refer to the development of a 2D reduced density gradient map and a 3D electronic density map, topology and algebraic properties of the target or query protein or enzyme where a hetero-active site is to be established.
[0071] Homo-active site enzyme
[0072] The utilization of developed technology for the development of an additional active site which performs a similar function to its native active site. The homo-active site here in this invention refers to an enzyme which is capable of catalysing the same reaction in multiple active sites.
[0073] Hetero active site enzyme
[0074] The utilization of developed technology for the development of an additional active site which performs different reaction when compared to its native active site. The hetero-active site here in this invention refers to an enzyme which is capable of catalysing the two different reactions in multiple active sites.
[0075] Feature-Based Cross-Correlation Technique
[0076] Cross-correlation is a standard statistical tool used to measure the similarity between two images or datasets as a function of the displacement of one relative to the other. When applied in the context of image processing or computer vision, cross-correlation can help identify the presence of a template or a feature within a larger image by providing a measure of similarity.
[0077] The Feature-Based Cross-Correlation technique augments the traditional method by focusing on distinct and prominent features of the image or signal rather than raw intensity values. Here’s a brief overview:
[0078] Feature Extraction: Before performing cross-correlation, salient features from both the main image and the template are extracted. Features could be corners, edges, or other distinct structures within the image.
[0079] Feature Matching: Once features are extracted from both images, the next step is to find matches. This could be done using descriptors derived from the features and measuring their similarity.
[0080] Cross-Correlation Using Features: Rather than using the raw intensity values, the cross-correlation is performed based on the matched features. This is often more robust and can handle variations in illumination, minor deformations, and noise better than traditional cross-correlation.
[0081] Localization: The outcome of the correlation process will highlight areas in the main image that are most similar to the template based on the features. Peaks in the cross-correlation output indicate potential matches.
[0082] In summary, Feature-Based Cross-Correlation combines the principles of feature extraction with the traditional cross-correlation method to offer more robustness to noise, variations in illumination, and minor deformations in the image. Additionally, by narrowing down the regions of interest based on features, it can be computationally more efficient than searching the entire image space, an efficient approach to template matching and similarity measurement in images.
[0083] “Catalytic conformation” in this invention the catalytic conformation refers to an enzyme-substrate complex where the distance between the carboxyl carbon of the substrate and the reaction atom of the catalytic residue Ser is less than 4.0Å. The catalytic conformation can also be called near-attack conformation.
[0084] In some embodiment, the developed technology is used to engineer a transaminase and ketoreductase to harbour hetero-lipase active sites for performing hydrolysis reactions.
[0085] In some embodiments, the engineered transaminase enzymes are mutated or substituted at one or multiple places, one at a time or simultaneously to develop the hetero-lipase active site.
[0086] In some embodiments, the transaminase enzyme engineered to have a hetero-lipase active site is given in SEQ ID 1 to 20.
[0087] In some embodiments, the engineered transaminase with hetero-lipase active site enzyme is engineered to accommodate a broad range of substrates. The current invention the engineered transaminase enzyme with hetero-lipase active site is tested against different substrates where the substrate undergoes hydrolysis followed by an amination reaction.
[0088] In some embodiments, the engineered transaminase with hetero-lipase active site is tested against (5-methoxy-3,5-dioxopentyl)(methyl)phosphinic acid, methyl 3-oxohexanoate, methyl (4R)-4-cyclopropyl-3-oxopentanoate, methyl (5R)-5-methyl-3-oxooctanoate, methyl 3-oxo-3-phenylpropanoate, methyl 3-(4-methylphenyl)-3-oxopropanoate, methyl 3-(3-methylphenyl)-3-oxopropanoate, methyl 3-oxo-4-phenylbutanoate, methyl 3-(4-methoxyphenyl)-3-oxopropanoate, methyl 3-(3,4-dimethoxyphenyl)-3-oxopropanoate, methyl 3-(4-chlorophenyl)-3-oxopropanoate, methyl 3-(4-fluorophenyl)-3-oxopropanoate, methyl 3-oxo-5-(2,4,5-trifluorophenyl)pentanoate, and methyl 3-oxo-5-phenylpentanoate.
[0089] In the embodiments, the activity of the engineered transaminase with hetero-lipase active site was obtained against 5-[hydroxy(methyl)phosphoryl]-3-oxopentanoic acid, 3-oxohexanoic acid, (4R)-4-cyclopropyl-3-oxopentanoic acid, (5R)-5-methyl-3-oxooctanoic acid, 3-oxo-3-phenylpropanoic acid, 3-(4-methylphenyl)-3-oxopropanoic acid, 3-(3-methylphenyl)-3-oxopropanoic acid, 3-oxo-4-phenylbutanoic acid, 3-(4-methoxyphenyl)-3-oxopropanoic acid, 3-(3,4-dimethoxyphenyl)-3-oxopropanoic acid, 3-(4-chlorophenyl)-3-oxopropanoic acid, 3-(4-fluorophenyl)-3-oxopropanoic acid, 3-oxo-5-(2,4,5-trifluorophenyl)pentanoic acid, and 3-oxo-5-phenylpentanoic acid.
[0090] In some embodiments, the engineered transaminase enzyme was tested against the above-mentioned substrate and the activity data was listed in Table 4.
[0091] In some embodiments, the activity of the amination of the engineered transaminase was tested for amination activity and the activity data is given in Table 4.
Sl.No ID Mutations Lipase relative activity (%) Transaminase relative activity (%)
1 SEQ ID 1 X198E, X228V, X236L, X238A, X268S, X269D, X276S, X277L, X349H, X350T, X354L + +++++
2 SEQ ID 2 X228F, X236I, X237A, X239S, X269D, X276S, X277Y, X349R + +++++
3 SEQ ID 3 X238A, X239D, X269D, X276S, X277E, X349R, X373S, X357R ++ +++++
4 SEQ ID 4 X198E, X228V, X236I, X237A, X268A, X269D, X276S, X277L, X350E ++ +++++
5 SEQ ID 5 X236I, X237A, X239A, X269D, X276S, X277L, X349R, X353A, X354L ++ +++++
6 SEQ ID 6 X198H, X236L, X238A, X268S, X269D, X276S, X277Y, X350E, X353A ++ +++++
7 SEQ ID 7 X228F, X236I, X237A, X239S, X269D, X276S, X277E, X349R, X350T ++ +++++
8 SEQ ID 8 X236L, X238A, X268A, X269D, X276S, X277L, X349H, X350E, X353A +++ +++++
9 SEQ ID 9 X228F, X236I, X237A, X239A, X268A, X269D, X276S, X277Y, X349R, X350E, X354I, X361E, X373S +++ +++++
10 SEQ ID 10 X198H, X236L, X238A, X268S, X269D, X276S, X277L, X349H, X350T, X353A, X361Q, X365D, X371E ++ +++++
11 SEQ ID 11 X228F, X236I, X237A, X239A, X268A, X269D, X276S, X277E, X349R, X350T, X354L, X361E, X379I +++ +++++
12 SEQ ID 12 X236L, X238A, X268S, X269D, X276S, X277L, X349H, X350E, X353A, X361Q, X365R, X373T ++ +++++
13 SEQ ID 13 X198H, X228F, X236I, X237A, X239S, X269D, X276S, X277Y, X349R, X350T, X354I, X357H, X371T +++ +++++
14 SEQ ID 14 X228V, X236L, X238A, X268A, X269D, X276S, X277L, X349H, X350E, X353A, X373T +++ +++++
15 SEQ ID 15 X228F, X236I, X237A, X239A, X268A, X269D, X276S, X277E, X349R, X350T, X354I, X353A, X361E, X373S ++ +++++
16 SEQ ID 16 X236L, X238A, X268S, X269D, X276S, X277L, X349H, X350E, X353A, X354L, X371A +++ +++++
17 SEQ ID 17 X198H, X228F, X236I, X237A, X239S, X269D, X276S, X277Y, X349R, X350T, X354I, X357R, X371T +++ +++++
18 SEQ ID 18 X228F, X236I, X237A, X239S, X268A, X269D, X276S, X277Y, X349R, X350E, X354L, X357R, X361E, X364I, X371A ++++ +++++
19 SEQ ID 19 X236L, X238A, X239D, X268S, X269D, X276S, X277L, X349H, X350T, X353A, X361Q, X365A, X373T, X421E, X354I ++++ +++++
20 SEQ ID 20 X228F, X236I, X237A, X239A, X268A, X269D, X276S, X277E, X349R, X350E, X354I, X353A, X361Q, X364I, X379I ++++ +++++
[0092] Table 4: The table shows the activity data of hetero-lipase active site for hydrolysis reaction and transaminase for amination reaction. The converted product of the lipase was bound in the transaminase active site as a substrate. The indication “+” highlights the activity of the enzyme for hydrolysis and . “+” indicates the activity of = 20%, “++” indicates the activity of >20% and = 35%, “+++” indicates the activity of >35% and =55%, “++++” indicates the activity of >55% and =80%, and “+++++” indicates the activity of >80% and =99%.
[0093] In some embodiments, the engineered ketoreductase enzyme is mutated or substituted at one or multiple places, one at a time or simultaneously to develop the hetero-lipase active site.
[0094] In some embodiments, the Ketoreductase enzyme engineered to have a hetero-lipase active site is given in SEQ ID 21 to 33.
[0095] In some embodiments, the engineered Ketoreductase with hetero-lipase active site was tested for hydrolysis reaction followed by reduction reaction.
[0096] In some embodiments, the engineered Ketoreductase with hetero-lipase active site was tested against ethyl 2-oxo-4-phenylbutanoate (ethyl benzylpyruvate) and tert-butyl (5S)-6-chloro-5-hydroxy-3-oxohexanoate for hydrolysis reaction.
[0097] In some embodiments, the engineered Ketoreductase with hetero-lipase active site enzyme was tested against the above-mentioned substrate and the activity data was listed in Table 5. In the same table, the activity data for reduction by ketoreductase natural active site is given.
Sl.No ID Mutations Lipase relative activity (%) Ketoreductase relative activity (%)
1 SEQ ID 21 X212D, X247H, X237S + +++++
2 SEQ ID 22 X212D, X247H, X237S, X238G + +++++
3 SEQ ID 23 X212D, X247H, X237S, X238G, X239G + +++++
4 SEQ ID 24 X55V, X120I, X212D, X175S, X230I, X237S, X238G, X239G, X247H, X290C ++ +++++
5 SEQ ID 25 X60I, X130V, X212D, X180T, X245V, X237S, X238G, X239G, X247H, X310I ++ +++++
6 SEQ ID 26 X65T, X140V, X212D, X190I, X255S, X237S, X238G, X239G, X247H, X320G ++ +++++
7 SEQ ID 27 X55I, X120V, X212D, X175T, X230I, X237S, X238G, X239G, X247H, X290C ++ +++++
8 SEQ ID 28 X60T, X130V, X212D, X180S, X245I, X237S, X238G, X239G, X247H, X310V, X350V, X365T ++ +++++
9 SEQ ID 29 X65V, X140T, X212D, X190S, X255I, X237S, X238G, X239G, X247H, X320V, X380C ++ +++++
10 SEQ ID 30 X55V, X120I, X212D, X175T, X230S, X237S, X238G, X239G, X247H, X290V, X330C, X350G, X410D, X440T +++ +++++
11 SEQ ID 31 X60I, X130V, X212D, X180S, X245T, X237S, X238G, X239G, X247H, X285V, X310I, X365D, X420T, X450V +++ +++++
12 SEQ ID 32 X65T, X140I, X212D, X190S, X255V, X237S, X238G, X239G, X247H, X300G, X340C, X370T, X430D, X460V ++++ +++++
13 SEQ ID 33 X70V, X150T, X212D, X200I, X265V, X237S, X238G, X239G, X247H, X315C, X355G, X380T, X440V, X470D ++++ +++++
[0098] Table 5: The table shows the activity data of hetero-lipase active site for hydrolysis reaction and reduction reaction by Ketoreductase enzyme. The converted product of the lipase was bound in the ketoreductase active site as a substrate. The indication “+” highlights the activity of the enzyme for hydrolysis and reduction. “+” indicates the activity of = 20%, “++” indicates the activity of >20% and = 35%, “+++” indicates the activity of >35% and =55%, “++++” indicates the activity of >55% and =80%, and “+++++” indicates the activity of >80% and =99%.
BRIEF DESCRIPTION OF FIGURES
[0099] Figure 1: The reaction schema explains the conversion of A) methyl 3-oxo-4-(2,4,5-trifluorophenyl)butanoate into 3-oxo-4-(2,4,5-trifluorophenyl)butanoic acid in the first step where lipase enzyme catalyses the reaction, B) the 3-oxo-4-(2,4,5-trifluorophenyl)butanoic acid undergoes amination reaction where transaminase enzyme converts Keto group into amino group which produces (3R)-3-amino-4-(2,4,5-trifluorophenyl)butanoic Acid as a final product. The reaction is a two-step that involves two different enzymes to complete the reaction in different media.
[0100] Figure 2: The reaction schema of conversion of A) R-methyl-3-oxo-propanoate into R-methyl-3-oxo-propanoic acid using engineered transaminase enzyme that accommodates new lipase active site and performs hydrolysis function. B) The R-methyl-3-oxo-propanoic acid undergoes an amination process in the natural active site of the transaminase. The engineered transaminase can accommodate various ranges of substrates as given in the R group in the engineered active site and also in the natural active site. The engineered enzyme brings an advantage over multiple enzymes required for the same reaction.
[0101] Figure 3: The steps for incorporating a hetero-active site begin by gathering diverse sequences of hydrolase classes. The collected sequences were modelled using AlphaFold. Docking studies are carried out using the Autodock4 tool, generating 1000 different conformations for every reference enzyme. The most energetically feasible conformation is selected and subjected to Non-covalent interaction studies. Data such as a 2D reduced density gradient map, 3D electronic density map, topology, and algebraic properties are extracted and stored in the database as Reference NCI Indices. In the second step of the innovation, the target proteins or enzymes considered for incorporating hetero-active site was subjected to find pockets and docking studies. For docking studies reference substrate was considered and 1000 different conformations were generated across the filtered pockets. The conformation with the lowest energy is chosen and subjected to NCI Analysis. From the NCI Analysis, data such as a 2D reduced density gradient map, 3D electronic density map, topology, and algebraic properties are extracted and stored in the database as Query NCI Indices. The data from the reference enzyme-substrate complex and the query protein or enzyme-substrate complex are then compared using a 2D-CNN-based image-matching protocol to identify highly similar matches to the natural pocket. The NCI Analysis process involves comparing Query NCI Indices against Reference NCI Indices using a 2D-CNN-based image-matching protocol. The objective is to identify the closest match of the query to the reference pocket. A matrix was created between the query NCI Indices and the Reference NCI Indices, categorising the data into three match types: high, medium, and low. On the "high match" pocket, the active site having catalytic residue was developed.
[0102] Figure 4: The process of introducing a novel hetero-active site into the transaminase enzyme involves A) identification of buried pockets B) dents or pockets on the surfaces, and C) the docking studies to fit the reference substrate molecule on the filtered pockets. The small hidden pockets inside the protein and dents on the surface are potential regions to introduce new active sites.
[0103] Figure.5: The architecture of machine learning-based (2+1)D convolutional neural network. The model was trained on the provided non-redundant lipase structure where the active site had near-attack conformation of the substrate probe. The algebraic, property and NCI interaction were considered as the key features of training the model. For the given protein or enzyme where a new catalytic site needs to be incorporated will be given as an input to the protocol using reference substrate. The reference substrate was subjected to docking studies to find potential pockets where the substrate can attain energetically feasible conformation. A docked conformation from the pocket will be subjected to NCI Analysis to produce 2D reduced density gradient maps and 3D electronic density gradient maps. All possible pockets along with the substrate in it were subjected to NCI Analysis. The topology and algebraic parameters were also extracted by finding interactive residues around the substrate and finding the volume and shape of the pocket. The framework repeatedly tries to find the matches if the pocket suits any of the reference NCI indices and categorises the data as high match, medium match and low match. The high match pocket was selected and chosen for incorporating catalytic residues.
[0104] Figure 6: The output of the process is categorized into highly matched, medium matched, and low matched pockets. Those in the high-matched category demonstrate a similarity in 3D density maps surrounding the substrate when compared to references. Different pockets from Query NCI Indices were compared against reference NCI Indices.
[0105] Figure 7: Depicts the top-ranked pockets from the high-match list, including A) Reference-1, B) Reference-2, C) *1 pocket, D) #1 from the high-match list, alongside and E) #2 from a low-match pocket.
[0106] Figure 8: A) Quantum Mechanical (QM) studies were conducted on Reference-1, #1, and *1 pockets to assess the feasibility of reactions. B) The schematic outlines the hydrolysis mechanism examined in the QM studies, showcasing the ground state, transition state, and intermediate. Pockets #1 and #2 exhibit interactions similar to Reference-1 and Reference-2, respectively, corroborated by the analogous 3D maps identified by the process. The process’s predictions align with the QM studies, indicating that the *1 pocket necessitates higher energy for the reaction.
[0107] Figure 9: Illustrates A) the Pillion residues responsible for stabilizing the catalytic residues, B) the designed hetero-lipase active site residues showing interaction with reference substrate. The process evaluates individual features present in the reference and compares them with the provided input pockets. The figures detail the key interactions or features identified by the process.
[0108] Figure 10: The developed hetero-active site on the S-Transaminase enzyme where amination occurs at the natural active site and hydrolysis on the engineered active site. A) The 2D reduced density gradient maps of the hetero-active site where hydrolysis reaction occurs, B) the structure of the engineered S-Transaminase enzyme where the natural active site is highlighted in dark pink colour, and the engineered hetero-active site is shown in green colour C) The 2D reduced density gradient map of the natural active site where amination occurs.
[0109] Figure 11: A) the interactions of methyl 3-oxo-4-(2,4,5-trifluorophenyl)-butanoate in the engineered hetero-active site and B) the interaction of 3-oxo-4-(2,4,5-trifluorophenyl)-butanoic acid in the native transaminase pocket. In both, A and B, different residues show different types of interactions which were explained using NCI interaction in Figure 9B.
[0110] Figure 12: The figure illustrates the binding conformation of acetophenone in the active site of the transaminase enzyme, subjected to NCI Analysis. A) It highlights the reduced electron density gradient maps around acetophenone, showcasing large and small binding pockets. B) Displayed are the residues surrounding acetophenone, including Lys, PLP, and other significant pocket residues. C) Depicted are the interactions between PLP, catalytic Lys, and acetophenone, and D) various interactions of acetophenone with surrounding residues along with its reduced electron density maps. E-F) Different binding conformations of acetophenone on tubulin and actin are presented, where similar non-covalent interactions are not observed, yet the pockets in actin and tubulin maintain the necessary interactions to stabilize the acetophenone is substrate.
[0111] Figure 13: Information captured by the NCI Analysis for the different conformation of lipase where I) interaction of catalytic residues with the substrate, II) 3D electron density maps shows the interaction of catalytic residues before substrate entering the active site, III) 3D electron density maps interaction of pillion residues with the catalytic Ser, His, and Asp, IV) 3D electron density maps shows the interaction of the substrate with the catalytic residues V) D electron density maps interactions of the substrate with the surrounding residues, and VI) interactions of active site residues with pillion residues.
SUMMARY
[0112] The present invention is a method for enhancing the functionality of proteins and enzymes by adding new active sites. This methodology allows a single protein or enzyme to have multiple active sites. As an illustration, a transaminase enzyme, typically engaged in amino group transfers, is engineered to have an additional site that facilitates hydrolysis reactions. Such enzymes with two distinct active sites performing different functions are referred to as "hetero-active site enzymes where the lipase active site performs the hydrolysis reaction on the given substrate and the transaminase active site performs amination on the substrate which is the product of the hydrolysis reaction. If the hetero-active site enzymes with double-active sites are not used each step of the reaction requires separate enzymes. For instance, lipase in another for hydrolyzing the supplied substrate and transaminase in one medium by preserving the distinct reaction conditions (Figure 1)". The engineered transaminase enzyme is tested for lipase activity in the lab with multiple substrates such as (5-methoxy-3,5-dioxopentyl)(methyl)phosphinic acid, methyl 3-oxohexanoate, methyl (4R)-4-cyclopropyl-3-oxopentanoate, methyl (5R)-5-methyl-3-oxooctanoate, methyl 3-oxo-3-phenylpropanoate, methyl 3-(4-methylphenyl)-3-oxopropanoate, methyl 3-(3-methylphenyl)-3-oxopropanoate, methyl 3-oxo-4-phenylbutanoate, methyl 3-(4-methoxyphenyl)-3-oxopropanoate, methyl 3-(3,4-dimethoxyphenyl)-3-oxopropanoate, methyl 3-(4-chlorophenyl)-3-oxopropanoate, methyl 3-(4-fluorophenyl)-3-oxopropanoate, methyl 3-oxo-5-(2,4,5-trifluorophenyl)pentanoate, and methyl 3-oxo-5-phenylpentanoate. The products of lipase reaction with the above substrates are as follows; 5-[hydroxy(methyl)phosphoryl]-3-oxopentanoic acid, 3-oxohexanoic acid, (4R)-4-cyclopropyl-3-oxopentanoic acid, (5R)-5-methyl-3-oxooctanoic acid, 3-oxo-3-phenylpropanoic acid, 3-(4-methylphenyl)-3-oxopropanoic acid, 3-(3-methylphenyl)-3-oxopropanoic acid, 3-oxo-4-phenylbutanoic acid, 3-(4-methoxyphenyl)-3-oxopropanoic acid, 3-(3,4-dimethoxyphenyl)-3-oxopropanoic acid, 3-(4-chlorophenyl)-3-oxopropanoic acid, 3-(4-fluorophenyl)-3-oxopropanoic acid, 3-oxo-5-(2,4,5-trifluorophenyl)pentanoic acid, and 3-oxo-5-phenylpentanoic acid which were used as substrates for the transaminase activity, respectively(Figure 2).The brief step wise protocol of the method to incorporate hetero-active site is given in Figure 3-11. The invention starts with the collection of six different classes of enzymes, each with different subclasses. The diverse sequences of hydrolases, oxidoreductases, transferases, lyases, isomerases, and ligases are collected and modelled using the AlphaFold modelling tool. The enzymes with specific activity are extracted from literature and protein databanks. The diverse sequences with a minimum of 90% identicalness are modelled and used for docking studies. The enzymes are shown to have activity with different substrates, with the specificity of the substrate determined by the nature of the pocket. Docking studies are conducted using the Autodock4 tool, 1000 different conformations are generated for each reference protein. The energetically feasible best catalytic conformation is chosen. The energetically feasible catalytic conformation is defined based on the distance criteria where the distance between the reactive carboxyl carbon of the substrate and reaction atom of the catalytic residues Ser-OG must be less than 3.5Å. The extracted catalytic conformation is subjected to Non-covalent interaction studies and the data such as a 2D reduced density gradient map, 3D electronic density map, topology and algebraic properties were extracted and stored in the database as Reference NCI Indices which are used for the training to develop the template or training database.
[0113] In the second step of the invention a suitable starting point, essential for developing non-natural and new functions in enzymes and proteins is considered. Proteins and enzymes known to have many pockets and dents, with larger clefts or pockets. These pockets are identified and filtered and are subjected to docking studies using the reference substrate which was used in step 2. The docking studies generate 1000 different conformations of the reference substrate across different filtered pockets. The least energy conformation was selected and subjected to NCI Analysis.
[0114] From the NCI Analysis, similar data such as a 2D reduced density gradient map and 3D electronic density map, topology and algebraic properties were extracted and stored in the database as Query NCI Indices. The developed Query NCI Indices is considered as a testing dataset. The developed data of reference enzyme-substrate complex and query protein or enzyme-substrate complex are then compared using a 2D-CNN-based image-matching protocol to identify high matches to the natural pocket. The NCI Analysis process involves a 2D-CNN-based image-matching protocol to compare Query NCI indexes against Reference NCI indexes. The objective is to identify the high match of the query to the natural pocket. A pairwise matrix is produced between the query NCI Indexes and the Reference NCI Indexes, categorizing the data into three distinct match types: high, medium, and low. The active site is developed based on the "high match" protein-ligand or enzyme-substrate complexes. A pocket with at least 50% of 3D density matching derived using 3D CNN, feature-based cross-correlation technique and ResNet used for defining catalytic residues.
[0115] The NCI Indices output of a high match, the low match were compared where the 3D electronic density maps of high match and low match were observed to have similar interactions and 3D maps as such References. For the same Quantum Mechanical (QM) studies were conducted to check the feasibility of the reaction on Reference-1, a high match, and a low match pocket.
[0116] The above-given protocol was then implemented on a transaminase and a hetero-active site was developed. Another aim of the developed method is to identify and differentiate the pocket and an active site of an enzyme. Using the technology it is possible to distinguish the difference between an active and a pocket. For example, if the same substrate is present in an active site and a normal random pocket, the technology should be capable of differentiating both. Acetophenone, a known transaminase substrate docked in the transaminase pocket and in tubulin protein. The active site of transaminase holds a large binding pocket and a small binding pocket, while tubulin does not have any similar characteristics in any of the pocket it has. The technology predicted the difference between both pockets in the presence of acetophenone (Figure 12). In addition to this, the technology is further evolved to analyse the interaction of pillion (beyond active site) residues with the substrate. Initially, the structural alignment of catalytic residues, binding of the substrate in the pocket, interaction of only catalytic residues with the substrate, interaction of pocket residues with the substrate, and interaction of pillion residues with the substrate were explored. Different non-covalent interactions which were failed to be observed using any other tools were explored using NCI Analysis (Figure 13).
[0117] This invention holds substantial relevance, given the multifaceted roles proteins play as structural components, transporters, messengers, antibodies, and especially as enzymes in the industrial realm. By offering a structured method to introduce secondary active sites in proteins and enzymes, this invention paves the way for multi-functional enzymes, optimizing their utility in diverse applications, including drug production and other industrial processes.
DETAILED DESCRIPTION OF THE INVENTION:
[0118] The invention describes a methodology for constructing, a hetero-active site on enzymes or proteins in addition to their natural function. The protocol involves multiple steps, starting with collection of diverse sequences from different classes of enzymes reported for natural activity with its natural substrate. The six different classes of enzymes are hydrolases, oxidoreductases, transferases, lyases, isomerases and ligases are collected. The collected diverse sequences were then modelled using an AlphaFold and further used for docking studies. The docking studies were conducted using the Autodock4 tool where a native substrate of that particular enzyme class was used for docking studies. The docked conformation was subjected to NCI analysis to extract topology and algebraic properties, a 2D reduced density gradient map and a 3D electronic density map. The extracted properties and maps were stored in the Reference NCI Indices.
[0119] The enzymes and proteins are known to have multiple pockets and dents on it. Many are allosteric pockets some are simple non-functional pockets and dents. These pockets were filtered based on the pocket size and subjected for docking studies with native substrate. The docked conformation later subjected to NCI analysis and extracted similar topology and algebraic properties, a 2D reduced density gradient map and a 3D electronic density map and stored in the database as Query NCI Indices. Both Reference and Query NCI Indices were then compared to develop a hetero-active site using 2D CNN, 3D CNN and Feature-based cross-correlation methods and utilized ResNet for incorporating catalytic residues by comparing 3D electronic density maps of Reference and query pockets. The above-given steps are explained in Figure. 3 and stepwise explanations are given below.
[0120] Step 1: Collection of enzymes from different classes: There are 6 different classes of enzymes which perform various types of reactions, and which also contain different subclasses. The sequence from hydrolases, oxidoreductases, transferases, lyases, isomerases and ligases classes are collected and modelled using the AlphaFold modelling tool. The enzymes which are reported for a specific activity was extracted from literature and protein databank. Then the diverse sequence of a class of enzymes is stored in a database which is then used for docking studies. The diverse sequences are selected based on the criteria where all the sequences are a minimum of 90% identical to each other. The AlphaFold-modelled relaxed structure was chosen for the next set of studies.
[0121] Step 2: Docking studies with representative substrate: All different classes of enzymes are shown activity with different types of substrates. The specificity of the substrate will be defined based on the nature of the pocket. The docking studies were conducted to establish a reference enzyme-substrate complex. Docking studies are conducted on the native pocket of the enzymes with the reference substrate molecule of that class. The docking studies were conducted using the Autodock4 tool where the grid size of 40, 40, and 40 in X, y, and Z directions respectively maintained with the grid spacing of 0.225 Å and the grid centre was defined by considering the centre of the pocket. For each reference protein 500 different conformations were generated.
[0122] step 3: Energetically feasible conformation of the representative substrate in the pocket: The energetically feasible conformation along with appropriate distances with catalytic residues is chosen as the best conformation. The Enzyme-substrate complex of the chosen conformation was derived and the next set of calculations and analyses where non-covalent interaction studies were conducted.
[0123] Step 4: NCI Analysis to extract 2D NCI Index and 3D electronic map of interactions (Reference NCI Indices): The enzyme-substrate conformation is subjected to NCI analysis where the coordinates of the substrate and residues within 5Å of the substrate are extracted. The NCI analysis provides a 2D reduced density gradient map and a 3D electronic density map. topology and algebraic properties of the pocket are also stored in the database as Reference NCI Indices.
[0124] Step 1a: Selection of a starting point: A suitable starting point enables the fitting of suitable substrates in the active site. A suitable starting point was used for the development of non-natural and new functions by retaining its natural function. For example, a transaminase was chosen as a starting point for developing the hetero-active site to perform a hydrolysis reaction without disturbing the amination reaction. An enzyme with a natural active site can function only in its native reaction of interest. Introducing an additional hetero-active site on an enzyme can develop a multi-site functional enzyme. Similarly, proteins are known to perform their own function inside a cell without performing any biochemical reactions. Introducing an active site will make it multifunctional and perform biochemical reactions.
[0125] Step 2a: Pocket Identification and Docking Studies: Proteins and enzymes are known to have N-number of pockets and dents on it. There are larger clefts or pockets which do not possess any functions such pockets were identified and filtered based on the radius of the pocket. The Caver tool is used to identify the pockets and dents in a given protein or an enzyme. The radius of gyration of the substrate was calculated which gives the average space required for the rotation of it. The pockets which are having higher radius of the pocket size are filtered for docking studies. The filtered pockets were subjected to docking studies where the reference substrate used in step 2 was used for conducting the docking studies (Figure 4).
[0126] Step 3a: The docking studies give rise to varying conformations of the substrate situated in different filtered pockets. The docked conformation of the substrate varies as per the geometry of the local region and, the presence of amino acids for interactions and anchoring points. The interacting residues around the substrate define the least energy conformation in the area. The docking studies generated 1000 different conformations of the reference substrate across different filtered pockets. A conformation where the substrate is bound in the pocket which has better interactions and lower energy was selected for the next steps where it will be subjected to NCI Analysis.
[0127] Step 4a: The coordinates of the chosen reference substrate-bound complex were used to study NCI Analysis to generate a 2D reduced density gradient map and a 3D electronic density map. The coordinates of the reference substrate molecule and surrounding amino acids within 5Å of the reference substrate were extracted and converted into XYZ format to subject NCI analysis. The data from NCI analysis resulting in a 2D density gradient map and a 3D electronic map representing hydrogen bonds, Van Der Walls, electrostatic interactions etc., extracted and the data such as topology and algebraic properties of the pocket also extracted and stored in the database as Query NCI Indices.
[0128] Step 5: storing the data: The developed data of reference enzyme-substrate complex (Reference NCI Indices) and query protein or enzyme-substrate complex (query NCI Indices) will be stored in a database. The stored data will be analysed using different ML-based algorithms which will explained in the below steps.
[0129] Step 6: 2D-CNN based image matching algorithm for the comparison of the query NCI Indices and Reference NCI Indices: The generated images and maps from NCI Analysis undergo a process via a 2D-CNN-based image-matching protocol, enabling the comparison of all the Query NCI indexes against the Reference NCI indexes. The objective of the image comparison is to identify the high match (of the query) to the natural pocket. All the 2D reduced density gradient maps are converted into NCI indices for comparison (Figure 5).
[0130] Step 7: A matrix: a matrix is produced between the query NCI Indexes and the Reference NCI Indexes after comparing 2D reduced density gradient maps. This process categorizes the data into three distinct match types: high, medium, and low. The pairwise matrix contains numerical values indicating a high match, medium match and low match (Figure 6).
[0131] Step 8: 3D CNN based based image matching algorithm for the comparison of the query NCI Indices and Reference NCI Indices: The 3D electronic maps hold key information about the reference substrate in the filtered pockets. The 3D electronic map shows all types of non-covalent interactions. For the selected high matches these interactions were compared by superimposing 3D electronic maps of Reference NCI indices and Query NCI indices using 3D CNN. A pocket which shows at least 50% of the density matching derived using 3D CNN is further used for defining the catalytic residues (Figure 7).
[0132] Step 9: Feature-based cross-correlation and ResNet for comparing the features and incorporating the catalytic residues: Only the “high match” protein-ligand or enzyme-substrate complexes were considered for developing the active site after comparing the 3D electronic maps. As suggested by the Feature-based cross-correlation method the catalytic residues were replaced. Additionally, the high match conformation of the pocket was reconsidered by ResNet protocol for optimizing the pocket to incorporate catalytic residues. The stability and reaction mechanism of the high match pocket were further analysed by conducting QM studies (Figure 8) and also to understand if the developed technology compared the similar interaction beyond the active site for pillion residues. The interaction of the substrate with the active site and the pillion residues was similar to the Reference-1 pocket (Figure 9).
[0133] Step 10: synthesis, Expression, purification and assay for testing the activity of the developed hetero-active site enzyme: Engineered transaminase enzyme was expressed using E. coli BL21(DE3)/pET-28a, which was then inoculated in 200 mL of LB medium containing 50 µg·mL-1 kanamycin at 37 °C for three hours, or until the optical density at 600 nm (OD600) reached about 0.6. 200 µg·mL-1 IPTG was used to stimulate enzyme expression for 20 hours at 25 °C and 150 rpm. After being harvested, the cells were centrifuged for 10 minutes at 8000 rpm at 4 °C. They were then dissolved in binding buffer (20 mmol·L–1 imidazole, 50 mmol·L–1 PBS, pH 8.0) to homogenize them under high pressure.
[0134] The engineered enzyme was tested against different set of substrates for different activity. For lipase activity (5-methoxy-3,5-dioxopentyl)(methyl)phosphinic acid, methyl 3-oxohexanoate, methyl (4R)-4-cyclopropyl-3-oxopentanoate, methyl (5R)-5-methyl-3-oxooctanoate, methyl 3-oxo-3-phenylpropanoate, methyl 3-(4-methylphenyl)-3-oxopropanoate, methyl 3-(3-methylphenyl)-3-oxopropanoate, methyl 3-oxo-4-phenylbutanoate, methyl 3-(4-methoxyphenyl)-3-oxopropanoate, methyl 3-(3,4-dimethoxyphenyl)-3-oxopropanoate, methyl 3-(4-chlorophenyl)-3-oxopropanoate, methyl 3-(4-fluorophenyl)-3-oxopropanoate, methyl 3-oxo-5-(2,4,5-trifluorophenyl)pentanoate, and methyl 3-oxo-5-phenylpentanoate substrates was used and the same was tested in the lab. For transaminase activity the below given substrates were used 5-[hydroxy(methyl)phosphoryl]-3-oxopentanoic acid, 3-oxohexanoic acid, (4R)-4-cyclopropyl-3-oxopentanoic acid, (5R)-5-methyl-3-oxooctanoic acid, 3-oxo-3-phenylpropanoic acid, 3-(4-methylphenyl)-3-oxopropanoic acid, 3-(3-methylphenyl)-3-oxopropanoic acid, 3-oxo-4-phenylbutanoic acid, 3-(4-methoxyphenyl)-3-oxopropanoic acid, 3-(3,4-dimethoxyphenyl)-3-oxopropanoic acid, 3-(4-chlorophenyl)-3-oxopropanoic acid, 3-(4-fluorophenyl)-3-oxopropanoic acid, 3-oxo-5-(2,4,5-trifluorophenyl)pentanoic acid, and 3-oxo-5-phenylpentanoic acid. The above-given substrates were tested using engineered transaminase enzyme where the lipase reaction was performed initially followed by transaminase reactions. The reaction was conducted in PBS buffer by monitoring the pH between pH 7.2 to 8.0, 0.05 to 5mg/ml of enzyme powder, and 3-10 mM of substrate concentration was used in equal or 20% V/V of DMSO. 10 mmol L–1 1-(R)-phenylethylamine, and 0.1 mmol, L–1 PLP was added for transaminase reaction. The reaction mixture was agitated at 100-1000 rpm and 30-45°C.
Examples
[0135] Example 1:
[0136] Gene Cloning and Expression of engineered transaminase with lipase active site: establishing transaminase activity
[0137] Engineered transaminases were expressed using E. coli BL21(DE3)/pET-28a, which was then inoculated in 200 mL of LB medium containing 50 µg·mL-1 kanamycin at 37 °C for three hours, or until the optical density at 600 nm (OD600) reached about 0.6. 200 µg·mL-1 IPTG was used to stimulate enzyme expression for 20 hours at 25 °C and 150 rpm. After being harvested, the cells were centrifuged for 10 minutes at 8000 rpm at 4 °C. They were then dissolved in binding buffer (20 mmol·L–1 imidazole, 50 mmol·L–1 PBS, pH 8.0) in order to homogenize them under high pressure. Centrifugation was used to obtain the cleared lysate for 10 minutes at 8000 rpm and 4 °C.
[0138] Enzyme Activity of Engineered transaminase for transamination activity with Acetophenone substrate
[0139] A 500 µL volume containing 50 mmol·L–1 PBS (pH 8.0), 0.05–0.50 g/L–1 pure engineered transaminases, 10 mmol·L–1 1-acetonaphthone, 10 mmol·L–1 1-(R)-phenylethylamine, 20% (v/v) DMSO, and 0.1 mmol·L–1 PLP was used for the reaction. For thirty minutes, the reaction mixture was agitated at 500 rpm and 30 °C. An aliquot of the combination (100 µL) was subjected to HPLC analysis after being diluted with 50% acetonitrile.
[0140] Enzyme Activity of Engineered transaminase for lipase activity for hydrolysis of p-nitrophenol for benchmarking lipase activity.
[0141] Solution of 1mg/ml of enzyme powder prepared in phosphate buffer 50mM and pH 7.4. A 3mM p-nitrophenol stock solution was prepared in dimethyl sulfoxide, and an equal volume of sodium phosphate buffer (0.1 M, pH 6.8) was added. Reaction mixtures containing 500 µL of buffered substrate solution and 500 µL of enzyme solution were incubated at 40 °C for 30 min. After incubation 250 µL of 0.1 M sodium carbonate was added to stop the reaction. After the incubation time, the substrate-to-product conversion can be measured by HPLC or any other preferred method.
[0142] Site-directed mutagenesis of Transaminase
[0143] Site-directed mutagenesis of the transaminase gene was conducted using whole plasmid two-step PCR with the primer containing required codons for mutations (sense strand) and antisense strand based on the expression vector pET-28a(+). The primers used for site-directed mutagenesis were synthesized from Agilent and mutated using Quick Change mutagenesis kits. Luria-Bertani (LB) broth containing ampicillin (100 µg/mL), or kanamycin (50 µg/mL) was used for recombinant bacteria growth.
[0144] Advantages/applications
[0145] With the proliferation of enzyme usage in drug production, the ability to equip a single enzyme with multiple functionalities can revolutionize the pharmaceutical industry. The invention allows for multi-tasking enzymes, reducing the need for multiple enzymes in industrial processes, and potentially leading to cost and time efficiencies. The multi-active site enzyme can be used in a place where two consecutive reactions need to be done. The engineered hetero-active site enzyme reduces the cost, experimental set-up, human resources and also multiple reaction conditions.
[0146] Other publications:
[0147] Tjørnelund et al., 2023 Candida antarctica lipase B performance in organic solvent at varying water activities studied by molecular dynamics simulations. Computational and Structural Biotechnology Journal, 21, 5451–5462. https://doi.org/10.1016/j.csbj.2023.10.049
[0148] US8293507B2; Christopher Savile, Transaminase biocatalysts 23- Oct 2012
[0149] Yabukarski, Filip, et al., 2020 “Assessment of enzyme active site positioning and tests of catalytic mechanisms through x-ray–derived conformational ensembles.” Proceedings of the National Academy of Sciences, vol. 117, no. 52, 21 Dec. 2020, pp. 33204–33215, https://doi.org/10.1073/pnas.2011350117
[0150] Del Sol, Antonio, et al., 2006 “Residue centrality, functionally important residues, and active site shape: Analysis of enzyme and non-enzyme families.” Protein Science, vol. 15, no. 9, Sept. 2006, pp. 2120–2128, https://doi.org/10.1110/ps.062249106
[0151] Judge, Allison, et al., 2024 “Network of epistatic interactions in an enzyme active site revealed by large-scale deep mutational scanning.” Proceedings of the National Academy of Sciences, vol. 121, no. 12, 14 Mar. 2024, https://doi.org/10.1073/pnas.2313513121
[0152] Khobragade, Taresh P., et al., 2021 “Synthesis of sitagliptin intermediate by a multi-enzymatic cascade system using lipase and transaminase with benzylamine as an amino donor.” Frontiers in Bioengineering and Biotechnology, vol. 9, 6 Oct. 2021, https://doi.org/10.3389/fbioe.2021.757062
[0153] Naveen et al., 2024 “Computational studies on the catalytic potential of the double active site for enzyme engineering.” Scientific Reports, vol. 14, no. 1, 2 Aug. 2024, https://doi.org/10.1038/s41598-024-60824-x
[0154] Santiago, Gerard, et al. 2018 “Rational engineering of multiple active sites in an ester hydrolase.” Biochemistry, vol. 57, no. 15, 30 Mar. 2018, pp. 2245–2255, https://doi.org/10.1021/acs.biochem.8b00274
[0155] Contreras-García, Julia, et al., 2011 “NCIPLOT: A program for plotting noncovalent interaction regions.” Journal of Chemical Theory and Computation, vol. 7, no. 3, 25 Jan. 2011, pp. 625–632, https://doi.org/10.1021/ct100641a
[0156] Adhav et al., 2023 “The realm of unconventional noncovalent interactions in proteins: Their significance in structure and function.” ACS Omega, vol. 8, no. 25, 13 June 2023, pp. 22268–22284, https://doi.org/10.1021/acsomega.3c00205
[0157] Karshikoff, Andrey et. Al, 2020. Non-Covalent Interactions in Proteins, 2nd Edition, world scientific 9 Sept. 2020, https://doi.org/10.1142/12035
[0158] Bakheit, Ahmed H., et al., 2022. “Thermodynamic and computational (DFT) study of non-covalent interaction mechanisms of charge transfer complex of linagliptin with 2,3-dichloro-5,6-dicyano-1,4-benzoquinone (DDQ) and chloranilic acid (CHA).” Molecules, vol. 27, no. 19, 25 Sept. 2022, p. 6320, https://doi.org/10.3390/molecules27196320
[0159] Gálvez-Iriqui, Alma Carolina, et al., 2005 “Lysozymes: Characteristics, mechanism of action and technological applications on the control of pathogenic microorganisms.” Revista Mexicana de Fitopatología, Mexican Journal of Phytopathology, vol. 38, no. 3, 11 Aug. 2020, https://doi.org/10.18781/r.mex.fit.2005-6
[0160] Polydoridis, Savvas, et al., 2007 “Recognition of ribonuclease a by 3'–5'-pyrophosphate-linked dinucleotide inhibitors: A molecular dynamics/continuum electrostatics analysis.” Biophysical Journal, vol. 92, no. 5, Mar. 2007, pp. 1659–1672, https://doi.org/10.1529/biophysj.106.093419
[0161] Fotheringham et al., 2000 “Engineering biosynthetic pathways: New routes to chiral amino acids.” Current Opinion in Chemical Biology, vol. 4, no. 1, 1 Feb. 2000, pp. 120–124, https://doi.org/10.1016/s1367-5931(99)00062-9
[0162] Nishimasu, Hiroshi, et al. 2007 “Crystal Structures of an ATP-dependent hexokinase with broad substrate specificity from the hyperthermophilic archaeon Sulfolobus tokodaii.” Journal of Biological Chemistry, vol. 282, no. 13, Mar. 2007, pp. 9923–9931, https://doi.org/10.1074/jbc.m610678200
[0163] Castillo, Edmundo, et al., 2016 “Medium-engineering: A useful tool for modulating lipase activity and selectivity.” Biocatalysis, vol. 1, no. 1, 7 Jan. 2016, https://doi.org/10.1515/boca-2015-0013
[0164] Guidi, Benedetta, et al., 2018 “Strategic single point mutation yields a solvent- and salt-stable transaminase from virgibacillus sp. in soluble form.” Scientific Reports, vol. 8, no. 1, 6 Nov. 2018, https://doi.org/10.1038/s41598-018-34434-3
,CLAIMS:CLAIMS
We Claim:
1. A method for developing a hetero-lipase active site on a given protein or an enzyme the method comprises of steps:
a) Collecting diverse sequences of reported lipases with activity and modelling 3D structures of the same.
b) Docking of reference substrates with the modelled enzyme structures and identification of catalytic conformation to perform NCI Analysis.
c) Conducting NCI Analysis by extracting the coordinates of the substrate and the amino acids present within 5.0Å of the substrate conformation identified as the catalytic conformation (as described in the previous step) using XYZ format of the same.
d) Using the coordinates, plot 2D reduced density gradient map 3D electronic density map to derive NCI Indices and calculate algebraic properties and topological properties and storing this in a database.
e) Identifying pockets on a protein of interest to build a hetero-lipase active site and fit the substrate of interest followed by docking studies with the reference substrate resulting in the identification of pockets with energetically feasible substrate conformation.
f) Conducting NCI Analysis by extracting the coordinates of the reference substrate and the amino acids present within 5.0Å of the substrate conformation in the pocket.
g) Performing NCI Analysis on the identified pockets and the amino acids present within 5Å around the substrate by extracting the coordinates and converting them into XYZ format.
h) Obtaining the output in the form of a 2D reduced density gradient map, a 3D electronic density map and algebraic and topological properties of the pocket and storing them as Query pocket NCI indices in the database.
i) Comparing 2D reduced density gradient map of query pocket NCI indices against the reference active site NCI indices using 2DCNN and classifying the data into high match, medium match, and low match pockets based on the similarity percentages and representing the same in the form of a matrix to filter out the high match pockets.
j) Comparing 3D electronic maps of the high match pockets against the reference active site NCI indices using 3DCNN and comparing features of the same using the Feature based cross-correlation technique followed by incorporating appropriate catalytic residues in the high match pocket using the ResNet algorithm to derive a desired hetero-active site for a desired function on the enzyme of interest
2. Engineering of enzyme transaminase of SEQ ID 1 to SEQ ID 20 using the method of claim 1, synthesis and expression of the same using E. coli BL21(DE3)/pET-28a to produce transaminase enzyme powder followed by enzyme assay to check dual-activity of lipase activity followed by transaminase activity using different substrates such as (5-methoxy-3,5-dioxopentyl)(methyl)phosphinic acid, methyl 3-oxohexanoate, methyl (4R)-4-cyclopropyl-3-oxopentanoate, methyl (5R)-5-methyl-3-oxooctanoate, methyl 3-oxo-3-phenylpropanoate, methyl 3-(4-methylphenyl)-3-oxopropanoate, methyl 3-(3-methylphenyl)-3-oxopropanoate, methyl 3-oxo-4-phenylbutanoate, methyl 3-(4-methoxyphenyl)-3-oxopropanoate, methyl 3-(3,4-dimethoxyphenyl)-3-oxopropanoate, methyl 3-(4-chlorophenyl)-3-oxopropanoate, methyl 3-(4-fluorophenyl)-3-oxopropanoate, methyl 3-oxo-5-(2,4,5-trifluorophenyl)pentanoate, and methyl 3-oxo-5-phenylpentanoate for lipase activity to produce 5-[hydroxy(methyl)phosphoryl]-3-oxopentanoic acid, 3-oxohexanoic acid, (4R)-4-cyclopropyl-3-oxopentanoic acid, (5R)-5-methyl-3-oxooctanoic acid, 3-oxo-3-phenylpropanoic acid, 3-(4-methylphenyl)-3-oxopropanoic acid, 3-(3-methylphenyl)-3-oxopropanoic acid, 3-oxo-4-phenylbutanoic acid, 3-(4-methoxyphenyl)-3-oxopropanoic acid, 3-(3,4-dimethoxyphenyl)-3-oxopropanoic acid, 3-(4-chlorophenyl)-3-oxopropanoic acid, 3-(4-fluorophenyl)-3-oxopropanoic acid, 3-oxo-5-(2,4,5-trifluorophenyl)pentanoic acid, and 3-oxo-5-phenylpentanoic acid, respectively. The product of lipase was taken as substrate by transaminase pocket.
3. Engineering of enzyme Ketoreductases of SEQ ID 21 to SEQ ID 33 using the method of claim 1, synthesis and expression of the same using E. coli BL21(DE3)/pET-28a to produce Ketoreductases enzyme powder followed by enzyme assay to check dual-activity of lipase and followed by Ketoreductases activity using different substrates such as ethyl 2-oxo-4-phenylbutanoate (ethyl benzylpyruvate) and tert-butyl (5S)-6-chloro-5-hydroxy-3-oxohexanoate. The product of the hetero-lipase active site was taken as substrate by the ketoreductase natural active site.
| # | Name | Date |
|---|---|---|
| 1 | 202341065417-STATEMENT OF UNDERTAKING (FORM 3) [29-09-2023(online)].pdf | 2023-09-29 |
| 2 | 202341065417-PROVISIONAL SPECIFICATION [29-09-2023(online)].pdf | 2023-09-29 |
| 3 | 202341065417-POWER OF AUTHORITY [29-09-2023(online)].pdf | 2023-09-29 |
| 4 | 202341065417-FORM FOR SMALL ENTITY(FORM-28) [29-09-2023(online)].pdf | 2023-09-29 |
| 5 | 202341065417-FORM 1 [29-09-2023(online)].pdf | 2023-09-29 |
| 6 | 202341065417-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [29-09-2023(online)].pdf | 2023-09-29 |
| 7 | 202341065417-DECLARATION OF INVENTORSHIP (FORM 5) [29-09-2023(online)].pdf | 2023-09-29 |
| 8 | 202341065417-Proof of Right [09-07-2024(online)].pdf | 2024-07-09 |
| 9 | 202341065417-STARTUP [27-09-2024(online)].pdf | 2024-09-27 |
| 10 | 202341065417-Sequence Listing in PDF [27-09-2024(online)].pdf | 2024-09-27 |
| 11 | 202341065417-FORM28 [27-09-2024(online)].pdf | 2024-09-27 |
| 12 | 202341065417-FORM-9 [27-09-2024(online)].pdf | 2024-09-27 |
| 13 | 202341065417-FORM 18A [27-09-2024(online)].pdf | 2024-09-27 |
| 14 | 202341065417-DRAWING [27-09-2024(online)].pdf | 2024-09-27 |
| 15 | 202341065417-COMPLETE SPECIFICATION [27-09-2024(online)].pdf | 2024-09-27 |