Abstract: ABSTRACT AN ARTIFICIAL INTELLIGENCE METHOD TO PREDICT SELECTIVITY OF TRANSAMINASES AND ENANTIOMERIC EXCESS OF TRANSAMINASE-KETONE REACTIONS. The present invention describes identifying if a given Transaminase is (R)-Selective or (S)-Selective. Based on the binding of the substrate in the active site of a given transaminase, it is primarily based on torsion, the angle between specific atoms of the cofactor PLP and the catalytic Lysine, and the ten interactions of the substrate with the cofactor PLP and the catalytic Lysine the process can determine if the given Transaminase is (R)-Selective or (S)-Selective. On a 0° - +180° scale or 0° -180° scale or on 0° - 360° scale, angles that are specific for the two types of transaminases are defined in this embodiment. The embodiment is also described as a methodology of categorizing the bound substrate conformers as (R), (S) and non-catalytic conformers and a formula to calculate the percentage of enantiomeric excess of the product. Reference Figure: Figure 1
DESC:DESCRIPTION
FIELD
[001] The disclosed subject matter relates to the field of Biology, Biocatalysis, Life Science, Computational Biology and Chemistry.
DISCUSSION OF RELATED FIELD
[002] Commercial use of Transaminases has increased to manufacture several drugs with an intermediary transamination reaction.
[003] Sustainable syntheses of chiral amines are highly sought after as 70% of all pharmaceuticals are derivatives of chiral amines.
[004] Transaminases are pyridoxal-5'-phosphate-(PLP)-dependent enzymes that catalyse the reversible transfer of an amine group from an amino donor to prochiral keto acids, ketones or aldehydes.
[005] Transaminases are highly enantioselective towards their natural substrates, making them crucial in biocatalysis for industrial applications.
[006] They are either (S)-selective which prefers the (S)-conformer of the substrate and gives the (S)-enantiomer of the product, or (R)-selective, which prefers the (R) conformer of the substrate and gives (R)-enantiomer of the product.
[007] The limited substrate scope of transaminase is due to its active site, which is situated in the middle of a large binding pocket (LBP) and a small binding pocket (SBP).
[008] The keto acid or the ketone substrates bind in the active site, placing their bulky and small substituent groups in the LBP and the SBP in a specific manner that depends on whether the enzyme is (S) or (R) selective form (S) or (R) enantiomer product, respectively.
[009] This limits the use of transaminases for large scale industrial applications where either an (S) or an (R) enantiomeric excess is required from a reaction.
SUMMARY
[0010] The invention described in this embodiment entails a methodology to screen any given Pyridoxal phosphate (PLP) dependent enzyme as (S)-selective or (R)-selective enzyme, followed by a process that involves non-covalent modelling or non-covalent docking of ketones in the active site of Enzyme-Pyridoxal phosphate complex, where Pyridoxal phosphate (PLP) is quantum chemically transformed to pyridoxamine phosphate (PMP) and catalytic lysine (Lys) has neutral NH2, that is referred to as Enzyme-PMP complex here and a process to predict the selectivity of the enzyme to bind the keto acid or the ketone substrate in the active site placing the bulky and small substituent groups in the LBP and the SBP to give (S) and or (R)-enantiomer product and to calculate its enantiomeric excess using machine learning and artificial convolutional neural network.
BRIEF DESCRIPTION OF DRAWINGS
[0011] Figure 1. Flow chart Algorithm showing stepwise process to determine protein stereoselectivity
[0012] Figure 2. Shows the hydrogen bond interaction of modelled PMP and Lys with Substrate and active site residues. PLP and Lys are shown in blue, active site residues are shown in grey and
[0013] The substrate is shown in pink. The dashed line represents hydrogen bonding.
[0014] Figure 3a. With PLP as the central molecule, the figure shows the orientation of catalytic
[0015] Lysine in Crystal structure of Omega transaminase of Chromobacterium violaceum (PDB ID: 4A6T) with the angle of -95.1180 (-1800 to +1800 scale) between selected atoms of Lys and PLP and Crystal structure of the first (R)-stereoselective-transaminase from Arthrobacter sp. KNK168 (PDB ID: 3WWH) with the angle of +130.260 (-1800 to +1800 scale) between selected atoms of Lys and PLP b) Atoms of PLP and catalytic Lysine that were chosen for calculating the angle are shown in purple Vander Waals.
[0016] Figure 4a. Wheel to identify the selective nature of Transaminase. The wheel shows angle 0º to +180º and angle 0º and -180º, with a schematic representation of substrate in the center of the wheel in two possible binding conformations. PLP is shown as a green stick. When the calculated angle between the PLP and catalytic Lysine falls between +30º and +160º, the protein is (R)-selective, and when the same falls between -30º and -160º, the protein is (S)-selective. Tyr in the active site that helps form (R) conformation of the substrate is shown in a purple stick. In most (S)-selective transaminase, the calculated torsion angle falls within the brown dashed lines; similarly, in most of the (R)-selective transaminase, the calculated torsion angle falls within the purple dashed lines.
[0017] Figure 4b. Wheel to identify the selective nature of Transaminase as shown in Figure 4a, the scale being 0° and +360°
[0018] Figure 5. Flow chart algorithm showing the stepwise process of Prediction of R and S conformation of Substrate and Enantiomeric excess percentage calculation
[0019] Figure 6.A-F. Shows the different binding modes of the substrate in the active site of (S) selective transaminase defining the (R)-conformations, (S)-conformations and non-catalytic conformations (A) Large moiety (LM) of the substrate (acetophenone), large binding pocket (LBP) of enzyme and small moiety (SM) of the substrate (acetophenone), small binding pocket (SMP). (B) (S)-selective enzyme showing (S)-conformation of the substrate with LM of substrate binding in the LBP of the enzyme, SM of substrate binding in the SBP of the enzyme and ketone projected towards catalytic lysine. (C) (S)-selective enzyme showing (R)-conformation of the substrate with LM of substrate binding in the SBP of the enzyme, SM of substrate binding in the LBP of the enzyme and ketone projected towards catalytic lysine. (D) (S)-selective enzyme showing noncatalytic binding mode with LM of substrate binding in the LBP of the enzyme, SM of substrate projected towards catalytic lysine and ketone binding in SBP of the enzyme. (E) (S)-selective enzyme showing noncatalytic binding mode with LM of substrate binding in the SBP of the enzyme, SM of substrate projected towards catalytic lysine and ketone binding in LBP of the enzyme. (F) (S)-selective enzyme showing non-catalytic binding mode with LM of substrate binding in the LBP of the enzyme, SM of substrate binding in the SBP of the enzyme, but ketone projected away from catalytic lysine.
[0020] Figure 6.G – K Shows the different binding modes of the substrate in the active site of (R)-selective transaminase, defining the (R)-conformations, (S)-conformations and non-catalytic conformations. (G) (R)-selective enzyme showing (R)-conformation with LM of substrate binding in the LBP of the enzyme, SM of substrate binding in the SBP of the enzyme and ketone projected towards catalytic lysine. (H) (R)-selective enzyme showing (S)-conformation with LM of substrate binding in the SBP of the enzyme, SM of substrate binding in the LBP of the enzyme and ketone projected towards catalytic lysine. (I) (R)-selective enzyme showing noncatalytic binding mode with LM of substrate binding in the LBP of the enzyme, SM of substrate projected towards catalytic lysine and ketone binding in SBP of the enzyme. (J) (R)-selective enzyme showing noncatalytic binding mode with LM of substrate binding in the SBP of the enzyme, SM of substrate projected towards catalytic lysine and ketone binding in LBP of the enzyme. (K) (R)-selective enzyme showing non-catalytic binding mode with LM of substrate binding in the LBP of the enzyme, SM of substrate binding in the SBP of the enzyme, but ketone projected away from catalytic lysine.
[0021] Figure 7. The distance constraints, and variables used in equation 1 to count the R/S selectivity of the substrate conformations as explained in the Table 4. (A) depicts the binding mode of acetophenone in (S)-specific Transaminase to give S phenylethylamine as the product (B) depicts the binding mode of acetophenone in (S)-specific Transaminase to give R phenylethylamine as the product (C) depicts the binding mode of acetophenone in (R)-specific Transaminase to give (R)-phenylethylamine as the product (D) depicts the binding mode of acetophenone in (R)-specific Transaminase to give (S)-phenylethylamine as the product.
[0022] Figure 8. A stereo diagram showing the residues of Transaminase that make the Small pocket (Brown) and the Large pocket (Red. (A) shows Transaminase with PDB Id 4A6T with the substrate bound to catalytic Lys288 and PLP (B) shows Transaminase with PDB Id 3WWH with substrate bound to catalytic Lys188 and PLP
[0023] Figure 9. The sphere representing grid enclosing the active site with a radius of 6Å used for Ensemble docking; Substrate (Acetophenone) is shown in pink stick, PLP is shown in light sea green stick
DETAILED DESCRIPTION OF THE DRAWINGS
[0024] In Figure. 1 is shown a process or an algorithm that uses a GPU script executed on a computer system with multiple processors to identify if a given 3D structure of a transaminase enzyme complexed with PLP is (S)-selective or (R)-selective.
[0025] Figure 1: Flow chart Algorithm showing stepwise process to determine protein stereoselectivity
[0026] Figure. 1 includes steps 3, 4 and 5, wherein pyridoxal monophosphate (PMP) is modelled in the crystal structure, or the modelled structure of transaminase as shown in Figure. 2 and the angle between PMP and the Lys that is closest to PMP in the active site, designated as catalytic Lys is calculated using a Python Script, specifically by selecting the atoms C2, C5 and C4A of PMP to be plane one and the CA (alpha carbon) of the catalytic Lysine as plane two and the angle between plane one and plane two is calculated as shown in Figure. 3a and 3b.
[0027] Figure 2. Shows the hydrogen bond interaction of modelled PMP and Lys with Substrate and active site residues. PMP and Lys are shown in blue, active site residues are shown in grey, and Substrate is shown in pink. The dashed line represents hydrogen bonding.
[0028] Figure 3. a) With PLP as the central molecule, the figure shows the Orientation of catalytic Lysine on Crystal structure of Omega transaminase of Chromobacterium violaceum (PDB ID: 4A6T) with angle of -95.1180 (-1800 to +1800 scale) between selected atoms of Lys and PLP and in Crystal structure of the first (R)-stereoselective-transaminase from Arthrobacter sp. KNK168 ( PDB ID: 3WWH) with the angle of +130.260 (-1800 to +1800 scale) between selected atoms of Lys and PLP b) Atoms of PLP and catalytic Lysine that were chosen for calculating the angle are shown in purple Vander Waals.
[0029] The given transaminase is defined or determined as an (R)-selective transaminase if the angle is calculated between plane one and plane 2, as explained in step 5 of Figure. 1, falls between +30º to +160º, as shown in Figure. 4a and 4b and the given transaminase are defined or determined as (S)-selective transaminase if the angle is calculated between plane one and plane two as explained in step 5 of Figure. 1, falls between -30º to -160º or between 210 º to 360 º as shown in Figure. 4a and Figure. 4b.
[0030] Figure 4a. Wheel to identify the selective nature of Transaminase. The wheel shows angle 0º to +180º and angle 0º and -180º, with a schematic representation of substrate in the center of the wheel in two possible binding conformations. PLP is shown as a green stick. When the calculated angle between the PLP and catalytic Lys falls between +30º and +160º, the protein is (R)-selective, and when the same falls between -30º and -160º, the protein is (S)-selective. Tyr in the active site that helps form R conformation of the substrate is shown in a purple stick. In most (S)-selective transaminase, the calculated torsion angle falls within the brown dashed lines; similarly, in most of the (R)-selective transaminase, the calculated torsion angle falls within the purple dashed lines.
[0031] Figure 4b. Wheel to identify the selective nature of Transaminase as shown in Figure 4a, the scale being 0° and +360°
[0032] Figure. 5 shows a process or algorithm flow chart which uses a GPU script executed on a computer system with multiple processors to predict the enantiomeric excess of a product that will result from a transaminase enzyme-substrate complex.
[0033] Figure 5: Flow chart algorithm showing the stepwise process of Prediction of (R) and (S)-conformation of Substrate and Enantiomeric excess percentage (ee%) calculation; SSC - Selectivity of substrate conformation, CNN – Convolutional neural network, AI – Artificial Intelligence, ML - Machine Learning.
[0034] Step 8 of Figure. 5, includes performing geometric modelling, ensemble docking, Molecular Dynamics (MD) simulations and Quantum Mechanics/Molecular Mechanics (QM/MM) studies or calculations on the transaminase enzyme-substrate complex wherein the transaminase be (R)-selective or (S)-selective.
[0035] In step 9 of Figure. 5, screening of conformations followed by categorizing the conformations based on the binding mode (of the substrate) in the active site of the transaminase as catalytic conformations are explained with Figure. 6B, 6C, 6G, 6H and as non-catalytic conformations explained is with Figure 6E, 6F, 6I, 6J, 6K wherein the conformations are obtained from the results of the steps 9 and 10 of Figure 5, followed by purging of the non-catalytic conformations using a python script and predicting if the catalytic conformations have R conformer of the substrate or (S)-conformer of the substrate based on the binding mode of the substrate in the active-site pocket (as explained here) and counting the number of R conformers and S conformers and concluding that if the number of R conformers of the substrate is high, then the percentage of (R)-enantiomer of the product is high and vice-versa.
[0036] For any given substrate conformation, the selectivity is determined using equation 1, and variables are shown in Figure 7 and explained in Table 4. This equation is employed for the catalytic conformations where c and p distances shown in Figure 7C and 7D are less than a 4Å, and the conformations with c and p distance greater than 4Å are considered non-catalytic conformations.
[0037] Selectivity of substrate conformation (SSC) = log(S(1/a + 1/b)) - log(S(1/d + 1/e)) ---------- Equation (1)
Total (R)-conformers (SSCR) and (S)-conformers (SSCS) is substituted in equation two for determining the Enantiomeric excess percentage of the product obtained from the transaminase substrate reaction, which is mentioned in Figure. 5
ee % = ( SSCR – SSCS / SSCR + SSCS ) x 100 ---------- Equation (2)
[0038] Figure 6. A-F. Shows the different binding modes of the substrate in the active site of (S)-selective transaminase defining the (R)-conformations, (S)-conformations and non-catalytic conformations (A) Large moiety (LM) of the substrate (acetophenone) in large binding pocket (LBP) of enzyme and small moiety (SM) of the substrate (acetophenone) in small binding pocket (SMP). (B) (S)-selective enzyme showing (S)-conformation of the substrate with LM of substrate binding in the LBP of the enzyme, SM of substrate binding in the SBP of the enzyme and ketone projected towards catalytic lysine. (C) (S)-selective enzyme showing (R)-conformation of the substrate with LM of substrate binding in the SBP of the enzyme, SM of substrate binding in the LBP of the enzyme and ketone projected towards catalytic lysine. (D) (S)-selective enzyme showing noncatalytic binding mode with LM of substrate binding in the LBP of the enzyme, SM of substrate projected towards catalytic lysine and ketone binding in SBP of the enzyme. (E) (S)-selective enzyme showing noncatalytic binding mode with LM of substrate binding in the SBP of the enzyme, SM of substrate projected towards catalytic lysine and ketone binding in LBP of the enzyme. (F) (S)-selective enzyme showing non-catalytic binding mode with LM of substrate binding in the LBP of the enzyme, SM of substrate binding in the SBP of the enzyme, but ketone projected away from catalytic lysine.
[0039] Figure 6. G – K Shows the different binding modes of the substrate in the active site of (R)-selective transaminase, defining the (R)-conformations, (S)-conformations and non-catalytic conformations. (G) (R)-selective enzyme showing (R)-conformation with LM of substrate binding in the LBP of the enzyme, SM of substrate binding in the SBP of the enzyme and ketone projected towards catalytic lysine. (H) (R)-selective enzyme showing (S)-conformation with LM of substrate binding in the SBP of the enzyme, SM of substrate binding in the LBP of the enzyme and ketone projected towards catalytic lysine. (I) (R)-selective enzyme showing noncatalytic binding mode with LM of substrate binding in the LBP of the enzyme, SM of substrate projected towards catalytic lysine and ketone binding in SBP of the enzyme. (J) (R)-selective enzyme showing noncatalytic binding mode with LM of substrate binding in the SBP of the enzyme, SM of substrate projected towards catalytic lysine and ketone binding in LBP of the enzyme. (K) (R)-selective enzyme showing non-catalytic binding mode with LM of substrate binding in the LBP of the enzyme, SM of substrate binding in the SBP of the enzyme, but ketone projected away from catalytic lysine.
[0040] Figure 7. The distance constraints and variables used in equation 1 to count the R/S selectivity of the substrate conformations as explained in Table 4. (A) depicts the binding mode of acetophenone in (S)-specific Transaminase to give (S)-phenylethylamine as the product wherein ‘a” is distance between CoM of large pocket (C1) and CoM of large moiety (C2) and ‘b” is distance between CoM of small binding pocket (C3) and CoM of small moiety (C4). (B) depicts the binding mode of acetophenone in (S)-specific Transaminase to give R phenylethylamine as the product (C) depicts the binding mode of acetophenone in (R)-specific Transaminase to give R phenylethylamine as the product (D) depicts the binding mode of acetophenone in (R)-specific Transaminase to give (S)-phenylethylamine as the product ‘d” is distance between CoM of large binding pocket (C1) and CoM of small moiety (C4) and ‘e” is distance between CoM of small binding pocket (C3) and CoM of large moiety (C2).
GRID-BASED METHOD
[0041] (S)-selective transaminases giving (S)-amine product is natural. Here the enzyme would transfer the amine from the PMP cofactor to the Re-face of a prochiral ketone. If the larger substituent has priority over the smaller substituent in the enzyme's active site, i.e., if the larger substituent binds in, the larger binding pocket and if the smaller substituent binds in, the smaller binding pocket, then it would give an (S)-amine product. This is directly dependent on the structure of the substrate, which implies a change in the conformation of the substrate can yield an (R)-enantiomer product as well by the enzyme as with the difference in the substrate’s structure, the enzyme can show more priority to the smaller substituent, that is smaller substituent binds in the large binding pocket. Larger substituent binds in the small binding pocket. An (S)-Transaminase can be engineered to do an (R)-conversion; in this case, it will be that again, smaller substituent binds in the large binding pocket, and larger substituent binds in the small binding pocket.
[0042] Similarly, (R)-selective transaminases giving (R)-amine product is natural, i.e., the Larger substituent binds in, the larger binding pocket; if the smaller substituent binds in, the smaller binding pocket, then it would give an (R)-amine product when the enzyme is (R)-selective. However, (R)-selective transaminase can give (S)-enantiomer product if the enzyme binds the substrate in such a way that the priority towards the substituents is altered, i.e., the enzyme could show more importance to the smaller substituent when it is bound in the sizeable binding pocket and more priority to larger substituent when bound in the small binding pocket.
[0043] In addition to this, an (R)-selective Transaminase can be engineered to give an (S)-enantiomer product by altering the priority as mentioned above.
[0044] The active site of (S) and (R)-specific Transaminase protein contains two binding pockets – A small binding pocket and a Large binding pocket around the cofactor pyridoxal-5-phosphate (PLP) for binding the substrates shown in Figure. 8A and 8B. The amino acid residues that constitute these binding pockets and allow substrate accommodation in different orientations are also shown in Figure. 8A and 8B.
[0045] Figure 8: A stereo diagram showing the residues of Transaminase that make the Small pocket (Brown) and the Large pocket (Red). (A) shows Transaminase with PDB Id 4A6T with the substrate bound to catalytic Lys288 and PLP (B) shows Transaminase with PDB Id 3WWH with substrate bound to catalytic Lys188 and PLP
[0046] The flow chart for the identification of Stereoselectivity of the protein is shown in Figure.1; The algorithm is validated with approximately 1462 proteins retrieved from RCSB PDB (Step 1). Of the 1462 proteins, those which did not have the appropriate structure of PLP were eliminated for the validation studies (Step 2). The algorithm first identifies the C4A atom of PLP in the active site in any one protein chain and captures the catalytic Lysine situated close to the PLP (Step3/Block 3).
[0047] Next, the angle between PLP and catalytic lysine is calculated using the Kcat Torsion Analysis wherein C5, C2, C4A atoms of PLP and CA of the catalytic Lysine are considered. (Step 4, 5 and 6). The calculated torsion angle determines the stereoselectivity of protein (Step 7). The angle ranging from +30º to +160º as per Figure 4a is considered R selective protein (Step 7a) and from -30º to -160º or +210 º to 330 º as per Figure 4a or Figure 4b is considered as S selective protein (Step 7b). This can be understood clearly with Figure 3a, which shows the torsion angle of -95.1180 calculated for the Crystal structure of Omega transaminase from Chromobacterium violaceum (PDB ID: 4A6T) in complex with PLP by the algorithm, which is defined as S selective protein and an angle of +130.26 for the Crystal structure of transaminase from Arthrobacter sp. KNK168 (PDB ID: 3WWH) is defined as an (R)-selective protein, and Figure 3b shows the atoms of PLP and catalytic Lysine for calculating the angle.
[0048] The flow chart for the prediction of enantiomeric excess of an enzyme-substrate complex is shown in Figure. 5. The protein from Step 7a or 7b is subjected to geometric modelling of the PLP by adding amino-group (NH2) at the C4A atom, considered PMP or modelled PLP and reducing NH3 to NH2 the catalytic lysine to obtain the critical interaction as depicted in Figure 2. The R or S selective protein is docked with the given substrate, for instance, Acetophenone, using 7D grid technology in the active site. The docking employed is called Ensemble docking. The grid in the active site is defined by a sphere centered on the N1 atom of PLP enclosing about 6Å radius of the active site as depicted in Figure 9. Random poses in the docking are generated using an AI-based algorithm for any enzyme complex.
[0049] Figure 9. The sphere representing grid enclosing the active site with a radius of 6Å used for Ensemble docking
[0050] The Substrate and active site are optimised, and partial charges are derived using Quantum mechanical calculations. Further, the degree of freedom of substrate is captured along simulation time. These conformations are filtered using an algorithm that defines distance and angle based on which (S), (R) and non-catalytic conformations are identified and counted, and ee% is calculated based on the Equation 2. Conformations of the substrate are shown in Figure. 3D and 3E in case of (S)-selective protein, Figure. 3I and 3J in case of (R) selective protein, wherein keto group poses opposite to the catalytic Lysine are geometrically in a static structure that is not feasible or possible. The QM/MM or Quantum chemical optimization of these conformations show a higher transition state (activation energy) which is unfavorable for the reaction to proceed, so these kinds of conformations are entitled as non-catalytic conformation. Therefore, these conformations are geometrically restricted or not counted for calculating ee%. Thus, these non-catalytic conformations are purged, and catalytic conformations only are taken for further analysis (Step 9)
[0051] Catalytic conformations obtained from Step10 of Figure.5 are further filtered to get appropriate conformations based on factors such as threshold distance of 3.2 Å – 4Å between carbonyl carbon of substrate and amino nitrogen of PLP, positioning of the keto group as facing towards or opposite to the catalytic lysine and binding of Small moiety (SM) and Large moiety (LM) in the small binding pocket (SBP) and in the Large binding pocket (LBP) as shown and explained with Figure. 6A – 6K and are categorized into (R)-conformations and (S)-conformations of the substrate in the active site using two methods, 10a and 10b, as explained above.
[0052] Figure. 6A-6K explains and is a representation of the different binding modes of substrate in the active site to predict the R conformations, S conformations and non-catalytic conformations.
[0053] Figure.6B indicates the (S)-selective protein showing S specific conformation with the keto group facing catalytic lysine in the active site with LM in LBP and SM in SBP, Figure. 6C indicates (S)-selective protein showing R specific conformation with keto group facing catalytic lysine in the active site with LM in SBP and SM in LBP. (R)-selective protein showing R specific conformation with keto group facing catalytic lysine in the active site with LM in LBP and SM in SBP is shown in Figure 6G and (R)-selective protein showing S specific conformation with keto group facing catalytic lysine in the active site with LM in SBP and SM in LBP is shown in Figure. 6H.
[0054] Figure 6D with (S)-selective protein showing non-catalytic binding mode with LM of substrate binding in the LBP of enzyme, SM of substrate projected towards catalytic lysine and ketone binding in SBP of enzyme, Figure. 6E with (S)-selective protein showing noncatalytic binding mode with LM of substrate binding in the SBP of enzyme, SM of substrate projected towards catalytic lysine and ketone binding in LBP of enzyme, Figure. 6F indicates (S)-selective protein showing R specific conformation with keto group facing opposite to catalytic lysine in the active site with LM in LBP and SM in SBP, Figure. 6I with (R)-selective protein showing noncatalytic binding mode with LM of substrate binding in the LBP of enzyme, SM of substrate projected towards catalytic lysine and ketone binding in SBP of enzyme. Figure. 6J with (R)-selective protein showing noncatalytic binding mode with LM of substrate binding in the SBP of enzyme, SM of substrate projected towards catalytic lysine and ketone binding in LBP of enzyme and Figure. 6K with (S)-selective enzyme shows non-catalytic binding mode with LM of substrate binding in the LBP of enzyme, SM of substrate binding in the SBP of the enzyme, but ketone projected away from catalytic lysine.
[0055] Figure 4a and Fig4b show another way to identify (R) and (S) conformations of the substrate. The conformations that give the calculated angle are explained in Step 5 of Figure. Between the purple dotted lines is considered (R)-conformations, and between brown dotted lines are considered (S)-conformations.
[0056] The number of (R) and (S) conformations are counted and are substituted in the equation 2 to calculate the enantiomeric excess of a Transaminase-Substrate reaction.
The process explained in this embodiment is executed with selected transaminases
[0057] The process explained in this embodiment is executed with selected Transaminases, the results of which are produced here as an example; Table. 1 depicts the selectivity as (R) or (S)-selective Transaminase determined based on Torsion angles calculated between specified atoms using an algorithm for selected transaminases. Table. 3 Shows the percentage of enantiomeric excess calculated for transaminases chosen as per the formula given in the process, after finding the number of (R) and (S)-conformations of substrates.
Protein ID Torsion angle with specified atoms Nature of Selectivity
4A6T -95 S
5KQU -79.5415 S
6GWI -103.222 S
3WWH 130.26 R
6Q1Q 107.623 R
4CMD 142.449 R
6XWB 122.489 R
[0058] Table. 1. Selective nature of transaminase determined based on Torsion angles calculated between specified atoms using the algorithm described in Figure 1 for selected transaminases
Substrate ID Substrate Substrate ID Substrate
SUB1
SUB6
SUB2
SUB7
SUB3
SUB8
SUB4
SUB9
SUB5
SUB10
[0059] Table 2: Substrates used to verify algorithm and for training the data set.
Protein ID Protein stereospecificity Substrate (S)-conformations count (R)-conformations count Non-Catalytic conformations Predicted ee %
4A6T S SUB1 57 6 37 80.95% (S)
4A6T S SUB2 45 5 50 88.88% (S)
4A6T S SUB3 49 7 44 75% (S)
4A6T S SUB4 65 5 30 85.71% (S)
4A6T S SUB5 61 10 29 71.83% (S)
4A6T S SUB6 65 12 23 68.8% (S)
4A6T S SUB7 50 5 45 81.81% (S)
4A6T S SUB8 62 5 33 85.0% (S)
5KQU S SUB1 20 3 27 73.91% (S)
5KQU S SUB2 30 7 63 62.16% (S)
5KQU S SUB3 40 9 51 63.26% (S)
5KQU S SUB4 39 0 61 >99% (S)
5KQU S SUB5 35 0 65 >99% (S)
5KQU S SUB6 28 8 64 55.55% (S)
5KQU S SUB7 50 5 45 81.81% (S)
5KQU S SUB8 28 6 66 64.70% (S)
6GWI S SUB1 44 0 56 >99% (S)
6GWI S SUB2 41 11 48 57.69% (S)
6GWI S SUB3 53 4 47 85.96% (S)
6GWI S SUB4 30 0 70 >99% (S)
6GWI S SUB5 21 5 74 61.53% (S)
6GWI S SUB6 43 8 49 68.62% (S)
6GWI S SUB7 62 5 33 85.07% (S)
6GWI S SUB8 73 0 27 >99% (S)
3WWH R SUB1 10 58 32 70.58% (R)
3WWH R SUB2 0 17 83 >99% (R)
3WWH R SUB3 4 18 78 63.63% (R)
3WWH R SUB4 0 6 94 >99% (R)
3WWH R SUB6 5 29 66 70.58% (R)
3WWH R SUB7 9 40 51 63.26% (R)
3WWH R SUB8 2 30 68 87.5% (R)
3WWH R SUB9 8 56 36 75% (R)
3WWJ R SUB10 0 15 85 >99% (R)
6XWB R SUB1 0 61 39 >99% (R)
6XWB R SUB2 14 52 34 57.57% (R)
6XWB R SUB3 11 68 21 72.15% (R)
6XWB R SUB4 4 55 41 86.44% (R)
6XWB R SUB6 7 26 67 57.57% (R)
6XWB R SUB7 4 40 56 81.81% (R)
6XWB R SUB8 2 26 72 85.71%(R)
4CMD R SUB1 0 29 71 >99% (R)
4CMD R SUB2 3 49 48 88.46% (R)
4CMD R SUB3 11 47 42 62.06% (R)
4CMD R SUB4 9 37 54 60.86% (R)
4CMD R SUB6 8 29 63 56.75% (R)
4CMD R SUB7 4 38 58 80.95% (R)
4CMD R SUB8 0 57 42 >99% (R)
[0060] Table 3. Percentage of enantiomeric excess calculated for selected Transaminases with substrates mentioned in Table 2 as per the algorithm described in Figure.5 and equation 2 after finding the number of (R) and (S) conformations of substrates using equation 1.
Variables Explanation
C1 Centre of mass (COM) of Large binding pocket ( LBP )
C2 Centre of mass (COM) of Large Moiety ( LM )
C3 Centre of mass (COM) of Small binding pocket ( LBP )
C4 Centre of mass (COM) of Small Moiety ( LM )
CC Carbonyl carbon of Ketone Substrate
O Ketone of Substrate
LYS Catalytic Lysine
NH2 Amide of PMP
a distance between C1 and C2
b distance between C3 and C4
c distance between CC and Amine (N) of LYS
d distance between C1 and C4
e distance between C3 and C2
p distance between NH2 and CC
[0061] Table 4: The table contains different distance constraints derived from the three-dimensional orientation of ketone complexed in the Enzyme-PMP complex as depicted in Figure 7; COM – Centre of mass, LBP – Large binding pocket, SBP – Small binding pocket, LM – Large Moiety, SM – Small Moiety, CC – Carbonyl carbon.
Entry Substrate Structure
ID Selectivity of TA Proteina Experimental ee% Product
1 SUB1 4A6T (S) TA-P1-A06 >99 (S)
2 SUB1 5KQU (S) TA-P1-A06 >99 (S)
3 SUB1 3WWH (R) ATA-025 >99 (R)
4 SUB1 6XWB (R) ATA-025 >99 (R)
5 SUB2 4A6T (S) PD, PF >99 (S)
6 SUB2 5KQU (S) PD, PF >99 (S)
7 SUB2 3WWH (R) ArR, AT, HN >99 (R)
8 SUB2 6XWB (R) ArR, AT, HN >99 (R)
9 SUB3 4A6T (S) PF 94 (S)
10 SUB3 5KQU (S) PF 94 (S)
11 SUB3 3WWH (R) AT >99 (R)
12 SUB4 4A6T (S) CV(wild type) >99 (S)
13 SUB4 5KQU (S) CV(wild type) >99 (S)
14 SUB4 4A6T (S) CV(PHE88ALA) >90 (S)
15 SUB5 4A6T (S) CV(wild type) >99 (S)
16 SUB5 4A6T (S) CV(PHE88ALA) >90 (R)
17 SUB6 4A6T (S) CV(wild type) >99 (S)
18 SUB6 5KQU (S) CV(wild type) >99 (S)
19 SUB6 6XWB (R) AT >99 (R)
20 SUB7 4A6T (S) CV(wild type) >99 (S)
21 SUB7 5KQU (S) CV(wild type) >99 (S)
22 SUB7 3WWH (R) ArR, AT, HN >99 (R)
23 SUB7 6XWB (R) ArR, AT, HN >99 (R)
24 SUB8 4A6T (S) PD >99 (S)
25 SUB8 5KQU (S) PD >99 (S)
26 SUB8 3WWH (R) ArR, AT, HN >99 (R)
27 SUB9 3wwh (R) AT >99 (R)
28 SUB10 3wwj (R) AT >99 (R)
29 SUB1 6GWI (S) TA-P1-A06 >99 (S)
30 SUB2 6GWI (S) PD, PF >99 (S)
31 SUB3 6GWI (S) PF 94 (S)
32 SUB4 6GWI (S) CV(wild type) >99 (S)
33 SUB5 6GWI (S) CV(wild type) >99 (S)
34 SUB6 6GWI (S) CV(wild type) >99 (S)
35 SUB7 6GWI (S) CV(wild type) >99 (S)
36 SUB6 3WWH (R) AT >99 (R)
37 SUB1 4CMD (R) ATA-025 >99 (R)
38 SUB2 4CMD (R) ArR, AT, HN >99 (R)
39 SUB3 4CMD (R) AT >99 (R)
40 SUB4 4A6T (S) CV
(PHE88ALA_ALA231PHE) >90 (S)
41 SUB5 4A6T (S) CV(PHE88ALA_ALA231PHE) >90 (R)
42 SUB6 4CMD (R) AT >99 (R)
43 SUB7 4CMD (R) ArR, AT, HN >99 (R)
44 SUB8 4CMD (R) ArR, AT, HN >99 (R)
45 SUB5 5KQU (S) CV(wild type) >99 (S)
46 SUB8 6XWB (R) ArR, AT, HN >99 (R)
47 SUB8 6GWI (S) PD >99 (S)
48 SUB3 6XWB (R) AT >99 (R)
[0062] Table 5: Training set (entry 1-28), test set (entry 29-36), validation set (entry 37-48)
aorigin of ?-TAs: ArR, (R)-selective Arthrobacter sp.; AT, Aspergillus terreus; CV, Chromobacterium violaceum; HN, Hyphomonas neptunium; TA-P1-A06, ATA-025, ATA-033, ?-TAs from the Codex TA screening kit; VF, Vibrio fluvialis. *Reaction centre for converting into chiral amines.
,CLAIMS:CLAIMS
We Claim,
1. In the field of biocatalysis, we claim a method to predict any Pyridoxal phosphate bound enzyme as (S)-selective or (R)-selective transaminase and a process that uses molecular modelling method combined with machine learning and artificial intelligence to predict the enantiomeric excess of product of a transaminase – ketone substrate reaction as shown in Figure. 1.
2. In claim 1, we claim the method of capturing the three-dimensional orientation of catalytic Lysine and Pyridoxal phosphate in the 3D structure of a transaminase to predict an enzyme to be (S)-selective transaminase or (R)-selective transaminase, which applies to experimental Enzyme-Pyridoxal phosphate complex or theoretically modelled Enzyme- Pyridoxal phosphate complex.
3. In claim 1, we claim the prediction of the transaminase to be (R)-selective when the calculated torsion angle values range between +30º to +160º or to be (S)-selective when the calculated torsion angle values range between -30º to -160º wherein the torsion angle is calculated between atoms C5, C2, C4A of PLP and CA atom of catalytic Lysine selected in sequential order in Enzyme-Pyridoxal phosphate complex.
4. In claim 1, we claim the method of non-covalent modelling or non-covalent docking of ketones in the active site of Enzyme-Pyridoxal phosphate complex, wherein Pyridoxal phosphate (PLP) is quantum chemically transformed to pyridoxamine phosphate (PMP) and the catalytic lysine has neutral NH2, hereafter referred to as Enzyme-PMP complex in this embodiment.
5. We claim the categorization of the docked conformations of the ketones as (R) conformation or (S) conformation based on the three-dimensional orientation of ketones in the active site of the (R)-specific transaminase and in the (S)-specific transaminase that is expected to give (R) amine and (S) amine products as depicted in a 2-dimensional image in Figure 7.
6. Different distance constraints are derived from the three-dimensional orientation of the ketone complexed in the Enzyme-PMP complex, as depicted in Figure 5 and shown in Table 4. We claim the process and the algorithm used to automate the identification of conformations of the substrate in the active site of the Enzyme-PMP complex that would yield an (R)-enantiomer product (R amine) or an (S)-enantiomer product using the constraints given in Table 4.
7. We claim a process and the Equations 1 & 2 mentioned in this embodiment to identify and count the number of conformations of the substrate in the active site of the Enzyme-PMP complex as shown in Figure. 5, that would yield an (R)-enantiomer product or an (S)-enantiomer product, wherein the equation is called Selectivity of substrate conformation (SSC) equation and is substituted in a general ee% equation as shown and explained and inferred below.
SSC = log(S(1/a + 1/b)) - log(S(1/d + 1/e)) ; wherein a,b,d and e are described in Table 4.
A positive SSC value derived from docking of ketone in the (S)-selective transaminase is predicted (counted) as (S)-enantiomer product, and a negative SSC value are indicated as R selective product.
A positive SSC value derived from docking of ketone in the (R)-selective transaminase is predicted (counted) as (R)-selective product, and a negative SSC value is predicted as (R)-selective product.
8. In claim 1, we claim identifying, the R or S specific substrate conformation in the Enzyme-PMP complex using a grid-based method shown in Figure. 5, explained under subheading “GRID BASED METHOD” also described below, wherein the grid is already trained with (S) distinct conformations and (R) specific conformations of ketones in the Enzyme-PMP complex.
The grid is made of grid points spaced at 1.0 Å resolution, which is defined as grid cell occupancy descriptors and is composed of electrostatic descriptors or van der Waals descriptors or quantum chemical descriptors; the grid covers up to 10 Å radius from the ring of PMP of the Enzyme-PMP and substrate complex. The training set and the test set used for training the grid is given in Table 5, and the external validation set is also shown in Table 5. An energy density is calculated using the descriptor derived from the docked or modelled Enzyme-PMP and substrate complex. We have implemented Machine Learning and Deep Learning using an Artificial Convolution Neural Networks (CNN) algorithm where the algorithm extracts the superimposable descriptors over the substrate Enzyme-PMP complex. The training set and test sets are initially constructed as matrices using the convolutional neural network algorithm, which forms the model as part of Machine Learning. The convolutional neural networks algorithm ingests and processes external validation datasets as tensors. Tensors are matrices of numbers compared with the training set models to identify the descriptors and predict ee%. This method is a quantitative structure-activity approach. As an Artificial Intelligence approach, the convolutional neural networks algorithm further equips, adds up and rebuilds the existing training set matrix with external validation set data which would help predict superimposable descriptors over the substrate Enzyme-PMP complex.
| Section | Controller | Decision Date |
|---|---|---|
| # | Name | Date |
|---|---|---|
| 1 | 202141000858-IntimationOfGrant22-08-2024.pdf | 2024-08-22 |
| 1 | 202141000858-STATEMENT OF UNDERTAKING (FORM 3) [08-01-2021(online)].pdf | 2021-01-08 |
| 2 | 202141000858-PatentCertificate22-08-2024.pdf | 2024-08-22 |
| 2 | 202141000858-PROVISIONAL SPECIFICATION [08-01-2021(online)].pdf | 2021-01-08 |
| 3 | 202141000858-FORM FOR STARTUP [08-01-2021(online)].pdf | 2021-01-08 |
| 3 | 202141000858-AMMENDED DOCUMENTS [19-04-2024(online)].pdf | 2024-04-19 |
| 4 | 202141000858-FORM FOR SMALL ENTITY(FORM-28) [08-01-2021(online)].pdf | 2021-01-08 |
| 4 | 202141000858-FORM 13 [19-04-2024(online)].pdf | 2024-04-19 |
| 5 | 202141000858-MARKED COPIES OF AMENDEMENTS [19-04-2024(online)].pdf | 2024-04-19 |
| 5 | 202141000858-FORM 1 [08-01-2021(online)].pdf | 2021-01-08 |
| 6 | 202141000858-Written submissions and relevant documents [19-04-2024(online)].pdf | 2024-04-19 |
| 6 | 202141000858-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [08-01-2021(online)].pdf | 2021-01-08 |
| 7 | 202141000858-EVIDENCE FOR REGISTRATION UNDER SSI [08-01-2021(online)].pdf | 2021-01-08 |
| 7 | 202141000858-Correspondence to notify the Controller [07-03-2024(online)].pdf | 2024-03-07 |
| 8 | 202141000858-FORM-26 [07-03-2024(online)].pdf | 2024-03-07 |
| 8 | 202141000858-DRAWINGS [08-01-2021(online)].pdf | 2021-01-08 |
| 9 | 202141000858-DECLARATION OF INVENTORSHIP (FORM 5) [08-01-2021(online)].pdf | 2021-01-08 |
| 9 | 202141000858-US(14)-HearingNotice-(HearingDate-05-04-2024).pdf | 2024-03-06 |
| 10 | 202141000858-Proof of Right [19-02-2021(online)].pdf | 2021-02-19 |
| 10 | 202141000858-REQUEST FOR ADJOURNMENT OF HEARING UNDER RULE 129A [21-02-2024(online)].pdf | 2024-02-21 |
| 11 | 202141000858-FORM-26 [19-02-2021(online)].pdf | 2021-02-19 |
| 11 | 202141000858-US(14)-HearingNotice-(HearingDate-01-03-2024).pdf | 2024-02-13 |
| 12 | 202141000858-ABSTRACT [10-01-2023(online)].pdf | 2023-01-10 |
| 12 | 202141000858-Correspondence_24-02-2021.pdf | 2021-02-24 |
| 13 | 202141000858-AMMENDED DOCUMENTS [10-01-2023(online)].pdf | 2023-01-10 |
| 13 | 202141000858-STARTUP [28-02-2022(online)].pdf | 2022-02-28 |
| 14 | 202141000858-CLAIMS [10-01-2023(online)].pdf | 2023-01-10 |
| 14 | 202141000858-PETITION u-r 6(6) [28-02-2022(online)].pdf | 2022-02-28 |
| 15 | 202141000858-COMPLETE SPECIFICATION [10-01-2023(online)].pdf | 2023-01-10 |
| 15 | 202141000858-FORM28 [28-02-2022(online)].pdf | 2022-02-28 |
| 16 | 202141000858-FER_SER_REPLY [10-01-2023(online)].pdf | 2023-01-10 |
| 16 | 202141000858-FORM-9 [28-02-2022(online)].pdf | 2022-02-28 |
| 17 | 202141000858-FORM 18A [28-02-2022(online)].pdf | 2022-02-28 |
| 17 | 202141000858-FORM 13 [10-01-2023(online)].pdf | 2023-01-10 |
| 18 | 202141000858-DRAWING [28-02-2022(online)].pdf | 2022-02-28 |
| 18 | 202141000858-MARKED COPIES OF AMENDEMENTS [10-01-2023(online)].pdf | 2023-01-10 |
| 19 | 202141000858-Covering Letter [28-02-2022(online)].pdf | 2022-02-28 |
| 19 | 202141000858-FER.pdf | 2022-07-11 |
| 20 | 202141000858-COMPLETE SPECIFICATION [28-02-2022(online)].pdf | 2022-02-28 |
| 21 | 202141000858-Covering Letter [28-02-2022(online)].pdf | 2022-02-28 |
| 21 | 202141000858-FER.pdf | 2022-07-11 |
| 22 | 202141000858-DRAWING [28-02-2022(online)].pdf | 2022-02-28 |
| 22 | 202141000858-MARKED COPIES OF AMENDEMENTS [10-01-2023(online)].pdf | 2023-01-10 |
| 23 | 202141000858-FORM 13 [10-01-2023(online)].pdf | 2023-01-10 |
| 23 | 202141000858-FORM 18A [28-02-2022(online)].pdf | 2022-02-28 |
| 24 | 202141000858-FORM-9 [28-02-2022(online)].pdf | 2022-02-28 |
| 24 | 202141000858-FER_SER_REPLY [10-01-2023(online)].pdf | 2023-01-10 |
| 25 | 202141000858-FORM28 [28-02-2022(online)].pdf | 2022-02-28 |
| 25 | 202141000858-COMPLETE SPECIFICATION [10-01-2023(online)].pdf | 2023-01-10 |
| 26 | 202141000858-CLAIMS [10-01-2023(online)].pdf | 2023-01-10 |
| 26 | 202141000858-PETITION u-r 6(6) [28-02-2022(online)].pdf | 2022-02-28 |
| 27 | 202141000858-AMMENDED DOCUMENTS [10-01-2023(online)].pdf | 2023-01-10 |
| 27 | 202141000858-STARTUP [28-02-2022(online)].pdf | 2022-02-28 |
| 28 | 202141000858-ABSTRACT [10-01-2023(online)].pdf | 2023-01-10 |
| 28 | 202141000858-Correspondence_24-02-2021.pdf | 2021-02-24 |
| 29 | 202141000858-FORM-26 [19-02-2021(online)].pdf | 2021-02-19 |
| 29 | 202141000858-US(14)-HearingNotice-(HearingDate-01-03-2024).pdf | 2024-02-13 |
| 30 | 202141000858-Proof of Right [19-02-2021(online)].pdf | 2021-02-19 |
| 30 | 202141000858-REQUEST FOR ADJOURNMENT OF HEARING UNDER RULE 129A [21-02-2024(online)].pdf | 2024-02-21 |
| 31 | 202141000858-DECLARATION OF INVENTORSHIP (FORM 5) [08-01-2021(online)].pdf | 2021-01-08 |
| 31 | 202141000858-US(14)-HearingNotice-(HearingDate-05-04-2024).pdf | 2024-03-06 |
| 32 | 202141000858-DRAWINGS [08-01-2021(online)].pdf | 2021-01-08 |
| 32 | 202141000858-FORM-26 [07-03-2024(online)].pdf | 2024-03-07 |
| 33 | 202141000858-Correspondence to notify the Controller [07-03-2024(online)].pdf | 2024-03-07 |
| 33 | 202141000858-EVIDENCE FOR REGISTRATION UNDER SSI [08-01-2021(online)].pdf | 2021-01-08 |
| 34 | 202141000858-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [08-01-2021(online)].pdf | 2021-01-08 |
| 34 | 202141000858-Written submissions and relevant documents [19-04-2024(online)].pdf | 2024-04-19 |
| 35 | 202141000858-FORM 1 [08-01-2021(online)].pdf | 2021-01-08 |
| 35 | 202141000858-MARKED COPIES OF AMENDEMENTS [19-04-2024(online)].pdf | 2024-04-19 |
| 36 | 202141000858-FORM 13 [19-04-2024(online)].pdf | 2024-04-19 |
| 36 | 202141000858-FORM FOR SMALL ENTITY(FORM-28) [08-01-2021(online)].pdf | 2021-01-08 |
| 37 | 202141000858-FORM FOR STARTUP [08-01-2021(online)].pdf | 2021-01-08 |
| 37 | 202141000858-AMMENDED DOCUMENTS [19-04-2024(online)].pdf | 2024-04-19 |
| 38 | 202141000858-PROVISIONAL SPECIFICATION [08-01-2021(online)].pdf | 2021-01-08 |
| 38 | 202141000858-PatentCertificate22-08-2024.pdf | 2024-08-22 |
| 39 | 202141000858-STATEMENT OF UNDERTAKING (FORM 3) [08-01-2021(online)].pdf | 2021-01-08 |
| 39 | 202141000858-IntimationOfGrant22-08-2024.pdf | 2024-08-22 |
| 1 | 11may2022_202141000858_searchE_11-05-2022.pdf |