Method For Protein Sequencing By Advanced Data Dependent Acquisition

< Back

Method For Protein Sequencing By Advanced Data Dependent Acquisition

Abstract: METHOD FOR PROTEIN SEQUENCING BY ADVANCED DATA-DEPENDENT ACQUISITION The present invention discloses a robust, reliable, and highly reproducible method for protein sequencing by liquid chromatography-tandem mass spectrometry (LC-MS/MS) wherein the method employs data-dependent acquisition (DDA) having fast acquisition rates and improved MS/MS sensitivity wherein, the method yields improved sequence coverage with one protease and complete sequence coverage with as low as two proteases in a single experimental iteration. The disclosed method teaches superior fragmentation and subsequent coverage of b and y ions. Further, the method also finds utility in identifying post-translational modifications in the peptide under analysis. The automated DDA peptide mapping workflow disclosed herein greatly reduces the time between data acquisition and decision-making, thereby accelerating the drug development process. The method can be employed in early-stage drug development for amino acid sequencing of biologic products as well as at various stages of clone development.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

28 December 2023

Publication Number

27/2025

Publication Type

INA

Invention Field

BIO-CHEMISTRY

Status

Parent Application

Applicants

Dr. Reddy’s Laboratories Limited

8-2-337 Road No. 3, Banjara Hills Hyderabad Telangana India 500034

Inventors

1. Ravi Kumar Marikanti

House no. 3-136/7/7 Prashanthi hills, road no.14. Meerpet Hyderabad Telangana India 500097

2. Giridhar Sivalanka

Flat no: 505 Balaji Residency, Satavahana Nagar Eluru Andhra Pradesh India 534003

3. Sireesha Goswamy Kaligatla

1-22-14; ABC Apartments; F-10 Prakash Street; Sri ram nagar Kakinada, East Godavari Dist. Andhra Pradesh India 533001

4. Murali Jayaraman

Door No 7 Third Street, Nandivaram Guduvancheri Post Kancheepuram Dt Tamil Nadu India 603202

Specification

Description:FIELD OF THE INVENTION
The present invention relates to protein sequencing using liquid chromatography coupled with tandem mass spectrometry.
BACKGROUND OF THE INVENTION
Protein-based therapeutics (‘biologics”) are large and complex molecules and require high-end analytical techniques for characterization, among which, liquid chromatography coupled with mass spectrometry (LC-MS) is a powerful and sensitive method. LC-MS has the potential to deliver a multitude of insights about the components of interest in a given solution (in terms of identity, quantity and quality) with a limited amount of material (e.g. 2-10µg in case of protein) and without need of cumbersome sample processing steps.
While LC-MS is helpful in measuring molecular weights of peptides and proteins, LC-MS/MS (tandem mass spectrometry) gives insight also into the primary amino acid sequence of peptides. This approach, commonly referred to as ‘bottom-up’ proteomics constitutes cleaving of the protein of interest into smaller peptides with the aid of one or more proteases, followed by chromatographic separation of the peptides, MS-based mass determination, selection of the desired peptide fraction from the MS spectra and MS/MS-dependent sequence determination based on charge to mass ratio of the ions generated. For this, peptides in the desired fraction (or, “precursor ions”) are selected by applying certain criteria, and subject to collision-induced dissociation (CID) in the gas phase of the collision cell, followed by data acquisition of the ions generated (‘b’ and ‘y’ ions). The amino acid sequence is deduced from the resultant MS/MS spectrum either by matching with a theoretical sequence (in the case of a known peptide) or by manual interpretation (in the case of a novel peptide sequence).
Selection of species for MS/MS analysis can be done in several ways. Data-dependent acquisition (DDA) is one such means wherein, a narrow selection window is applied to select the desired MS peak(s) (typically, peaks with the greatest signal intensity of precursor ions), followed by fragmentation and MS/MS scan.
While the conventional DDA method is helpful in bottom-up proteomics, 100% amino acid sequence coverage with existing methods is rarely achieved. The method requires combining data from multiple protease digestions and several experimental iterations to achieve maximal sequence coverage. This in turn impacts the total sample requirement and overall time of analysis for a given sequence analysis.
Thus, there is a need to develop a data-dependent acquisition-based LC-MS/MS method with superior amino acid sequence coverage capability by employing less number of protease digestions and lesser experimental runs. Also, it is a highly desirable aspect in the analytical strategy of biologic products to have a platform method that can be used across several types of proteins, thereby accelerating the overall drug development process.
SUMMARY OF THE INVENTION
Accordingly, it is an object of the present invention to develop a robust, reliable, and highly reproducible method for protein sequencing by liquid chromatography-tandem mass spectrometry (LC-MS/MS) wherein the method employs data-dependent acquisition (DDA) having fast acquisition rates and improved MS/MS sensitivity wherein, the method yields improved sequence coverage with one protease and complete sequence coverage with as low as two proteases in a single experimental iteration. The disclosed method teaches superior fragmentation and subsequent coverage of b and y ions. Further, the method also finds utility in identifying post-translational modifications in the peptide under analysis. The automated DDA peptide mapping workflow disclosed herein greatly reduces the time between data acquisition and decision-making, thereby accelerating the drug development process. The method can be employed in early-stage drug development for amino acid sequencing of biologic products as well as at various stages of clone development.
BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1: Illustrates % sequence coverage as a function of relative hydrophobicity of the peptide.
Figure 2: Illustrates % sequence coverage as a function of length of peptide analysed.
Figure 3: Illustrates sequence coverage of Mab3 light chain after MS/MS analysis of protein sample digested with trypsin and elastase enzymes
Figure 4: Illustrates sequence coverage of Mab3 heavy chain after MS/MS analysis of protein sample digested with trypsin, Asp-N and elastase enzymes
Figure 5: Shows comparative MS/MS spectra of a representative peptide sequence as sequenced using HD-DDA and the method of present invention
Figure 6: Shows comparative MS/MS spectra of a representative peptide sequence as sequenced using HD-DDA and the method of present invention
Figure 7: MS/MS spectrum of two representative runs of trypsin-digested peptides
Figure 8: MS/MS spectrum of a peptide in the presence and absence of site-specific modification
DETAILED DESCRIPTION OF THE INVENTION
The present invention discloses a mass spectrometry method for determination of amino acid sequence of a protein, wherein the method achieves 100% amino acid sequence coverage in one iteration.
In an embodiment, the present invention discloses a method for the determination of amino acid sequence of a protein, the method comprising:
(a) obtaining a composition comprising the protein;
(b) subjecting the composition to protease treatment to generate peptide fragments;
(c) separating the peptide fragments by liquid chromatography;
(d) ionizing the separated peptide fragments;
(e) detecting generated ions of step (d) by mass spectrometry;
(f) selecting the ions of step (e);
(g) fragmenting the selected ions of step (f) and
(h) detecting the fragmented ions of step (g) by tandem mass spectrometry;
wherein the selecting, fragmenting and detecting of steps (f) to (h) is performed according to the parameters of Table (A); and
wherein in step (b) the composition is subject to at least one protease to achieve improved, preferably 100%, amino acid sequence coverage in one iteration; and
wherein the method provides for determination of post-translational modification comprising oxidation, deamidation and pyroglutamination in the peptide under analysis.
MS/MS Parameters Value
Scan Time 0.2 sec
Stop MS/MS TIC intensity <10000
Timeout 3 sec
Start MS/MS TIC intensity >1000
Collision energy (CE) (Ramp) Low mass: 6-10 eV ; High mass: 40-70 eV

Table (A)
In another embodiment, the present invention discloses a method for the determination of amino acid sequence of a protein, the method comprising:
(a) obtaining a composition comprising the protein;
(b) subjecting the composition to Asp-N treatment to generate peptide fragments;
(c) separating the peptide fragments by liquid chromatography;
(d) ionizing the separated peptide fragments;
(e) detecting generated ions by mass spectrometry;
(f) selecting the ions of step (e);
(g) fragmenting the selected ions of step (f) and
(h) detecting the fragmented ions of step (g) by tandem mass spectrometry;
wherein the selecting, fragmenting and detecting of steps (f) to (h) is performed according to the parameters of Table (A)
wherein the method yield improved, preferably 100%, amino acid sequence coverage in one iteration; and
wherein the method provides for determination of post-translational modification comprising oxidation, deamidation and pyroglutamination in the peptide under analysis.
In yet another embodiment, the present invention discloses a method for the determination of amino acid sequence of a protein, the method comprising:
(a) obtaining a composition comprising the protein;
(b) subjecting the composition separately to trypsin and elastase treatment to generate peptide fragments; and performing steps (c) to (h) for trypsin and elastase-treated peptides separately;
(c) separating the peptide fragments by liquid chromatography;
(d) ionizing the separated peptide fragments;
(e) detecting generated ions by mass spectrometry;
(f) selecting the ions of step (e);
(g) fragmenting the selected ions of step (f) and
(h) detecting the fragmented ions of step (g) by tandem mass spectrometry;
wherein the selecting, fragmenting and detecting of steps (f) to (h) is performed according to the parameters of Table (A);
wherein the method yields improved, preferably 100% amino acid sequence coverage; and
wherein the method provides for determination of post-translational modification comprising oxidation, deamidation and pyroglutamination in the peptide under analysis.
In yet another embodiment the present invention discloses a method for the determination of amino acid sequence of a protein, the method comprising:
(a) obtaining a composition comprising the protein;
(b) subjecting the composition separately to trypsin, elastase and Asp-N to generate peptide fragments; and performing steps (c) to (h) for trypsin elastase and Asp-N-treated peptides separately;
(c) separating the peptide fragments by liquid chromatography;
(d) ionizing the separated peptide fragments;
(e) detecting generated ions by mass spectrometry;
(f) selecting the ions of step (e);
(g) fragmenting the selected ions of step (f) and
(h) detecting the fragmented ions of step (g) by tandem mass spectrometry;
wherein the selecting, fragmenting and detecting of steps (f) to (h) is performed according to the parameters of Table (A);
wherein the method yields improved, preferably 100% amino acid sequence coverage; and
wherein the method provides for determination of post-translational modification comprising oxidation, deamidation and pyroglutamination in the peptide under analysis.
In a further embodiment the method provides complete fragmentation of the peptide and complete coverage of both b and y ions thereby delivering 100% amino acid sequence coverage, wherein the ion intensity of all the fragmented peptides is =2000 counts.
In a further embodiment, the method provides =80% amino acid sequence coverage of peptides.
In a related embodiment, the method provides =80% amino acid sequence coverage of peptides that are about 5 to about 35 amino acids long.
In a further embodiment, the method provides accurate determination of C-terminal basic amino acid in the peptide.
In a related embodiment, the protein is a recombinant product including an Fc-containing protein, an antibody, a monoclonal antibody or a fusion protein.
Definitions:
The term “collision energy” refers to the energy (in eV) which is used to accelerate the precursor ions in collision cell to induce fragmentation for MS/MS analysis.
The term “complete fragmentation” as used herein refers to 95%, 96%, 97%, 98%, 99% or 100% fragmentation of the peptide under analysis.
High definition data dependent acquisition (HD-DDA) is a novel MS method which incorporates instrumental and application benefits for the identification of proteins and peptides, where ion mobility spectrometry is incorporated into a quadrupole time-of-flight mass spectrometer. HD-DDA uses a high duty cycle mode and enhanced decision making to provide a highly sensitive and selective/specific experiment. HD-DDA enhancements include full support for Wideband Enhancement, which affords a signal increase of five- to ten-fold as well as enhanced decision making logic when switching between MS and MS/MS modes. Wideband Enhancement utilizes ion mobility separation of product ions of a single charge state in combination with pusher synchronization to achieve nearly 100% duty cycle. Spectral quality for low abundance species/peptides is significantly increased by HD-DDA’s method. The percentage of MS/MS spectra generating a positive match is dramatically increased using HD-DDA data.
Mass spectrometry is an analytical technique that is used to identify unknown compounds, quantify known materials, and elucidate the structural and physical properties of ions. Mass Spectrometry can be used in conjunction with chromatography techniques, such as LC-MS and GC-MS. Examples of mass spectrometry tools for use as detection agents include, but are not limited to, electron ionisation (EI), chemical ionisation (CI), fast atom bombardment (FAB)/liquid secondary ionisation (LSIMS), matrix assisted laser desorption ionisation (MALDI), and electrospray ionisation (ESI). See, for example, Gary Siuzdak, Mass Spectrometry for Biotechnology, Academic Press, San Diego, 1996.
The term “Reverse phase chromatography” is a chromatographic technique wherein mobile phase solute (e.g. proteins/peptides etc.) binds to an immobilized n-alkyl hydrocarbon or aromatic ligand via hydrophobic interaction. The biomolecules are then generally eluted using gradient elution instead of isocratic elution. While biomolecules are strongly adsorbed to the surface of a reversed phase matrix under aqueous/relatively less organic conditions, they desorb from the matrix within a very narrow window of organic/ relatively increased organic modifier concentration. Since biomolecules would vary in terms of their hydrophobicity, it is an efficient technique to separate biomolecules by using gradient of organic modifier and thus pattern their separation.
The term “MS/MS scan time” refers to the duration (in seconds), over which ion detections in each individual MS/MS spectrum is recorded.
The term “start MS/MS” refers to the selection setting of the peak where MS/MS acquisition begins. This is either based on total ion count or intensity of the individual ions.
The term “stop MS/MS” refers to the scan setting where MS/MS acquisition is terminated which is usually based on either with reference to start MS/MS time or based on ion intensity threshold.
The term “sequence coverage” or “coverage” as used herein refers to the extent to which the exact identity of the amino acid and corresponding amino acid sequence in the peptide under analysis is determined by employing the claimed method. In an MS/MS spectrum, the difference between successive peaks of a b ion series or y ion series helps in determining the identity and sequence of amino acids in the peptide. The term “improved amino acid sequence coverage” as used herein refers to at least 6% or more increase in correct determination of amino acids sequence when compared to a method in the art. The term “100% amino acid sequence coverage” or “complete sequence coverage” as used herein refers to correct determination of all amino acids in the sequence.
The term “one iteration” as used herein refers to one experimental run comprising one protease treatment, one liquid chromatography-based separation and one MS/MS detection and analysis.
Abbreviations:
CD20: cluster of differentiation 20
CE: collision energy
CID: collision-induced dissociation
DDA: data-dependent acquisition
EDTA: ethylenediamine tetraacetic acid
eV: Electonvolt
Fc: fragment crystallizable region (pertaining to immunoglobulin)
GdnHCl: guanidium hydrochloride
HC: heavy chain
HD-DDA: high-definition data-dependent acquisition
IAM: Iodoacetamide
LC: light chain
LC-MS/MS: liquid chromatography coupled with tandem mass spectrometry
LC-MS: liquid chromatography coupled with mass spectrometry
m/z: mass-to-charge ratio
M: Molar
min: Minute
mM: Millimolar
PD-1: programmed cell death protein - 1
sec: Second
TFA: Trifluoroacetic acid
TIC: Total ion chromatogram
UPLC: ultraperformance liquid chromatography

EXAMPLES
Those skilled in the art will recognize that several embodiments are possible within the scope and spirit of this invention. The invention will now be described in greater detail by reference to the following non-limiting examples. The following examples further illustrate the invention but, of course, should not be construed as in any way limiting its scope.
The disclosed method is applicable for MS/MS sequence mapping for all biologic molecules.
Example 1: Sample preparation
Sample preparation involves the steps of denaturation, reduction and alkylation, followed by treatment with one or more protease.
Denaturation is performed to unfold the protein, such that reducing agents and proteases have greater access to the protein structure to act upon. For this, protein sample was diluted to 1mg/ml concentration with denaturation buffer (pH 7.5) comprising EDTA and guanidium hydrochloride (GdnHCl). Next, reduction of disulfide bonds was carried out in solution containing 10mM DTT wherein, sample was mixed gently and incubated at 37ºC for 30 mins. To alkylate the free sulfhydryl groups from preventing formation of disulfide bonds again, 20µL sample was treated with 20mM iodoacetamide (IAM). Sample was mixed gently and incubated at room temperature for 40 mins in dark.
Subsequently, the reduced and alkylated protein was buffer-exchanged in protease digestion buffer (comprising 1mM EDTA, 1M urea, 20mM Hydroxyl ammonium chloride and 0.1M Tris, pH 7.5) using PD-10 desalting column (CytivaTM) to remove remnants of GdnHCl, DTT and IAM. For this, the column was washed with water and equilibrated using digestion buffer, followed by loading of the alkylated protein sample. Protein bound to the column was eluted with protease digestion buffer. Protein content was independently assessed by Bradford method and fractions containing protein at 1 mg/mL concentration was then subject to protease digestion with one or more of the following enzymes as per the description below.
Digestion with protease:
a) Trypsin: Trypsin is a serine protease which hydrolyzes proteins at the carboxyl terminal of lysine and arginine. Protein sample was digested with trypsin by mixing at protein: enzyme ratio of 25:1 and incubated at 37 °C for more than 16hrs.
b) Asp-N: Endoproteinase Asp-N is a zinc metalloendopeptidase which cleaves at the amino terminal of aspartic acid residues. Protein sample was mixed with Asp-N in protein: enzyme ratio 25:1 and incubated at 37 °C for more than 16hrs.
c) Elastase: Elastase catalyzes cleavage at the carboxyl terminal of hydrophobic amino acids such as glycine, alanine, valine, serine, leucine and isoleucine. Protein sample was mixed with elastase in protein: enzyme ratio of 25:1 and incubated at 37 °C for 1hr.
Example 2: Separation by liquid chromatography
Following proteolytic digestion, protein sample was subject to ultraperformance liquid chromatographic (UPLC) separation. The method was performed as per parameters in Table 1 (UPLC parameters) and Table 2 (mobile phase gradient).
Mobile Phase A 100% Milli-Q Water
Mobile Phase B 100% Acetonitrile
Mobile Phase C 1% TFA
Column Waters Acquity UPLC BEH C18, 2.1*150mm
Column Temperature 60°C ± 3°C
Sample Temperature 6°C ± 3°C
Wavelength 214/280 nm
Table 1
Time (min) %A %B %C Curve
0 88 2 10 Initial
3 88 2 10 6
10 84 6 10 6
40 70 20 10 6
85 48 42 10 6
90 2 88 10 6
95 2 88 10 6
95.1 88 2 10 6
100 88 2 10 6
Table 2
Example 3: Elucidation of primary structure of trypsin-digested proteins
Elucidation of primary structure was performed with LC-MS/MS. The UPLC instrument is connected with the mass spectrometer and protein sample obtained after sample preparation and separation was directly passed through ion source of mass spectrometer and subject to either high-definition data-dependent acquisition (HD-DDA) mass spectrometer (Synapt G2-Si HDMS) or MS analysis of presently disclosed method (Xevo® G2-XS QTof UNIFI platform). The HD-DDA MS analysis was performed as described in the Indian patent application 201941009557. The source parameters, MS parameters and MS/MS parameters of the presently disclosed method can be found in Table 3, Table 4 and Table 5 respectively.
Table 6 depicts analysis results of trypsin-digested antibody ‘Mab1’ light chain (LC) and heavy chain (HC) peptides. In the example of Mab1 shown, sequencing of heavy chain and light chain yielded greater coverage by present method compared to the HD-DDA method. There was a 6% increase in total sequence coverage upon using trypsin digestion alone. Also, more specifically, present method yielded 100% coverage of peptide sequence which otherwise yielded only 62.5% coverage by HD-DDA method.
Table 7 gives similar depiction and insights of trypsin-digested antibody ‘Mab2’ and the percentage sequence coverage. The method yielded =80% amino acid sequence coverage of peptides with net hydrophobicity of about 7.95% to about 44% that are about 5 to about 35 amino acids long. Data reveals that percentage coverage is largely independent of the length or hydrophobicity of the peptide under analysis (Figure 1) or peptide length (Figure 2), indicating that the method can be applicable for analysis of diverse proteins – indicating its utility as a platform method for MS/MS based sequence analysis.

Source Parameters Set Point
Source Temperature 120°C
Capillary Voltage 3.5 V
Cone Voltage 50 V
De-solvation Temperature 500°C
De-solvation gas flow 1000°C
Cone gas flow 50 L/min
Table 3

MS Parameters Set Point Method Operating Range
Scan Range 50-2000 m/z 50-5000
Polarity Positive
Scan Time 0.3 sec 0.1-1
Table 4

MS/MS Parameters Value
Scan Time 0.2 sec
Stop MS/MS When TIC falls below 10000 Intensity (counts)
Timeout 3 sec
Start MS/MS When TIC raises above 1000 Intensity (counts)
Collision energy (Ramp) Low mass: 6-10 eV ; High mass: 40-70 eV

Table 5
Peptide ID Reference amino acid sequence No. of residues in reference sequence HD-DDA method (% MS/MS coverage) Residues covered (HD-DDA) Claimed DDA method
(%coverage) Residues covered (Present method)
LC: T1 DIQMTQSPSSLSASVGDR 18 82.4 14.8 94.1 16.9
LC: T2& VTITCR 6 90 5.4 90 5.4
LC: T3 ASSSVSYMHWYQQKPGK 17 90.6 15.4 96.9 16.5
LC: T4 APKPLIYAPSNLASGVPSR 19 86.1 16.4 83.3 15.8
LC: T5& FSGSGSGTDFTLTISSLQPEDFATYYCQQWSFNPPTFGQGTK 42 51.2 21.5 53.7 22.5
LC: T6-7 VEIKR 5 62.5 3.1 100 5.0
LC: T8 TVAAPSVFIFPPSDEQLK 18 76.5 13.8 94.1 16.9
LC: T9& SGTASVVCLLNNFYPR 16 83.3 13.3 93.3 14.9
LC: T11 VQWK 4 84.3 3.4 100 4.0
LC: T12 VDNALQSGNSQESVTEQDSK 20 86.8 17.4 97.4 19.5
LC: T13 DSTYSLSSTLTLSK 14 84.6 11.8 96.2 13.5
LC: T14 ADYEK 5 87.5 4.4 87.5 4.4
LC: T16& VYACEVTHQGLSSPVTK 17 96.9 16.5 96.9 16.5
LC: T17 SFNR 4 83.3 3.3 83.3 3.3
HC: T1 EVQLVESGGGLVQPGGSLR 19 83.3 15.8 100 19.0
HC: T2& LSCAASGYTFTSYNMHWVR 19 80.6 15.3 75 14.3
HC: T4 GLEWVGAIYPGNGDTSYNQK 20 71.1 14.2 86.8 17.4
HC: T7 FTISVDK 7 75 5.3 75 5.3
HC: T9 NTLYLQMNSLR 11 85 9.4 100 11.0
HC: T10& AEDTAVYYCAR 11 85 9.4 95 10.5
HC: T12 GPSVFPLAPSSK 12 90.9 10.9 90.9 10.9
HC: T13& STSGGTAALGCLVK 14 96.2 13.5 96.2 13.5
HC: T14& DYFPEPVTVSWNSGALTSGVHTFPAVLQSSGLYSLSSVVTVPSSSLGTQTYICNVNHKPSNTK 64 30.7 19.6 34.7 22.2
HC: T14& DYFPEPVTVSWNSGALTSGVHTFPAVLQSSGLYSLSSVVTVPSSSLGTQTYICNVNHKPSNTK 63 24.2 15.2 33.1 20.8
HC: T19& THTCPPCPAPELLGGPSVFLFPPKPK 26 86 22.4 92 23.9
HC: T20 DTLMISR 7 91.7 6.4 100 7.0
HC: T21& TPEVTCVVVDVSHEDPEVK 19 94.4 17.9 100 19.0
HC: T22 FNWYVDGVEVHNAK 14 96.2 13.5 96.2 13.5
HC: T24 EEQYNSTYR 9 43.8 3.9 43.8 3.9
HC: T25 VVSVLTVLHQDWLNGK 16 96.7 15.5 96.7 15.5
HC: T29 ALPAPIEK 8 92.9 7.4 92.9 7.4
HC: T30 TISK 4 83.3 3.3 83.3 3.3
HC: T32 GQPR 4 83.3 3.3 83.3 3.3
HC: T33 EPQVYTLPPSR 11 95 10.5 100 11.0
HC: T34 EEMTK 5 100 5.0 100 5.0
HC: T35& NQVSLTCLVK 10 100 10.0 94.4 9.4
HC: T36 GFYPSDIAVEWESNGQPENNYK 22 73.8 16.2 90.0 17.8
HC: T37 TTPPVLDSDGSFFLYSK 17 84.4 14.3 87.5 14.9
HC: T38 LTVDK 5 87.5 4.4 87.5 4.4
HC: T40& WQQGNVFSCSVMHEALHNHYTQK 23 72.7 16.7 86.4 19.9
HC: T41& SLSLSPG 7 71.4 5.0 71.4 5.0
Total 652 465 504
% Total coverage 71% 77%
Table 6
Peptide ID Reference amino acid sequence % MS/MS coverage in HD-DDA % MS/MS coverage by present method Hydrophobicity
P-HC:T15& DYFPEPVTVSWNSGALTSGVHTFPAVLQSSGLYSLSSVVTVPSSSLGTK 60.42 69.79 52.41
P-HC:T20& YGPPCPPCPAPEFLGGPSVFLFPPKPK 80.77 86.54 44.37
P-HC:T22& TPEVTCVVVDVSQEDPEVQFNWYVDGVEVHNAK 78.13 85.94 44.02
P-LC:T5 LLIYLASYLESGVPAR 86.67 96.67 42.35
P-LC:T6& FSGSGSGTDFTLTISSLEPEDFAVYYCQHSR 73.33 83.33 41.59
P-LC:T11& SGTASVVCLLNNFYPR 86.67 96.67 40.32
P-HC:T25 VVSVLTVLHQDWLNGK 93.33 96.67 39.99
P-HC:T4 ASGYTFTNYYMYWVR 75 96.43 38.38
P-LC:T10 TVAAPSVFIFPPSDEQLK 64.71 97.06 38.33
P-HC:T12 FDMGFDYWGQGTTVTVSSASTK 78.57 90.48 38.21
P-HC:T36 TTPPVLDSDGSFFLYSR 87.5 93.75 37.04
P-HC:T35 GFYPSDIAVEWESNGQPENNYK 71.43 80.95 34.98
P-LC:T1 EIVLTQSPATLSLSPGER 85.29 97.06 34.13
P-HC:T5 QAPGQGLEWMGGINPSNGGTNFNEK 68.75 85.42 33.34
P-HC:T39& WQEGNVFSCSVMHEALHNHYTQK 79.55 86.36 33.32
P-HC:T8 VTLTTDSSTTTAYMELK 87.5 96.88 30.73
P-LC:T4 GVSTSGYSYLHWYQQKPGQAPR 76.19 95.24 30.24
P-HC:T9& SLQFDDTAVYYCAR 80.77 96.15 30.12
P-LC:T15 DSTYSLSSTLTLSK 80.77 96.15 28.65
P-HC:T33 EPQVYTLPPSQEEMTK 86.67 96.67 27.22
P-HC:T13& GPSVFPLAPCSR 90.91 95.45 26.33
P-LC:T18& VYACEVTHQGLSSPVTK 78.13 96.88 26.03
P-LC:T7 DLPLTFGGGTK 75 95 25.28
P-HC:T34& NQVSLTCLVK 88.89 94.44 24.97
P-HC:T14& STSESTAALGCLVK 88.46 96.15 24.76
P-HC:T40 SLSLSLG 78.57 83.33 22.96
P-HC:T1& QVQLVQSGVEVK 68.18 100 22.32
P-HC:T29 GLPSSIEK 78.57 92.86 15.18
P-LC:T14 VDNALQSGNSQESVTEQDSK 89.47 97.37 14.55
P-HC:T24 EEQFNSTYR 56.25 81.25 13
P-LC:T13 VQWK 83.33 100 10.28
P-HC:T16& TYTCNVDHKPSNTK 92.31 96.15 7.95
Table 7

Example 4: Elucidation of primary structure of antibody heavy and light chains
Sample preparation of protein sample containing antibody ‘Mab3’ was performed as aforementioned. For digestion, eluted fraction from PD-10 column was digested with trypsin. Additionally, freshly eluted fractions were individually digested with Asp-N and elastase and subjected to LC-MS/MS separation as described earlier. Reference sequence of Mab3 was fed into the UNIFI software platform, which was used for data processing & analysis.
It was found that MS/MS data of sample digested with trypsin alone yielded ?90% amino acid sequence coverage as tabulated in Table 8.
Enzyme Light/heavy chain % Sequence coverage at MS/MS level
Trypsin Light chain 94.0%
Heavy chain 87.4%
% Total sequence coverage 90.7%
Table 8
Trypsin is the first protease of choice for protein digestion. However, smaller peptides generated as a result of trypsin digestion might elute out in the void region and can be missed from analysis. Hence, separately digesting with alternate proteases can improve overall sequence coverage.
Employing present method, when data from digests of trypsin and other proteases was combined, 100% sequence coverage of Mab3 light chain could be achieved with trypsin and elastase digestion alone (2 enzymes) in one iteration (Table 9, Figure 3). Similarly, in the case of heavy chain, stitching together data of 3 enzyme digestions (trypsin, Asp-N, elastase) yielded 100% sequence coverage (Table 10, Figure 4) in one iteration. The experimentally verified sequence of Mab3 light and heavy chains obtained using the disclosed method thus matched 100% with the theoretical sequence.
Enzyme Light/heavy chain % sequence coverage at MS/MS level
Trypsin Light chain 94%
Elastase Light chain 6%
% Total sequence coverage 100%
Table 9

Enzyme Light/heavy chain % sequence coverage at MS/MS level
Trypsin Heavy chain 87.4%
Asp-N Heavy chain 5.3%
Elastase Heavy chain 7.3%
% Total sequence coverage 100%
Table 10
Example 5: Percentage fragmentation
Generally, in data-dependent acquisition, the most intense precursor ions are selected, which are then subject to high-energy collisions, resulting in fragments extending from the N terminus and C terminus (b and y ions respectively). These ions are subsequently acquired for MS/MS analysis. A good MS/MS fragmentation-acquisition strategy should be able to have collision energy settings to acheive maximal fragmentation. Thus, during development, it was an aim of present method to achieve about complete fragmentation of peptide (and subsequently maximum coverage of the fragmented ions, as discussed in the following Example) with minimum number of iterations for a given sequencing exercise. In other words, the aim was to generate and analyze as many b and y ions as possible in a single run whereby which, the method would ideally deliver 100% sequence coverage in as less number of iterations as possible.
% Fragmentation was calculated as:
% Fragmentation = Observed no. of primary ions) X 100
(Total no. of primary ions)
Various parameters were analyzed to assess the level of fragmentation as shown in Table 11. The results of the different trials using various light chain (LC) and heavy chain (HC) fragments as well as the resultant % fragmentation are tabulated in Tables 12 & 13. Parameters of present method (listed in last column of Table 11) resulted in maximum % fragmentation among all the samples tested, leading to selection of the said parameters.

MS-MS/MS parameters Low CE Trial High CE trial Increased scan time trial Present method Trial
Low CE 6 to 10 10 to 20 6 to 10 6 to 10
High CE 25 to 50 45 to 75 40 to 60 40 to 60
MS Scan Time (sec) 0.3 0.3 0.5 0.3
MS/MS Scan Time (sec) 0.2 0.2 0.3 0.2
Table 11

MD Parameters % MS/MS Coverage of LC
LC:T1 LC:T4 LC:T5 LC:T7 LC:T10 LC:T11
Low CE Trial 94.12 90.48 93.33 90 94.12 90
High CE Trial 91.18 88.1 96.67 90 91.18 90
Increased Scan time Trial 97.06 90.48 96.67 90 97.06 90
Present Method Trial 97.06 95.24 96.67 95 97.06 96.67
Table 12
MD Parameters % MS/MS Coverage of HC
HC:T1 HC:T4 HC:T8 HC:T9 HC:T13 HC:T14
Low CE Trial 95.45 89.29 96.88 96.15 95.45 96.15
High CE Trial 100 92.86 96.88 96.15 81.82 92.31
Increased Scan time Trial 100 92.86 96.88 96.15 92.45 96.15
Present Method Trial 100 96.43 96.88 96.15 95.45 96.15

Table 13
Example 6: Extent of coverage and comparative intensities of b and y ions
In an MS/MS spectrum, the difference between successive peaks of a b ion series or y ion series helps in determining the series of amino acids in the peptide under analysis.
A superior MS/MS methodology should not only have energy settings for maximal fragmentation, but also the right settings for maximum coverage of the fragmented b and y ions. Employing the present method, an increased coverage of both b and y ions was observed (Figure 5). Figure 5 shows comparative MS/MS spectra of a representative peptide sequence as sequenced using HD-DDA and the method of present invention. As evident from the figure, present method gives improved fragmentation as well as improved coverage of both b and y ions, thereby considerably reducing the number of iterations to achieve maximum sequence coverage. In the given example of Figure 3, the peptide stretch ‘PPSDEQLK’ (encircled on top) is found to be completely lost from the analysis range of HD-DDA method but is fully captured by employing presently disclosed method.
Figure 6 gives another representative view of the same aspect. It is also clear from both Figures 5 & 6 that present method yields a more improved quality of spectrum because of higher intensity of fragmented ions.
Example 7: Coverage of y1 ions
In a trypsin-digested sample, C-terminal residue of the peptide is always either lysine or arginine. y1 ion in a given mass spectrum having molecular mass of 147 or 175 indicates the presence of lysine or arginine respectively at C-terminal end of the peptide. In conventional DDA-based methods, it is often difficult to locate the said trypsin-digested y1 ion due to low ion intensity. Because of this limitation, researchers have to rely on multi-enzyme data so as to confirm the identity of the C-terminal residue.
Employing presently disclosed method, y1 ion could be successfully sequenced without any such loss, for all the tryptic peptides analyzed, which would otherwise have been missed with conventional DDA methods. As represented in Figure 7, MS/MS spectrum of two representative runs of trypsin-digested peptides are shown where y1 ion representing lysine has been picked up by claimed method, but which was missed with conventional DDA analysis.
Example 8: Identification of post-translational modifications
Employing present method, it was possible to determine post-translational modifications of peptides with a high level of certainty. For example, Figure 8 represents an MS/MS spectrum of a peptide with amino sequence “DTLMISR”. A site-specific modification (specifically, oxidation) at the 4th amino acid “M” resulted in a corresponding difference in the reading from 4th amino acid level onwards, clearly indicating and confirming utility of present method in identifying post-translational modifications in the peptide under analysis.
The invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The foregoing embodiments and examples are therefore to be considered in all respects illustrative rather than limiting the invention described herein.

Date: Signature:____________________
V.R. Srinivas, Ph.D., LL.B
(Head–IPM, Biologics)
For, Dr. Reddy’s Laboratories Limited , Claims:We Claim:
1. A method for the determination of amino acid sequence of a protein, the method comprising:
(b) obtaining a composition comprising the protein;
(b) subjecting the composition to protease treatment to generate peptide fragments;
(c) separating the peptide fragments by liquid chromatography;
(d) ionizing the separated peptide fragments;
(e) detecting generated ions of step (d) by mass spectrometry;
(f) selecting the ions of step (e);
(g) fragmenting the selected ions of step (f) and
(h) detecting the fragmented ions of step (g) by tandem mass spectrometry;
wherein the selecting, fragmenting and detecting of steps (f) to (h) is performed according to the parameters of Table (A); and
wherein in step (b) the composition is subject to at least one protease to achieve improved, preferably 100%, amino acid sequence coverage in one iteration; and
wherein the method provides for determination of post-translational modification comprising oxidation, deamidation and pyroglutamination in the peptide under analysis.
MS/MS Parameters Value
Scan Time 0.2 sec
Stop MS/MS TIC intensity <10000
Timeout 3 sec
Start MS/MS TIC intensity >1000
Collision energy (CE) (Ramp) Low mass: 6-10 eV ; High mass: 40-70 eV

Table (A)
2. A method for the determination of amino acid sequence of a protein, the method comprising:
(a) obtaining a composition comprising the protein;
(b) subjecting the composition to Asp-N treatment to generate peptide fragments;
(c) separating the peptide fragments by liquid chromatography;
(d) ionizing the separated peptide fragments;
(e) detecting generated ions by mass spectrometry;
(f) selecting the ions of step (e);
(g) fragmenting the selected ions of step (f) and
(h) detecting the fragmented ions of step (g) by tandem mass spectrometry;
wherein the selecting, fragmenting and detecting of steps (f) to (h) is performed according to the parameters of Table (A)
wherein the method yield improved, preferably 100%, amino acid sequence coverage in one iteration; and
wherein the method provides for determination of post-translational modification comprising oxidation, deamidation and pyroglutamination in the peptide under analysis.
3. A method for the determination of amino acid sequence of a protein, the method comprising:
(a) obtaining a composition comprising the protein;
(b) subjecting the composition separately to trypsin and elastase treatment to generate peptide fragments; and performing steps (c) to (h) for trypsin and elastase-treated peptides separately;
(c) separating the peptide fragments by liquid chromatography;
(d) ionizing the separated peptide fragments;
(e) detecting generated ions by mass spectrometry;
(f) selecting the ions of step (e);
(g) fragmenting the selected ions of step (f) and
(h) detecting the fragmented ions of step (g) by tandem mass spectrometry;
wherein the selecting, fragmenting and detecting of steps (f) to (h) is performed according to the parameters of Table (A);
wherein the method yields improved, preferably 100% amino acid sequence coverage; and
wherein the method provides for determination of post-translational modification comprising oxidation, deamidation and pyroglutamination in the peptide under analysis.
4. A method for the determination of amino acid sequence of a protein, the method comprising:
(a) obtaining a composition comprising the protein;
(b) subjecting the composition separately to trypsin, elastase and Asp-N to generate peptide fragments; and performing steps (c) to (h) for trypsin elastase and Asp-N-treated peptides separately;
(c) separating the peptide fragments by liquid chromatography;
(d) ionizing the separated peptide fragments;
(e) detecting generated ions by mass spectrometry;
(f) selecting the ions of step (e);
(g) fragmenting the selected ions of step (f) and
(h) detecting the fragmented ions of step (g) by tandem mass spectrometry;
wherein the selecting, fragmenting and detecting of steps (f) to (h) is performed according to the parameters of Table (A);
wherein the method yields improved, preferably 100% amino acid sequence coverage; and
wherein the method provides for determination of post-translational modification comprising oxidation, deamidation and pyroglutamination in the peptide under analysis.
5. The method as claimed in the aforementioned claims wherein, the method provides complete fragmentation of the peptide and complete coverage of both b and y ions thereby delivering 100% amino acid sequence coverage, wherein the ion intensity of all the fragmented peptides is =2000 counts.
6. The method as claimed in the aforementioned claims wherein, the method provides =80% amino acid sequence coverage of peptides.
7. The method as claimed in the aforementioned claims wherein, the method provides =80% amino acid sequence coverage of peptides that are about 5 to about 35 amino acids long.
8. The method as claimed in the previous claims wherein, the method provides accurate determination of C-terminal basic amino acid in the peptide.
9. The method as claimed in preceding claims wherein, the protein is a recombinant product including an Fc-containing protein, an antibody, a monoclonal antibody or a fusion protein.

Date: Signature:____________________
V.R. Srinivas, Ph.D., LL.B
(Head–IPM, Biologics)
For, Dr. Reddy’s Laboratories Limited

Documents

Application Documents

#	Name	Date
1	202341089248-STATEMENT OF UNDERTAKING (FORM 3) [28-12-2023(online)].pdf	2023-12-28
2	202341089248-POWER OF AUTHORITY [28-12-2023(online)].pdf	2023-12-28
3	202341089248-FORM 1 [28-12-2023(online)].pdf	2023-12-28
4	202341089248-DRAWINGS [28-12-2023(online)].pdf	2023-12-28
5	202341089248-COMPLETE SPECIFICATION [28-12-2023(online)].pdf	2023-12-28
6	202341089248-ENDORSEMENT BY INVENTORS [04-01-2024(online)].pdf	2024-01-04