Broadly Protective Vaccines And Methods Thereof

< Back

Broadly Protective Vaccines And Methods Thereof

Abstract: ABSTRACT “BROADLY PROTECTIVE VACCINES AND METHODS THEREOF” The present disclosure relates to broadly protective vaccines and methods thereof. Particularly, the disclosure relates to vaccines that induces/elicits both humoral (B cell) and cellular (T cell) immune response. More particularly, the disclosure relates to a nucleic acid sequence that encodes a chimeric immunogen, comprising a first sequence encoding a consensus B cell immunogen and a second sequence encoding a consensus T cell immunogen comprising CD4-CD8 bi-specific epitopes, wherein the chimeric immunogen induces a B-cell immune response and a T-cell immune response to different variants or different clades of a pathogen or different variants of an antigen. The nucleic acid sequence is generated by employing a phylogeny-based multi-level consensus sequence generation method.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

20 September 2024

Publication Number

40/2024

Publication Type

INA

Invention Field

BIOTECHNOLOGY

Status

Parent Application

Applicants

INDIAN INSTITUTE OF SCIENCE

CV Raman Road Bangalore Karnataka India 560012

Inventors

1. SHASHANK TRIPATHI

IISc CV Raman Road Bangalore Karnataka India 560012

2. RAJESH THANGAVEL YADAV

IISc CV Raman Road Bangalore Karnataka India 560012

3. RISHAD SHIRAZ

IISc CV Raman Road Bangalore Karnataka India 560012

Claims

1. A nucleic acid sequence encoding a chimeric immunogen, comprising a first sequence encoding a consensus B cell immunogen of the chimeric immunogen and a second sequence encoding a consensus T cell immunogen of the chimeric immunogen comprising CD4-CD8 bi-specific epitopes, wherein the chimeric immunogen induces a B-cell immune response and a T-cell immune response to different variants or different clades of a pathogen or different variants of an antigen.

2. The nucleic acid sequence as claimed in claim 1, wherein the first sequence is operably linked to the second sequence.

3. The nucleic acid sequence as claimed in claim 1 or 2, wherein the second sequence comprises nucleotides that encode the CD4-CD8 bi-specific epitopes and nucleotides that encode for linkers that link the CD4-CD8 bi-specific epitopes.

4. The nucleic acid sequence as claimed in any one of claims 1-3, wherein amino acids of the consensus B cell immunogen encoded by the first sequence are conserved at a frequency of about 50% or more across the different strains or different clades of the pathogen.

5. The nucleic acid sequence as claimed in any one of claims 1-4, wherein amino acids of the CD4-CD8 bi-specific epitopes encoded by the second sequence are conserved at a frequency of about 50% or more across the different strains or different clades of the pathogen.

6. The nucleic acid sequence as claimed in any one of claims 1-5, wherein the second sequence encodes about 2-15 CD4-CD8 bi-specific epitopes.

7. The nucleic acid sequence as claimed in any one of claims 1-6, wherein each of the CD4-CD8 bi-specific epitopes encoded by the second sequence has a length of about 15-40 amino acids.

8. The nucleic acid sequence as claimed in any one of claims 1-7, wherein the first sequence encodes the consensus B cell immunogen selected from a consensus spike protein of SARS-CoV-2, a consensus hemagglutinin of an influenza virus, or a consensus neuraminidase (N) of an influenza virus.

9. The nucleic acid sequence as claimed in any one of claims 1-8, wherein the second sequence encodes consensus CD4-CD8 bi-specific epitopes of SARS-CoV-2 or influenza virus.

10. The nucleic acid sequence as claimed in any one of claims 1-9, wherein the second sequence encodes CD4-CD8 bi-specific epitopes of a consensus nucleoprotein of influenza virus, a consensus matrix protein 1 (M1) of influenza virus, or both.

11. The nucleic acid sequence as claimed in any one of claims 8-10, wherein the influenza virus is selected from influenza A virus or influenza B virus.

12. The nucleic acid sequence as claimed in claim 11, wherein influenza A virus incudes but not limited to H1N1, H3N2, H5N1 and H7N9 subtypes..

13. The nucleic acid sequence as claimed in any one of claims 1-12, wherein the nucleic acid sequence is a DNA sequence or an mRNA sequence.

14. A vector comprising the nucleic acid sequence as claimed in any one of claims 1-12.

15. The vector as claimed in claim 14, wherein the vector is a plasmid or a viral vector.

16. An immunogenic composition comprising the nucleic acid sequence as claimed in any one of claims 1-13 or the vector as claimed in claim 14 or 15 and a pharmaceutical carrier.

17. The immunogenic composition as claimed in claim 16, wherein the composition induces a protective immune response to different variants or different clades of a pathogen or different variants of an antigen.

18. The immunogenic composition as claimed in claim 17, wherein the pathogen is SARS-CoV-2 or influenza.

19. A chimeric immunogen comprising, a consensus B cell immunogen and a consensus T cell immunogen comprising CD4-CD8 bi-specific epitopes, wherein the chimeric immunogen induces a B-cell immune response and a T-cell immune response to different variants or different clades of a pathogen or different variants of an antigen.

20. The chimeric immunogen as claimed in claim 19, wherein the consensus B cell immunogen is linked to the consensus T cell immunogen.

21. The chimeric immunogen as claimed in claim 19 or 20, wherein the C-terminus of the consensus B cell immunogen is linked to the N-terminus of the consensus T cell immunogen.

22. The chimeric immunogen as claimed in any one of claims 19-21, wherein amino acids of the consensus B cell immunogen are conserved at a frequency of about 50% or more across the different strains or different clades of the pathogen.

23. The chimeric immunogen as claimed in any one of claims 19-22, wherein amino acids of the CD4-CD8 bi-specific epitopes are conserved at a frequency of about 50% or more across the different strains or different clades of the pathogen.

24. The chimeric immunogen as claimed in any one of claims 19-23, wherein the consensus T cell immunogen comprises about 2-15 CD4-CD8 bi-specific epitopes.

25. The chimeric immunogen as claimed in any one of claims 19-24, wherein each of the CD4-CD8 bi-specific epitopes has a length of about 15-40 amino acids.

26. The chimeric immunogen as claimed in any one of claims 19-25, wherein the consensus B cell immunogen is selected from a consensus spike protein of SARS-CoV-2, a consensus hemagglutinin of an influenza virus, or a consensus neuraminidase (N) of an influenza virus.

27. The chimeric immunogen as claimed in any one of claims 19-26, wherein the consensus T cell immunogen comprises consensus CD4-CD8 bi-specific epitopes of SARS-CoV-2 or influenza virus.

28. The chimeric immunogen as claimed in any one of claims 19-27, wherein the consensus T cell immunogen comprises consensus CD4-CD8 bi-specific epitopes of nucleoprotein of influenza virus, matrix protein 1 (M1) of influenza virus, or both.

29. The chimeric immunogen as claimed in any one of claims 26-28, wherein the influenza virus is selected from influenza A virus or influenza B virus.

30. The chimeric immunogen as claimed in claim 29, wherein influenza A virus is H1N1 or H3N2.

31. An immunogenic composition comprising the chimeric immunogen as claimed in any one of claims 19-30 and a pharmaceutical carrier.

32. The immunogenic composition as claimed in claim 31, wherein the composition induces a protective immune response to different variants or different clades of a pathogen or different variants of an antigen.

33. The immunogenic composition as claimed in claim 32, wherein the pathogen is SARS-CoV-2 or influenza.

34. A method for inducing an immune response in a subject, comprising administering to the subject the nucleic acid sequence as claimed in any one of claims 1-13; the vector as claimed in claim 14 or 15; the immunogenic composition as claimed in any one of claims 16-18 or 31-33; or the chimeric immunogen as claimed in any one of claims 19-30.

35. The method as claimed in claim 34, wherein the nucleic acid sequence, the vector, the immunogenic composition, or the chimeric antigen induces a protective immune response to different strains or different clades of a pathogen.

36. The method as claimed in claim 35, wherein the pathogen is SARS-CoV-2 or influenza.

37. The method as claimed in claim 36, wherein influenza is influenza A or influenza B.

38. The method as claimed in claim 37, wherein influenza A is H1N1 or H3N2.

39. A method for generating the nucleic acid sequence as claimed in claim 1, comprising: - classifying, by a system, sequences corresponding to a plurality of lineages/clades of a pathogen into a plurality of groups based on respective evolutionary relationships; - performing one of, by the system, generation of a final consensus sequence for B-cell immunogen and for prediction of CD4 epitopes and CD8 epitopes or generation of the final consensus sequence individually for each of, the B-cell immunogen and for the prediction of the CD4 epitopes, and the CD8 epitopes, using a multi-level consensus technique on the classified sequences, wherein the final consensus sequence is indicative of evolutionary average representing each lineage of the plurality of lineages of the pathogen; and - identifying, by the system, one or more regions containing at least one CD4-CD8 bispecific epitope from the predicted CD4 epitopes and the CD8 epitopes, wherein the at least one CD4-CD8 bispecific epitopes are combined with the B-cell immunogen for generating a nucleic acid sequence encoding a chimeric immunogen.

40. The method as claimed in claim 39, wherein the multi-level consensus technique comprises: - generating a consensus sequence for each of the plurality of lineages at a first level of a plurality of levels; - generating, iteratively at one or more subsequent levels of the plurality of levels, one or more combined consensus sequences formed based on a combination of (a) consensus of a parent lineage, (b) consensus sequences of consecutive lineages related to a common parent lineage generated at a previous level and optionally (c) a consensus sequence of a recombinant lineage comprising at least a part of the parent lineage; and - generating a final consensus sequence based on a combination of each of the one or more combined consensus sequences generated at each of the one or more subsequent levels.

41. The method as claimed in claim 1, wherein predicting the CD4 and CD8 epitopes comprises: - processing the final consensus sequence into one or more peptides; - predicting a binding relationship of each of the one or more peptides with a corresponding Major Histocompatibility Complex (MHC) molecule present in a host cell which presents peptides to T-cells, based on predefined binding techniques; and - predicting whether each of the one or more peptides having the binding relationship with respective MHC is triggering an immune response using a predefined predictor.

42. The method as claimed in claim 3, wherein the CD4 and CD8 epitopes are predicted using a first set of predefined techniques and a second set of predefined techniques, respectively.

43. The method as claimed in claim 1, wherein identifying one or more regions containing at least one CD4-CD8 bispecific epitope comprises identifying a predefined number of amino acid overlap regions between the CD4 epitopes and the CD8 epitopes to obtain a plurality of overlapping CD4-CD8 bispecific peptides, by performing an overlap between the predicted CD4 epitopes and CD8 epitopes.

44. The method as claimed in claim 5 further comprising: - determining, from the plurality of overlapping CD4-CD8 bispecific peptides, a set of overlapping CD4-CD8 bispecific peptides which overlap with each other with a predefined number of amino acids; and - determining, iteratively, subsequent overlapping CD4-CD8 bispecific peptides based on respective previously determined overlapping CD4-CD8 bispecific peptide, until no more overlapping is possible or the CD4-CD8 bi-specific peptides reach a predefined threshold length of amino acids.

45. The method as claimed in claim 1, wherein the at least one CD4-CD8 bispecific epitopes are linked with each other using a peptide linker.

46. The method as claimed in claim 1, wherein the nucleic acid sequence is one of, a DNA sequence or an mRNA sequence.

47. The method as claimed in claim 1, wherein the final consensus sequence corresponding to the B cell immunogen comprises amino acids which are conserved at a frequency of at least 50% across the plurality of lineages of the pathogen.

48. The method as claimed in claim 1, wherein the CD4-CD8 bi-specific epitopes comprises amino acids which are conserved at a frequency of at least 50% across the plurality of lineages of the pathogen.

49. The method as claimed in claim 1, wherein each of the at least one CD4-CD8 bi-specific epitopes include a length of about 15-40 amino acids.

50. The method as claimed in claim 1, wherein the evolutionary relationships correspond to phylogenetic relationship.

51. A system for generating a nucleic acid sequence encoding a chimeric immunogen, comprising: a processor; and a memory communicatively coupled to the processor, wherein the memory stores processor instructions, which, on execution, causes the processor to: - classify sequences corresponding to a plurality of lineages/clades of a pathogen into a plurality of groups based on respective evolutionary relationships; - perform one of, generation of a final consensus sequence for B-cell immunogen and for prediction of CD4 epitopes and CD8 epitopes or generation of the final consensus sequence individually for each of, the B-cell immunogen and for the prediction of the CD4 epitopes, and the CD8 epitopes, using a multi-level consensus technique on the classified sequences, wherein the final consensus sequence is indicative of evolutionary average representing each lineage of the plurality of lineages of the pathogen; and - identify one or more regions containing at least one CD4-CD8 bispecific epitope from the predicted CD4 epitopes and the CD8 epitopes, wherein the at least one CD4-CD8 bispecific epitopes are combined with the B-cell immunogen for generating a nucleic acid sequence encoding a chimeric immunogen.

Specification

Description:TECHNICAL FIELD
The present disclosure relates to broadly protective vaccines and methods thereof. Particularly, the disclosure relates to vaccines that induce/elicit both humoral (B cell) and cellular (T cell) immune responses to a phylogenetically conserved sequence(s) of one or more antigens.

BACKGROUND OF THE DISCLOSURE
Viruses affect billions of animals and humans each year and inflict an enormous economic burden on society. Viral pathogens, especially those that contain an RNA genome and cause acute infections such as Influenza, Beta Coronaviruses, and Flaviviruses are major public health problems. Almost all emerging and reemerging viruses of epidemic/pandemic potential fall in this category. These viruses mutate and evolve rapidly, thereby escaping host immunity acquired through infection or vaccination. This necessitates periodic upgradation of the vaccine candidate and frequent boosters. For example, Influenza vaccines need upgradation annually and still, efficacy is below 50% and often there is a mismatch in the vaccine and circulating strains. Similarly, for SARS-CoV-2 vaccine designed against ancestral Wuhan strains became ineffective by the time the Omicron variant emerged.

In the past, researchers have used either the CD4 and/or CD8 T-cell peptide pool or the CD4 and CD8 peptides linked via linkers to design multi-epitope vaccines. One of the most significant drawbacks of these approaches is the narrow MHC population coverage and the limitation on the number of T-cell epitopes one can add to the final construct. The traditional approach has certain drawbacks, such as, suboptimal response for each individual; focus on prevalent MHCs etc.

Thus, broadly protective vaccines against these viruses, which can provide protection against newly emerged variants and minimize the requirements of vaccine upgradation and frequent boosters are highly desired. However, developing effective vaccines against infectious diseases is a challenging and complicated endeavor. More so, if the pathogen is fast mutating and escaping host immunity, such as RNA viruses. In such cases, frequent upgrades of vaccine design and periodic boosters become necessary, which is undesirable considering the already prevalent vaccine hesitancy.

Accordingly, to address the shortcomings of the prior art, the inventors of the present disclosure have developed broadly protective vaccines and methods thereof that induce both humoral (B cell) and cellular (T cell) immune response against different variants/lineages/clades of a pathogen or an antigen. The vaccine is based on an evolutionary average of an immunogen representing sequences conserved across different variants/lineages/clades of a pathogen or an antigen.

STATEMENT OF THE DISCLOSURE
The present disclosure relates to is a nucleic acid sequence encoding a chimeric immunogen, comprising a first sequence encoding a consensus B cell immunogen of the chimeric immunogen and a second sequence encoding a consensus T cell immunogen of the chimeric immunogen comprising CD4-CD8 bi-specific epitopes, wherein the chimeric immunogen induces a B-cell immune response and a T-cell immune response to different variants or different clades of a pathogen or different variants of an antigen. This nucleic acid sequence simultaneously induces local innate immune response and systemic adaptive immune response, which in turn is divided into B- cell-mediated humoral immunity and T-cell-mediated cellular immunity. The T-cell response elicited by administration of the nucleic acid sequence is further divided into the CD8 response, responsible for cytotoxicity, and the CD4 response, which augments both humoral and cellular immunity.
The present disclosure also provides a vector comprising the nucleic acid sequence encoding the chimeric immunogen. In some embodiments, the vector is a plasmid or a viral vector.
The present disclosure also provides an immunogenic composition comprising a nucleic acid sequence encoding the chimeric immunogen, or a vector comprising said nucleic acid and a pharmaceutical carrier. In some embodiments, the immunogenic composition induces a protective immune response to different variants or different clades of a pathogen or different variants of an antigen. In some embodiments, the pathogen is SARS-CoV-2 or influenza.
The present disclosure also provides a chimeric immunogen, which is an amino acid sequence encoded by the nucleic acid sequence of the present disclosure, comprising a consensus B cell immunogen and a consensus T cell immunogen comprising CD4-CD8 bi-specific epitopes, wherein the chimeric immunogen induces a B-cell immune response and a T-cell immune response to different variants or different clades of the same pathogen or different variants of the same antigen.
The present disclosure also provides a method for inducing an immune response in a subject, comprises administering to the subject the nucleic acid sequence encoding the chimeric immunogen; the vector; the immunogenic composition or the chimeric immunogen. In some embodiments, said method induces a protective immune response to different variants or different clades of a pathogen or different variants of an antigen.

The present disclosure also provides a method for generating the nucleic acid sequence encoding the chimeric immunogen, said method comprising: a) classifying, by a system, sequences corresponding to a plurality of lineages/clades of a pathogen or an antigen into a plurality of groups based on respective evolutionary relationships; b) performing one of, by the system, generation of a final consensus sequence for B-cell immunogen and for prediction of CD4 epitopes and CD8 epitopes or generation of the final consensus sequence individually for each of, the B-cell immunogen and for the prediction of the CD4 epitopes, and the CD8 epitopes, using a multi-level consensus technique on the classified sequences, wherein the final consensus sequence is indicative of evolutionary average representing each lineage of the plurality of lineages of the pathogen or the antigen; and c) identifying, by the system, one or more regions containing at least one CD4-CD8 bispecific epitope from the predicted CD4 epitopes and the CD8 epitopes, wherein the at least one CD4-CD8 bispecific epitopes are combined with the B-cell immunogen for generating the nucleic acid sequence encoding the chimeric immunogen.

The present disclosure also provides a system for generating the nucleic acid sequence encoding a chimeric immunogen, comprising:
a) a processor; and
b) a memory communicatively coupled to the processor, wherein the memory stores processor instructions, which, on execution, causes the processor to:
- classify sequences corresponding to a plurality of lineages/clades of a pathogen or an antigen into a plurality of groups based on respective evolutionary relationships;
- perform one of, generation of a final consensus sequence for B-cell immunogen and for prediction of CD4 epitopes and CD8 epitopes or generation of the final consensus sequence individually for each of, the B-cell immunogen and for the prediction of the CD4 epitopes, and the CD8 epitopes, using a multi-level consensus technique on the classified sequences, wherein the final consensus sequence is indicative of evolutionary average representing each lineage of the plurality of lineages of the pathogen or the antigen; and
- identify one or more regions containing at least one CD4-CD8 bispecific epitope from the predicted CD4 epitopes and the CD8 epitopes, wherein the at least one CD4-CD8 bispecific epitopes are combined with the B-cell immunogen for generating a nucleic acid sequence encoding a chimeric immunogen.

BRIEF DESCRIPTION OF THE ACCOMPANYING FIGURES
The accompanying drawings illustrate some of the embodiments of the present invention and together with the descriptions, serve to explain the invention. These drawings have been provided by way of illustration and not by way of limitation. The components in the drawings are not necessarily drawn to scale, emphasis instead being placed upon clearly illustrating the principles of the aspects of the embodiments.
Figure 1 shows a schematic of the chimeric immunogen comprising a consensus B cell immunogen and a consensus T cell immunogen comprising multiple CD4-CD8 bi-specific epitopes according to one embodiment.
Figure 2A shows a schematic for consensus sequence generation according to one embodiment.
Figure 2B shows a schematic explaining the multi-level consensus sequence generation for a highly diverse immunogen according to one embodiment.
Figure 3 shows a general framework for developing a consensus sequence for an immunogen.
Figure 4A depicts a pairwise sequence identity comparison of SARS-CoV-2 consensus protein immunogen (SCoP) with SARS-CoV-2 variants consensus sequences of Spike protein (upper panel) and da pairwise sequence identity comparison of SCoP immunogen with SARS-CoV-2 variant consensus sequences of Nucleocapsid protein.
Figure 4B provides docking scores of neutralizing mAbs binding to spike protein based on SCoP sequence or SARS-CoV-2 variant sequences.
Figure 5 represents a pairwise sequence identity comparison of consensus sequence to the individual clade sequences of Influenza virus for NP (upper panel) and M1 (lower panel).
Figure 6 illustrates a schematicshowing the CD4 and CD8 bispecific T-cell peptide saturation according to one embodiment.
Figure 7 provides a 3D structure of a chimeric vaccine antigen for Influenza H1N1 (upper panel) and SARS-CoV-2 (lower panel) according to one embodiment.
Figure 8 is a schematic showing the linking of a B cell immunogen with a T cell immunogen of a chimeric vaccine according to one embodiment.
Figure 9A shows a design of a Broadly Protective SCoP mRNA Vaccine against SARS-CoV-2.
Figure 9B shows the results of an immunogenicity assay of the mRNA vaccine against SARS-CoV-2 in mice.
Figure 10A shows a design of a Chimeric Broadly protective immunogen based on consensus HA and CD4-CD8 bi-specific regions from M1 and NP for influenza A virus according to one embodiment.
Figure 10B shows the docking scores for mAb binding of consensus H1 HA when compared to homologous HA proteins.
Figure 10C shows the results of percentage body weight change post virus challenge upon intranasal immunization of mice with a T-cell peptide pool generated according to the present disclosure.
Figure 10D shows the results of a survival assay upon intranasal immunization of mice with a T-cell peptide pool generated according to the present disclosure.
Figure 11 shows a flow chart illustrating method operations for generating the nucleic acid sequence according to one embodiments of the disclosure.
Figure 12 shows a flow chart illustrating method operations for generating a multi-level consensus according to one embodiments of the disclosure.
Figure 13 shows an exemplary illustration of generation of multi-level consensus according to one embodiment of the disclosure.
Figure 14 shows an exemplary schematic for the generation of CD4 and CD8 bispecific T-cell peptide saturation.
Figure 15 illustrates an exemplary overview of a system for generating a nucleic acid sequence encoding a chimeric immunogen in accordance with an embodiment of the disclosure.
Figure 16 shows a map of an exemplary SARS-CoV-2 Consensus Spike sequence according to one embodiment.
DETAILED DESCRIPTION OF THE DISCLOSURE
With respect to the use of substantially any plural and/or singular terms herein, those having skill in the art can translate from the plural to the singular and/or from the singular to the plural as is appropriate to the context and/or application. The various singular/plural permutations may be expressly set forth herein for sake of clarity. The use of the expression “at least” or “at least one” suggests the use of one or more elements or ingredients or quantities, as the use may be in the embodiment of the disclosure to achieve one or more of the desired objects or results. Throughout this specification, the word “comprise”, or variations such as “comprises” or “comprising” or “containing” or “has” or “having”, or “including but not limited to” wherever used, will be understood to imply the inclusion of a stated element, integer or step, or group of elements, integers or steps, but not the exclusion of any other element, integer or step, or group of elements, integers or steps.

Reference throughout this specification to “one embodiment”, “an embodiment”, “some embodiments” means that a particular feature, structure or characteristic described in connection with the embodiment may be included in at least one embodiment of the present disclosure. Thus, the appearances of the phrases “in one embodiment” “in an embodiment”, on “in some embodiments” in various places throughout this specification may not necessarily all refer to the same embodiment. It is appreciated that certain features of the disclosure, which are, for clarity, described in the context of separate embodiments, may also be provided in combination in a single embodiment. Conversely, various features of the disclosure, which are, for brevity, described in the context of a single embodiment, may also be provided separately or in any suitable sub-combination.

As used herein, the term “immunogen” refers to a specific sequence that elicits a B cell-mediated humoral immune response, a T cell- mediated immune response, or both in a subject. The term “humoral immunogen” or “B-cell immunogen” refers to a consensus sequence that elicits a B cell-mediated humoral immune response. The term “cellular immunogen” or “T-cell immunogen” refers to a consensus sequence that activates a CD4 and/or CD8 T cell-mediated immune response. The term “CD4-CD8 bispecific immunogen” refers to a consensus sequence that activates both CD4 and CD8 T cell-mediated immune response. In some embodiments, the consensus sequence is a nucleic acid sequence. In some embodiments, the consensus sequence is an amino acid sequence.

As used herein, the term “chimeric immunogen” refers to a consensus sequence that elicits both B cell-mediated humoral immune response and T cell- mediated immune response. In some embodiments, the consensus sequence is a nucleic acid sequence. In some embodiments, the consensus sequence is an amino acid sequence.

As used herein, the term “epitope” refers to a part of an antigen that is recognized by the immune system, such as by antibodies, B cells, or T cells and is capable of stimulating an immune response.

As used herein, the term “CD4-CD8 bispecific epitope” refers to a peptide that contains both CD4 and CD8 epitopes. In some embodiments, the length of a CD4-CD8 bispecific epitope can vary from 15-40 amino acids.

As used herein, the term “multi-epitope” refers to a structure that comprises multiple CD4-CD8 bispecific epitopes. In some embodiments, the multi-epitope comprises 2-15 CD4-CD8 bi-specific epitopes.

As used herein, the term “Consensus Sequence” refers to a nucleotide or amino acid sequence in which the nucleotides or amino acids are the ones which occur most frequently at that site in nature across variants of different lineages. The consensus sequence is a sequence of most frequent residues, either nucleotide or amino acid that is arrived at from multiple sequence alignments in which related sequences are compared to each other and most frequently occurring residues at a given site are determined. For example, in the context of a viral protein/antigen, first, closely related sequences such as sequences of subclades encoding that protein/antigen are compared to each other to arrive at a consensus sequence for that clade, then consensus sequences of different clades are compared to arrive at the next level of a consensus sequence. The process is repeated for subclades and clades of different variants of the same virus to arrive at a final consensus sequence that represents an evolutionary average of that specific viral protein/antigen. A person of ordinary skill in the art would understand that a similar process can be followed for bacteria or oncogenes to arrive at a final consensus sequence of a bacterial protein or oncogene.

As used herein, the term “protective immune response/protective immunity” refers to the immune system’s ability to resist infection or reinfection or attenuate an infectious disease or its clinical presentation or the immune system’s ability to mount an immune response to oncogenes.

The term “broadly protective vaccine” refers to a vaccine that can activate the immune system in response to variants of different lineages of the same pathogen or the same antigen.

The present invention aims at developing broadly protective vaccines/immunogenic compositions. The present disclosure provides a nucleic acid sequence encoding a chimeric immunogen, comprising a first sequence encoding a consensus B cell immunogen of the chimeric immunogen and a second sequence encoding a consensus T cell immunogen of the chimeric immunogen comprising CD4-CD8 bi-specific epitopes, wherein the chimeric immunogen induces a B-cell immune response and a T-cell immune response to different variants or different clades of the same pathogen or different variants of the same antigen. The term “antigen” encompasses antigens from the pathogen as well as a cancer-associated antigen such as an oncogene.

Further, the present disclosure also provides a method for preparing a vaccine, wherein a humoral immunogen and a cellular immunogen are developed from an evolutionary consensus of an antigen of interest and are operably linked. Further, to activate both CD4 and CD8 T cell responses, the inventors have generated CD4-CD8 bispecific epitopes of the antigen of interest and multiple bispecific epitopes are linked to each other through linkers to form a multi-epitope cellular immunogen. In some embodiments, this multi-epitope cellular immunogen is linked directly or through another sequence to the C-terminal cytoplasmic tail of the consensus B-cell immunogen. In some other embodiments, the consensus B-cell immunogen is linked directly or through another sequence to the C-terminal cytoplasmic tail of the multi-epitope cellular immunogen.

The present invention also discloses a method for inducing immune response by administering to the subject the nucleic acid sequence encoding the chimeric immunogen; and/or a vector comprising said nucleic acid sequence; and/or an immunogenic composition comprising the nucleic acid sequence or the vector and a pharmaceutical carrier; and/or chimeric immunogen.

In some embodiments, provided herein is a nucleic acid sequence encoding a chimeric immunogen, comprising a first sequence encoding a consensus B cell immunogen of the chimeric immunogen and a second sequence encoding a consensus T cell immunogen of the chimeric immunogen comprising CD4-CD8 bi-specific epitopes, wherein the chimeric immunogen induces a B-cell immune response and a T-cell immune response to different variants or different clades of the same pathogen or different variants of the same antigen. Upon administration of this nucleic acid sequence to a subject, the nucleic acid sequence is translated by the subject’s cellular machinery into the chimeric antigen, which in turn, induces a B-cell-mediated humoral immune response and a T-cell-mediated cellular immune response. The T-cell response is further divided into a CD8 response, responsible for killing infected cells, and a CD4 response, which augments both humoral and cellular immune responses. Figure 1 shows a schematic of the chimeric immunogen according to one embodiment.

In some embodiments, in the nucleic acid sequence encoding the chimeric immunogen, the first nucleic acid sequence is operably linked to the second nucleic acid sequence.

In some embodiments, the first nucleic acid sequence comprises nucleotides that encode a consensus B cell immunogen and the second nucleic acid sequence comprises nucleotides that encode the CD4-CD8 bi-specific epitopes and nucleotides that encode for linkers that link the CD4-CD8 bi-specific epitopes.

In some other embodiments, the first nucleic acid sequence comprises nucleotides that encode the CD4-CD8 bi-specific epitopes and nucleotides that encode for linkers that link the CD4-CD8 bi-specific epitopes and the second nucleic acid sequence comprises nucleotides that encode a consensus B cell immunogen.

In some embodiments, a third nucleic acid sequence is included between the first nucleic acid sequence and the second nucleic acid, where the third nucleic acid sequence comprises nucleotides that encode a 2A peptide.

In some embodiments, amino acids of the consensus B cell immunogen encoded by the first sequence are conserved at a frequency of about 50% or more across the different variants or different clades of the same pathogen or different variants of the same antigen.

In some embodiments, amino acids of the CD4-CD8 bi-specific epitopes encoded by the second sequence are conserved at a frequency of about 50% or more across the different variants or different clades of the same pathogen or different variants of the same antigen. In some embodiments, the second sequence encodes about 2-15 CD4-CD8 bi-specific epitopes. For example, in some embodiments, the second sequence encodes 2-14, 2-13, 2-11, 2-10, 2-8, 2-6, 2-5, 2-4, 3-15, 3-13, 3-12, 3-10, 3-8, 3-6, 3-5, 4-15, 4-14, 4-12, 4-10, 4-8, 4-7, 4-6, 5-15, 5-12, 5-10, 5-8, 6-15, 6-12, 6-10, 6-8, 8-15, 8-13, 8-12, 8-10, 9-15, 9-13, 10-15, 10-13, 11-15, 12-15, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 CD4-CD8 bi-specific epitopes. The multiple bi-specific epitopes are linked to each other by linkers such as a –(GGGGS)- or G4S peptide linkers. G4S peptide linker is a flexible linker which could be easily cleaved inside the cell and leave the individual bi-specific epitope containing peptides for antigen presentation.

In some embodiments, each of the CD4-CD8 bi-specific epitopes encoded by the second sequence has a length of about 15-40 amino acids, such as, for example, about 15-35, 15-30, 15-25, 15-20, 18-40, 18-35, 18-30, 18-25, 18-22, 20-40, 20-25, 20-30, 20-25, 25-40, 25-35, 25-30, 30-40, or 35-40 amino acids.

The nucleic acid sequence encoding the chimeric immunogen according to the present disclosure can be designed as a vaccine or an immunogenic composition to activate immunity against any infectious agent including viruses, bacteria, fungi, and the like as well as other antigens such as cancer-associated antigens or antigens associated with other diseases.

In some embodiments, in the nucleic acid sequence encoding the chimeric immunogen, the first sequence encodes the consensus B cell immunogen selected from a consensus spike protein of SARS-CoV-2 or a consensus hemagglutinin (HA) or neuraminidase (N) protein of an influenza virus. In some embodiments, the second sequence encodes consensus CD4-CD8 bi-specific epitopes from the proteome of SARS-CoV-2 or an influenza virus. In some embodiments, the second sequence encodes CD4-CD8 bi-specific epitopes of a consensus nucleoprotein of an influenza virus, a consensus matrix protein 1 (M1) of an influenza virus, or both. A person of ordinary skill in the art would understand that the terms like “consensus spike protein of SARS-CoV-2”, “consensus hemagglutinin protein of an influenza virus”, “consensus nucleoprotein of an influenza virus” refer to a final consensus sequence of that protein arrived at by multiple level sequence alignments across subclades and clades of different variants as described above.

In some embodiments, in the nucleic acid sequence encoding the chimeric immunogen, the first sequence encodes a consensus spike protein of SARS-CoV-2 and the second sequence encodes consensus CD4-CD8 bi-specific epitopes from the proteome of SARS-CoV-2. In some embodiments, the first sequence encodes a consensus hemagglutinin protein of influenza virus and the second sequence encodes CD4-CD8 bi-specific epitopes of a consensus nucleoprotein of influenza virus, a consensus matrix protein 1 (M1) of influenza virus, or both.

In some embodiments, the influenza virus is selected from influenza A virus or influenza B virus. That is, in some embodiments, the first sequence encodes a consensus hemagglutinin protein of influenza A virus, and the second sequence encodes CD4-CD8 bi-specific epitopes of a consensus nucleoprotein of influenza A virus, a consensus matrix protein 1 (M1) of influenza A virus, or both. In some embodiments, the first sequence encodes a consensus hemagglutinin protein of influenza B virus, and the second sequence encodes CD4-CD8 bi-specific epitopes of a consensus nucleoprotein of influenza B virus, a consensus matrix protein 1 (M1) of influenza B virus, or both. In some embodiments, influenza A virus is selected from H1N1, H3N2, H5N1, or H7N9.

In some embodiments, the nucleic acid sequence encoding the chimeric antigen is a DNA sequence or an mRNA sequence.

In some embodiments, the present disclosure provides a vector comprising the above-described nucleic acid sequence encoding the chimeric immunogen. In some embodiments, the vector is a plasmid or a viral vector. Exemplary viral vectors include an adenoviral vector or a vaccine vector like modified vaccinia virus Ankara (MVA) vector.

In some embodiments, the present disclosure provides an immunogenic composition comprising the above-described nucleic acid sequence encoding the chimeric immunogen, or the vector and a pharmaceutical carrier. In some embodiments, the immunogenic composition comprises an adjuvant such as cGAMP and CPG ODN1018.

In some embodiments, the immunogenic composition induces a protective immune response to different variants or different clades of the same pathogen or different variants of the same antigen. In some embodiments, the immunogenic composition induces a protective immune response to different variants or different clades of SARS-CoV-2 or influenza.

The present disclosure also provides a chimeric immunogen comprising a consensus B cell immunogen and a consensus T cell immunogen comprising CD4-CD8 bi-specific epitopes, wherein the chimeric immunogen induces a B-cell immune response and a T-cell immune response to different variants or different clades of the same pathogen or different variants of the same antigen. In the chimeric immunogen, the C-terminus of the consensus B cell immunogen may be linked to the N-terminus of the consensus T cell immunogen or vice versa. In some embodiments, the amino acids of the consensus B cell immunogen are conserved at a frequency of about 50% or more across the different variants or different clades of the same pathogen or different variants of the same antigen. In some embodiments, in the chimeric immunogen, amino acids of the CD4-CD8 bi-specific epitopes are conserved at a frequency of about 50% or more across the different variants or different clades of the same pathogen or different variants of the same antigen.

In some embodiments, the consensus T cell immunogen comprises about 2-15 CD4-CD8 bi-specific epitopes. For example, in some embodiments, the consensus T cell immunogen comprises 2-14, 2-13, 2-11, 2-10, 2-8, 2-6, 2-5, 2-4, 3-15, 3-13, 3-12, 3-10, 3-8, 3-6, 3-5, 4-15, 4-14, 4-12, 4-10, 4-8, 4-7, 4-6, 5-15, 5-12, 5-10, 5-8, 6-15, 6-12, 6-10, 6-8, 8-15, 8-13, 8-12, 8-10, 9-15, 9-13, 10-15, 10-13, 11-15, 12-15, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 CD4-CD8 bi-specific epitopes.

In some embodiments, in the chimeric immunogen, each of the CD4-CD8 bi-specific epitopes has a length of about 15-40 amino acids, such as, for example, about 15-35, 15-30, 15-25, 15-20, 18-40, 18-35, 18-30, 18-25, 18-22, 20-40, 20-25, 20-30, 20-25, 25-40, 25-35, 25-30, 30-40, or 35-40 amino acids.

In some embodiments of the chimeric immunogen, the consensus B cell immunogen is selected from a consensus spike protein of SARS-CoV-2 or a consensus HA or N protein of an influenza virus. In some embodiments, in the nucleic acid sequence encoding the chimeric immunogen, the first sequence encodes the consensus B cell immunogen selected from a consensus spike protein of SARS-CoV-2HA or N protein of an influenza virus. In some embodiments, the second sequence encodes consensus CD4-CD8 bi-specific epitopes from the proteome of SARS-CoV-2 or an influenza virus. In some embodiments, the second sequence encodes CD4-CD8 bi-specific epitopes of a consensus nucleoprotein of an influenza virus, a consensus matrix protein 1 (M1) of an influenza virus, or both.

In some embodiments of the chimeric immunogen, the consensus T cell immunogen comprises consensus CD4-CD8 bi-specific epitopes of non-surface proteins of SARS-CoV-2 or an influenza virus. In some embodiments, the consensus T cell immunogen comprises consensus CD4-CD8 bi-specific epitopes of a consensus nucleoprotein of an influenza virus, a consensus M1 protein of an influenza virus, or both. In some embodiments, the chimeric immunogen as provided herein comprises a consensus B cell immunogen and a consensus T cell immunogen of the influenza virus selected from influenza A or influenza B. In some embodiments, influenza A virus is H1N1, H3N2, H5N1, or H7N9.

In some embodiments, the consensus B cell immunogen is linked to the consensus T cell immunogen through an amino acid sequence that forms a transmembrane region and/or an amino acid sequence of a 2A peptide.

The present disclosure also provides an immunogenic composition comprising the chimeric immunogen and a pharmaceutical carrier. In some embodiments, the immunogenic composition comprises only a portion of the chimeric immunogen (a sub-unit vaccine). For example, in some embodiments, the immunogenic composition may comprise only the consensus B cell immunogen or a fragment thereof that is capable of inducing a B cell immune response. In some embodiments, the immunogenic composition may comprise only the consensus T cell immunogen or a fragment thereof that is capable of inducing a CD4 T cell response, a CD8 T cell response or both. In some embodiments, the chimeric immunogen is administered in the form of virus-like particles (VLP). In some embodiments, the subunit vaccine is loaded in nanoparticles.
In some embodiments, the immunogenic composition comprising the chimeric immunogen or a portion thereof comprises an adjuvant. Exemplary adjuvants include, but are not limited to, Alum based adjuvants, AS03, AS01 (MPL+QS21 loaded in liposome), MF59, AS04(detoxified LPS+ alum).

In some embodiments, the composition induces a protective immune response to different variants or different clades of the same pathogen or different variants of the same antigen. In some embodiments, the pathogen is SARS-CoV-2 or influenza.

The present disclosure also provides a method for inducing an immune response in a subject, comprising administering to the subject: a) the nucleic acid sequence encoding the chimeric immunogen; b) a vector comprising said nucleic acid sequence; c) the chimeric immunogen; or d) an immunogenic composition comprising said nucleic acid sequence, vector or the chimeric immunogen. In some embodiments, said method induces a protective immune response to different variants or different clades of the same pathogen or different variants of the same antigen. In some embodiments, the method induces a protective immune response to SARS-CoV-2 or influenza. For example, in some embodiments, the method induces a protective immune response to influenza A or influenza B. In some embodiments, influenza A is H1N1, H3N2, H5N1, or H7N9.

The present disclosure also provides a method for generating the nucleic acid sequence encoding a chimeric immunogen, comprising:
- classifying, by a system, sequences corresponding to a plurality of variants/lineages/clades of a pathogen or an antigen into a plurality of groups based on respective evolutionary relationships;
- performing one of, by the system, generation of a final consensus sequence for the B-cell immunogen; a sequence for prediction of CD4 epitopes and CD8 epitopes or generation of the final consensus sequence individually for each of, the B-cell immunogen and for the prediction of the CD4 epitopes, and the CD8 epitopes, using a multi-level consensus technique on the classified sequences, wherein the final consensus sequence is indicative of evolutionary average representing each variant/lineage/clade of the plurality of variants/lineages/clades of the pathogen or the antigen; and
- identifying, by the system, one or more regions containing at least one CD4-CD8 bispecific epitope from the predicted CD4 epitopes and the CD8 epitopes, wherein the at least one CD4-CD8 bispecific epitopes are combined with the B-cell immunogen for generating said nucleic acid sequence encoding a chimeric immunogen.

In some embodiments, the multi-level consensus technique comprises:
- generating a consensus sequence for each of the plurality of variants/lineages/clades at a first level of a plurality of levels;
- generating, iteratively at one or more subsequent levels of the plurality of levels, one or more combined consensus sequences formed based on a combination of (a) consensus of a parent lineage, (b) consensus sequences of consecutive lineages related to a common parent lineage generated at a previous level and optionally (c) a consensus sequence of a recombinant lineage comprising at least a part of the parent lineage; and
- generating a final consensus sequence based on a combination of each of the one or more combined consensus sequences generated at each of the one or more subsequent levels.

In some embodiments, a method for predicting the CD4 and CD8 epitopes comprises:
- processing the final consensus sequence into one or more peptides;
- predicting a binding relationship of each of the one or more peptides with a corresponding Major Histocompatibility Complex (MHC) molecule present in a host cell which presents peptides to T-cells, based on predefined binding techniques; and
- predicting whether each of the one or more peptides having the binding relationship with respective MHC is triggering an immune response using a predefined predictor.

In some embodiments, in said method, the CD4 and CD8 epitopes are predicted using a first set of predefined techniques and a second set of predefined techniques, respectively.

In some embodiments, in said method, identifying one or more regions containing at least one CD4-CD8 bispecific epitope comprises identifying a predefined number of amino acid overlap regions between the CD4 epitopes and the CD8 epitopes to obtain a plurality of overlapping CD4-CD8 bispecific peptides, by performing an overlap between the predicted CD4 epitopes and CD8 epitopes.

In some embodiments, said method further comprises:
- determining, from the plurality of overlapping CD4-CD8 bispecific peptides, a set of overlapping CD4-CD8 bispecific peptides which overlap with each other with a predefined number of amino acids; and

- determining, iteratively, subsequent overlapping CD4-CD8 bispecific peptides based on respective previously determined overlapping CD4-CD8 bispecific peptide, until no more overlapping is possible or the CD4-CD8 bi-specific peptides reach a predefined threshold length of amino acids.
In some embodiments, in said method, at least one CD4-CD8 bispecific epitopes are linked with each other using a peptide linker.

In some embodiments, the nucleic acid sequence generated by the method is a DNA sequence or an mRNA sequence.

In some embodiments of said method, the final consensus sequence corresponding to the B cell immunogen comprises amino acids which are conserved at a frequency of at least 50% across the plurality of variants/lineages/clades of the same pathogen or across the plurality of variants of the same antigen.

In some embodiments, the CD4-CD8 bi-specific epitopes generated/identified by said method comprise amino acids which are conserved at a frequency of at least 50% across the plurality of variants/lineages/clades of the same pathogen or across the plurality of variants of the same antigen. In some embodiments, the at least one CD4-CD8 bi-specific epitope generated by the method has a length of about 15-40 amino acids.

In some embodiments of said method, the evolutionary relationships correspond to phylogenetic relationship.

In some embodiments, the present disclosure also provides a system for generating a nucleic acid sequence encoding a chimeric immunogen, comprising:
a processor; and
a memory communicatively coupled to the processor, wherein the memory stores processor instructions, which, on execution, causes the processor to:
 classify sequences corresponding to a plurality of variants/lineages/clades of a pathogen or an antigen into a plurality of groups based on respective evolutionary relationships;
 perform one of, generation of a final consensus sequence for the B-cell immunogen and a sequence for prediction of CD4 epitopes and CD8 epitopes or generation of the final consensus sequence individually for each of, the B-cell immunogen and for the prediction of the CD4 epitopes, and the CD8 epitopes, using a multi-level consensus technique on the classified sequences, wherein the final consensus sequence is indicative of evolutionary average representing each variant/lineage/clade of the plurality of variants/lineages/clades of the pathogen or the antigen; and
 identify one or more regions containing at least one CD4-CD8 bispecific epitope from the predicted CD4 epitopes and the CD8 epitopes, wherein the at least one CD4-CD8 bispecific epitopes are combined with the B-cell immunogen for generating said nucleic acid sequence encoding a chimeric immunogen.
Figure 11 shows a flow chart illustrating method operations for generating the nucleic acid sequence, in accordance with one or more example embodiments of the disclosure.

As illustrated in Figure 11, the method 1100 may include one or more operations. The method 1100 may be described in the general context of computer executable instructions. Generally, computer executable instructions can include routines, programs, objects, components, data structures, procedures, modules, and functions, which perform particular functions or implement particular abstract data types.

The order in which the method 1100 is described is not intended to be construed as a limitation, and any number of the described method blocks can be combined in any order to implement the method. Additionally, individual blocks may be deleted from the methods without departing from the scope of the subject matter described herein. Furthermore, the method can be implemented in any suitable hardware, software, firmware, or combination thereof.

At step 1101, sequences corresponding to a plurality of lineages/clades of a pathogen are classified by a system into a plurality of groups based on respective evolutionary relationships. The system is explained in detail in subsequent figure. The sequences may be obtained from one or more sources, such as, but not limited to, databases (such as, GenBank, GISAID) and direct sequencing efforts. The sequences may be classified into different lineage/clade/serotypes based on the immunogen of interest. The pathogen may be, for example, SARS-CoV-2 or influenza, and the like. Herein, the evolutionary relationships correspond to phylogenetic relationships.

At step 1102, generation of a final consensus sequence for B-cell immunogen and for prediction of CD4 epitopes and CD8 epitopes or generation of the final consensus sequence individually for each of, the B-cell immunogen and for the prediction of the CD4 epitopes and the CD8 epitopes is performed by the system, using a multi-level consensus technique, on the classified sequences. Herein, the final consensus sequence is indicative of evolutionary average representing each lineage of the plurality of lineages of the pathogen. In an embodiment, the final consensus sequence corresponding to the B cell immunogen comprises amino acids which are conserved at a frequency of at least 50% across the plurality of lineages of the pathogen.
The analysis from multi-level consensus provides insights into the evolutionary relationships between the sequences, their shared structural and functional features, and possible mutations that may have occurred over time. Figure 12 shows a flow chart illustrating method operations for generating a multi-level consensus. As shown, at step 1201, a consensus sequence is generated by the system for each of the plurality of lineages at a first level of a plurality of levels. Particularly, the sequences are aligned using appropriate alignment tools such as, but not limited to, MAFFT, CLUSTALW, MUSCLE, and the like. In an embodiment, a scoring matrix assigns scores to different alignments of sequences based on the similarity between the amino acids or nucleotides. In one example, scoring matrices may include PAM, BLOSUM, and the like.
At step 1202, iteratively at one or more subsequent levels of the plurality of levels, one or more combined consensus sequences are generated based on a combination of (a) consensus of a parent lineage, (b) consensus sequences of consecutive lineages related to a common parent lineage generated at a previous level and optionally (c) a consensus sequence of a recombinant lineage comprising at least a part of the parent lineage.
At step 1203, a final consensus sequence is generated based on a combination of each of the one or more combined consensus sequences generated at each of the one or more subsequent levels.
Figure 13 shows an exemplary illustration of generation of multi-level consensus. As shown, consider a viral sequence of lineage A, B, C as shown, and each of them include two or three sub lineages. In the case of multi-level consensus, sequences in the lineage are taken as input and a multi-level consensus sequence is constructed. This multi-level consensus avoids input bias coming from a predominant lineage which are present in higher number as compared to others.
The multi-level consensus generation technique consists of multiple iterations of single-level consensus generations at different levels of the evolutionary tree. Each lineage or phylogenetic group may first undergo a single-level consensus generation to get the consensus sequence of each lineage/clade/phylogenetic group for the immunogen. As shown, there are n+1 levels where n is the number of sublineage levels. n+1th level deals with recombinant lineages or recombinants of recombinant lineages. Recombinant lineages would be descendants of two lineages. So, for each recombinant lineage, the consensus sequence would be added to the consensus pool of both parent lineages. Multi-level consensus starts from the nth level if there are no recombinant phylogenetic groups. Now, the most evolved sublineage at the far end of the phylogeny would be dealt with at the nth level. Here, the single-level consensus generation may be applied to these sublineage. The consensus sequences generated out of that may be added to the consensus lineage pool of their immediate parent lineages. Next, the lineage pools in the (n-1)th level is considered. The single-level consensus generation may be applied to the sublineage pool, and the consensus sequences generated may be added to the consensus lineage pool of their immediate parent lineages. This may further lower iterations to reach 1st level, where the single-level consensus generation is applied to generate a single final consensus lineage. This may be performed till root variants in each evolutionary direction is reached. Then, a consensus sequence is generated (shown in bold line in Fig.13) from all the multi-level consensus sequences root variant lineages to make the final multi-level consensus sequence. This multi-level consensus avoids the input bias from a predominant lineage, which is present in higher numbers than others.
Thus, the multi-level consensus technique relies on combining information about the context of amino acid positions in alignment with their phylogenetic relationships provides a more accurate representation of the consensus amino acid at those positions. The multi-level consensus reduces input bias, i.e., combining phylogenetic information with protein amino acid sequence information for consensus generation alleviates temporal, geographic, phylogenetic, and sampling bias in the immunogen sequence dataset. It helps to broaden immune specificity of the immunogen as it determines the context of each position in the alignment.
Returning to Figure11, post the generation of the final consensus, the CD4 epitopes and CD8 epitopes are predicated. The prediction of the CD4 epitopes and CD8 epitopes includes processing the final consensus sequence into one or more peptides. In an embodiment, cell peptide prediction techniques use computational tools to identify short segments of proteins (peptides) that can bind to and activate T-cells. Table 1 below shows exemplary computational tools for the prediction of CD4 and CD8 T-cell peptides. The peptide prediction techniques are crucial in various fields, such as, but not limited to, vaccine development, immunology research, and the like. The protein sequence are processed into potential peptide/peptide fragments. These fragments are typically 8-15 amino acids long. Next, a binding relationship of each of the one or more peptides is predicted with a corresponding Major Histocompatibility Complex (MHC) molecule present in a host cell which presents peptides to T-cells, based on predefined binding techniques. The MHC molecules are proteins on the surface of cells that present peptides to T-cells. Each MHC molecule has a specific binding pocket accommodating certain amino acids. The Human MHC alleles with the highest world population coverage may be selected to predict the CD4 and CD8 T-cell epitopes. The CD4 and CD8 epitopes are predicted using a first set of predefined techniques and a second set of predefined techniques, respectively. The first set of predefined technique may be motif based methods which rely on identifying known amino acid patterns associated with MHC binding. They are relatively simple and fast. The second set of predefined technique may be structure-based methods which may use computational models of the MHC binding pocket to predict how well a peptide may fit. Finally, it is predicted whether each of the one or more peptides having a binding relationship with respective MHC is triggering an immune response using a predefined predictor. That is, it can trigger an immune response using a PRIME immunogenicity predictor.

CD4 tools CD8 Tools
NetMHCIIpan BA ANN
NetMHCIIpan EL HLAthena
MixMHCIIpred MHCflurry
NN Align MHCSeqNet Oneshot
NetMHCII MHCSeqNet Sequence
MHCII3D NetMHC
SMM Align NetMHCpan BA
Comblib NetMHCpan EL
Struniolo MixMHCpred
PSSMHCpan
Pickpocket
SMMPMBEC
Table 1
At step 1103, identifying, by the system, one or more regions containing at least one CD4-CD8 bispecific epitope from the predicted CD4 epitopes and the CD8 epitopes. The at least one CD4-CD8 bispecific epitopes are combined with the B-cell immunogen for generating a nucleic acid sequence encoding a chimeric immunogen. Herein, the nucleic acid sequence is one of, a DNA sequence or an mRNA sequence. Particularly, identifying one or more regions containing at least one CD4-CD8 bispecific epitope comprises identifying a predefined number of amino acid overlap regions between the CD4 epitopes and the CD8 epitopes to obtain a plurality of overlapping CD4-CD8 bispecific peptides, by performing an overlap between the predicted CD4 epitopes and CD8 epitopes. The at least one CD4-CD8 bispecific epitopes are linked with each other using a peptide linker. In an embodiment, CD4-CD8 bi-specific epitopes comprises amino acids which are conserved at a frequency of at least 50% across the plurality of lineages of the pathogen. In an embodiment, each of the at least one CD4-CD8 bi-specific epitopes include a length of about 15-40 amino acids.

In addition, the method includes performing bispecific multiepitope saturation which includes determining, from the plurality of overlapping CD4-CD8 bispecific peptides, a set of overlapping CD4-CD8 bispecific peptides which overlap with each other with a predefined number of amino acids. In an example, the predefined number of amino acids may be three. Then, determining, iteratively, subsequent overlapping CD4-CD8 bispecific peptides based on respective previously determined overlapping CD4-CD8 bispecific peptides, until no more overlapping is possible or the CD4-CD8 bi-specific peptides reach a predefined threshold length of amino acids. In an example, the predefined threshold length of amino acids may be 40 amino acids. An exemplary schematic for the generation of CD4 and CD8 bispecific T-cell peptide saturation is shown in Figure 14.

Therefore, the predicted CD4 and CD8 epitopes help to achieve bi-specificity, as a single peptide can activate CD4 and CD8 T-cell immunity, which is unprecedented. The bi-specific activity of the peptide is achieved without a linker sequence between the CD4 and CD8 T-cell epitopes. The predicted CD4 and CD8 provides high population coverage. That is, as the present method considers bispecific peptide design strategy, the drawback of T-cell peptide-based vaccines low population coverage can be overcome. This is achieved by first saturating the T-cell peptide with multiple overlapping CD4 and CD8 bispecific T-cell epitopes, which binds to multiple MHC alleles, thereby increasing the population coverage. Further selecting the specific T-cell peptides may help to achieve over 95% for both MHC class I and MHC class II HLA allele population coverage. Further, the predicted CD4 and CD8 epitopes help to achieve high efficiency since the present method achieve CD4 and CD8 responses with high population coverage, even with less than 40 amino acid peptides.

Thus, the sequences encoding bispecific multi-epitope peptides having high CD4 and CD8 population coverage in combination can be selected for the final immunogenic composition. These bispecific peptides may be linked together using a –(GGGGS)- or G4S peptide linkers. G4S peptide linker is a flexible linker which may be easily cleaved inside the cell and leave the T-cell peptides for antigen presentation.
Figure 15 illustrates an exemplary overview of a system for generating a nucleic acid sequence encoding a chimeric immunogen, in accordance with an embodiment of the disclosure. With reference to FIG. 15, the system 1501 may include a processor 1503, a memory 1504, Input/Output (I/O) interface 1505, a network interface 1506, connected to a database 1502 via a communication network 1507.

In an embodiment, the processor 1503 may be configured to execute instructions, such as those that may be loaded into the memory 1504. The instructions could be associated with a process for generating a nucleic acid sequence encoding a chimeric immunogen. The processor 1503, for example, may be a Proportional-Integral-Derivative (PID) controller or multivariable controllers, such as controllers implementing Model Predictive Control (MPC) or other Advanced Predictive Control (APC). As a particular example, the processor 1503 could represent a computing device running a real-time operating system, a WINDOWS operating system, or other operating systems. The memory 1504 may be a storage device, which represents any structure(s) capable of storing and facilitating retrieval of information (such as data, program code, and/or other suitable information on a temporary or permanent basis). The memory 1504 may represent a random access memory or any other suitable volatile or non-volatile storage device(s). The memory 1504 may contain one or more components or devices supporting longer-term storage of data, such as a read only memory, hard drive, Flash memory, or optical disc.

In an embodiment, the memory 1504 may be coupled to the system 1501 through the I/O interface 1505. Program instructions and data stored via the memory 1504 may be transmitted by transmission media or signals such as electrical, electromagnetic, or digital signals, which may be conveyed via a communication medium such as a network and/or a wireless link, such as may be implemented via the network interface 1506. In some implementations, the I/O interface 1505 may be configured to coordinate I/O traffic between the processor 1503, the memory 1504, and any peripheral devices, the network interface 1506, or other peripheral interfaces, such as input/output devices. In some implementations, the I/O interface 1506 may perform any necessary protocol, timing, or other data transformations to convert data signals from one component (e.g., the memory) into a format suitable for use by another component (processor 1503). The I/O interface 1506 may allow for input and output of data. For example, the I/O interface 1506 may provide a connection for user input through a keyboard, mouse, keypad, touchscreen, or other suitable input device. The I/O interface 1506 may also send output to a display, printer, or other suitable one or more display devices.

In an embodiment, the processor 1503 may be connected to the database 1502 through the communication network 1507. The processor 1503 may receive the sequences, such as, amino sequences, from the database 1502 through the communication network 1507. The communication network 1507 may include wired connections such as Ethernet cables or industrial communication protocols like MODBUS or OPC (OLE for Process Control). In some cases, wireless communication technologies such as Wi-Fi, cellular networks, or proprietary wireless protocols may be used for data transfer. Common protocols for transmitting data may include FTP (File Transfer Protocol), HTTP (Hypertext Transfer Protocol), MQTT (Message Queuing Telemetry Transport), or custom APIs (Application Programming Interfaces) provided by data service providers. Additionally, to ensure data security and integrity, encrypted connections such as HTTPS (HTTP Secure) or VPNs (Virtual Private Networks) may be utilized for data transmission. In an embodiment, the sequences may be associated with the pathogens.

In an embodiment, the processor 1503 may receive the sequences and classify them corresponding to a plurality of lineages/clades of a pathogen into a plurality of groups based on respective evolutionary relationships.

In an embodiment, the processor 1503 may perform one of, generation of a final consensus sequence for B-cell immunogen and for prediction of CD4 epitopes and CD8 epitopes or generation of the final consensus sequence individually for each of, the B-cell immunogen and for the prediction of the CD4 epitopes, and the CD8 epitopes, using a multi-level consensus technique on the classified sequences. The final consensus sequence is indicative of evolutionary average representing each lineage of the plurality of lineages of the pathogen.
The processor 1503 uses the multi-level technique, wherein the processor 1503 firstly generates a consensus sequence for each of the plurality of lineages at a first level of a plurality of levels. Then, the processor 1503 generates, iteratively at one or more subsequent levels of the plurality of levels, one or more combined consensus sequences formed based on a combination of (a) consensus of a parent lineage, (b) consensus sequences of consecutive lineages related to a common parent lineage generated at a previous level and optionally (c) a consensus sequence of a recombinant lineage comprising at least a part of the parent lineage. Thereafter, the processor 1503 generates a final consensus sequence based on a combination of each of the one or more combined consensus sequences generated at each of the one or more subsequent levels.
Further, the processor 1503 uses the final consensus for predicting the CD4 and CD8 epitopes. That is, the processor 1503 processes the final consensus sequence into one or more peptides and predicts a binding relationship of each of the one or more peptides with a corresponding Major Histocompatibility Complex (MHC) molecule present in a host cell which presents peptides to T-cells, based on predefined binding techniques. Then, the processor 1503 predicts whether each of the one or more peptides having the binding relationship with respective MHC is triggering an immune response using a predefined predictor. Herein, the processor 1503 predicts the CD4 and CD8 epitopes using the first set of predefined techniques and the second set of predefined techniques, respectively.

In an embodiment, the processor 1503 may identify one or more regions containing at least one CD4-CD8 bispecific epitope from the predicted CD4 epitopes and the CD8 epitopes. The at least one CD4-CD8 bispecific epitopes are combined with the B-cell immunogen for generating a nucleic acid sequence encoding the chimeric immunogen. Particularly, the processor 1503 identifies one or more regions containing at least one CD4-CD8 bispecific epitope by identifying a predefined number of amino acid overlap regions between the CD4 epitopes and the CD8 epitopes to obtain a plurality of overlapping CD4-CD8 bispecific peptides, by performing an overlap between the predicted CD4 epitopes and CD8 epitopes. In an embodiment, the CD4-CD8 bi-specific epitopes comprises amino acids which are conserved at a frequency of at least 50% across the plurality of lineages of the pathogen. In an embodiment, each of the at least one CD4-CD8 bi-specific epitopes include a length of about 15-40 amino acids.

Further, the processor 1503 may be configured to determine from the plurality of overlapping CD4-CD8 bispecific peptides, a set of overlapping CD4-CD8 bispecific peptides which overlap with each other with a predefined number of amino acids. Then, the processor 1503 determines, iteratively, subsequent overlapping CD4-CD8 bispecific peptides based on respective previously determined overlapping CD4-CD8 bispecific peptide, until no more overlapping is possible or the CD4-CD8 bi-specific peptides reach a predefined threshold length of amino acids.

ADVANTAGES
The nucleic acid sequence encoding the chimeric immunogen can elicit both humoral (antibody-mediated) and cellular (T-cell-mediated) immune responses and offers several advantages in providing comprehensive protection against infectious diseases or diseases like cancer such as:
a. Dual Defense Mechanisms: The immunogenic compositions of the present disclosure elicit both humoral and cellular immunity. Humoral immunity involves the production of antibodies that can neutralize pathogens or disease-specific antigens and prevent them from infecting host cells. Cellular immunity, on the other hand, involves the activation of T-cells, particularly Cytotoxic T Lymphocytes (CTLs), which can directly target and destroy infected cells.
b. Broader Protection: Some pathogens are better controlled by humoral immunity, while others require a strong cellular immune response. An immunogenic composition/vaccine that activates both arms of the immune system can offer broader protection, making the strategy applicable to a variety of infectious agents, including viruses, bacteria, and intracellular parasites.
c. Effective Control of Intracellular Pathogens: Cellular immunity is particularly important for combating intracellular pathogens that reside within host cells. CTLs can recognize and eliminate infected cells, preventing the spread of the pathogen and the development of persistent infections.
d. Memory Responses: Both humoral and cellular immune responses contribute to the development of immunological memory. Memory B-cells and memory T-cells are generated, providing a rapid and robust response upon re-exposure to the pathogen. This can result in long-lasting protection and reduce the need for frequent booster vaccinations.
e. Adaptability to Pathogen Variability: Pathogens, especially viruses, can undergo mutations and variations. By eliciting both humoral and cellular responses, a vaccine is more likely to provide protection against different strains or variants of the pathogen. This adaptability is crucial in the face of evolving infectious agents.
f. Preventing Pathogen Spread: Humoral immunity can neutralize circulating pathogens, preventing their entry into host cells. Meanwhile, cellular immunity can eliminate infected cells, reducing the host's reservoir of the pathogen and minimizing the transmission risk.
g. Bi-specificity: In the present invention, a single peptide can activate CD4 and CD8 T-cell immunity, which is unprecedented. The bi-specific activity of each CD4-CD8 bispecific epitope is achieved without a linker sequence between the CD4 and CD8 T-cell epitopes present on that bi-specific epitope.

h. High Population coverage: The present bispecific CD4-CD8 epitope design strategy overcomes a major drawback of T-cell peptide-based vaccines: low population coverage. This is achieved by first saturating the T-cell peptide with multiple overlapping CD4 and CD8 bispecific T-cell epitopes, which would bind to multiple MHC alleles, thereby increasing the population coverage. Further selecting the best 3 T-cell peptides is expected to achieve over 95% of population coverage for both MHC class I and MHC class II HLA alleles (checked using IEDB Population coverage).

i. High Efficiency: The traditional methods of designing T-cell peptide-based vaccines are quite inefficient, where long stretches of peptides such as more than 40 amino acids long peptides are linked via peptide linkers to generate T-cell immunity with viable population coverage. However, in the present approach, CD4 and CD8 responses with a high population coverage are achieved even with less than 40 amino acid peptides.

It is to be understood that the foregoing descriptive matter is illustrative of the disclosure and not a limitation. While considerable emphasis has been placed herein on the particular features of this disclosure, it will be appreciated that various modifications can be made, and that many changes can be made in the preferred embodiments without departing from the principles of the disclosure. Those skilled in the art will recognize that the embodiments herein can be practiced with modification within the spirit and scope of the embodiments as described herein. Similarly, additional embodiments and features of the present disclosure will be apparent to one of ordinary skill in art based upon description provided herein.
Descriptions of well-known/conventional methods/steps and techniques are omitted so as to not unnecessarily obscure the embodiments herein. Further, the disclosure herein provides for examples illustrating the above-described embodiments, and in order to illustrate the embodiments of the present disclosure certain aspects have been employed. The examples used herein for such illustration are intended merely to facilitate an understanding of ways in which the embodiments herein may be practiced and to further enable those of skill in the art to practice the embodiments herein. Accordingly, the following examples should not be construed as limiting the scope of the embodiments herein.

EXAMPLES
The following examples particularly describe the manner in which the invention is to be performed. However, the embodiments disclosed herein do not limit the scope of the invention in any manner.
Example 1: Multi-level Consensus Sequence Generation
The multi-level consensus generation includes multiple iterations of single-level consensus generations at different levels of the evolutionary tree. Each lineage or phylogenetic group first underwent a single-level consensus generation to get the first level consensus sequence of each lineage/clade/phylogenetic group for the immunogen. The first level consensus sequences then underwent consensus generation with the next closest lineage to obtain the second level of consensus sequence. The process is iterated with the next closest lineage and so on to arrive at a final consensus sequence. Since multiple stepwise consensus sequences are generated to arrive at the final consensus sequence, the method is referred to herein as multi-level consensus sequence generation. The multi-level consensus generation is performed for each immunogen – B cell immunogen and T cell immunogen. Figure 3 depicts a general framework for developing a consensus sequence for any immunogen.
1.1 Data Preprocessing
i. Sequence Collection: Immunogen sequences from various sources, such as public databases (e.g., GenBank, GISAID) were collected and directed to sequencing efforts.
ii. Sequence Quality Control: Quality control checks were performed on the collected sequences to identify and remove low-quality or erroneous sequences. This involved checking for sequencing errors, ambiguous nucleotides, lab strains, and incomplete sequences.
iii. Sequence Segregation: Based on the immunogen of interest, sequences were organized phylogenetically into different lineage/clade/serotypes.

1.2 Single-Level Consensus (Consensus Sequence Generation within a group):
Each single-level consensus generation consisted of at least three steps as depicted in Figure 2A:
i. Multiple Sequence Alignment: The high-quality sequences using appropriate alignment tools such as MAFFT, CLUSTALW, MUSCLE, etc. were aligned. The protein sequences were aligned based on similarities, generating a multiple sequence alignment (MSA). Further, scoring matrices such as PAM and BLOSUM assigned scores to different alignments based on the sequences' similarity between the amino acids or nucleotides. The analysis from MSA provided valuable insights into the evolutionary relationships between the sequences, their shared structural and functional features, and potential mutations that may have occurred over time.

ii. Trimming Alignment: When alignment was done across many sequences, some sequences had unique amino acid position insertions in the genome. Due to these unique or rare insertion events, the whole alignment was added with those positions. Further, these amino acid positions if not present in at least 50 % of sequences in the group were trimmed off using sequence trimming tools like TrimAI.

iii. Identify Consensus Positions: The most frequent amino acid that had a frequency of more than or equal to 50 % formed the consensus for that position.
a) Resolving Ambiguities: If no consensus could be reached, an IUPAC ambiguity code to represent the possible nucleotide/RNA/amino acid was used.
b) Generating Consensus Sequence: The consensus nucleotides were concatenated for all positions to generate the consensus sequence of the immunogen.

1.3 Multi-level Consensus:
The multi-level consensus generation as depicted in Figure 2(B) includes multiple iterations of single-level consensus generations at different levels of the evolutionary tree. After consensus sequence of each lineage/clade/phylogenetic group for the immunogen was carried out, single-level consensus was generated. Further, multi-level consensus generation for each immunogen was performed.
There was n+1 levels where n was the number of sub-lineage levels. The n+1th level dealt with recombinant lineages or recombinants of recombinant lineages. Here, recombinant lineages were descendants of two lineages. So, for each recombinant lineage, the consensus sequence was added to the consensus pool of both parent lineages.
Multi-level consensus started from the nth level if there were no recombinant phylogenetic groups. Then, the most evolved sublineage at the far end of the phylogeny was dealt with at the nth level. Thereafter, the single-level consensus generation was applied to these sublineage pools. The consensus sequences generated out of that was then added to the consensus lineage pool of their immediate parent lineages. Next, the inventors dealt with lineage pools in the (n-1)th level. The single-level consensus generation was applied to the sublineage pool, and the consensus sequences generated were added to the consensus lineage pool of their immediate parent lineages.
These further lowered iterations to reach 1st level, where the Single-level consensus generation was applied to generate a single final consensus lineage. This was done till the inventors reached root variants in each evolutionary direction. Then, a consensus sequence was generated from all the multi-level consensus sequences root variant lineages to make the final multi-level consensus sequence. Figure 13 provides a schematic of the steps involved in multi-level consensus sequence generation.

1.4 Post-processing
i. Analyze Consensus Sequence: The consensus sequence to identify potential mutations, deletions, or insertions was analyzed. These changes provided insights into the evolutionary history of the immunogen and its potential impact on vaccine efficacy.

ii. Compare Consensus Sequence: The consensus sequence was compared to known immunogen variants to determine its closest relatives. Figure 13 illustrates the steps involved in multi-level consensus sequence generation.

Further, Figure 4 provides a comparison of consensus immunogen with SARS-CoV-2 variants. a) Pairwise sequence identity comparison of SARS-CoV-2 consensus protein immunogen (SCoP) with SARS-CoV-2 variants consensus sequences of Spike protein. b) Pairwise sequence identity comparison of SCoP immunogen with SARS-CoV-2 variant consensus sequences of Nucleocapsid protein. c) Comparison of neutralizing mAbs binding to spike protein based on SCoP sequence or SARS-CoV-2 variant sequences. Here, the X-axis shows docking scores, and the Y-axis shows spike immunogen type.

Figure 5 illustrates pairwise sequence identity comparison of consensus sequence to the individual clade sequences of Influenza virus. A) NP ; B) M1.

iii. Update Consensus Sequence: If an update was required, new sequences could be added, updating the consensus sequence to reflect the latest genetic information.

EXAMPLE 2: Designing CD4-CD8 bi-specific T-cell peptides
The aim of the inventors was to design a single peptide that contains both CD4 and CD8 T cell activating epitopes (referred to as CD4-CD8 bispecific epitope) so that the single peptide activates both CD4 and CD8 T-cell responses. To achieve this, tools listed in Table 1 were employed to individually predict all potential CD4 and CD8 T-cell epitopes.
2.1. T-cell epitope prediction
T-cell peptide prediction was carried out by employing computational tools to identify short segments of proteins (peptides) that can bind to and activate T-cells.
The first step involved processing the protein sequence into potential peptide fragments. These fragments were typically 8-15 amino acids long. The next step was related to predicting which peptides would likely bind to a specific major histocompatibility complex (MHC) molecule. The Human MHC alleles with the highest world population coverage were selected to predict the CD4 and CD8 T-cell epitopes.
Two approaches were employed for MHC binding prediction:
a. Motif-based methods
b. Structure-based methods
Finally, steps were employed to predict whether a peptide that binds to MHC is also likely to be immunogenic, meaning it can trigger an immune response using the PRIME Immunogenicity predictor. This more complex task involved factors beyond MHC binding, such as the individual's T-cell repertoire and other immune system components.
2.2. CD4-CD8 bispecific multiepitope saturation:
Once the immunogenic CD4 and CD8 T-cell epitopes were identified, with a first round of overlapping the CD8 and CD4 epitopes was performed so that at least three amino acid overlap regions were present between the epitopes. After the first round, multiple rounds of overlapping were performed on the previously identified epitope.This process is called epitope saturation. Multiple iterations of epitope saturation were performed until no more peptides could be overlapped or the peptides reached the maximum allowed length of 40 amino acids. This is where the present disclosure differs significantly from the traditional methods. The present inventors could saturate the maximum number of epitopes in a single peptide and have bi-specific T-cell activity. Figure 6 is a schematic explaining the CD4 and CD8 bispecific T-cell peptide saturation.
EXAMPLE 3: Chimeric B-Cell antigen - CD4/CD8 bispecific T-cell peptide construct
The steps to design a chimeric construct are as follows.
A. Linking CD4-CD8 Bispecific multi-epitope peptides
Three bispecific multi-epitope peptides having high CD4 and CD8 population coverage in combination are selected to design the final nucleic acid construct. These three bispecific peptides are linked together using a –(GGGGS)- or G4S peptide linkers. G4S peptide linker is a flexible linker which could be easily cleaved inside the cell and leave the T-cell peptides for antigen presentation. Figure 7 depicts a 3D structure of the chimeric vaccine antigen for Influenza H1N1 and SARS-CoV-2.
B. Linking humoral immunogen to cellular immunogen
The multi-level consensus sequence generation method was used to design the B-cell immunogen. To elicit both the T-cell and B-cell responses, the B-cell antigen to be presented on the cell surface for antibody interaction and T-cell immunogen containing multiple bispecific epitopes to be presented in the cytoplasm for processing by the proteasome are needed. To make this happen, 2A peptide sequence followed by a G4S linker between the B-cell antigen and T-cell multi-peptide construct can be inserted. Moreover, to enhance the 2A peptide cleavage, the G4S linker is incorporated into the N-terminal of the 2A peptide. Figure 8 provides a schematic explaining the chimeric vaccine design.
EXAMPLE 4: Designing broadly protective SARS-CoV-2 and Influenza Vaccine
4.1. Broadly Protective SARS-CoV-2 Vaccine
Broadly protective vaccines against SARS-CoV-2, which comprised of evolutionary consensus of Spike (S) protein linked to CD4-CD8 bi-specific T-cell multi-epitope derived from other structural proteins were designed. The consensus Spike alone or in combination with consensus Nucleoprotein was tested and found a broad immune response when delivered through self-amplifying mRNA platform.

Figure 9A depicts the design of broadly protective mRNA vaccine against SARS-CoV-2.

A consensus SARS-CoV-2 spike protein sequence was generated by employing the multi-level consensus generation method described above:

MFVFLVLLPLVSSQCVNLTTRTQLPPAYTNSFTRGVYYPDKVFRSSVLHSTQDLFLPFFSNVTWFHAIHVSGTNGTKRFDNPVLPFNDGVYFASTEKSNIIRGWIFGTTLDSKTQSLLIVNNATNVVIKVCEFQFCNDPFLGVYYHKNNKSWMESEFRVYSSANNCTFEYVSQPFLMDLEGKQGNFKNLREFVFKNIDGYFKIYSKHTPINLVRDLPQGFSALEPLVDLPIGINITRFQTLLALHRSYLTPGDSSSGWTAGAAAYYVGYLQPRTFLLKYNENGTITDAVDCALDPLSETKCTLKSFTVEKGIYQTSNFRVQPTESIVRFPNITNLCPFGEVFNATRFASVYAWNRKRISNCVADYSVLYNSASFSTFKCYGVSPTKLNDLCFTNVYADSFVIRGDEVRQIAPGQTGNIADYNYKLPDDFTGCVIAWNSNNLDSKVGGNYNYLYRLFRKSNLKPFERDISTEIYQAGSTPCNGVEGFNCYFPLQSYGFQPTYGVGYQPYRVVVLSFELLHAPATVCGPKKSTNLVKNKCVNFNFNGLTGTGVLTESNKKFLPFQQFGRDIADTTDAVRDPQTLEILDITPCSFGGVSVITPGTNTSNQVAVLYQGVNCTEVPVAIHADQLTPTWRVYSTGSNVFQTRAGCLIGAEHVNNSYECDIPIGAGICASYQTQTNSHRRARSVASQSIIAYTMSLGAENSVAYSNNSIAIPTNFTISVTTEILPVSMTKTSVDCTMYICGDSTECSNLLLQYGSFCTQLNRALTGIAVEQDKNTQEVFAQVKQIYKTPPIKDFGGFNFSQILPDPSKPSKRSFIEDLLFNKVTLADAGFIKQYGDCLGDIAARDLICAQKFNGLTVLPPLLTDEMIAQYTSALLAGTITSGWTFGAGAALQIPFAMQMAYRFNGIGVTQNVLYENQKLIANQFNSAIGKIQDSLSSTASALGKLQDVVNQNAQALNTLVKQLSSNFGAISSVLNDILSRLDKVEAEVQIDRLITGRLQSLQTYVTQQLIRAAEIRASANLAATKMSECVLGQSKRVDFCGKGYHLMSFPQSAPHGVVFLHVTYVPAQEKNFTTAPAICHDGKAHFPREGVFVSNGTHWFVTQRNFYEPQIITTDNTFVSGNCDVVIGIVNNTVYDPLQPELDSFKEELDKYFKNHTSPDVDLGDISGINASVVNIQKEIDRLNEVAKNLNESLIDLQELGKYEQYIKWPWYIWLGFIAGLIAIVMVTIMLCCMTSCCSCLKGCCSCGSCCKFDEDDSEPVLKGVKLHYT (SEQ ID NO: 1).

A map for the above SARS-CoV-2 Consensus Spike sequence is shown in Figure 16. Figure 16 shows theanalysis of Amino acid mutation in Spike protein of SCoP consensus sequence compared to variant of concern sequences. The x-axis shows amino acid position in the protein and y-axis shows the parent lineage of the variant. Alpha - B.1.1.7; Beta – B.1.351; Gamma – P.1; Delta – B.1.617.2; Omicron - BA.1. Different colours represent different amino acids and black indicates a gap in the amino acid position (color legends mark amino acids with one‐letter amino acid codes). Spike ORF subregions were mapped and marked on top. Regions marked include SP (signal peptide), S1, S2, NTD (N‐terminal domain), RBD (ribosome binding domain), RBM (ribosome binding motif), FP (fusion peptide), HR1 (heptad repeat 1), HR2 (heptad repeat 2), TM (transmembrane region), and CP (cytoplasmic region). Different colors mark Spike subregions, with S1 and subregions represented in shades of orange, while S2 and subregions are represented in shades of cyan.

Figure 9B shows the results of an immunogenicity assay in mice administered with a mRNA encoding the above-mentioned consensus SARS-CoV-2 spike protein. Specifically, Figure 9B shows the results from plaque reduction neutralization tests (PRNT) of sera isolated from mice immunized with SCOP immunogen. The results highlighted the 50% neutralization titres against Delta and Alpha variants of SARS-CoV-2, showing a comparable response to the wild type Hong Kong variant. The overall conclusion indicates that the SCOP immunogen enhanced antibody neutralization titres across broad range of SARS-CoV-2 variants.

4.2. Broadly Protective Influenza Vaccine
The inventors designed a broadly protective vaccine against influenza A virus (IAV), which comprises an evolutionary consensus of Hemagglutinin (HA) protein linked to CD4-CD8 bi-specific T-cell multi-epitope derived from Nucleoprotein (NP) and Matrix (M1) proteins. A composition comprising a pool of peptides encoding CD4-CD8 bi-specific epitopes was administered in mice as an intranasal vaccine. The interaction of consensus HA immunogen with a range of monoclonal antibodies was examined and it was found that the interaction was retained/maintained or enhanced compared to homologous wild-type HA protein. Figure 10A depicts the design of Chimeric Broadly protective immunogen based on consensus HA and CD4-CD8 bi-specific regions from M1 and NP. Figure 10B depicts mAb binding of consensus H1 HA when compared to homologous HA proteins. Figure 10C shows the change in weight reduction in mice post H1N1 or H3N2 virus challenge. The mice group that received intranasal T-cell peptide pool only reduced 10% of their body weight while the unvaccinated groups reduced about 25% of their body weight due to H1N1 or H3N2 virus infection. The vaccinated mice showed mild symptoms of flu infection while the unvaccinated groups showed severe infection symptoms.
Figure 10D shows a survival curve of mice challenged with H1N1 or H3N2. Mice that received intranasal immunization with a T-cell peptide pool where the T cell peptides containing bi-specific epitopes were generated according to the present disclosure survived the H1N1 or H3N2 exposure whereas mice that did not receive the immunization with the T-cell peptide pool succumbed to death. , Claims:We claim:
1. A nucleic acid sequence encoding a chimeric immunogen, comprising a first sequence encoding a consensus B cell immunogen of the chimeric immunogen and a second sequence encoding a consensus T cell immunogen of the chimeric immunogen comprising CD4-CD8 bi-specific epitopes, wherein the chimeric immunogen induces a B-cell immune response and a T-cell immune response to different variants or different clades of a pathogen or different variants of an antigen.
2. The nucleic acid sequence as claimed in claim 1, wherein the first sequence is operably linked to the second sequence.
3. The nucleic acid sequence as claimed in claim 1 or 2, wherein the second sequence comprises nucleotides that encode the CD4-CD8 bi-specific epitopes and nucleotides that encode for linkers that link the CD4-CD8 bi-specific epitopes.
4. The nucleic acid sequence as claimed in any one of claims 1-3, wherein amino acids of the consensus B cell immunogen encoded by the first sequence are conserved at a frequency of about 50% or more across the different strains or different clades of the pathogen.
5. The nucleic acid sequence as claimed in any one of claims 1-4, wherein amino acids of the CD4-CD8 bi-specific epitopes encoded by the second sequence are conserved at a frequency of about 50% or more across the different strains or different clades of the pathogen.
6. The nucleic acid sequence as claimed in any one of claims 1-5, wherein the second sequence encodes about 2-15 CD4-CD8 bi-specific epitopes.
7. The nucleic acid sequence as claimed in any one of claims 1-6, wherein each of the CD4-CD8 bi-specific epitopes encoded by the second sequence has a length of about 15-40 amino acids.
8. The nucleic acid sequence as claimed in any one of claims 1-7, wherein the first sequence encodes the consensus B cell immunogen selected from a consensus spike protein of SARS-CoV-2, a consensus hemagglutinin of an influenza virus, or a consensus neuraminidase (N) of an influenza virus.
9. The nucleic acid sequence as claimed in any one of claims 1-8, wherein the second sequence encodes consensus CD4-CD8 bi-specific epitopes of SARS-CoV-2 or influenza virus.
10. The nucleic acid sequence as claimed in any one of claims 1-9, wherein the second sequence encodes CD4-CD8 bi-specific epitopes of a consensus nucleoprotein of influenza virus, a consensus matrix protein 1 (M1) of influenza virus, or both.
11. The nucleic acid sequence as claimed in any one of claims 8-10, wherein the influenza virus is selected from influenza A virus or influenza B virus.
12. The nucleic acid sequence as claimed in claim 11, wherein influenza A virus incudes but not limited to H1N1, H3N2, H5N1 and H7N9 subtypes..
13. The nucleic acid sequence as claimed in any one of claims 1-12, wherein the nucleic acid sequence is a DNA sequence or an mRNA sequence.
14. A vector comprising the nucleic acid sequence as claimed in any one of claims 1-12.
15. The vector as claimed in claim 14, wherein the vector is a plasmid or a viral vector.
16. An immunogenic composition comprising the nucleic acid sequence as claimed in any one of claims 1-13 or the vector as claimed in claim 14 or 15 and a pharmaceutical carrier.
17. The immunogenic composition as claimed in claim 16, wherein the composition induces a protective immune response to different variants or different clades of a pathogen or different variants of an antigen.
18. The immunogenic composition as claimed in claim 17, wherein the pathogen is SARS-CoV-2 or influenza.
19. A chimeric immunogen comprising, a consensus B cell immunogen and a consensus T cell immunogen comprising CD4-CD8 bi-specific epitopes, wherein the chimeric immunogen induces a B-cell immune response and a T-cell immune response to different variants or different clades of a pathogen or different variants of an antigen.
20. The chimeric immunogen as claimed in claim 19, wherein the consensus B cell immunogen is linked to the consensus T cell immunogen.
21. The chimeric immunogen as claimed in claim 19 or 20, wherein the C-terminus of the consensus B cell immunogen is linked to the N-terminus of the consensus T cell immunogen.
22. The chimeric immunogen as claimed in any one of claims 19-21, wherein amino acids of the consensus B cell immunogen are conserved at a frequency of about 50% or more across the different strains or different clades of the pathogen.
23. The chimeric immunogen as claimed in any one of claims 19-22, wherein amino acids of the CD4-CD8 bi-specific epitopes are conserved at a frequency of about 50% or more across the different strains or different clades of the pathogen.
24. The chimeric immunogen as claimed in any one of claims 19-23, wherein the consensus T cell immunogen comprises about 2-15 CD4-CD8 bi-specific epitopes.
25. The chimeric immunogen as claimed in any one of claims 19-24, wherein each of the CD4-CD8 bi-specific epitopes has a length of about 15-40 amino acids.
26. The chimeric immunogen as claimed in any one of claims 19-25, wherein the consensus B cell immunogen is selected from a consensus spike protein of SARS-CoV-2, a consensus hemagglutinin of an influenza virus, or a consensus neuraminidase (N) of an influenza virus.
27. The chimeric immunogen as claimed in any one of claims 19-26, wherein the consensus T cell immunogen comprises consensus CD4-CD8 bi-specific epitopes of SARS-CoV-2 or influenza virus.
28. The chimeric immunogen as claimed in any one of claims 19-27, wherein the consensus T cell immunogen comprises consensus CD4-CD8 bi-specific epitopes of nucleoprotein of influenza virus, matrix protein 1 (M1) of influenza virus, or both.
29. The chimeric immunogen as claimed in any one of claims 26-28, wherein the influenza virus is selected from influenza A virus or influenza B virus.
30. The chimeric immunogen as claimed in claim 29, wherein influenza A virus is H1N1 or H3N2.
31. An immunogenic composition comprising the chimeric immunogen as claimed in any one of claims 19-30 and a pharmaceutical carrier.
32. The immunogenic composition as claimed in claim 31, wherein the composition induces a protective immune response to different variants or different clades of a pathogen or different variants of an antigen.
33. The immunogenic composition as claimed in claim 32, wherein the pathogen is SARS-CoV-2 or influenza.
34. A method for inducing an immune response in a subject, comprising administering to the subject the nucleic acid sequence as claimed in any one of claims 1-13; the vector as claimed in claim 14 or 15; the immunogenic composition as claimed in any one of claims 16-18 or 31-33; or the chimeric immunogen as claimed in any one of claims 19-30.
35. The method as claimed in claim 34, wherein the nucleic acid sequence, the vector, the immunogenic composition, or the chimeric antigen induces a protective immune response to different strains or different clades of a pathogen.
36. The method as claimed in claim 35, wherein the pathogen is SARS-CoV-2 or influenza.
37. The method as claimed in claim 36, wherein influenza is influenza A or influenza B.
38. The method as claimed in claim 37, wherein influenza A is H1N1 or H3N2.
39. A method for generating the nucleic acid sequence as claimed in claim 1, comprising:
- classifying, by a system, sequences corresponding to a plurality of lineages/clades of a pathogen into a plurality of groups based on respective evolutionary relationships;

- performing one of, by the system, generation of a final consensus sequence for B-cell immunogen and for prediction of CD4 epitopes and CD8 epitopes or generation of the final consensus sequence individually for each of, the B-cell immunogen and for the prediction of the CD4 epitopes, and the CD8 epitopes, using a multi-level consensus technique on the classified sequences, wherein the final consensus sequence is indicative of evolutionary average representing each lineage of the plurality of lineages of the pathogen; and

- identifying, by the system, one or more regions containing at least one CD4-CD8 bispecific epitope from the predicted CD4 epitopes and the CD8 epitopes, wherein the at least one CD4-CD8 bispecific epitopes are combined with the B-cell immunogen for generating a nucleic acid sequence encoding a chimeric immunogen.
40. The method as claimed in claim 39, wherein the multi-level consensus technique comprises:
- generating a consensus sequence for each of the plurality of lineages at a first level of a plurality of levels;
- generating, iteratively at one or more subsequent levels of the plurality of levels, one or more combined consensus sequences formed based on a combination of (a) consensus of a parent lineage, (b) consensus sequences of consecutive lineages related to a common parent lineage generated at a previous level and optionally (c) a consensus sequence of a recombinant lineage comprising at least a part of the parent lineage; and
- generating a final consensus sequence based on a combination of each of the one or more combined consensus sequences generated at each of the one or more subsequent levels.

41. The method as claimed in claim 1, wherein predicting the CD4 and CD8 epitopes comprises:
- processing the final consensus sequence into one or more peptides;
- predicting a binding relationship of each of the one or more peptides with a corresponding Major Histocompatibility Complex (MHC) molecule present in a host cell which presents peptides to T-cells, based on predefined binding techniques; and
- predicting whether each of the one or more peptides having the binding relationship with respective MHC is triggering an immune response using a predefined predictor.

42. The method as claimed in claim 3, wherein the CD4 and CD8 epitopes are predicted using a first set of predefined techniques and a second set of predefined techniques, respectively.

- determining, iteratively, subsequent overlapping CD4-CD8 bispecific peptides based on respective previously determined overlapping CD4-CD8 bispecific peptide, until no more overlapping is possible or the CD4-CD8 bi-specific peptides reach a predefined threshold length of amino acids.
45. The method as claimed in claim 1, wherein the at least one CD4-CD8 bispecific epitopes are linked with each other using a peptide linker.

46. The method as claimed in claim 1, wherein the nucleic acid sequence is one of, a DNA sequence or an mRNA sequence.

49. The method as claimed in claim 1, wherein each of the at least one CD4-CD8 bi-specific epitopes include a length of about 15-40 amino acids.

50. The method as claimed in claim 1, wherein the evolutionary relationships correspond to phylogenetic relationship.

51. A system for generating a nucleic acid sequence encoding a chimeric immunogen, comprising:
a processor; and
a memory communicatively coupled to the processor, wherein the memory stores processor instructions, which, on execution, causes the processor to:
- classify sequences corresponding to a plurality of lineages/clades of a pathogen into a plurality of groups based on respective evolutionary relationships;
- perform one of, generation of a final consensus sequence for B-cell immunogen and for prediction of CD4 epitopes and CD8 epitopes or generation of the final consensus sequence individually for each of, the B-cell immunogen and for the prediction of the CD4 epitopes, and the CD8 epitopes, using a multi-level consensus technique on the classified sequences, wherein the final consensus sequence is indicative of evolutionary average representing each lineage of the plurality of lineages of the pathogen; and
- identify one or more regions containing at least one CD4-CD8 bispecific epitope from the predicted CD4 epitopes and the CD8 epitopes, wherein the at least one CD4-CD8 bispecific epitopes are combined with the B-cell immunogen for generating a nucleic acid sequence encoding a chimeric immunogen.

Documents

Application Documents

#	Name	Date
1	202441071302-STATEMENT OF UNDERTAKING (FORM 3) [20-09-2024(online)].pdf	2024-09-20
3	202441071302-Sequence Listing in PDF [20-09-2024(online)].pdf	2024-09-20
4	202441071302-POWER OF AUTHORITY [20-09-2024(online)].pdf	2024-09-20
5	202441071302-FORM FOR SMALL ENTITY(FORM-28) [20-09-2024(online)].pdf	2024-09-20
6	202441071302-FORM 1 [20-09-2024(online)].pdf	2024-09-20
7	202441071302-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [20-09-2024(online)].pdf	2024-09-20
8	202441071302-EDUCATIONAL INSTITUTION(S) [20-09-2024(online)].pdf	2024-09-20
9	202441071302-DRAWINGS [20-09-2024(online)].pdf	2024-09-20
10	202441071302-DECLARATION OF INVENTORSHIP (FORM 5) [20-09-2024(online)].pdf	2024-09-20
11	202441071302-COMPLETE SPECIFICATION [20-09-2024(online)].pdf	2024-09-20
12	202441071302-FORM-9 [23-09-2024(online)].pdf	2024-09-23
13	202441071302-FORM-8 [23-09-2024(online)].pdf	2024-09-23
14	202441071302-FORM 18A [23-09-2024(online)].pdf	2024-09-23
15	202441071302-EVIDENCE OF ELIGIBILTY RULE 24C1f [23-09-2024(online)].pdf	2024-09-23
16	202441071302-Proof of Right [01-10-2024(online)].pdf	2024-10-01
17	202441071302-FORM-26 [26-07-2025(online)].pdf	2025-07-26
18	202441071302-FER.pdf	2025-07-30
19	202441071302-FORM 3 [22-09-2025(online)].pdf	2025-09-22

Search Strategy

1	202441071302_SearchStrategyNew_E_Untitled2E_29-07-2025.pdf