Sign In to Follow Application
View All Documents & Correspondence

Mitochondrial Dna Mutations In Zoroastrian Parsi Haplogroups Implications Of Mitochondrial Signatures In Cancers, Neurodegenerative Disorders, Metabolic And Rare Diseases

Abstract: The waves of pastoralist migrations from Airyan'?m Vaejah, the ancient homeland of the present day Zoroastrians-Parsis in the Hyperborean Regions of Northern Siberia and the islands of the circumpolar regions1 led these Indo Europeans to settle in the Eurasian Steppes2 and later, as Indo Iranians in the Fertile Crescent3. From then, the Zoroastrian Achaemenids (550 - 331 BC), and later the Sassanids (224 BC - 642 AD) established the mighty Persian Empires2. The Arab invasion of Persia in 642 AD necessitated the migration of Zoroastrians from Pars and Khorasan through the island of Hormuz to India where they settled as Parsis and practiced their faith, Zoroastrianism. Endogamy became a dogma, and since their arrival in India, the community has maintained this practice to date. Fire is the medium of worship4 as it is considered pure and sacrosanct. Strict measures continue to be employed to maintain the purity of fire, hence social ostracism continues to be practiced against smokers resulting in an endogamous, non-smoking community thereby forming an unique basis for our study. In order to gain a clearer understanding of the historically recorded migration of the Zoroastrian-Parsis from Persia to India and to decipher their phylogenetic relationships and to understand disease association to their individual mitochondrial genomes, we generated the first complete de novo Zoroastrian-Parsi Mitochondrial Reference Genome (AGENOME-ZPMRG-Hv2a-1). HV2a is the single, largest representative sub-haplogroup within the 100 Parsis that includes the de novo AGENOME-ZPMRG-HV2a-1. We have generated the first Zoroastrian-Parsi de novo Mitochondrial Reference Genome AGENOME-ZPMRG-HV2a-1. We have analyzed the 100 mitochondrial genomes of the Zoroastrian-Parsis of India for their haplogroups and generated haplogroup specific Mitochondrial Reference Genomes: AGENOME-ZPMRG-HV, U, T, M, A, F and Z V1.0. Our study for the first time has assembled the Mitochondrial Consensus Genome for the Zoroastrian-Parsis of India: The Zoroastrian Parsi Mitochondrial Consensus Genome (AGENOME- ZPMCG V 1.0). Phylogenetic analysis demonstrated that the Zoroastrian-Parsis of India are largely of Persian descent. Our analysis of the 420 synonymous and non-synonymous SNPs across coding regions, t-RNA genes and D-Loop region of the mitochondrial genomes revealed haplogroup specific longevity traits and disease associations for Parkinson’s, Alzheimer’s, Cancers and Rare diseases. We conclude that the endogamous Zoroastrian-Parsis of India represent a genetically validated non-smoking group characterized by the lack of lung cancer specific haplogroup associations and the absence of mutational signatures for tobacco induced cancers across all samples. We have generated an unique repository/Database, AVESTAMITOME™ of the mitochondrial genome sequences of the rapidly declining Zoroastrian-Parsi community of India, a small but unique endogamous, non-smoking community where the disease signatures can be investigated in the backdrop of generations of endogamy, providing exceptional opportunities to understand and mitigate disease.Analysis of the mitochondrial variant data from the WGS of 100 Parsi subjects with the revised Cambridge Reference Sequence (rCRS) standard using pipeline filter systems resulted in 420 unique variants. Further analysis with VarDiG® revealed 217 unique variants associated with 41 disease phenotypes and were classified according to the 7 major haplogroups and their 25 sub-haplogroups. The present invention relates to seven principle mitochondrial haplogroups HV, U, T, M, A2v, Z1a and F1g were found in the sequence of the 100 genomes analyzed, 50 women and 50 men The present invention relates to human diseases and more specifically to compositions and methods for detecting predisposition of diseases by detecting single nucleotide polymorphisms in mitochondrial DNA from Parsi cohort.34The invention provides identification and mapping of a huge number of 420 SNPs throughout the non-smoking Parsi human mitochondrial genome. The current invention relates to association of mitochondrial Haplogroups associated with human diseases The current invention relates to the role of mitochondrial genes associated with various human diseases. This mapping provides the method, process, and composition to isolate and identify genes that are relevant to the predisposition, prevention, causation, or treatment of human disease conditions.The present invention relates to Mitochondrial rare diseases like Leigh Disease, Mitochondrial Encephalomyopathies, Mitochondrial Myopathies, MELAS Syndrome, MERRF Syndrome, Ophthalmoplegia Chronic Progressive External and Cytochrome-c Oxidase Deficiency attesting to a role for mitochondrial SNPs in rare diseases that occur at a high frequency in the Parsi population

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
02 June 2020
Publication Number
49/2021
Publication Type
INA
Invention Field
BIOTECHNOLOGY
Status
Email
sheethalsprakash@gmail.com
Parent Application

Applicants

Avesthagen Limited,
Avesthagen Limited, THE dry lab, Yolee Grande, 2nd Floor, Pottery Road, Richard’s Town, Bangalore, 560005, Karnataka, India
Agenome LLC
Agenome LLC, Regus, 15 N , Main Street, Suite 100, West Hartford, CT - 06107USA
Avestagenome Project International Private Limited
THE dry lab, Yolee Grande, 2nd Floor, Pottery Road, Richard’s Town, Bangalore, 560005, Karnataka, India

Inventors

1. Villoo Morawala Patell
Avesthagen Limited, THE dry lab, Yolee Grande, 2nd Floor, Pottery Road, Richard’s Town, Bangalore, 560005, Karnataka, India
2. Naseer Pasha
Avesthagen Limited, THE dry lab, Yolee Grande, 2nd Floor, Pottery Road, Richard’s Town, Bangalore, 560005, Karnataka, India
3. Kashyap Krishnasamy
Avesthagen Limited, THE dry lab, Yolee Grande, 2nd Floor, Pottery Road, Richard’s Town, Bangalore, 560005, Karnataka, India
4. Bharti Mittal
Avesthagen Limited, THE dry lab, Yolee Grande, 2nd Floor, Pottery Road, Richard’s Town, Bangalore, 560005, Karnataka, India
5. Chellappa Gopalakrishnan
Avesthagen Limited, THE dry lab, Yolee Grande, 2nd Floor, Pottery Road, Richard’s Town, Bangalore, 560005, Karnataka, India
6. Naveen Sharma
Avesthagen Limited, THE dry lab, Yolee Grande, 2nd Floor, Pottery Road, Richard’s Town, Bangalore, 560005, Karnataka, India
7. Renuka Jain
Avesthagen Limited, THE dry lab, Yolee Grande, 2nd Floor, Pottery Road, Richard’s Town, Bangalore, 560005, Karnataka, India

Specification

Claims:

Claims
1. A method of determining the first complete de novo Zoroastrian Parsi Mitochondrial Reference Genome (ZPMRG-Hv2a-1).
2. The method according to Claim1 wherein he analysis of the 100 mitochondrial genome sequences comprising of 50 women and 50 men, have A method of found seven principle mitochondrial haplogroups HV, U, T, M, A, Z and F, that further split into 25 sub-haplogroups.
3. The method according to Claim1 and 2, wherein individual reference genomes for each haplogroup, and further analyzed their disease profiles in depth. The method of Claim1 assembly and curation of one hundred Parsi mtDNA genome sequences has contributed to building the Zoroastrian Parsi Mitochondrial Consensus Genome (AGENOME-ZPMCG V1.0) for the first time in the world.
4. The method according to Claim1to 3wherein phylogenetic analysis of the 100 mitochondrial genomes and their haplogroups revealed that present day Zoroastrian-Parsi sub-haplogroup HV2a (n=14) are Persian and Qashqai in origin.
5. The method according to Claim1 to 4,wherein the associations between the sub-haplogroup HV12b (n=1) with Persian, Qashqai, Mazandarani; U4b (n=11) with Persian, Khorasani, Qashqai; U2e (n=3) with Persian, Qashqai, Azeri; U7a (n=6) with Persian, Kurd, Tajik; U1a (n=1) with Persian, Armenians; T1a (n=2) , T2b (n=1), T2g (n=1), T2i (n=1) with Persians; A2v (n=3) with Persians; F1g (n=1) with Kurd and Turkmen; Z1a (n=1) with Qashqai, Persian; M5a (n=2) with Persian; M33a (n=1) with Azeris; M52b (n=9), M27b (n=1) with Shia muslim of Persian origin; M24a (n=8) with Persian, Qashqai; M3a (n=8), M4a (n=1), M35b (n=1) with Persians. M30d (n=10) and M39b (n=9) form unique clusters with 1 M30d clustering with Bhovi, Kuruva and Brahmin Iyengar, a class in India with Indo-European roots. M2a (n=2) clustered with relic tribes Lambadi (Caucasoid origin), Hill Kolam, Katkari and Dongri Bhil of Indian origin while M2b (n=1) clustered with relic tribes Korku and Hill Kolam of Indian origin.
6. The method according to Claim1 to 5, wherein primary associations attest to the migration of these Zoroastrian-Parsis from western Iran, central and southeast Europe, their historically documented migration from Central Asia, Iranian plateau and their long-held practice of endogamy.
7. The method according to Claim1 to 6, wherein the analysis of the mitochondrial variant data from the Whole Genome Sequence WGS of 100 Parsi subjects with the revised Cambridge Reference Sequence (rCRS) standard using pipeline filter systems resulted in 420 unique variants.
8. The method according to Claim1 to 7 wherein the analysis with VarDiG® revealed 217 unique variants associated with 41 disease phenotypes and were classified according to the 7 major haplogroups and their 25 sub-haplogroups.
9. The method according to Claim1 to 8 wherein a high representation of Parkinson’s disease in most haplogroups. The variants linked to Longevity were found linked to Parkinson’s, Alzheimer’s, Breast cancer and Cardiomyopathy in 23/25 sub-haplogroups (HV2a, U7a, U4b, T1a, T2g, T2i, T2b, M5a, M39b, M33a, M52b, M24a, M3a, M30d, M2a, M4a, M2b, M35b, M27b, A2v, F1g and Z1a). Longevity variants were absent in 2/25 sub-haplogroups (HV12b and U1a).
10. The method according to Claim1 to 9 wherein the analysis that revealed linkages to Colon cancer in 13/23 longevity linked sub-haplogroups (A2v, F1g, HV2a, M24a, M27b, M2b, M35b, M5a, T2b, U1a, U2e, U7a, Z1a). The variant for Colon cancer was missing in 10/23 sub-haplogroups belonging to M39b, M33a, M52b, M3a, M30d, M2a, M4a, T1a, T2g, T2i and T2b.
11. The method according to Claim1 to 10 wherein two outlier sub-haplogroups U1a associated with all diseases except longevity, while Hv12b was associated with neither Colon cancer nor Longevity.
12. The method according to Claim1 to 11wherein the presence of 17 SNPs in the t-RNA genes suggesting a role in pathogenicity as they are known to be an important cause of human morbidity associated with a wide range of pathologies.
13. The method according to Claim1 to 12wherein 6 SNPs in the D-Loop region, known to be critical for its role in replication and expression of the mitochondrial genome to be present across all the sub-haplogroups except HV2a, A2v and M39b.
14. The method according to Claim1 to 13 wherein SNPs in the 5 sub-haplogroups M2b, U4b, U2e, U1a and U7a to be associated with Rare diseases like Leigh Disease, Mitochondrial Encephalomyopathies, MELAS Syndrome, MERRF Syndrome, Ophthalmoplegia Chronic Progressive External and Cytochrome-c Oxidase Deficiency attesting to a role for mitochondrial SNPs in rare diseases that occur at a high frequency in the endogamous Parsi population.
15. The method according to Claim1 to 14 wherein the mutational signature C>A transitions, a direct mutational consequence of misreplication of DNA damage induced by tobacco carcinogens36 to be absent in the Zoroastrian-Parsi cohort.
16. The method according to Claim1 to 15 wherein additional disease association mapping revealed no lung cancer associated haplogroup in this population study therefore validating its authenticity as an endogamous, non-smoking group.
17. The method according to Claim1 to 16 wherein an unique repository/Database, AVESTAMITOME™ of the mitochondrial genome sequences of the rapidly declining Zoroastrian-Parsi community of India, a small but unique endogamous, non-smoking community where the disease signatures can be investigated in the backdrop of generations of endogamy, providing exceptional opportunities to understand and mitigate disease.
18. The method according to claim 1to17, wherein the 420 variants in the Zoroastrian-Parsi community with MITOMASTER (PMID: 18566966), a database that contains all known pathogenic mtDNA mutations and common haplogroup polymorphisms, to identify unique SNPs in our population, that are not reported previously.
19. The method according to claim 1-18, wherein the variants showed the presence 12 unique SNPs distributed across 27 subjects that were not observed in MITOMASTER and additionally in the VarDIG® disease association dataset
20. A method according to claim 1-19, wherein the it is used with analysis of kit or test results for a biological sample, the method comprising: selecting or producing a test kit suitable for performing a test on a biological sample, the test kit comprising one or more biomarkers for one or more areas of interest to the user, and providing the test kit to the user; applying the biological sample to the test kit in order to generate test results dependent upon said biological marker(s);
21. A method according to claim 1-20 wherein the above test is used in a device comprising: a memory storing a database of product codes and associated product recommendations derived from person a listed genetic information; a product code reader for reading a product code from a product; a processor for using a read product code to perform a look-up in the database to obtain a product recommendation for the associated product; and an indicator for providing an indication of the obtained product recommendation to a wearer of the device.
22. The method according to claim 1-21 for identifying an individual who has an altered risk for developing diseases, comprising detecting a single nucleotide polymorphism (SNP) in any one of the mitochondrial Genome sequences said individual’s nucleic acids wherein the presence of SNP is correlated with an altered risk for disease.
23. The method according to claim 1-22 in which the altered risk is an increased risk
24. The method according to claim 1-23 for detecting predisposition, of diseases by detecting single nucleotide polymorphisms in mitochondrial DNA.

25. The method according to claim 1-23 for diagnosis of diseases by detecting single nucleotide polymorphisms in mitochondrial DNA
26. The method according to 1-23 for prognostic of diseases by detecting single nucleotide polymorphisms in mitochondrial DNA

, Description:FORM 2
The Patents Act, 1970
(39 of 1970)
&

The Patents Rules, 2003

COMPLETE SPECIFICATION
(See Section 10 and Rule 13)
Title of Invention
Mitochondrial DNA mutations in Zoroastrian Parsi haplogroups - Implications of mitochondrial signatures in cancers, neurodegenerative disorders, metabolic and rare diseases
Applicant Name:

1. AGENOME LLC
Regus, 15 N , Main Street, Suite 100,
West Hartford, CT - 06107USA
Email : villoo@avesthagen.com
M No. +91-9886037291

2. Avesthagen Limited,
THE dry lab, Yolee Grande,
2nd Floor, Pottery Road, Richard’s Town,
Bangalore, 560005, Karnataka, India
Email : villoo@avesthagen.com
M No. +91-9886037291

3. Avestagenome ProjectTM International Private Limited
THE dry lab, Yolee Grande, 2nd Floor,
Pottery Road, Richard’s Town,
Bangalore, 560005, Karnataka, India
Email : villoo@avesthagen.com
M No. +91-9886037291

Name and Address of the Applicant’s Patent Agent/Patent Counsel:
Mr. HARIKRISHNA S HOLLA, Holla Associates-Advocates and IPR
Consultants, #193, ‘Kashi Bhavan’, 6 th Cross, Gandhi Nagar
Bangalore 560 009, Karnataka State, India
Tel: +91-80-2228 0778; +91-80-2228 0778; Cell: +91-98440 36805
E-mail: hariholla@gmail.com

The following specification particularly describes and ascertains the invention and the manner in which it is to be performed.

FIELD OF THE INVENTION

The present invention relates to the first complete de novo Zoroastrian Parsi Mitochondrial Reference Genome (ZPMRG-Hv2a-1).

The present invention relates the analysis of the 100 mitochondrial genome sequences comprising of 50 women and 50 men, we found seven principle mitochondrial haplogroups HV, U, T, M, A, Z and F, that further split into 25 sub-haplogroups.

The present invention relates to individual reference genomes for each haplogroup, and further analyzed their disease profiles in depth.

The present invention relates to assembly and curation of one hundred Parsi mtDNA genome sequences has contributed to building the Zoroastrian Parsi Mitochondrial Consensus Genome (AGENOME-ZPMCG V1.0) for the first time in the world.
The present invention relates to the phylogenetic analysis of the 100 mitochondrial genomes and their haplogroups revealed that present day Zoroastrian-Parsi sub-haplogroup HV2a (n=14) are Persian and Qashqai in origin.
The present invention relates to the associations between the sub-haplogroup HV12b (n=1) with Persian, Qashqai, Mazandarani; U4b (n=11) with Persian, Khorasani, Qashqai; U2e (n=3) with Persian, Qashqai, Azeri; U7a (n=6) with Persian, Kurd, Tajik; U1a (n=1) with Persian, Armenians; T1a (n=2) , T2b (n=1), T2g (n=1), T2i (n=1) with Persians; A2v (n=3) with Persians; F1g (n=1) with Kurd and Turkmen; Z1a (n=1) with Qashqai, Persian; M5a (n=2) with Persian; M33a (n=1) with Azeris; M52b (n=9), M27b (n=1) with Shia muslim of Persian origin; M24a (n=8) with Persian, Qashqai; M3a (n=8), M4a (n=1), M35b (n=1) with Persians. M30d (n=10) and M39b (n=9) form unique clusters with 1 M30d clustering with Bhovi, Kuruva and Brahmin Iyengar, a class in India with Indo-European roots. M2a (n=2) clustered with relic tribes Lambadi (Caucasoid origin), Hill Kolam, Katkari and Dongri Bhil of Indian origin while M2b (n=1) clustered with relic tribes Korku and Hill Kolam of Indian origin.
The present invention relates to these primary associations attest to the migration of these Zoroastrian-Parsis from western Iran, central and southeast Europe, their historically documented migration from Central Asia, Iranian plateau and their long-held practice of endogamy.
The present invention relates to the analysis of the mitochondrial variant data from the Whole Genome Sequence WGS of 100 Parsi subjects with the revised Cambridge Reference Sequence (rCRS) standard using pipeline filter systems resulted in unique variants.
The present invention relates to the analysis with VarDiG® revealed 217 unique variants associated with 41 disease phenotypes and were classified according to the 7 major haplogroups and their 25 sub-haplogroups.
The present invention relates to a high representation of Parkinson’s disease in most haplogroups. The variants linked to Longevity were found linked to Parkinson’s, Alzheimer’s, Breast cancer and Cardiomyopathy in 23/25 sub-haplogroups (HV2a, U7a, U4b, T1a, T2g, T2i, T2b, M5a, M39b, M33a, M52b, M24a, M3a, M30d, M2a, M4a, M2b, M35b, M27b, A2v, F1g and Z1a). Longevity variants were absent in 2/25 sub-haplogroups (HV12b and U1a).
The present invention further relates to the analysis that revealed linkages to Colon cancer in 13/23 longevity linked sub-haplogroups (A2v, F1g, HV2a, M24a, M27b, M2b, M35b, M5a, T2b, U1a, U2e, U7a, Z1a). The variant for Colon cancer was missing in 10/23 sub-haplogroups belonging to M39b, M33a, M52b, M3a, M30d, M2a, M4a, T1a, T2g, T2i and T2b.
The present invention relates to the two outlier sub-haplogroups U1a associated with all diseases except longevity, while Hv12b was associated with neither Colon cancer nor Longevity.
The present invention relates to the presence of 17 SNPs in the t-RNA genes suggesting a role in pathogenicity as they are known to be an important cause of human morbidity associated with a wide range of pathologies.
The present invention relates to 6 SNPs in the D-Loop region, known to be critical for its role in replication and expression of the mitochondrial genome to be present across all the sub-haplogroups except HV2a, A2v and M39b.
The present invention relates to SNPs in the 5 sub-haplogroups M2b, U4b, U2e, U1a and U7a to be associated with Rare diseases like Leigh Disease, Mitochondrial Encephalomyopathies, MELAS Syndrome, MERRF Syndrome, Ophthalmoplegia Chronic Progressive External and Cytochrome-c Oxidase Deficiency attesting to a role for mitochondrial SNPs in rare diseases that occur at a high frequency in the endogamous Parsi population.
The present invention relates to the mutational signature C>A transitions, a direct mutational consequence of mis-replication of DNA damage induced by tobacco carcinogens36 to be absent in the Zoroastrian-Parsi cohort.
The present invention relates to additional disease association mapping revealed no lung cancer associated haplogroup in this population study therefore validating its authenticity as an endogamous, non-smoking group.
The present invention related to endogamous Zoroastrian-Parsis of India represent a genetically validated non-smoking group characterized by the lack of lung cancer specific haplogroup associations and the absence of mutational signatures for tobacco induced cancers across all samples.
The present invention related to Phylogenetic analysis demonstrated that the Zoroastrian-Parsis of India are largely of Persian descent. Our analysis of the 420 synonymous and non-synonymous SNPs across coding regions, t-RNA genes and D-Loop region of the mitochondrial genomes revealed haplogroup specific longevity traits and disease associations for Parkinson’s, Alzheimer’s, Cancers and Rare diseases
The present invention relates to an unique repository/Database, AVESTAMITOME™ of the mitochondrial genome sequences of the rapidly declining Zoroastrian-Parsi community of India, a small but unique endogamous, non-smoking community where the disease signatures can be investigated in the backdrop of generations of endogamy, providing exceptional opportunities to understand and mitigate disease.

OBJECTIVE OF THE INVENTION
In this study, our first aim was to gain a clear understanding and impact at the genetic level of the historically recorded migration of the Zoroastrian-Parsis from Persia to India. And to link socio-cultural, ritualistic practices over several millenia manifesting in genetic outcomes. To shed further light on the impact of migrations followed by integration into communities where ritual and social practices are strictly followed within and with outside communities and to understand the gene flows resulting between the present day Iranians, Persians, Indians and the Zoroastrian-Parsis resulting in specific traceable signatures.

Secondly, we sought to throw light on the genetic basis of commonly occurring diseases in this endogamous community. To address these issues, we generated the first complete de novo Zoroastrian-Parsi Mitochondrial Genome (ZPMRG-Hv2a-1) and used it to elucidate the mitochondrial haplotype specific Reference Genomes from a hundred Zoroastrian-Parsi individuals. Our study for the first time has assembled the Zoroastrian-Parsi Mitochondrial Consensus Genome (AGENOME-ZPMCG V 1.0) thus creating a mitochondrial reference standard for the endogamous community.

Phylogenetic analysis confirmed that present day Zoroastrian-Parsis are closely related to Persians and Iranians, attesting to their practice of endogamy. Endogamous communities have comparatively lower genetic diversity and tend to be predisposed to several autosomal recessive and other rare genetic diseases. While the Zoroastrian-Parsis appear to be disproportionately affected with certain diseases such as prostate and breast cancers5,11, Parkinson’s disease and Alzheimer’s disease, they also possess longevity as a trait and are a long-lived community6 with lower incidences of lung12 and head and neck cancers. Reconstruction of the genealogic history and genetics of a close-knit community like the Parsis provides a rare and unique opportunity to better understand wellness and disease. We further demonstrate, for the first time, a genetic association with the commonly occurring diseases in the different haplotypes found in the Zoroastrian Parsi community.

SUMMARY OF THE INVENTION
Zoroastrian-Parsis, are a small community of less than ~45,000 in India (2011 Census, Govt of India). We present the genetic data of the conserved Zoarastrian-Parsi mitochondrion, encapsulated in resilience of thousands of years of magnificent history: of struggles and overcoming them; of building something out of nothingness; of achievement gained with ethical standards; and philanthropy.
In recent decades, the analysis of the variability of maternally inherited mitochondrial DNA (mtDNA) has been commonly used to reconstruct the history of ethnic groups, especially with respect to maternal inheritance. The lack of genetic recombination in mtDNA, results in the accumulation of maternally inherited single nucleotide polymorphisms (SNPs). The accumulation of SNPs along maternally inherited lineages results in phylogenetically traceable haplotypes15 which can be used to follow maternal genealogies both historically and geographically. This approach has provided insightful findings into the origins and disease etiologies associated with another well documented endogamous European community: the Icelandic people6.

Human mtDNA (mitochondrial DNA) is a double stranded, circular (16,569 kb) genome of bacterial origin16 ?primarily encoding vital subunits of the energy generating oxidative phosphorylation and electron transport chain (ETC) pathway that generates Adenosine Tri-Phosphate (ATP), the primary energy substrate of the eukaryotic cell. In addition, 22 tRNAs and 2 rRNAs are also encoded by the mtDNA8.

In this study, our first aim was to gain a clear understanding and impact at the genetic level of the historically recorded migration of the Zoroastrian-Parsis from Persia to India. And to link socio-cultural, ritualistic practices over several millenia manifesting in genetic outcomes. To shed further light on the impact of migrations followed by integration into communities where ritual and social practices are strictly followed within and with outside communities and to understand the gene flows resulting between the present day Iranians, Persians, Indians and the Zoroastrian-Parsis resulting in specific traceable signatures.

Secondly, we sought to throw light on the genetic basis of commonly occurring diseases in this endogamous community. To address these issues, we generated the first complete de novo Zoroastrian-Parsi Mitochondrial Genome (ZPMRG-Hv2a-1) and used it to elucidate the mitochondrial haplotype specific Reference Genomes from a hundred Zoroastrian-Parsi individuals. Our study for the first time has assembled the Zoroastrian-Parsi Mitochondrial Consensus Genome (AGENOME-ZPMCG V 1.0) thus creating a mitochondrial reference standard for the endogamous community.

Phylogenetic analysis confirmed that present day Zoroastrian-Parsis are closely related to Persians and Iranians, attesting to their practice of endogamy. Endogamous communities have comparatively lower genetic diversity and tend to be predisposed to several autosomal recessive and other rare genetic diseases. While the Zoroastrian-Parsis appear to be disproportionately affected with certain diseases such as prostate and breast cancers1,3, Parkinson’s disease and Alzheimer’s disease, they also possess longevity as a trait and are a long-lived community6 with lower incidences of lung12 and head and neck cancers. Reconstruction of the genealogic history and genetics of a close-knit community like the Parsis provides a rare and unique opportunity to better understand wellness and disease. We further demonstrate, for the first time, a genetic association with the commonly occurring diseases in the different haplotypes found in the Zoroastrian Parsi community.

BACKGROUND OF THE INVENTION

The Burden of History- Travelogue of the Zoroastrian Mitochondrion
Zoroastrian-Parsis of India are followers of the ancient prophet Zarathushtra, claimed by the Greek historian Herodotus to have been born circa 6,450 BC1. Zarathushtra, advocated the first known monotheistic concept of one supreme intelligence termed Ahura Mazda - ‘Majestic Creator’2.
The ancient homeland of the present day Zoroastrian-Parsis finds mention in their sacred Avestan text Vendidad, and the location indicated is the North Polar Arctic region3. Sanskrit scholar B G Tilak’s study Arctic Home in the Vedas is also corroborated by Bennet, suggesting that the Indo-European culture originated in the Hyperborean Regions of Northern Siberia and the islands of the circumpolar regions4.
Around 12000 years ago, this region suffered a natural calamity and became ice-clad1 necessitating southward migrations of these pastoralist inhabitants, and by 4,000 BC the Indo Europeans took over the Eurasian Steppe7.
From the late second to early first millennium BC, the Indo Europeans, mostly on the basis of religious worship, split with the Indo Aryans who moved further south and crossed the Hindu Kush while the Indo Iranians (Maad (Medes), (Paars) Persians, and (Parthay) Parthians) began populating the western portion of the Iranian plateau, close to the Alborz and Zagros Mountains and northern Mesopotamia to Southeast Anatolia, in what is called the Fertile Crescent where significant innovation in agriculture occurred8.
In 550 BC, Cyrus the Great overthrew the leading Median rule and conquered the Kingdom of Lydia and the Babylonian Empire after which he established the Persian Zoroastrian Achaemenid Empire (the First Persian Empire), while his successors Dariush I (522-485 BC) Xerxes I, Artaxerxes and others extended its borders to encompass several countries across three continents, namely Europe, Africa and Asia, stretching from the Balkans and Eastern Europe proper in the west, to the Indus Valley in the east followed by a second Zoroastrian dynasty of Sassanian Kings who ruled Persia starting with Ardashir 1 (224 BC). It was the Golden age of the Persian Empire and during this period the Zoroastrian scriptures were codified and written down. During the time of Zoroastrian Achaemenid and Sassanid empires, Persia became a global hub of culture, religion, science, art, gender equality, and technology.
They Persians under Yezdezard III were defeated by the Arab invasion in two decisive battles – (Qadisiyah- 636 AD and Nahavand – 642 AD) resulting in the fall of the Zoroastrian Persian Empire.

It was almost a hundred years later in the 8th century that a few boatloads of Zoroastrians left Paars and Khorasan from the port of Hormuz to sail south towards India. The boats first touched shore on Diu Island on the west coast of India where the refugees stayed for around 19 years.

The environment being non-conducive to progress, they once again set sail and arrived in Sanjan, Gujarat. Vijayaditya of the Chalukya dynasty (aka Jadi Rana) the ruler, hesitated to give refuge, but on being explained the principles of Zoroastrianism and finding it similar to the Vedic religion, the Parsis were given refuge.

Endogamy became the norm to preserve their identity, and for the last 1300 years the community have maintained this practice9,10. Fire being the purest of all elements is considered sacred by Zoroastrian-Parsis. Strict measures are employed to maintain the purity of fire, hence the strict social ostracism practiced against smokers in the community.
Today, the Zoroastrian-Parsis, are a small community of less than ~45,000 in India (2011 Census, Govt of India). We present the genetic data of the conserved Zoarastrian-Parsi mitochondrion, encapsulated in resilience of thousands of years of magnificent history: of struggles and overcoming them; of building something out of nothingness; of achievement gained with ethical standards; and philanthropy.
In recent decades, the analysis of the variability of maternally inherited mitochondrial DNA (mtDNA) has been commonly used to reconstruct the history of ethnic groups, especially with respect to maternal inheritance. The lack of genetic recombination in mtDNA, results in the accumulation of maternally inherited single nucleotide polymorphisms (SNPs). The accumulation of SNPs along maternally inherited lineages results in phylogenetically traceable haplotypes15 which can be used to follow maternal genealogies both historically and geographically. This approach has provided insightful findings into the origins and disease etiologies associated with another well documented endogamous European community: the Icelandic people (rev in 13).

Human mtDNA (mitochondrial DNA) is a double stranded, circular (16,569 kb) genome of bacterial origin16 ?primarily encoding vital subunits of the energy generating oxidative phosphorylation and electron transport chain (ETC) pathway that generates Adenosine Tri-Phosphate (ATP), the primary energy substrate of the eukaryotic cell. In addition, 22 tRNAs and 2 rRNAs are also encoded by the mtDNA17?.

In this study, our first aim was to gain a clear understanding and impact at the genetic level of the historically recorded migration of the Zoroastrian-Parsis from Persia to India. And to link socio-cultural, ritualistic practices over several millenia manifesting in genetic outcomes. To shed further light on the impact of migrations followed by integration into communities where ritual and social practices are strictly followed within and with outside communities and to understand the gene flows resulting between the present day Iranians, Persians, Indians and the Zoroastrian-Parsis resulting in specific traceable signatures.
Secondly, we sought to throw light on the genetic basis of commonly occurring diseases in this endogamous community. To address these issues, we generated the first complete de novo Zoroastrian-Parsi Mitochondrial Genome (ZPMRG-Hv2a-1) and used it to elucidate the mitochondrial haplotype specific Reference Genomes from a hundred Zoroastrian-Parsi individuals. Our study for the first time has assembled the Zoroastrian-Parsi Mitochondrial Consensus Genome (AGENOME-ZPMCG V 1.0) thus creating a mitochondrial reference standard for the endogamous community.

Phylogenetic analysis confirmed that present day Zoroastrian-Parsis are closely related to Persians and Iranians, attesting to their practice of endogamy. Endogamous communities have comparatively lower genetic diversity and tend to be predisposed to several autosomal recessive and other rare genetic diseases. While the Zoroastrian-Parsis appear to be disproportionately affected with certain diseases such as prostate and breast cancers5,11, Parkinson’s disease and Alzheimer’s disease, they also possess longevity as a trait and are a long-lived community6 with lower incidences of lung12 and head and neck cancers. Reconstruction of the genealogic history and genetics of a close-knit community like the Parsis provides a rare and unique opportunity to better understand wellness and disease. We further demonstrate, for the first time, a genetic association with the commonly occurring diseases in the different haplotypes found in the Zoroastrian Parsi community.

Details Description of the Drawings
Table 1: A) Annotation of the de novo Parsi mitochondrial genome AGENOME-ZPMS-HV2a-1. B) The table indicates the SNPs (n=28) found in the AGENOME-ZPMS-HV2a-1 in relation to the revised Cambridge Reference Sequence (rCRS, Reference base)
Table 2: Distribution of 420 variants across coding regions, D-loop of 100 Parsi mitogenomes: Distribution of SNPs across coding genes, D-loop across all the 25 sub-haplogroups, Phylogenetic analysis of the Parsi mitochondrial haplotypes with those of Iranians and Indians
Table 3: Phylogenetic clustering of complete mitogenomes of Parsis with 352 Iranian and 100 relic tribes of Indian origin: Results of the Phylogenetic clustering of the 100 Parsis mitochondrial genomes with 352 mitochondrial genomes of Iranian origin and 100 mitochondrial genomes of relic tribes of Indian origin through Neighbour Joining method. BS indicates Boot-Strap values between each clade.
Table 4: Variants associated with ZPMRG (n=7) and ZPMCG (n=1) mitochondrial genome sequences: List of variants associated with the Haplogroup specific Zoroastrian Parsi Mitochondrial Reference Genomes for A2v, HV, M, U, T, F1g, Z and variants and the Zoroastrian Parsi Mitochondrial Consensus Genome
Table 5: Variants associated with ZPMRG and unique variants in each ZPMRG compared to ZPMCGA); Unique Variants found in the haplogroup consensus sequences compared to the Parsi Consensus mtDNA sequence (AGENOME-ZPMCG-V1 (B) Histogram listing the exact number of variants in each ZPMRG compared to ZPMCG
Table 6: mt-t-RNA variants in our study and their disease association: Analysis of the occurrence of the 420 variants in the tRNA and their disease associations annotated with the PON-mt-tRNA database. A frequency score =0.5 – pathogenic, =0.5 – likely pathogenic, <0.5 - neutral
Table 7: MITOME dataset and their Genbank Accession Number
Table 8: Accession Number for Zoroastrian Parsi Consensus Genome and Zoroastrian Parsi Reference Genome submitted to Genbank NCBI
Figure 1: Identification of 28 variants in the de novo Parsi mitochondrial genome, AGENOME-ZPMS-HV2a-1, Distribution and classification of SNPs in the AGENOME-ZPMS-HV2a-1, Representative histogram showing the base change, variant type, type of loci and distribution of SNPs across genes in the de novo mitochondrial genome AGENOME-ZPMS-HV2a-1
Figure 2: Validation of variants in the AGENOME-ZPMS-HV2a-1 by Sanger sequencing, Confirmation of variants identified with next-generation sequencing (NGS) data and confirmation by Sanger sequencing. Sequences obtained from desired regions were analyzed for presence of variants/Variants. Low quality bases were trimmed at either ends of the sequences and used for alignment with the reference Mitochondrial Genome (rCRS). A total of 13 variants/Variants from D-loop and internal region of mitochondrial genome were verified.
Figure 3 : Equal representation of Males and Females in the 100 Zoroastrian-Parsi whole mitogenome study: Distribution of the subjects classified based on gender and age. The bars on the histogram depict further segmentation of the total number of subjects, Male and Female numbers according to their age range.
Figure 4 : Annotation and distribution of 420 variants across 100 Parsi complete mitogenomes
Figure 5 : Identification of 25 sub-haplogroups in the 100 Zoroastrian-Parsi study group Distribution of Parsis across major haplogroups and sub-haplogroups. The table and the histogram shows the distribution of 100 Parsi subjects across 7 major haplogroups and 25 sub-haplogroups
Figure 6 : Distribution of variants across haplogroups and demographic classification of the 100 Parsi study group Distribution across the 100 Zoroastrian-Parsi subjects. (A) Representative graph depicting the distribution of SNP’s count across the 7 major haplogroups (B) Table depicts the distribution of the subjects classified based on gender across 25 sub-haplogroups
Figure 7: Lack of haplogroup diversity in the Parsi cohort suggesting endogamy. Comparative analysis of Major haplogroup distribution in the Parsis and populations of Iranian ethnicities (Persians, Qashqais) (A) Histogram depicting the analysis of The 7 Major haplogroups across Parsis (n=100), Persians (n=180) and Qashqais (n=112) (B) Representative figure showing the diversity of major haplogroups in the Parsis and the Persians
Figure 8: Phylogenetic analysis depicting individual sub-haplogroup clusters of 97 Parsis, 352 Iranian and 100 relic tribes of Indian origin
Representative cladograms of the HV sub-haplogroup
Representative cladograms of the U sub-haplogroup
Representative cladograms of the T, F and A sub-haplogroup
Representative cladograms of the M and Z sub-haplogroup
Figure 8E: Table indicates the number of Zoroastrian-Parsis who cluster with Persians or people of Persian origin, relic tribes of Indian origin. Pie chart indicates the percentage of clustering of the HV2a Zoroastrian-Parsis in the phylogenetic clustering analysis. Circular dendrogram of the complete Phylogenetic clustering analysis of Parsis (Blue clades) with Iranian mitogenomes (Green clades) and Indian mitogenomes (Brown clades).
Figure 9: Lack of smoking induced mutational signatures in the Parsi cohort: Mutational signatures observed in the 100 mitochondrial genomes of Parsis. Graph depicts the quantification of both transitions and transversions on both H&L strands of the 100 mitochondrial genomes of Parsis.
Figure 10: VarDiG® -R analysis of 420 variants indicates high association of Parsi specific variants with Parkinsons diseases Variant-disease distribution of 420 Parsi variants. Graph depicts the variant-disease distribution between Parsis (blue) and VarDiG®-R (Brown)
Figure 11: Observation of Longevity variants across all sub-haplogroups and predisposition of U and M haplogroups to diseases: Haplogroup specific distribution of diseases. (A) Distribution of 188 diseases across 25 sub-haplogroups of the 100 Parsi subjects analyzed in this study (B) Histogram depicting longevity and disease prevalence across U1a, M52b, M35b, M27b
Figure 12: PCA analysis shows absence of Longevity variants in U1a and F1g sub-haplogroups Principal Component Analysis of disease associations with sub-haplogroups in the Parsi-Zoroastrian group under study
Figure 13: CYTB gene has the highest occurrence of non-synonymous variants in this study Analysis of the non-synonymous variants within 420 variants in the 100 Parsi mitochondrial genome sequences. The histogram and the table show the location of the non-synonymous variants in the coding gene loci in the mitochondrial genome analysed with MitImpact database
Figure 14: Non-synonymous variants among 420 variants and their disease association Analysis of the non-synonymous variants within 420 variants in the 100 Parsi mitochondrial genome sequences. Functional impact of non-synonymous mutations on disease associations.
Figure 15: Non-synonymous variants among 420 variants and their associations with mitochondrial function Distribution of non-synonymous Variants across coding genes. Analysis was performed on the 420 Variantslinked to the 100 Parsi mitochondrial genomes.
Figure 16: Gene ontology associated with non-synonymous variants among 420 variants Analysis of non-synonymous mutations and their functional classification, engagement in different pathways respectively using DAVID and UNIPROT annotation tools
Figure 17: 12 unique variants found in the current study Comparative analysis of the 420 variants in the AVESTAMITOME™ Zoroastrian-Parsi community dataset with common and disease associated polymorphisms in MITOMASTER database and VarDiG®-R

Supplementary Tables and Figures
Supplementary Table 1: Description of primers used in validation of AGENOME-ZPMS-HV2a-1 by Sanger sequencing:Table shows the list of primers sequences used for Sanger sequencing for validation of selected variants in the AGENOME-ZPMS-HV2a-1
Supplementary Table 2: Description of primers used in validation of AGENOME-ZPMS-HV2a-1 by Sanger sequencing Table shows the list of primers sequences used for Sanger sequencing for validation of selected variants in the AGENOME-ZPMS-HV2a-1
Supplementary Figure 1: QC analysis of 100 Zoroastrian-Parsi mitochondrial genome sequences; QC analysis of 100 Parsi mitochondrial genomes (A) Frequency of mean PHRED score per read (150 read length) for 100 mitochondrial sample (B) Frequency of mean PHRED score per sequence for 100 mitochondrial samples
Supplementary Figure 2: Distribution of 420 variants across coding genes normalized for gene length Distribution of 420 variants across coding genes normalized to gene length (variants/gene length (bp))
Supplementary Figure 3: 43A Sub-haplogroup specific breakdown of 420 variants Distribution of Variants across gene loci in the HV haplogroup consisting of HV2a (n=14 subjects and HV12b (n=1 subject)
Supplementary Figure 3B: Distribution of Variants across gene loci in the U haplogroup consisting of U1a, U4b, U2e and U7a
Supplementary Figure 3C: Distribution of Variants across gene loci in the T haplogroup consisting of T1a, T2b, T2g and T2i
Supplementary Figure 3D: Distribution of Variants across gene loci in the A, Z and F haplogroup consisting of A2v, Z1a and F1g

BRIEF DESCRIPTION OF THE SEQUENCES – SINGLE NUCELOTIDE POLYMORPHISM (SNP) POSITIONS IN SAMPLES

DETAILED DESCRIPTION OF THE INVENTION
Materials and Methods
Sample collection and ethics statement
One hundred healthy non-smoking Parsi volunteers residing in the cities of Hyderabad-Secunderabad and Bangalore, India were invited to attend blood collection camps at the Zoroastrian centers in their respective cities under the auspices of The Avestagenome ProjectTM. Each adult participant (>18 years), underwent height and weight measurements and answered an extensive questionnaire designed to capture their medical and life history of each participant. All subjects provided a written informed consent for the collection of samples and subsequent analysis. All health-related data collected from the cohort questionnaire were secured in The Avestagenome ProjectTM database to ensure data privacy. All procedures performed in this study involving human participants were in accordance with the ethical standards of the institution (Avesthagen Limited,Bangalore, India) and in line with the 1964 Helsinki declaration and its later amendments. This study has been approved by the Avesthagen Ethics Committee (BLAG-CSP-033).

Example 1
Genomic DNA extraction
Genomic DNA from the buffy coat of peripheral blood was extracted using the Qiagen Whole Blood and Tissue Genomic DNA Extraction kit (cat. #69504). Extracted DNA samples were assessed for quality using the Agilent Tape Station and quantified using the Qubit™ dsDNA BR Assay kit (cat. #Q32850) with the Qubit 2.0® fluorometer (Life Technologies™). Purified DNA was subjected to both long-read (Nanopore GridION-X5 sequencer, Oxford Nanopore Technologies, Oxford, UK) and short-read (Illumina sequencer) for library preparation and sequencing

Example 2
Library preparation and sequencing for sequencing on the Nanopore platform
Libraries of long reads from genomic DNA were generated using standard protocols from Oxford Nanopore Technology (ONT) using the SQK-LSK109 ligation sequencing kit. Briefly, 1.5 µg of high-molecular-weight genomic DNA was subjected to end repair using the NEBNext Ultra II End Repair kit (NEB, cat. #E7445) and purified using 1x AmPure beads (Beckman Coulter Life Sciences, cat. #A63880). Sequencing adaptors were ligated using NEB Quick T4 DNA ligase (cat. #M0202S) and purified using 0.6x AmPure beads. The final libraries were eluted in 15 µl of elution buffer. Sequencing was performed on a GridION X5 sequencer (Oxford Nanopore Technologies, Oxford, UK) using a SpotON R9.4 flow cell (FLO-MIN106) in a 48-hr sequencing protocol. Nanopore raw reads (fast5 format) were base called (fastq5 format) using Guppy v2.3.4 software. Samples were run on two flow cells and generated a dataset of ~14 GB.

Example 3
Library preparation and sequencing on the Illumina platform
Genomic DNA samples were quantified using the Qubit fluorometer. For each sample, 100 ng of DNA was fragmented to an average size of 350 bp by ultrasonication (Covaris ME220 ultrasonicator). DNA sequencing libraries were prepared using dual-index adapters with the TruSeq Nano DNA Library Prep kit (Illumina) as per the manufacturer’s protocol. The amplified libraries were checked on Tape Station (Agilent Technologies), quantified by real-time PCR using the KAPA Library Quantification kit (Roche) with the QuantStudio-7flex Real-Time PCR system (Thermo). Equimolar pools of sequencing libraries were sequenced using S4 flow cells in a Novaseq 6000 sequencer (Illumina) to generate 2 x 150-bp sequencing reads.

Example 4
Generation of the de novo Parsi mitochondrial genome AGENOME-ZPMS-HV2a-1
a) Retrieval of mitochondrial reads from whole-genome sequencing (WGS) data:
A total of 16 GB of raw data (.fasta) was generated from a GridION-X5 Nanopore sequencer for AGENOME-ZPMS-HV2a-1 from WGS. About 320 million paired-end raw reads were generated for AGENOME-ZPMS-HV2a-1 by Illumina sequencing.
Long Nanopore reads (.fastaq5) were generated from the GridION-X5 samples. The high-quality reads were filtered (PHRED score =>20) and trimmed for adapters using Porechop (v0.2.3). The high-quality reads were then aligned to the human mitochondrial reference (rCRS) NC_12920.1 using Minimap2 software. The aligned SAM file was then converted to a BAM file using SAMtools. The paired aligned reads from the BAM file were extracted using Picard tools (v1.102).
The short Illumina high-quality reads were filtered (PHRED score =>30). The adapters were trimmed using Trimgalore (v0.4.4) for both forward and reverse reads, respectively. The filtered reads were then aligned against a human mitochondrial reference (rCRS21) using the Bowtie2 (v2.2.5) aligner with default parameters. The mapped SAM file was converted to a BAM file using SAMtools, and the mapped paired reads were extracted using Picard tools (v1.102).

b) De novo mitochondrial genome assembly
Mapped reads were used for the de novo hybrid assembly using the Maryland Super-Read Celera Assembler (MaSuRCA-3.2.8) tool. The configuration file from the MaSuRCA tool was edited by adding appropriate Illumina and Nanopore read files. The tool MaSuRCA uses a hybrid approach that has the computational efficiency of the de Bruijn graph methods and the flexibility of overlap-based assembly strategies. It significantly improves assemblies when the original data are augmented with long reads. AGENOME-ZPMS-HV2a-1 was generated by realigning the mapped mitochondrial reads from Illumina as well as Nanopore data with the initial assembly.

Example5
Confirmation of the SNPs in the de novo Parsi mitochondrial genome using Sanger sequencing
In order to validate the de novo Parsi mitochondrial sequence, AGENOME-ZPMS-HV2a-1, selected SNPs were identified and subjected to PCR amplification. Genomic DNA (20 ng) was PCR amplified using LongAmpTaq 2X master mix (NEB). The PCR amplicons of 781 bases targeting the hypervariable region of the D-loop region and a second amplification of 500 bases for an internal region (6.6–7.1 Kb) were subjected to Sanger sequencing and BLAST analysis to confirm the presence of eight SNPs using primers listed in Supplemental Table 1.

Example5
Confirmation of the SNPs in the de novo Parsi mitochondrial genome using Sanger sequencing
In order to validate the de novo Parsi mitochondrial sequence, AGENOME-ZPMS-HV2a-1, selected SNPs were identified and subjected to PCR amplification. 20ng of genomic DNA was PCR amplified using LongAmpTaq 2X master mix (NEB). The PCR amplicons of 781 bases targeting the hyper variable region of the D-Loop region and a second amplification of 500 bases for an internal region (6.6Kb to 7.1KB), were subjected to Sanger sequencing and BLAST analysis to confirm the presence of 8 SNPs using primers listed in Supplemental Table 1

Example 6
Generation of the Zoroastrian Parsi Mitochondrial Consensus Genome (AGENOME-ZPMCG-V1.0) and Parsi haplogroup specific consensus sequences
a)Retrieving mitochondrial reads from 100 Parsi Whole genome sequences
The whole-genome data from one hundred Parsi samples were processed for quality assessment and adapters were removed using trimgalore 0.4.4 tool for paired-end reads (R1 and R2), sites with PHRED scores lower than 30 and reads below 20 bp in length were removed. The processed Illumina reads were aligned against human mitochondrial reference (rCRS18: Revised Cambridge Reference Sequence, NC_012920.1) using bowtie2, version 2.4.1 aligner with default parameters. Mapped reads were further used for the De-novo assembly using SPAdes, version 3.11.1, Velvet and IVA, version 1.0.8. Comparison of the assembly and statistics were obtained using Quast, version 5.0.2. The assembled scaffolds were subjected to Blastn against the NCBI non-redundant nucleotide database (nt) for validation.
b) Variant calling and haplogroup classification
Sequencing reads were mapped to the human mitochondrial genome (rCRS21) assembly of the using the MEM algorithm of the Burrows-Wheeler Aligner, version: 0.7.17-r1188, with default parameters. Variants were called using the samtools, version 1.3.1 to transpose the mapped data in a sorted BAM file and calculate the Bayesian prior probability. Next, bcftools, version 1.10.2 was used to calculate the prior probability distribution to obtain actual genotype for the variants detected. The classification and haplogroup assignment was performed for each of the 100 Parsi mtDNAs after variant calling after mapping reference and alternate allele to standard haplogroups obtained from MITOMAP (Appendix 4).
c) Haplogroup based consensus sequence
The 97 of 100 fill length Parsi mtDNA sequences were segregated based on haplogroups and separately aligned using the MUSCLE to obtain the multiple sequence alignments. The Zoroastrian Parsi Mitochondrial Reference Genome (ZPMRG) and the Parsi haplogroup specific consensus sequences were generated after calculation of the ATGC base frequency by comparison of the nucleotides in an alignment column to all other nucleotides in the same column called for other samples at the same position. The highest frequency (%) was taken to build seven Parsi haplogroup ZPMRG and the seven Parsi haplogroup specific consensus sequences.
Example 7
Phylogeny build and analysis
The 97 of 100 full length Parsi mtDNA sequences that were generated as described above were compared with 100 randomly chosen Indian mtDNA derived from NCBI Genbank under the accession codes: FJ383174.1-FJ 383814.122, DQ246811.1-DQ246833.123; KY824818.1-KY825084.124, and from previously published data on 352 complete Iranian mtDNA sequences25. All mtDNA sequences were aligned using the MUSCLE software26, using -maxiters 2 -diags 1 options, followed by manual verification using BioEdit (version 7.0.0). Following alignment, the neighbor-joining method, implemented in MEGAX27, was employed to reconstruct the haplotype-based phylogeny. The neighbor-joining method was used because it is more efficient for large data sets28.

Example 8
Variant Disease Analysis
100 Parsi mitochondria sequences extracted from the WGS (Whole Genome Sequence) were uploaded into VarDiG® (https://vardigrviz.genomatics.life/vardig-r-viz/) on Amazon AWS. VarDiG®, developed by Genomatics Private Ltd, connects the relationships between Variants, Disease and Genes in the human Genome. Currently, VarDiG® knowledge base contains manually curated information on 330,000+ variants, >20K genes covering >4500 phenotypes including nuclear and mitochondrial regions for 150,000+ Published articles from 3,88+ journals. Variants obtained from the Parsi Mitochondria were mapped against all the published variants in VarDiG®. Disease association was ascertained for each variant and putative diseases through VarDIG®.
17 tRNA SNP sites were identified from the 100 Parsi mitochondrial SNP data. The PON-mt-tRNA database (PMID: 26843426) was downloaded to annotate the tRNA SNPs for their impact and diseases association. The method is a posterior probability-based method for classification of mitochondrial tRNA variations. PON-mt-tRNA integrates machine learning-based probability of pathogenicity and evidence-based likelihood of pathogenicity to predict the posterior probability of pathogenicity. In absence of evidence, it classifies the variations based on the machine learning-based probability of pathogenicity.
For annotation of disease pathways associated with the SNPs, we downloaded MitImpact (https://mitimpact.css-mendel.it/) to predict the functional impact of the non-synonymous SNPs for their pathogenicity. The database is a collection of non-synonymous mitochondrial SNPs their functional impact from various databases viz. SIFT, Polyphen, Clinvar, Mutationtester, dbSNP, APOGEE, and various other databases. The diseases association and functional classification and engagement in different pathways respectively using DAVID and UNIPROT annotation tools.

Example 9
Haplogroup and Disease Linkage
Haplogroups linked to disease were grouped together. Principal component analysis (PCA) was performed to visualize the linkage of the haplogroup with disease. XLSTAT (Addinsoft 2020, New York, USA. https://www.xlstat.com) was used for statistical and data analysis, including PCA.

Assembly of the first complete Zoroastrian-Parsi Mitochondrial Sequence: AGENOME-ZPMS-HV2a-1
The first complete de novo nonsmoking Zoroastrian Parsi mitochondrial sequence, AGENOME-ZPMS-HV2a-1, was assembled from a healthy Parsi female sample by combining the sequence data generated from two next-generation sequencing (NGS) platforms and using a novel assembly protocol,as outlined in Materials and Methods. Our approach is unique and accurate in that it combines the sequencing depth and accuracy of short-read technology (Illumina) with the coverage of long-read technology (Nanopore). QC parameters for mitochondrial reads, mitochondrial coverage, and X-coverage were found to be optimal, as seen in Supplementary Figure 1. The hybrid Parsi mitochondrial genome was assembled as a single contig of 16.6 kb (with 99.82% sequence identity), with reads generated from both Illumina and Nanopore resulting in the consensus sequence for the de novo Parsi mitochondrial genome (99.84% sequence identity). This was aligned to the rCRS21, and it was observed that the hybrid assembled genome aligned completely with the standard rCRS reference mitochondrial genome (~16.6 kb). To examine the read utilization in the genome assembly, the Illumina and Nanopore reads were aligned against the consensus sequence for the Parsi de novo mitochondrial genome, as described above. It was observed that 100% of the reads were utilized in the assembly. The common SNPs identified from both the Illumina and Nanopore data were considered to be significant for this de novo Zoroastrian Parsi mitochondrial genome, henceforth referred to as AGENOME-ZPMS-HV2a-1.

Identification of 28 unique SNPs in AGENOME-ZPMS-HV2a-1
A total of 28 significant variants (i.e., SNPs) were identified by BLAST alignment between the Parsi mitochondrial hybrid assembly and the rCRS21 (Figure 1). To confirm the authenticity of the identified variants, we selected a total of seven identified SNPs from the D-loop region and one SNP from the COI gene (m.C7028T, A375A) and subjected them to Sanger sequencing using primers (Supplementary Table 2). All eight predicted SNPs were verified and confirmed for their presence in the consensus Parsi mitochondrial genome (Supplementary Figure 3).
As expected, the majority (n=10) of the SNPs identified in the AGENOME-ZPMS-HV2a-1 were found in the hypervariable regions (HVRI and HVRII) of the D-loop (Figure 1). Of the remaining 18 SNPs, eight were found to represent synonymous variants, while four were located in genes for rRNA (n=3) and tRNA (n=1) (Figure 2). The remaining nonsynonymous SNPs were located in the genes for ATPase6 (m8860G>A), COIII (m.9336 A>G), ND4 (m.11016 G>A), and two in the CytB gene (m15326 A>G and m15792 T>C, Figure 1). With the exception of the ATPase6 gene variant, which has been found to be associated with hypertrophic cardiomyopathy in Iranian individuals29, no associations were found in the published literature for these gene variants, and they need to be further investigated.

Given that the Zoroastrian Parsis are known to have originated in Persia and have practiced endogamy since their arrival on the Indian subcontinent, we wished to determine the mitochondrial haplogroup associated with the first complete Zoroastrian Parsi mitochondrial genome. We therefore compared the variants associated with ZPMS-HV2a-1 to standard haplogroups obtained from PhyloTree (build 17) and determined the haplogroup to be Hv2a (Figure 1). This haplogroup is known to have originated in Iran25, suggesting Persian origins for this Parsi individual, based on maternal inheritance patterns.

Seven major haplogroups identified in 100 Zoroastrian Parsi individuals
In some embodiments Keeping in mind the endogamous nature of the Indian Parsis and to understand the extent of the diversity of the mitochondrial haplogroups in this population, we analyzed mitochondrial genomes from an additional 100 consenting Parsi individuals. Our study had an equal representation of both genders, and 60% of the subjects were of age 30–59 (mean age 50±1.6 (Figure 3). Complete analysis of the variants in the 100 Parsi samples identified a total of 420 unique SNPs (Figure 4, Appendix 1). QC analysis of the 100 mitochondrial genomes sequenced were found to be optimal; PHRED>30 (Supplementary Figure 2). Variant distribution in the coding region normalized to gene length showed the ND6 gene has the highest prevalence of variants (Supplementary Figure 3). The 100 Zoroastrian Parsi mitochondrial genomes were subjected to haplogroup analysis using haplogroup specific variant assignment matrix from MITOMAP (Appendix 4). The haplogroup assignment based on SNPs classified the genomes into seven principal haplogroups (HV, U, T, M, A, F, and Z) and 25 sub-haplogroups were also identified within the principal haplogroups (Figure 5). The variant count across all sub-haplogroups varied between 14-64 (Figure 6A). Analysis of the sub-haplogroups demonstrated that HV2a was the single largest representative sub-haplogroup within the Parsi population (n=14, n=9 females, n=5 males, (Figure 6B), that includes the AGENOME-ZPMS-HV2a-1.

The sub-haplogroup HV2a (n=14 subjects) contained 38 SNPs, with the highest number in the HVR II region (n=8). Coding region mutations constituted 20/38 SNPs, with equal distribution between synonymous (n=10) and nonsynonymous substitutions observed for this sub-haplogroup (n=10). Among the coding regions, the largest number of SNPs was found in the gene encoding COI (n=6, Supplementary Figure 4A). Four COI SNPs distributed across all of the 14 subjects in the HV2a sub-haplogroup (m.6104 C>T, m.6179 G>A, m.7028 C>T, and m.7193 T>C) constitute synonymous mutations (F67F, M92M, A375A, and F430F, respectively). Two SNPs (m.7080 T>C and m.7146 A>G), found to occur in one subject each in the sub-haplogroup HV2a, were nonsynonymous substitutions (F393L and T415A, respectively). Further analysis of rare SNPs (occurring only in single subjects or n<8/14) showed their presence in the 16S-RNR2 gene (m.1883 G>A and m.1888 G>A), as well as the COII, COIII (m.8203 C>T and m.9540 T>C), and HVR I (m.16153 G>A and 16274 G>A) genes, which were synonymous substitutions in these coding genes, while we found nonsynonymous substitutions in the COII (m.7650 C>T; T22I), ND5 (m.12358 C>T; T8A), and CYTB (m.14954 A>G; T70A) genes in our analysis. A SNP was seen in the gene encoding for tRNA[R] at m.10410 T>C (n=14 subjects), but no mutations were observed in the D-loop region for the entire group under analysis.

The sub-haplogroup HV12b contained 17 SNPs. HVR II harbors four SNPs, while the coding genes together contain six SNPs that encode three synonymous and three nonsynonymous substitutions. We observed SNPs encoding nonsynonymous substitutions in this sub-haplogroup in ATPase6 (m.8860 A>G; T112A), ND5 (m.13889 G>A; C518Y), and CYTB (m.15326 A>G; T194A). Three SNPs were found in 12S-RNR1, two SNPs in 16S-RNR2, two in the non-coding regions of HVR I and in the D-loop region. No SNPs were observed in the genes coding for tRNAs in the HV12b sub-haplogroup.

The 21 subjects analyzed fell into the U haplogroup, which consisted of four sub-haplogroups U1a (n=1), U4b (n=11), U2e (n=3), and U7a (n=6). The U1a sub-haplogroup contained 44 SNPs distributed across 19 positions in the mitochondrial genome. Twenty-one SNPs were observed in the coding region (17 synonymous, 4 nonsynonymous). ND5, containing a coding region, contains six SNPs, the most for any position within the U1a haplogroup. All ND5 SNPs coded for synonymous substitutions, while nonsynonymous substitutions were observed for ND2 (m.4659 G>A; A64T), ATPase6 (m.8860 A>G; T112A), and CYTB (m.14766 C>T; T7I and m.15326 A>G; A190T). Twenty-one of forty-four SNPs fell within coding genes, while the rest were distributed across HVR I (n=4 SNPs), HVR II (n=3 SNPs), HVR III (n=5 SNPs), 12S-RNR1 (n=2 SNPs), 16S-RNR2 (n=4 SNPs), the D-loop region (n=1 SNP), and control regions (n=2 SNPs). Two SNPs were found in regions coding for tRNA[D] and tRNA[L:CUN].

The U4b sub-haplogroup (n=11 subjects) is the most common sub-haplogroup among the U haplogroup in our analysis. In all, 64 SNPs were observed for the U4b sub-haplogroup, with the majority of the SNPs (n=20) found in the gene encoding 16S-RNR2 (Supplementary Figure 4B). Twenty-one SNPs were found in coding regions (14 synonymous and 7 nonsynonymous substitutions), with the highest number seen in the gene coding for COI (n=6 SNPs). Five of six SNPs coded for synonymous substitutions, while m.6366 G>A coded for a nonsynonymous substitution (V155I). Three SNPs were found in the gene encoding CYTB and were distributed across all subjects (n=11) in the U4b sub-haplogroup. All three encoded nonsynonymous substitutions, m14766 C>T (T7I), m.15326 A>G (T194A), and m.15693 T>C (M316T), and need to be further investigated. Four tRNA mutations were observed in this sub-haplogroup and one mutation in the D-loop region.

Six subjects in the U haplogroup fell within the subgroup U7a. A total of 52 SNPs were observed across all samples in this subgroup (Supplementary Figure 4B). Twenty-seven SNPs were found in noncoding regions, 12S-RNR1, 16S-RNR2, and the D-loop region. Twenty-five SNPs were found in the coding region (17 synonymous and 8 nonsynonymous substitutions), with 17/25 distributed among the ND genes coding for ND1–6. ND5 (n=6 SNPs) encodes five synonymous mutations, with a nonsynonymous mutation observed at m.14110 T>C (F592L, in 4/6 subjects).

Three of twenty-one subjects in the U haplogroup fell within the sub-haplogroup U2e. A total of 55 SNPs was observed for U2e, with the majority (n=33 SNPs) falling in the noncoding regions (HVRI-III and D-loop) and the 12S-RNR1, 16S-RNR2, and tRNA genes. Twenty-two SNPs fell within the coding region (15 synonymous and 7 nonsynonymous substitutions), of which 8 fell in the ND gene complex (four ND2, four ND5) and four in the CYTB gene. While all the SNPs in the ND2 and ND4 genes are synonymous substitutions, all the SNPs in the CYTB gene encoded nonsynonymous mutations (m.14766 C>T; T7I in 3/3 subjects, m.15326 A>G; T194A in 3/3 subjects; m.14831 G>A; A29T and m.15479 C>T; F245L, both in 1/3 subjects).

Five subjects in our analysis (n=100) fell within the T haplogroup. We found four sub-haplogroups within this haplogroup (T1a, 2/5; T2b, T2i, and T2g, with 1 subject each). Our analysis indicated a total of 39 SNPs (Supplementary Figure 4C) for T1a, with 21/39 SNPs found in noncoding regions, including 12S RNA, 16S RNA, tRNAs, and control regions, including the D-loop. Eighteen SNPs were observed in the coding region, with the greatest number occurring in the CYTB gene (n=5 SNPs). Three SNPs within the CYTB gene coded for nonsynonymous mutations, including m.14776 C>T, m.14905 G>A, and m.15452 C>A, coding for T7I, T194A, and L236I substitutions, respectively

The T2b, T2g, and T2i sub-haplogroups contained 35, 42, and 34 SNPs, respectively, in total. We found that CYTB contained the majority of the SNPs found in the coding regions in these sub-haplogroups, except for the T2i group in which the CYTB SNPs (n=5) constituted the majority of the SNPs found in coding and noncoding regions of the genome. Two SNPs, m.14766 C>T and m.15326 A>G, seen in all three groups code for nonsynonymous substitutions, and m.15452 C>A was seen in T2g and T2i and codes for a nonsynonymous mutation. Single mutations were seen for m.15497 G>A and m.14798 T>C and code for nonsynonymous substitutions, and these warrant further investigation.

The A haplogroup in our study consists of the sub-haplogroup A2v (n=3 subjects). The subjects in the A2v sub-haplogroup had a total of 17 SNPs (Supplementary Figure 4D) distributed across the mitochondrial genome. Twelve of seventeen SNPs were found in the noncoding regions (HVR I, II) and in the 12S rRNA and 16S rRNA genes. Five SNPs were distributed in the coding region across ND2 (m.4769 A>G and m.6095 A>G), ATPase6 (m.8860 A>G), ND4 (m.11881 C>T), and CYTB (m.15326 A>G). Two nonsynonymous substitutions were observed in the ATPase6 and CYTB genes.

F1g (n=1 subject) is a sub-haplogroup, along with Z1a (n=1 subject). A total of 33 and 32 SNPs, respectively, were identified in these groups. Nine CYTB SNPs were observed in total for both groups. Two encoded nonsynonymous substitutions, m.14766 C>T (T7I) and m.15326 A>G (T194A), while the seven other SNPs resulted in synonymous mutations. SNPs for ND4L are seen only across Z1a and F1g, with the m.10609 T>C SNP in F1g resulting in a nonsynonymous shift (M47T), while the Z1a SNP resulted in a synonymous substitution (Supplementary Figure 4D).

The M haplogroup (n=52 subjects) consists of 12 sub-haplogroups in our study (Supplementary Figure 4E). M30d is the sub-haplogroups with the highest number of subjects in the M haplogroup (n=11 subjects). Fifty-one SNPs were identified in this sub-haplogroup in total, of which 28 SNPs were seen in the noncoding regions (HVR I, II, III), the D-loop region, and the 12S-RNR1 and 16S-RNR2 genes. The remaining 23 SNPs were part of the coding region within CYTB (n=8 SNPs) and ND4 (n=5 SNPs) and formed a majority. Nine of thirteen SNPs in CYTB and ND4 code for synonymous substitutions, while four SNPs in CYTB resulted in nonsynonymous substitutions (m.14766 C>T; T7I, m.15218 A>G; T158A, m.15326 G>A; T194A, and m.15420 G>A; A229T).

M3 M39b (n=10 subjects) is one of the largest sub-haplogroups, and a total of 59 SNPs were seen for this sub-haplolgroup. The noncoding regions, 12S, 16S, and control regions, together constitute 33/59 of the SNPs. Of the remaining 26 SNPs, the 5 SNPs in the CYTB complex constitute the greatest number, while the ND gene complex accounts for 12 SNPs (2 ND1, 1 ND2, 2 ND3, 2 ND4, 3 ND5, and 2 ND6). Of the nine remaining SNPs, six are seen in the COI, II, and III genes (two each), while three SNPs are found in the ATPase6 gene.

The M2 sub-haplogroup consists of M2a (n=2 subjects) and M2b (n=1 subject). A total of 110 SNPs was observed in total for M2a and M2b. In M2a, 23/53 SNPs occurred in noncoding regions (HVR I, II, III), the 12S-RNR1 and 16S-RNR2 genes, the control region (OL), and the D-loop region. Thirty SNPs occurred in the coding regions, making this one of the sub-haplogroups in which SNPs in the coding region outnumber the SNPs in the noncoding region. CYTB harbors seven SNPs, followed by three SNPs in ND4 and three SNPs in ATPase8, ATPase6, and COI. A total of 55 SNPs were observed for M2b, in which 31/55 SNPs occurred in the noncoding regions. Twenty-four SNPs were observed in genes coding for COI, III; ND1,2,3,4,5; ATPase6,8; and CYTB. The six SNPs in CYTB constitute the greatest number of SNPs in the coding region. The M2a/b sub-haplogroup is also conspicuous by the presence of SNPs in the ATPase8 gene, which is not observed in any sub-haplogroup besides U4b. The complete distribution of the SNPs across all the sub-haplogroups is presented in Table 2.

Phylogenetic analysis of the Parsi mitochondrial haplotypes with those of Iranians and Indians
To further investigate the substructure of the major haplogroups identified in the Parsi cohort, a comparative analysis of haplotypes from 452 complete mtDNA sequences, including 352 Iranians and 100 Indian mtDNA sequences, was undertaken. The rationale for selection of these two populations centered around the ancestral migration patterns of the Parsis of India30. This grouping also complements the model of the Parsi origin stemming from the ancient Iranian plateau, circa 900 AD31.

Analysis of the haplogroups identified in the Parsis compared with the Iranians, of whom the Persians (n=180) and the Qashqais (n=112) were the most frequent representatives, demonstrated that a) all seven Parsi haplogroups were found within the Iranian haplogroup set and b) a marked lack of haplogroup diversity was observed in the Parsi datatset (n=7 principal haplogroups) compared with the Persians and Qashqais (n=14 principal haplogroups, Figure 7A, B). The reason for this lack of haplotype diversity likely lies in the practice of endogamy, which has been strictly adhered to in the Parsi community for centuries, following their arrival from the Iranian plateau. Contemporary populations of Persians and Qashqais in the Iranian plateau represent diverse haplogroupings, possibly due to admixture following political upheavals in the region after the departure of Parsis from ancient Iran around 900 AD.

The presence of the predominantly Eurasian mtDNA haplotypes HV, T, and U in our study cohort was remarkable, given that Parsis have resided on the Indian subcontinent for 1200 years. While the majority of Parsis with M haplogroups can be linked to Persian descent, a very small minority of Parsis were found to be related to Indians with M haplogroups in our analyses.

A detailed phylogenetic clustering of the Parsis to establish more precise ethnic relationships was next undertaken. Our analysis revealed that the Parsis predominantly clustered with populations from Iran (Persians and people of Persian descent, Figure 8A), and the most common HV group showed that all Parsis in the HV2a tree (n=14) clustered with Persians and Qashqais (neighbour-joining tree weight > 0.72/72%, Figure 8A, 8E), while the single Parsi in the HV12b (n=1) haplotype demonstrated a strong association with other Iranian ethnicities, including the Khorasani and Mazandaranis, in addition to the Qashqai and Persians (Table 3). The haplogroup HV2, dated at 36–42 kya, most likely arose in Iran between the time of the first settlement by modern humans and the late glacial melt, and the subclade HV2a has a demonstrated Iranian ancestry, with transfers to India in repeated gene flows from west to east32. HV12b, a branch of HV12, is one of the oldest HV subclades, and has been found in western Iran, India, and sporadically as far as Central and Southeast Asia, with strong associations with the Qashqais, who are Turkic-speaking nomadic pastoralists of southern Iran and who previously resided in the Iranian district of the South Caucasus32.

A total of 20 Parsi individuals in the U macro-haplogroup were found to fall into four subclades, U7a (n=6), U2e (n=3), U4b (n=10), and U1a (n=1), with the highest representation in U4b and U7a (Figure 8B). Phylogentic analysis demonstrated that the Parsis in the U haplogroup cluster with the Persians most frequently, while a few cluster with Kurds, Armenians, Mazandarani, Azeris, and Khorasanis, who all claim descent from Mesopotamia and the older Persian empire (https://journals.openedition.org/asiecentrale/480). U4b and U7a (the dominant branch of U7) haplotypes are distributed throughout the Near East and South Asia24 with subclades specific to Central Asia in the Volga-Ural region33, Mediterranean, and Southeast Europe, with lower frequencies in populations around the Baltic Sea, such as in Latvians and Tver Karelians33. Haplogroup U2 harbors frequency and diversity peaks in South Asia, whereas its U2d and U2e subclades are confined to the Near East and Europe24.

The T haplogroup in the Parsi cohort was found to consist of T1a, T2g, T2i, and T2b, with an even distribution of samples across the subgroups (n=2, 1, 1, 1, respectively). Similar to the haplogroups HV and U, the Persians and Qashqais form the largest ethnic denomination associated with the Parsis with respect to the T haplogroups (>60%, Figure 8C). This haplogroup has been found to be dominant in Western and Central Asia and in Southern Europe. The T haplogroup is also well distributed in Eastern and Northern Europe, as well as in the Indus Valley and the Arabian Peninsula. Younger T subclades are reported to have expanded into Europe and Central Asia during the Neolithic transition34. Five Parsi individuals of the haplogroups A2v (n=3), F1g (n=1), and Z1a (n=1) were observed to be phylogentically related to Persian, Kurd, Turkmen, and Iranian ethnicities, further attesting to their origin in the Iranian plateau (Figure 8C).

Unlike the HV, U, and T haplogroups, within which Parsi’s cluster closely with ethnic Iranians, Parsis harboring the M haplogroup appear to demonstrate more diversity in their mitochondrial genomes. This study showed the following breakdown: 8/12 M sub-haplogroups of the 29 Parsi M haplotypes (M24a [n= 8], M33a [n=1], M5a [n=2], M4a [n=1)], M3a [n=7], M52b [n=8], M27b [n=1], and M35b [n=1]) clustered with the Persians, Qashqais, Azeris of Iranian ethnicity, and others of Persian descent (Figure 8D, Table 3). Only two sub-haplogroups in our study (M2a and M2b [n=21], M30d [n=1], (Figure 8D) clustered more closely with relic tribes of Indian origin. Our phylogenetic analyses further showed that 19 Parsi individuals belonging to the M30d (n=10) and M39d (n=9) haplogroups did not cluster either with Indian or Iranian ethnic groups (Figure 8D) but remained clustered within their own subgroups

Assembly of the Zoroastrian Parsi Mitochondrial Consensus Genome (AGENOME-ZPMRG-V1.0) and Parsi haplogroup-specific reference sequences
In another aspect, this disclosure is directed to given that the Parsis of India have practiced endogamy for centuries and are a nonsmoking, long-lived community despite the prevalence of many genetic disease manifestations. This prompted us to generate a Parsi-specific mitochondrial consensus genome to better understand the nuances of disease and wellness in this unique community. In light of this goal, we classified the Parsi mitochondrial genome based on the seven identified major haplogroups, HV, M, U, T , A, F, and Z. The haplogroup-specific Parsi mitochondrial sequences were aligned, and a consensus call for each nucleotide was made based on the maximal frequency of a base called at each position in the mtDNA (Appendix 2 and 3).
Using this approach, we derived the Zoroastrian Parsi mitochondrial reference sequences for each haplogroup: AGENOME-ZPMRG-HV-V1.0 (n=15 sequences), AGENOME-ZPMRG-U-V1.0 (n=20 sequences), AGENOME-ZPMRG-T-V1.0 (n=5 sequences), AGENOME-ZPMRG-M-V1.0 (n=52 sequences), AGENOME-ZPMRG-A2v-V1.0, AGENOME-ZPMRG-F1a-V1.0, and AGENOME-ZPMRG-Z-V1.0 (Table 4). Additionally, using all 100 Parsi mitochondrial genomes generated in this study (see Materials and Methods), we built the first standard Zoroastrian Parsi mitochondrial consensus genome (AGENOME-ZPMCG-V1.0). This AGENOME-ZPMRG-V1.0. consensus Parsi mtDNA sequence was found to have 31 unique SNPs (Table 5), of which five SNPs (A263G, A750G, A1438G, A4769G, and A15326G) were found to be common to the reference sequences of all seven haplogroups considered (Table 5). While the number of SNPs unique to each of the seven haplogroups ranged from 11 to 33, haplogroup M did not appear to have any unique SNPs when compared with the overall consensus sequence, AGENOME-ZPMRG-V1.0. The utility of this newly generated reference standard can be found in the accurate mtDNA-based analyses for the global Zoroastrian Parsi population as well as for individuals of Western Asian origin.

Disease-specific associations of mtDNA variants predict the prevalence of commonly occurring diseases in the non-smoking Parsi cohort
Another aspect of the disclosure provides as demonstrated in Figure 7B, the practice of endogamy has likely restricted the genetic diversity of the Parsis, as measured by the paucity of haplogroups in our cohort compared with the Persian and Qashqai populations, possibly contributing to a number of autosomal recessive and other genetic diseases. In previous studies, Parsis were found to be disproportionately affected with certain diseases, such as prostate and breast cancers5,11, PD, and AD. However, the Parsis are considered to be a long-lived community6 with lower incidences of lung12 and head and neck cancers.

In order to determine whether diseases and conditions known to be prevalent in the Parsi community could in fact be predicted by association using the collective mtDNA variants discovered in this study, we analysed SNPs prevalent in tRNA genes in the mitochondrial genome that have implications for rare, degenerative diseases. We found a total of 17 tRNA-associated SNPs, with a pathogenic variant (G1644A) implicated significantly in LS/HCM/MELAS. We also found a total of six tRNA mutations associated with non-syndromic hearing loss, hypertension, breast/prostate cancer risk, and progressive encephalopathies in the analysis of our 100 Parsi individuals (Table 6).

Further analysis of the nucleotide transitions and transversions that constitute the 420 SNPs revealed that the mutational signatures (C>A and G>T) found in tobacco smoke derived cancers were found at an extremely low frequency (<6% compared to other signatures) in mitochondrial genomes of the Parsi population (Figure 9) who refrain from smoking due to their religious faith.

Variant Analysis
The present disclosure provides 420 variants were associated with 41 diseases. SNP disease-association analysis revealed that Parkinson’s disease is highly associated with our variants (178 SNPs, Figure 10). Other neurodegenerative diseases, rare diseases of mitochondrial origin, and cardiovascular and metabolic diseases associated with the variants in our study can be seen in Fig. 10
In some embodiments forty-one diseases were spread across 25 haplogroups, while many diseases were repeated across haplogroups totalling up to 188 diseases (Figure 11A). Haplogroup U4b harboured 15 diseases, while the majority of M and T groups had five diseases (Figure 6B). Some of the mitochondrial rare diseases, such as mitochondrial encephalomyopathies, MELAS syndrome, MERRF syndrome, and cytochrome c oxidase deficiency have been associated with M2b and U1a, U4b, no haplogroup, and M2b, respectively (Figure 11B).

Haplogroup and disease linkage
The 420 variants fell into 25 haplogroups contributing to 41 diseases and conditions. Principal component analysis (PCA) showed the grouping of variants and haplogroups (Figure 12). AD, breast cancer, cardiomyopathies, and PD were represented in all of the 25 haplogroups, and longevity was represented in 23 haplogroups, except for the HV12b and U1a groups. Our tRNA pathogenicity analysis showed that the variability in tRNA was high in the U, T, and M haplogroups compared with other haplogroups (Table 6).

Analysis of SNPs in tRNA genes and the D-loop region in the mitochondrial genome
While most of the variants in mtDNA do not affect mitochondrial function, unlike synonymous/neutral variants, nonsynonymous/non-neutral variants may have functional consequences, and their effect on mitochondrial metabolism may be strongly deleterious, mildly deleterious, or even beneficial. We thus analysed a SNP dataset obtained from 100 Parsi subjects for nonsynonymous mutations and identified 63 nonsynonymous SNPs located within different mitochondrial genes (Figure 13). Twenty of sixty-three SNPs were found in genes encoding CYTB (n=13) and ND2 (n=7), followed by ND5 and ND1. Disease-association analysis showed that these genes were implicated in the onset of neurodegenerative conditions like AD, PD, cancers of colorectal and prostate origin, metabolic diseases such as type 2 diabetes, and rare diseases such as LHON (CYTB and ND2, (Figure 14, Figure 15). SNPs implicated in longevity were observed in our study and distributed across the ND2 gene (Figure 11B). As observed earlier, we found no association of the nonsynonymous variants in our data set that linked to lung cancer or a risk of lung cancer.

To understand the mitochondrial pathways affected by the SNPs in our study, we annotated the pathways associated with SNPs with DAVID and UNIPROT and found that the major genes CYTB and ND2 were implicated in pathways that include the mitochondrial respiratory complex (COI/COII/COIII/COIV), OXPHOS, and metabolic pathways implicated in mitochondrial bioenergetics. Critical disease-related pathways in PD, AD, and cardiac muscle contraction were also associated with CYTB- and ND2-specific SNPs, which possibly explains the high incidence of these disease in the endogamous Parsi population (Figure 16).

A total of 87 SNPs, including 6 unique variants, were observed in the D-loop region across all 25 sub-haplogroups (n=100 subjects, Table 2). Seventy-four of 100 subjects were found to have m.16519 T>C, while six subjects of the M52 sub-haplogroup were found to have m.16525 A>G. The rest of the SNPs were found at m.16390 G>A (n=4 subjects) and m.16399 A>G, m.16401 C>T, and m.16497 A>G (all with n=1 subject each).

Identification of unique, unreported variants from the 100 Parsi-Zoroastrian mitogenome analysis
The present disclosure provides a comparative analysis of the 420 variants in the Zoroastrian-Parsi community with MITOMASTER45, a database that contains all known pathogenic mtDNA mutations and common haplogroup polymorphisms, to identify unique SNPs in our population, that are not reported previously. Our analysis showed the presence of 12 unique SNPs distributed across 27 subjects that were not observed in MITOMASTER and additionally in the VarDIG® disease association dataset (Figure 17). These unique SNPs were observed across different gene loci. 12S-RNA (2 SNPs), 16S-RNA (5 SNPs), 1 each at ND1, COII, COIII, ND4 and ND6. The SNP haplogroup association showed that they fell into 4 major haplogroups and 13 sub haplogroups; HV2a=1, M24a=4, M2a=1, M30d=3, M35b=1, M39b=2, M3a=1, M4a=1, M52b=4, M5a=1, T2b=1, U4b=6, U7a=1. Of the 12 variants identified, no disease associations was observed for on analysis with MITOMASTER and VarDIG®.

The present disclosure provides the first de novo Parsi mitochondrial genome from a healthy non-smoking female using a novel hybrid assembly approach. When compared with the revised mtDNA Cambridge reference standard (rCRS), we identified 28 unique SNPs, which upon analysis revealed this individual as belonging to the mt haplotype HV2a. Given that the Parsis of India trace their historical roots to Persia, finding an Iranian haplotype in this individual was intriguing. Upon extending our mtDNA analyses to an additional 99 Parsi individuals, we found that 95 individuals could be separated into four major mitochondrial haplogroups, HV, U, T, and M, while four individuals were found to have the rarer haplogroups A, F, and Z. Fifteen Parsi individuals, including, the largest representation in our cohort, were found to belong to the HV2a sub-haplogroup. A previous study of Indian Parsis reported a similar haplotype distribution in a sample of 117 Parsis30,with the conspicuous absence of the R haplotype in our cohort.

Historically, the Parsis, escaping religious persecution in Iran in the 8th century CE, fled to the Indian subcontinent where they settled. It has been suggested that, due to the strict endogamy practiced by the Zoroastrian Parsis, their maternally inherited mtDNA lineages have remained aligned with those of their ancestors in Iran. To address this issue, we compared the major mt haplogroups identified in our Parsi cohort to those of 352 Iranians25 and found a remarkable consistency in haplogroups in the Parsi cohort compared with the Persians and Qashqais25, likely due to the practice of endogamy. The presence of predominantly Eurasian mtDNA haplotypes, HV, T, and U in our Parsi cohort was remarkable, given that Parsis have resided on the Indian subcontinent for 1200 years

The haplogroup HV2, dated at 36–42 kya, most likely arose in Iran between the time of the first settlement by modern humans and the last glacial melt, and the subclade HV2a has a demonstrated Iranian ancestry, with transfers to India in repeated gene flows from west to east35. HV12b, a branch of the HV12 clade, is one of the oldest HV subclades and has been found in western Iran, India, and sporadically as far as Central and Southeast Asia. It has strong associations with the Qashqais, who are Turkic-speaking nomadic pastoralists of southern Iran and who previously resided in the Iranian region of the South Caucasus35.

Despite the large grouping of the M haplogroup (the largest haplogroup in the Indian subcontinent,34) in our Parsi population, phylogenetic analysis showed that most of the Parsis with M haplogroups are linked to Persian descent, with a very small minority of Parsis found to be related to Indians. This observation suggests minimal gene flow from indigenous females into the Parsi gene pool, as has been previously suggested30.

Phylogenetic analysis also revealed that two Parsi M sub-haplogroups, M30d and M39b, did not cluster with either Indians or Iranians. This anomaly could very likely find a solution by increasing the number and diversity of both Indian and Iranian individuals in our future analyses.

We further present the first complete Zoroastrian Parsi mitochondrial reference genome, built from the mtDNA of 100 nonsmoking, endogamous Parsi individuals representing seven mtDNA haplogroups. The need for the generation of such an ethnic-specific consensus genome, specifically for the Parsis, is self-evident for studies involving comparative analyses, designed to precisely understand patterns of maternally inherited mitochondrial DNA (mtDNA) and aid in reconstructing the history and prevalent disease associations in this unique community

Endogamy and consanguineous marriages amongst the Parsis has given rise to a number of diseases and conditions, including prostate and breast cancers9,10, Parkinson’s disease (PD), Alzeihmer’s disease (AD), longevity11, and lower incidences of lung12 and head and neck cancers.
In contrast to the human nuclear genome, which consists of 3.3 billion base pairs of DNA, the human mitochondrial genome contains a mere 16,569 base pairs. Despite its small size, the mitochondrial genome can be used to establish maternal family ties, thanks to its maternal pattern of inheritance. Prevalence studies in populations have shown that adult-onset mitochondrial diseases are much more common than childhood diseases. This is partly because childhood-onset disease is often fatal at a very early age, whereas adults often survive for many years after the diagnosis. Although mitochondrial diseases are long-term, genetic, and often inherited disorders that occur when mitochondria fail to produce enough energy for the body to function properly, and since Parsis are one of the longest-living communities, we wanted to make a list of significantly associated cardinal maladies for all the mitochondrial polymorphic regions

Currently, there are 7517 unique disease variants from human mitochondria in the VarDIG database, which are curated from the literature and include all the associated haplogroups, whereas the 420 unique variants from Parsi mitochondria form 25 haplogroups, leading to 1199 overlapped variants, which fall mainly within the M, T, U, Z, H, F, and A haplogroups. Our analysis indicated that the CYTB gene contained the maximum number of SNPs (n=5) in the coding region of haplogroup M, besides having maximal representation in F1g, T, and HV12b. Haplogroups U, A2v, and Z1a showed dominance for the ND complex genes ND5 and ND2, while the COI genes were the most highly represented in HV2a and U4b (Table 1). PD, known to be prevalent in the Parsi community36, appeared to be the disease with the highest predicted prevalence. Longevity, another common trait in the Parsis, was predicted in most haplogroups, except the U1 haplogroup. SNPs in the CYTB gene are associated with AD, diabetes mellitus, cognitive ability, breast cancer, hearing loss, and asthenozoospermia (Figure 8) and associated with changes in metabolic pathways, cardiac contraction, AD, PD, and rare diseases such as Huntington’s disease (Figure 9), whereas the ND2 and ND5 variants were associated with prostate, ovarian cancer, rare mitochondrial neuronal diseases, such as LHON, cardiomyopathy, AD, and PD (Figures 8 and 9). Our results lend further support for a mitochondrial SNP-linked pattern of inheritance for neurodegenerative diseases, rare diseases, cancers of the breast and prostate and of ovarian origin; and pregnancy loss, while accounting for the longevity observed in the Parsi community. tRNA disease-association analysis in our study showed that these genes were implicated in the onset of neurodegenerative conditions, such as AD, PD, cancers of colorectal and prostate origin, metabolic diseases, such as type 2 diabetes, and rare diseases, such as LHON (CYTB and ND2). The D-loop SNP analysis showed the prevalence (in 74/100 subjects) of the m.16519 T>C polymorphism, which has been implicated in chronic kidney disease43, an increased risk for Huntingtons disease, migraine headache, and cyclic vomiting syndrome44

The major distribution of the M and T haplogroup clusters in Parsis, with a total of 420 variants in which PD has 178 overlapped variants, corresponds to the endogamous nature of the population and the prevalence of high PD incidence in the population. Further examination of the diseases associated with this haplogroup suggests that diseases such as AD, PD, breast cancer, and cardiomyopathies, which are associated with all of the 25 haplogroups, have a high prevalence in the population. Studies have shown that haplogroup T asthenozoospermia37 is associated with reduced sperm motility, which might be one of the reasons for the low birth rate in the population.
As the exact same haplogroups are shared between breast cancer, AD, PD, and cardiomyopathies, the occurrence of the M group indicates the prevalence of breast cancer38. Studies have reported higher AD in the U group39. On the other hand, another study40 corroborated the evidence for the U haplogroup contributing to longevity by reducing the generation of reactive oxygen species.
The lack of signatures for lung cancer in all haplogroups in this nonsmoking Parsi population coupled with the low frequency of tobacco smoke-derived cancers, is intriguing. We observed three heteroplasmy variants associated with more than 50% of the 41 diseases. It is also noteworthy that the established understanding of the lower lung cancer occurrence in the Parsi population41 has been upheld, with no haplogroup predisposition for lung cancer in Parsi mitochondria. Some of the rare diseases also find their way into the Parsi population, and their frequency and distribution of occurrence need to be further explored.

In the current invention the first Zoroastrian Parsi mitochondrial reference genome is generated. The utility of this newly generated reference standard will be found in the accurate mtDNA-based analyses of the global Zoroastrian Parsi population as well as for individuals of western Asian origin. We have also provided evidence that the Indian Parsis, through the process of endogamy over the centuries, have largely retained their Iranian genetic heritage, as measured by maternally inherited mtDNA analysis. In fact, Parsis are more closely related to Persians than Indians. Our data further reveal the relevance of mitochondria and its haplogroup associations with diseases such as breast cancer, AD, PD, and cardiomyopathies and also the association of longevity with the respective haplogroups in the Parsi population, in which disease prevalence in the population can be predicted using community-specific SNPs. Additional studies to connect nuclear SNPs with those identified in the mitochondria are planned. The Parsi population is an excellent starting point for these and other future studies.

References
1. Mistry, R. K. Glimpses of Parsi history, Insights Into The Zarathustrian Religion, p.20.
2. Nariman, R. F. The Inner Fire – Faith, Choice, and Modern Day Living in Zoroastrianism.
3. Vendidad I, 1-2 & II, 5.
4. Bennet, J. G. The Hyperborean Origin of the Indo-European Culture, Journal Systematics. J Syst. 1, (1963).
5. Jussawalla, D. J., Yeole, B. B. & Natekar, M. V. Histological and epidemiological features of breast cancer in different religious groups in greater bombay. J. Surg. Oncol. (1981) doi:10.1002/jso.2930180309.
6. Jussawalla, D. J. The persistance of differences in cancer incidence at various anatomical sites 1300 years after immigration. Recent Results Cancer Res. (1975) doi:10.1007/978-3-642-80880-7_22.
7. Anthony, D. The Horse, The Wheel, And Language. How Bronze-Age Riders from the Eurasian Steppes Shaped the Modern World. (Princeton University Press., 2007).
8. Alizadeh, A. The Rise of the Highland Elamite State in Southwestern Iran. Current. Curr Anthropol. 51, 353–383 (2010).
9. Shroff Z, C. M. The potential impact of intermarriage on the population decline of the Parsis of Mumbai, India. Demogr Res. 25, 545–564 (2011).
10. Karkal, M. Marriage among Parsis. Demogr. India 4, 128 (1975).
11. Barnabas-Sohi, N. et al. Breast carcinoma in a high-risk population: Structural alterations in neu, int-2, and p-53 genes. Breast Dis. (1993).
12. Jussawalla, D. J. & Jain, D. K. Lung cancer in Greater Bombay: Correlations with religion and smoking habits. Br. J. Cancer (1979) doi:10.1038/bjc.1979.199.
13. Helgason, A., Siguroardóttir, S., Gulcher, J. R., Ward, R. & Stefánsson, K. mtDNA and the origin of the Icelanders: Deciphering signals of recent population history. Am. J. Hum. Genet. (2000) doi:10.1086/302816.
14. Wallace, D. C. Mitochondrial DNA Variation in Human Radiation and Disease. Cell (2015) doi:10.1016/j.cell.2015.08.067.
15. Wallace, D. C., Brown, M. D. & Lott, M. T. Mitochondrial DNA variation in human evolution and disease. Gene (1999) doi:10.1016/S0378-1119(99)00295-4.
16. Roger, A. J., Muñoz-Gómez, S. A. & Kamikawa, R. The Origin and Diversification of Mitochondria. Current Biology (2017) doi:10.1016/j.cub.2017.09.015.
17. Garcia, I., Jones, E., Ramos, M., Innis-Whitehouse, W. & Gilkerson, R. The little big genome: The organization of mitochondrial DNA. Front. Biosci. - Landmark (2017) doi:10.2741/4511.
18. Stewart, J. B. & Chinnery, P. F. The dynamics of mitochondrial DNA heteroplasmy: Implications for human health and disease. Nature Reviews Genetics (2015) doi:10.1038/nrg3966.
19. Bussard, K. M. & Siracusa, L. D. Understanding mitochondrial polymorphisms in cancer.Cancer Research (2017) doi:10.1158/0008-5472.CAN-17-1939.
20. Alston, C. L., Rocha, M. C., Lax, N. Z., Turnbull, D. M. & Taylor, R. W. The genetics and pathology of mitochondrial disease. J. Pathol. 241, 236–250 (2017).
21. Andrews, R. M. et al. Reanalysis and revision of the cambridge reference sequence for human mitochondrial DNA [5]. Nature Genetics (1999) doi:10.1038/13779.
22. Chandrasekar, A. et al. Updating phylogeny of mitochondrial DNA macrohaplogroup m in India: dispersal of modern human in South Asian corridor. PLoS One 4, e7447–e7447 (2009).
23. Rajkumar, R., Banerjee, J., Gunturi, H. B., Trivedi, R. & Kashyap, V. K. Phylogeny and antiquity of M macrohaplogroup inferred from complete mt DNA sequence of Indian specific lineages. BMC Evol. Biol. 5, 26 (2005).
24. Sahakyan, H. et al. Origin and spread of human mitochondrial DNA haplogroup U7. Sci. Rep. 7, 46044 (2017).
25. Derenko, M. et al. Complete mitochondrial DNA diversity in Iranians. PLoS One (2013) doi:10.1371/journal.pone.0080673.
26. Edgar, R. C. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 32, 1792–1797 (2004).
27. Kumar, S., Stecher, G., Li, M., Knyaz, C. & Tamura, K. MEGA X: Molecular Evolutionary Genetics Analysis across Computing Platforms. Mol. Biol. Evol. 35, 1547–1549 (2018).
28. Tamura, K., Nei, M. & Kumar, S. Prospects for inferring very large phylogenies by using the neighbor-joining method. Proc. Natl. Acad. Sci. U. S. A. 101, 11030–11035 (2004).
29. Houshmand, M. et al. Is 8860 variation a rare polymorphism or associated as a secondary effect in HCM disease? Arch. Med. Sci. (2011) doi:10.5114/aoms.2011.22074.
30. Chaubey, G. et al. ‘Like sugar in milk’: Reconstructing the genetic history of the Parsi population. Genome Biol. (2017) doi:10.1186/s13059-017-1244-9.
31. López, S. et al. The Genetic Legacy of Zoroastrianism in Iran and India: Insights into Population Structure, Gene Flow, and Selection. Am. J. Hum. Genet. (2017) doi:10.1016/j.ajhg.2017.07.013.
32. Quintana-Murci, L. et al. Where west meets east: the complex mtDNA landscape of the southwest and Central Asian corridor. Am. J. Hum. Genet. 74, 827–845 (2004).
33. Shamoon-Pour, M., Li, M. & Merriwether, D. A. Rare human mitochondrial HV lineages spread from the Near East and Caucasus during post-LGM and Neolithic expansions. Sci. Rep. 9, 14751 (2019).
34. Farjadian, S. et al. Discordant Patterns of mtDNA and Ethno-Linguistic Variation in 14 Iranian Ethnic Groups. Hum. Hered. 72, 73–84 (2011).
35. Thangaraj, K. et al. In situ origin of deep rooting lineages of mitochondrial Macrohaplogroup ‘M’ in India. BMC Genomics 7, 151 (2006).
36. Alexandrov LB, Ju YS, Haase K, et al. Mutational signatures associated with tobacco smoking in human cancer. Science. 2016;354(6312):618-622.
37. E. Ruiz-Pesini, A.C. Lapeña, C. Díez, E. Alvarez, J.A. Enríquez, M.J. López-Pérez Seminal quality correlates with mitochondrial functionality. Clin. Chim. Acta., 300 (2000), p. 97 105.
38. Fang, H., Shen, L., Chen, T. et al. Cancer type-specific modulation of mitochondrial haplogroups in breast, colorectal and thyroid cancer. BMC Cancer 10, 421 (2010).
39. Van der Walt JM, Dementieva YA, Martin ER, Scott WK, Nicodemus KK, Kroner CC, Welsh-Bohmer KA, Saunders AM, Roses AD, Small GW, Schmechel DE, Murali Doraiswamy P, Gilbert JR, Haines JL, Vance JM, Pericak-Vance MA. Analysis of European mitochondrial haplogroups with Alzheimer disease risk. Neurosci Lett. 2004 Jul 15; 365(1):28-32.
40. van Oven M, Kayser M Hum Mutat. Updated comprehensive phylogenetic tree of global human mitochondrial DNA variation. 2009 Feb; 30(2): E386-94.
41. Balkrishna Bhika Yeole, AP Kurkure, SH Advani, Sunny Lizzy; An Assessment of Cancer Incidence Patterns in Parsi and Non Parsi Populations, Greater Mumbai. Asian Pacific Journal of Cancer Prevention, Vol 2, 2001; 293-298

Documents

Application Documents

# Name Date
1 202041023042-Correspondence_Form1, Form13, Power of Attorney_30-12-2021.pdf 2021-12-30
1 202041023042-POWER OF AUTHORITY [02-06-2020(online)].pdf 2020-06-02
2 202041023042-FORM 1 [02-06-2020(online)].pdf 2020-06-02
2 202041023042-Correspondence_Form13, Form18, Power of Attorney_30-12-2021.pdf 2021-12-30
3 202041023042-FORM 18 [28-12-2021(online)].pdf 2021-12-28
4 202041023042-AMENDED DOCUMENTS [27-12-2021(online)].pdf 2021-12-27
4 202041023042-DRAWINGS [02-06-2020(online)].pdf 2020-06-02
5 202041023042-FORM 13 [27-12-2021(online)].pdf 2021-12-27
5 202041023042-COMPLETE SPECIFICATION [02-06-2020(online)].pdf 2020-06-02
6 202041023042-POA [27-12-2021(online)].pdf 2021-12-27
6 202041023042-CLAIMS UNDER RULE 1 (PROVISIO) OF RULE 20 [02-06-2020(online)].pdf 2020-06-02
7 202041023042-Form-26_Power of Attorney_11-06-2020.pdf 2020-06-11
7 202041023042-Correspondence_11-06-2020.pdf 2020-06-11
8 202041023042-Form-26_Power of Attorney_11-06-2020.pdf 2020-06-11
8 202041023042-Correspondence_11-06-2020.pdf 2020-06-11
9 202041023042-POA [27-12-2021(online)].pdf 2021-12-27
9 202041023042-CLAIMS UNDER RULE 1 (PROVISIO) OF RULE 20 [02-06-2020(online)].pdf 2020-06-02
10 202041023042-COMPLETE SPECIFICATION [02-06-2020(online)].pdf 2020-06-02
10 202041023042-FORM 13 [27-12-2021(online)].pdf 2021-12-27
11 202041023042-DRAWINGS [02-06-2020(online)].pdf 2020-06-02
11 202041023042-AMENDED DOCUMENTS [27-12-2021(online)].pdf 2021-12-27
12 202041023042-FORM 18 [28-12-2021(online)].pdf 2021-12-28
13 202041023042-FORM 1 [02-06-2020(online)].pdf 2020-06-02
13 202041023042-Correspondence_Form13, Form18, Power of Attorney_30-12-2021.pdf 2021-12-30
14 202041023042-POWER OF AUTHORITY [02-06-2020(online)].pdf 2020-06-02
14 202041023042-Correspondence_Form1, Form13, Power of Attorney_30-12-2021.pdf 2021-12-30
15 202041023042-FER.pdf 2025-09-30

Search Strategy

1 202041023042_SearchStrategyNew_E_Searchstrategy(202041023042)E_18-09-2025.pdf