Sign In to Follow Application
View All Documents & Correspondence

Identification Of Biomarkers From Parsi Genome

Abstract: Single nucleotide polymorphs (SNPs) / Copy Number Variations (CNVs) and Epigenome as in the Parsi-Zarathushti genome sequence and corresponding polynucleotides thereof.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
02 July 2010
Publication Number
18/2012
Publication Type
INA
Invention Field
BIOTECHNOLOGY
Status
Email
Parent Application

Applicants

AVESTHAGEN LIMITED
'DISCOVERER', 9TH FLOOR, INTERNATIONAL TECH.PARK, WHITEFIELD ROAD,BANGALORE - 560 066.

Inventors

1. VILLOO MORAWALA PATELL
C/O AVESTHAGEN LIMITED,'DISCOVERER', 9TH FLOOR, INTERNATIONAL TECH PARK, WHITEFIELD ROAD,BANGALORE - 560 066.
2. SAMI NOSHIR GUZDER
C/O AVESTHAGEN LIMITED,'DISCOVERER', 9TH FLOOR, INTERNATIONAL TECH PARK, WHITEFIELD ROAD,BANGALORE - 560 066.
3. CHELLAPPA GOPALAKRISHNAN
C/O AVESTHAGEN LIMITED,'DISCOVERER', 9TH FLOOR, INTERNATIONAL TECH PARK, WHITEFIELD ROAD,BANGALORE - 560 066.
4. NAVEEN SHARMA
C/O AVESTHAGEN LIMITED,'DISCOVERER', 9TH FLOOR, INTERNATIONAL TECH PARK, WHITEFIELD ROAD,BANGALORE - 560 066.

Specification

Technical Field of Invention
Genome Wide Association studies have been instrumental in identifying loci that are involved in multigenic diseases(s) in human. When combined with population-based studies has resulted in the identification of biomarkers for a wide range of diseases such as type II diabetes, atrial fibrillation, exfoliative glaucoma, etc., that can be used as tools for predictive, personalized healthcare.
The AVESTAGENOME Project is unique since it involves a detailed analysis of all the major 'systems' in the cell and subsequent integration that will enable the derivation of a comprehensive status of the cell, and by extension to the individual, under various conditions.
The Parsi population is unique in that it is a genetically homogeneous and well-defined such that the genealogy can be traced back for several generations. This is an important consideration for studying the effect of genes and environment on the health of an individual.
Prior Art
The Parsis of India are Zoroastrians who migrated from Iran in the 9th century A.D. Subsequent to their migration, they have maintained the integrity of their population with marriages being strictly within the community. As a result of generations of inbreeding, increased incidences of both positive traits and certain inherited diseases are seen amongst the Parsi population. Dissecting this complexity requires the ability to gather and conrelate detailed information on disease and genetic variations across a large group of people.
A number of common diseases like Parkinson's disease, stroke, heart disease, specific cancers and Alzheimer's result from the interplay of multiple genes and environmental and health factors. Unraveling this complexity requires the ability to gather and correlate detailed information on disease and genetic variations across a large group of people.
The Parsis are a rare population with generally higher longevity and are therefore ideal for population genetic studies. The AVESTAGENOME Project™ is being initiated to enable the archiving of the genome of the 69,000 members of the Parsi community and to determine the genetic basis of longevity and its related disorders. In addition, the study will generate a model for abbreviated clinical trials, pharmacogenomics-based therapies and development of biomarkers for predictive diagnostics and drug discovery.
GWA studies have been instrumental in identifying loci that are involved in multigenic diseases(s) in human (Ref. 1). When combined with population-based studies has resulted in the identification of biomarkers for a wide range of diseases such as type II diabetes, atrial fibrillation, exfoliative glaucoma, etc., that can be used as tools for predictive, personalized healthcare (Ref. 2, 3,4, 5)

The AVESTAGENOME Project™ is unique since it involves a detailed analysis of all the major 'systems' in the cell that will enable the derivation of a comprehensive status of the cell, and by extension to the individual, under various conditions.

Further, the systems biology approach will be applied to prostate cancer, metabolic disease such as diabetes, age-related degenerative condition, e.g., arthritis, and neurological disease(s) as well.

In this regard genome wide scans on genomic DNA from the Parsi participants will be undertaken from cases and controls using Affymetrix SNP 6.0 chip to find differential SNPs that can be associated with a particular trait or disease. In addition the Transcriptome, Proteome and Metabolome from these samples will also be analyzed.

In the Bioinformatics front, the tools that are proposed to be made into use are Affymetrix GeneChip Command Console, Affymetrix Genotyping Console, PLINK (open-source whole genome association analysis toolset) and Haploview.

Affymetrix GeneChip Command Console (AGCC) provides a set of tools for instrument control and data management used in the processing of GeneChip probe arrays. It summarizes probe cell intensity data (..CEL. file generation) and enables sample and array registration, data management, fluidics and scanning instrument control as well as automatic and manual image gridding.

Affymetrix Genotyping Console (GTC) is a genotyping analysis software package for whole-genome genotyping analysis and quality control for collections of Genome-Wide SNP Array 6.0 CEL files. It implements the Birdseed algorithm for SNP Array 6.0. It requires the information stored in library and annotation files from NetAffx to analyze the CEL files and display and export additional information about the SNPs (such as Chromosome, Physical Position, Allele, etc.) as well as for certain analysis and filtering steps.

Affymetrix Power Tools (APT) are a set of cross-platform command line programs that implement algorithms for analyzing and working with Affymetrix GeneChip® arrays. APT is an open-source project licensed under the GNU General Public License (GPL).

PLINK is a free, open-source whole genome association analysis toolset, designed to perform a range of basic, large-scale analyses in a computationally efficient manner. The focus of PLINK is purely on analysis of genotype/phenotype data.

PLINK is being developed by Shaun Purcell at the Center for Human Genetic Research (CHGR), Massachusetts General Hospital (MGH), and the Broad Institute of Harvard & MIT, with the support of others. (URL: http://pngu.mgh.harvard.edu/purcell/plink/)

Haploview is designed to simplify and expedite the process of haplotype analysis by providing a common interface to several tasks relating to such analyses as LD & haplotype block analysis, haplotype population frequency estimation etc

References:

1. A genome-wide association study identifies KIAA0350 as a type 1 diabetes gene. Nature (2007) 448, 591-594.

2. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature (2007) 447,661-678.

3. Broad sweep of genome zeroes in on diabetes. Nature (2007) 445, 688-689.

4. Human genetics: Variants in common diseases. Nature (2007) 445. 828-830.

5. Mapping determinants of human gene expression by regional and genome-wide association. Nature (2005) 437,1365-1369.

Description of Figures:

Figure 1: Genotyping process: Work breakdown structure for wet lab process Figure 2: Work breakdown structure for array scan and image analysis

Detailed Description

The AVESTAGENOME Project™ is a Systems Biology study of the Parsi- Zarathushti community to determine the genetic basis of longevity and its related disorders. In addition, the study will generate a model for pharmacogenomics-based therapies, development of biomarkers for predictive diagnostics and drug discovery.

This systems biology approach will include Genomics, wherein a genome wide analysis for loci associated with longevity and age-related diseases will be performed, Despite extensive research efforts for more than a decade, the genetic basis of common human diseases remains largely unknown. Linkage and candidate gene association studies have often failed to deliver definitive results. Yet the identification of the variants, genes and pathways involved in particular diseases offers a potential route to new therapies, improved diagnosis and better disease prevention. The recent advances including the International HapMap resource, availability of dense genotyping chips, provide a possible solution for GWA studies.

This systems biology approach will include Genomics, wherein a genome wide analysis for loci associated with longevity and age-related diseases will be performed, Proteomics of the serum proteome for discovery of novel biomarkers (biosignatures), Metabolome analysis that would involve the quantitative measurement of either or both intracellular and extracellular metabolites that will be essential to understand metabolism and regulation of cellular systems, Transcriptome analysis for gene expression profiling, and isolation and preservation of peripheral blood mononuclear cells (PBMCs) from the participants for drug screening and potential future therapy.

The information and technology generated from this project will allow multiple future uses such as identification of genes linked to diseases, risk prediction and planning of strategies to address prevention of disease, development of diagnostic kits for early prediction of disease, development of population validated drug targets and molecular biomarkers, identification of new gene therapy targets.
The results of a study on the Parsi population could have wide-ranging implications on human health for the general population around the world.
Genotyping using the Affymetrix Genome -Wide Human SNP Array 6.0
The Affymetrix Genome-Wide Human SNP Nsp/Sty 5.0/6.0 Assay is designed for processing 48 samples (Figure 1). The protocol is presented in the following stages:
Genomic DNA Plate Preparation
Prepare the genomic DNA by:
Determining the concentration of each sample.
Diluting each sample to 50 ng/pL using reduced EDTA TE buffer.
Aliquoting 5 pL of each sample to the corresponding wells of two 96-well plates.
Stage 1: Sty Restriction Enzyme Digestion
During this stage, the genomic DNA is digested by the Sty I restriction enzyme.
Prepare a Sty Digestion Master Mix.
Add the master mix to one set of 48 samples.
Place the samples onto a thermal cycler and run the GW5.0/6.0 Digest
program.
Stage 2: Sty Ligation
During this stage, the digested samples are ligated using the Sty Adaptor.
Prepare a Sty Ligation Master Mix. Add the master mix to the samples.
Place the samples onto a thermal cycler and the GW5.0/6.0 Ligate program is run.
Dilute the ligated samples with AccuGENE water.
Stage 3: Sty PCR
Steps in this stage:
Transfer equal amounts of each Sty ligated sample into three fresh 96-well plates (as shown in figure below).
Prepare the Sty PCR Master Mix, and add it to each sample.
Place each plate on a thermal cycler and run the GW5.0/6.0 PCR program.
Confirm the PCR by running 3 µ L of each PCR product on a 2% TBE gel or an
E-Gel 48
2% agarose gel.
Stage 4: Nsp Restriction Enzyme Digestion
During this stage, the genomic DNA is digested by the Nsp I enzyme.
Prepare a Nsp Digestion Master Mix.
Add the master mix to one set of 48 samples.
Place the samples onto a thermal cycler and run the GW5.0/6.0 Digest
program.
Stage 5: Nsp Ligation
During this stage, the digested samples are ligated using the Nsp Adaptor.
Prepare a Nsp Ligation Master Mix. Add the master mix to the samples.
Place the samples onto a thermal cycler and the GW5.0/6.0 Ligate program is run.
Dilute the ligated samples with AccuGENE water.
Stage 6: Nsp PCR
Steps in this stage:
Transfer equal amounts of each Nsp ligated sample into four fresh 96-well plates.
Prepare the Nsp PCR Master Mix, and add it to each sample. Place each plate on a thermal cycler and run the GW5.0/6.0 PCR program. Confirm the PCR by running 3 µ L of each PCR product on a 2% TBE gel or an E-Gel 48 2% agarose gel.
Stage 7: PCR Product Pooling and Purification
Steps in this stage:
Pool the Sty and Nsp PCR reactions to a single deep well pooling plate, for a
total of 700 pL/well
Add beads to each pool and incubate
Transfer each pool to a filter plate and dry down on a vacuum manifold Wash the PCR products with EtOH and dry down Elute the PCR products using Buffer EB
Vacuum and spin transfer the PCR products to a new 96-well plate Stage 8: Quantitation
During this stage, prepare one dilution of each PCR product in optical plates. Then quantitate the diluted PCR products.
Stage 9: Fragmentation
During this stage the purified PCR products will be fragmented using Fragmentation Reagent. First dilute the Fragmentation Reagent by adding the appropriate amount of Fragmentation Buffer and AccuGENE water. Quickly add the diluted reagent to each reaction, place the plate onto a thermal cycler, and run the GW5.0/6.0 Fragment program.
Once the program is finished, check the results of this stage by running 1.5 pl¬ot each reaction on a 4% TBE gel or an E-Gel 48 4% agarose gel.
Stage 10: Labeling
Steps in this stage:
Label the fragmented samples using the DNA Labeling Reagent. Prepare the Labeling Master Mix. Add the mix to each sample.
Place the samples onto a thermal cycler and run the GW5.0/6.0 Label program. Stage 11: Target Hybridization
During this stage, each reaction is loaded onto a Genome-Wide Human SNP Array 6.0.
First, prepare a Hybridization Master Mix and add the mix to each sample. Then, denature the samples on a thermal cycler.
After denaturation, load each sample onto a Genome-Wide Human SNP Array 6.0 - one sample per array. The arrays are then placed into a hybridization oven that has been preheated to 50 °C. Samples are left to hybridize for 16 to 18 hours.
Washing, staining and scanning the arrays Wash and Stain Arrays
The staining protocol for mapping arrays is a three-stage process:
1. A Streptavidin Phycoerythin (SAPE) stain.
2. An antibody amplification step.
3. A final stain with SAPE.
4. Once stained, each array is filled with Array Holding Buffer prior to scanning. Prepare Arrays for Washing and Staining.
Prepare the following buffers and solutions: Stain Buffer, SAPE Stain Solution, Antibody Stain Solution and Array Holding Buffer
Wash and Stain Protocol
The GenomeWideSNP6_450 protocol is an antibody amplification protocol for mapping targets. Use it to wash and stain the Genome-Wide Human SNP Array 6.0.
Scanning the Arrays
The GeneChip Scanner 3000 7G is controlled by the AGCC software.
Prepare the Scanner: Turn on the scanner at least 10 minutes before use. Prepare Arrays for Scanning Scanning the Array
1. Select the sample name (AGCC) that corresponds to the array being scanned.
2. Following the AGCC instructions as appropriate, load the array into the scanner and begin the scan.
Only one scan per array is required. Pixel resolution and wavelength are preset and cannot be changed.
The scanned data will be analyzed through the software's like Affymetrix GeneChip Command Console, Affymetrix Genotyping Console (GTC), Affymetrix Power Tools (APT) etc. The end product will be the actual SNP calls and Copy Number Variations (CNVs) for all the data points in the chips. The process is diagrammatically explained in Fig 2.
The following points are to be taken care while the assay is carried out.
DNA must be double-stranded (not single-stranded). DNA must be free of PCR inhibitors.
DNA must not be contaminated with other human genomic DNA sources, or with genomic DNA from other organisms. DNA must not be highly degraded.
Bioinformatics
The following types of analysis (Figure 2) have been planned:
I. Single SNP association analysis:
Allele and genotype based association analysis on the disease trait for all common SNPs (MAF > 5%)
II. Copy number variation and LOH analysis:
Joint SNP and CNV tests for common copy number variants Analysis of rare copy number variant data Loss of Heterozygosity analysis
III. Haplotvpe Association analysis:
a. Testing association to two-marker haplotypes
b. Testing for at-risk variants with lower frequencies (<5%) but not very rare (>1%) using block haplotypes with frequency <5% and >1%
IV. Meta-analvsis:
Perform the association analysis, by combining the results from many case- control groups
The current technology for genotyping and the output from the same is limited in terms of the detailed information that will be needed / required for establishing the cause responsible for the trait, condition, or disease of interest/study.
In this regard we will be establishing an automated high throughput system that would generate a list of targets that need to be analyzed by direct sequence using third generation sequencing technologies and including all aspects of sample processing and assay.
In the initial phase of the study, 79 samples have been genotyped and analyzed in comparison to the HapMap samples to identify SNP markers specific to Parsi population.
We have found 564 SNPs that have a different allele in all the 79 samples when compared to the HapMap samples.







We Propose to claim

1. Single nucleotide polymorphs (SNPs) / Copy Number Variations (CNVs) and Epigenome as in the Parsi-Zarathushti genome sequence and corresponding polynucleotides thereof.

2. The nucleotide sequence as claimed in claim 1, wherein the SNP / CNV is associated with the trait of interest or study in human.

3. A process to identify SNPs/CNVs / Epigenome in a nucleotide sequence comprising steps of:

a. Aligning genomic sequences of different human populations to the Parsi-Zarathushti genome to select the known and also novel SNPs; and

b. Comparing the highly conserved polynucleotide sequence in human population to the Parsi-Zarathushti genome to identify the said SNPs.

4. A process claimed in claim 3, wherein it is used to identify CNVs in the Parsi-Zarathushti genome nucleotide sequence.

5. Use of the sequences encompassing Parsi-Zarathushti genome sequence, as targets for drug design using bioinformatics and other tools, drug development, for gene therapy and vaccine development.

6. Use of the DNA sequences encompassing Parsi-Zarathushti genome sequence and the Epigenome with all encompassing modifications, as targets for drug design using bioinformatics and other tools, drug development, for gene therapy and vaccine development and for development of drugs effective against infectious diseases.

7. Use of proteins, RNA, DNA and metabolites encoded by the region carrying the polymorphisms in Parsi-Zarathushti genome as claimed in claim 1 for all aspects leading to and comprising of personalized medicine, for RNAi technology and other antisense RNA or noncoding RNA technologies and for drug therapy regimen and pharmacogenomics.

8. The method for generating and developing a database for identification and screening of the polymorphisms in the Parsi-Zarathushti genome as claimed in claim 1.

9. A diagnostic kit(s) for predictive diagnosis of all diseases and prognostic diagnosis of the response to therapy for all disease(s) that may manifest as result of polymorphisms / rearrangements in the Parsi-Zarathushti genome as claimed in claim 1.

10. A diagnostic kit(s) for predictive diagnosis of all health states that may manifest as a result of polymorphisms / rearrangements in the Parsi-Zarathushti genome as claimed in claim 1.

11. The generation of Stem cells from peripheral blood mononuclear cells (PBMCs) using iPSC technology(ies) or any other method thereof. The differentiation of these into a particular tissue or cell type(s) for use in regenerative medicine and / or patient- specific cellular therapy.

12. The use of the patient-specific cells for candidate drug screening during drug development and / or for evaluating drug efficacy. The cells could also be used to rule out adverse drug events for a particular individual thereby leading to personalization of therapy.

Documents

Application Documents

# Name Date
1 1887-che-2010 claims 02-07-2010.pdf 2010-07-02
1 1887-CHE-2010 FORM-2 02-07-2010.pdf 2010-07-02
2 1887-che-2010 correspondence others 02-07-2010.pdf 2010-07-02
2 1887-che-2010 form-5 02-07-2010.pdf 2010-07-02
3 1887-CHE-2010 DESCRIPTION(COMPLETE) 02-07-2010.pdf 2010-07-02
3 1887-che-2010 form-3 02-07-2010.pdf 2010-07-02
4 1887-che-2010 drawings 02-07-2010.pdf 2010-07-02
4 1887-che-2010 form-1 02-07-2010.pdf 2010-07-02
5 1887-che-2010 drawings 02-07-2010.pdf 2010-07-02
5 1887-che-2010 form-1 02-07-2010.pdf 2010-07-02
6 1887-CHE-2010 DESCRIPTION(COMPLETE) 02-07-2010.pdf 2010-07-02
6 1887-che-2010 form-3 02-07-2010.pdf 2010-07-02
7 1887-che-2010 correspondence others 02-07-2010.pdf 2010-07-02
7 1887-che-2010 form-5 02-07-2010.pdf 2010-07-02
8 1887-che-2010 claims 02-07-2010.pdf 2010-07-02
8 1887-CHE-2010 FORM-2 02-07-2010.pdf 2010-07-02