Sign In to Follow Application
View All Documents & Correspondence

Design Of Calibrx (Comprehensive Avesthagen Liquid Biopsy Risk Screen), Targeted Sequencing Gene Panels For Rapid Clinical Diagnosis Of Hereditary Genetic Conditions

Abstract: ABSTRACT DESIGN OF CALIBRX (COMPREHENSIVE AVESTHAGEN LIQUID BIOPSY RISK SCREEN), TARGETED SEQUENCING GENE PANELS FOR RAPID CLINICAL DIAGNOSIS OF HEREDITARY GENETIC CONDITIONS The present invention relates to the design of a multigene NGS capture panel CALiBRx (Comprehensive Avesthagen Liquid Biopsy Risk Screen), that is useful for prognosis of cancer and for rapid clinical diagnosis of hereditary genetic conditions. The panel is based on nucleosome binding patterns. determined based on size or number of cfNA (e.g., cfDNA) fragments, mapping to particular genomic regions. It identifies the predisposition of an individual by employing short read technology and correlating genetic variants with disease phenotypes that are mined from global repositories of variant-disease associations. The method involves collecting genetic screening and demographic data from clients, storing clients' data and DNA samples, processing and analyzing genetic testing data in conjunction with other relevant health data, generating custom reports, maintaining life-long health records and pre-populating data into user accessible personal health records. A risk score for each individual is derived by analyzing the mutational burden of each of the 624 prognostic genes into a risk score algorithm, based on which the individual can be classified to have Low/Intermediate/high Risk.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
23 January 2024
Publication Number
30/2025
Publication Type
INA
Invention Field
BIO-MEDICAL ENGINEERING
Status
Email
Parent Application

Applicants

AVESTAGENOME PROJECT INTERNATIONAL PRIVATE LIMITED
Yolee Grande, Level II, Pottery Road, Richards Town, Bangalore – 560005, Karnataka, India
AGENOME LLC
Regus, 15 N, Main Street, Suite 100, West Hartford, CT - 06107, United States of America

Inventors

1. VILLOO MORAWALA PATELL
Yolee Grande, Level II, 14, Pottery Road, Richards Town, Bangalore - 560005, Karnataka, India
2. KASHYAP KRISHNASAMY
Yolee Grande, Level II, 14, Pottery Road, Richards Town, Bangalore - 560005, Karnataka, India
3. NASEER PASHA
Yolee Grande, Level II, 14, Pottery Road, Richards Town, Bangalore - 560005, Karnataka, India

Specification

DESC:CALiBRx (COMPREHENSIVE AVESTHAGEN LIQUID BIOPSY RISK SCREEN) GENE PANEL FOR RAPID CLINICAL DIAGNOSIS OF HEREDITARY GENETIC CONDITIONS
Abstract
The invention pertains to the development of a multigene NGS capture panel, CALiBRx (Comprehensive Avesthagen Liquid Biopsy Risk Screen), designed for cancer prognosis and rapid diagnosis of hereditary genetic conditions. The panel leverages nucleosome binding patterns, determined by the size or number of cfNA (e.g., cfDNA) fragments mapping to specific genomic regions. It employs short-read sequencing to identify disease predisposition by correlating genetic variants with phenotypes from global variant-disease repositories. The method integrates genetic screening and demographic data, processes DNA samples, analyzes genetic and health data, and generates personalized reports. A risk score algorithm based on mutational burdens across 624 prognostic genes classifies individuals into Low, Intermediate, or High-Risk categories, enabling precision diagnostics and lifelong health management.
FIELD OF INVENTION
The present invention relates to the design of CALiBRx (Comprehensive Avesthagen Liquid Biopsy Risk Screen),a targeted sequencing gene panel for rapid clinical diagnosis of hereditary genetic conditions.
BACKGROUND OF INVENTION
1. The Avestagenome Project® is a unique and proprietary biobank of the endogamous Zoroastrian-Parsi samples and personalized medical data founded by Dr. Villoo Morawala-Patell in 2007 to first, preserve the biological heritage of the endogamous Parsi community and secondly, for this biobank to be used as a control population utilizing a systems biology big data mining approach to accelerate discovery of novel biomarkers and advanced novel therapies for lung cancer. The invention is the method of biomarker-based precision medicine for Lung cancer based on the unique, endogamous Zoroastrian-Parsi population driving the identification of prognostic biomarkers for onset of Lung cancer and combining population genomics, pharmacology, and clinical insights.
2. A comprehensive detailing of mutations is indispensable for understanding, diagnosis and treatment of many diseases including cancers and neurodegenerative conditions. A number of methods have been proposed for finding mutations from sequencing data, which usually consist of statistically evaluating the presence of variant bases compared to a reference. However, accurate determination of mutations remains a challenge in situations where a mutation is found only in a small fraction of reads. The delineation of such mutations is important especially in cancer. Such mutations are important not only for samples with low tumor content but are also important for capturing minor tumor sub clones to understand tumor heterogeneity and therefore the underlying causes of recurrence and therapeutic resistance
3. Enrichment techniques are appealing for studying such samples due to the high uniformity and read depths possible. However, though the experimental techniques capture the information accurately, the existing analysis methods are not suited for detection of low frequency variants
4. Cancer is a major cause of disease worldwide. Each year, tens of millions of people are diagnosed with cancer around the world, and more than half of the patients eventually die from it. In many countries, cancer ranks the second most common cause of death following cardiovascular diseases. Early detection is associated with improved outcomes for many cancers. Lung cancer is the most common cancer worldwide with poor prognosis following late diagnosis. The five-year survival rate for cancers detected at a localized stage (I–II) is about 56 percent. However, only 16 percent of lung cancer cases are diagnosed at an early stage. For advanced/metastatic tumors (stage IV), the five-year survival rate is only 5 percent.Therefore, it is imperative to identify diagnostic methods for early detection of lung cancer, enabling a timely treatment plan while potentially reducing healthcare costs. A 10- year period study from 1964-73, 2177 lung cancer cases showed that the incidence in non-Parsi males was almost double in Parsi males. Similarly, well-designed Indian studies on Parkinson and essential tremors estimate prevalence rates in Parsis who are ethnically different from Indians.
5. In cancers, non-small-cell lung cancer lacks effective early diagnosis as more than 70% of cases are detected in the middle and advanced stage when making a definite diagnosis, which misses optimal operation period. The collective effectiveness of the various chemotherapy regimens of pulmonary carcinoma is only about 30%, and some patients cannot tolerate chemotherapy and radiation and some develop drug resistance after treatment.
6. Neurodegenerative diseases are a group of disorders characterized by changes in normal neuronal function, leading in the majority of cases to neuronal dysfunction and even cell death. Currently, it is estimated that there are in excess of one hundred neurodegenerative diseases. However, we still have little understanding of the etiological cause of these diseases.
7. To detect disease conditions, several screening tests are available. A physical exam and history survey general signs of health, including checking for signs of disease, such as lumps or other unusual physical symptoms. A history of a patient's health habits and past illnesses and treatments will also be taken. Laboratory tests are another type of screening test and may include medical procedures to procure samples of tissue, blood, urine, or other substances in the body before conducting laboratory testing. Imaging procedures screen for cancer by generating visual representations of areas inside the body. Genetic tests detect certain gene deleterious mutations linked to some types of cancer. Genetic testing is particularly useful for a number of diagnostic methods.
8. In the clinic, diagnosis of 70% of non small-cell lung cancer is tissue biopsy samples, and it is primarily used for clinical pathology. Additionally, tumor heterogeneity adds to the diagnostic bottleneck as it is impossible to confirm only through tissue biopsy alone for effective diagnosis thereby necessitating the need for other molecular diagnostic methods that complement pathological analysis.
9. In disease conditions like cancer, cells continually shed DNA into the circulation, where it is readily accessible (Stroun et al. (1987). Analysis of such cell-free DNA (cfDNA) has the potential to transform detection and keep track of cancers and neurodegenerative conditions. The current invention will identify every possible therapeutically relevant change in the tumor and , following optimized target enrichment and complement by sequencing on NovaSeq6000, the latest and most reliable sequencing technology. Through next-generation sequencing (NGS) technology, Our panel of 624 tumor and neurological condition associated genes and selected therapy relevant fusions. Variations in these genes are known to have a significant impact on pathogenesis and progression. The generated data is summarized in a comprehensive report supporting the treating physician in finding efficient treatment for each patient. Our invention offers the following advantages.
10. Disclosed herein is a method for determining the ultra low frequency variants in a cell-free nucleic acid (cfNA) sample from an individual by detection of somatic mutations after screening across the CALiBRx gene panel. The method may comprise (a) obtaining a gDNA/cfNA sample; (b) selecting the cfNA for sequences corresponding to a plurality of regions of mutations in a cancer or of interest; (c) sequencing the selected cfDNA and potentially be extended to targeted gene sequencing, WES data as well; (d) determining the presence of somatic mutations, wherein the presence of the somatic mutations may be indicative of diseased cells present in the individual; and (e) providing the individual with an assessment of the presence of diseased cells.
11. Large panel approach: Full sequencing and analysis of 624 genes and cancer specific gene fusions (Table-1.1&1.2)
12. High average sequencing coverage to detect sub-clonal variants: 2500X to a maximum of 5000X (cfDNA) or 100X (Targeted Panel Sequencing) to detect Ultra Low Frequency (ULF) variants.
13. Sensitivity: >99.9%1; Specificity: >;99.9% Analysis of sequenced cfDNA/cfRNA sequences (5000X) underway to identify low and ultra low frequency variants specific to lung cancer. The project will inform the decisions of policymakers, regulators, consumers, or healthcare professionals.
14. Compositions and methods, including methods of bioinformatic analysis, are provided for the highly sensitive analysis of circulating tumor DNA (cfDNA), e.g. DNA sequences present in the blood of an individual that are derived from diseased cells. The methods of the invention may be referred to as the CALiBRx panel. Tumors of particular interest are solid tumors, including without limitation carcinomas, sarcomas, gliomas, lymphomas, melanomas, etc., although hematologic cancers, such as leukemias, are not excluded.
15. Further described herein is the design of a multigene NGS capture panel, composed of the markers described herein that can be used to provide a prognosis for lung cancer patients, and a larger spectrum of inherited disorders with neurological origin.
16. The present invention relates to genetic health data management. Specifically, the invention relates to a method, system, and computer program for collecting genetic screening and demographic data from clients, storing clients' data and DNA samples, processing and analyzing genetic testing data in conjunction with other relevant health data, generating custom reports, maintaining life-long health records and pre-populating data into user accessible personal health records.
17. A risk score for each individual is then derived by analyzing the mutational burden of each of the 624 prognostic genes into a risk score algorithm. Risk groups are also described herein based on these risk scores by placing patients into different risk categories according to their risk score. For example, "Low Risk," "Intermediate Risk," or "High Risk."

SUMMARY OF INVENTION
1. The Avestagenome Project®, founded by Dr. Villoo Morawala-Patell in 2007, is a proprietary biobank of Zoroastrian-Parsi samples aimed at preserving their biological heritage and accelerating lung cancer biomarker discovery through a systems biology approach. It enables precision medicine by identifying prognostic biomarkers for lung cancer, leveraging genomics, pharmacology, and clinical insights.
2. Comprehensive mutation analysis is crucial for understanding and treating diseases like cancer and neurodegenerative disorders. Current methods often struggle to detect mutations in a small fraction of reads, which is critical for identifying tumor heterogeneity, recurrence, and resistance.
3. Enrichment techniques provide high uniformity and depth for low-frequency variant detection, but existing analysis methods remain inadequate for such precision.
4. Cancer, the second leading cause of death globally, affects millions annually, with lung cancer being the most prevalent and deadliest due to late-stage diagnosis. Early detection improves survival rates significantly but remains a challenge, particularly for non-Parsi populations who have a higher incidence compared to Parsis.
5. Non-small-cell lung cancer (NSCLC) lacks effective early diagnostic tools, with over 70% diagnosed at advanced stages. Chemotherapy effectiveness is limited, and patients often face drug resistance or intolerability.
6. Neurodegenerative diseases, caused by neuronal dysfunction and cell death, encompass over 100 disorders. Their etiological causes remain poorly understood.
7. Screening methods for disease detection include physical exams, imaging, and genetic testing, which can identify gene mutations linked to certain cancers and other conditions.
8. Diagnosing 70% of NSCLC cases relies on tissue biopsies, but tumor heterogeneity necessitates molecular diagnostics to complement pathological analyses for accurate detection.
9. Cell-free DNA (cfDNA) shed by tumor cells into circulation allows for non-invasive cancer monitoring. Sequencing cfDNA using advanced technology like NovaSeq6000 and a 624-gene panel provides comprehensive data on tumor-related variations, aiding personalized treatment.
10. The CALiBRx method detects ultra-low frequency somatic mutations in cfDNA by targeting cancer-specific mutations. This approach enhances early detection and prognosis through sequencing and bioinformatics analysis.
11. The large-panel approach involves sequencing 624 genes and cancer-specific fusions with high coverage (up to 5000X for cfDNA) to identify low-frequency variants, offering >99.9% sensitivity and specificity.
12. The CALiBRx panel focuses on analyzing cfDNA from solid tumors like carcinomas, sarcomas, and gliomas, as well as hematologic cancers, to inform healthcare decisions.
13. A multigene NGS capture panel predicts lung cancer prognosis and identifies genetic markers for neurological and inherited disorders.
14. The invention includes a system for genetic health data management, combining genetic screening, demographic data, and lifelong health records to generate custom reports and risk assessments.
15. A risk score algorithm evaluates mutational burden across 624 prognostic genes, categorizing patients into "Low," "Intermediate," or "High Risk" groups for personalized care.
DETAILED DESCRIPTION OF THE INVENTION
1. Technical and scientific terms used herein retain their common meanings unless stated otherwise. Exemplary methods and materials are detailed below.
2. Amplifying: Generating one or more copies of a target nucleic acid using it as a template.
3. SNP: Single nucleotide polymorphism; a genomic position with two or more alternative alleles at a frequency of at least 1% in a population.
4. Enriching: Isolating specific genomic regions using a pre-synthesized panel with embedded baits for targeted analysis. This enhances capture uniformity and reproducibility.
5. Enriched Sample: A sample containing isolated DNA fragments, typically 100 bp to 1 kb in length, based on the fragmentation method used.
6. Genomic Region: A defined area of a genome, whether human, animal, or plant.
7. Plurality: Refers to at least two members, scaling up to billions.
8. Sequencing: Identifying at least 10 consecutive nucleotides (e.g., up to 200 or more) of a polynucleotide.
9. Next-Generation Sequencing (NGS): Parallelized sequencing technologies (e.g., Illumina, Roche) or emerging methods like nanopore or Ion Torrent.
10. Sequence Reads: Sequencing outputs represented by nucleotide strings, often accompanied by quality metrics.
11. Sequence Variant: A nucleic acid sequence differing from a reference at one or more positions, such as SNPs or somatic mutations.
12. Low-Frequency Sequence Variant: Variants present at <10% frequency in a sample, typically due to somatic mutations.
13. Reference Sequence: A known sequence used for comparison from public or proprietary databases.
14. Assembling: Aligning and merging sequence fragments to reconstruct a longer nucleic acid sequence.
15. Anchor: A sequence enabling alignment of longer sequences.
16. Sequence Contig: A contiguous nucleotide sequence produced by assembling overlapping fragments.
17. Associated with Cancer/Neurodegenerative Diseases: Genomic regions or genes with mutations linked to cancer or neurodegenerative phenotypes, often causative in nature.
Methods:
The CALiBRx screening using genomic DNA begins with sample preparation, where genomic DNA is quantified using a Qubit fluorometer. For each sample, 100 ng of DNA is fragmented to an average size of 350 bp using a Covaris ME220 ultrasonicator. Sequencing libraries are prepared with the TruSeq Nano DNA Library Prep Kit (Illumina) and dual-index adapters, following the manufacturer’s protocol. The amplified libraries are evaluated on an Agilent TapeStation and quantified using the KAPA Library Quantification Kit (Roche) on the QuantStudio-7flex Real-Time PCR system (Thermo). Equimolar pools of sequencing libraries are then sequenced on an Illumina Novaseq 6000 using S4 flow cells to generate 2x150 bp reads, ensuring 30x genome coverage per sample.
Capture probe design and enrichment for CALiBRx are carried out using Twist Custom Panels. These panels are designed to cover 624 genes of interest, considering panel sizes, target regions, and multiplexing requirements. The enriched coding regions are sequenced using paired-end 150 bp reads on the Illumina Novaseq 6000, achieving an average coverage of at least 100x to maintain data quality. The sequencing process includes high-quality DNA isolation, library preparation, target capture, and sequencing, followed by trimming and adapter removal of short reads using AdapterRemoval (v2.2.2). Reads with a quality score below Q30 are discarded. The GATK pipeline (v4.1.5.0), along with Picard (v2.21.9) and Samtools (v1.3.1), is employed for read mapping, variant detection, and annotation. The workflow includes pre-processing raw reads, aligning them to the GRCh38 reference genome with BWA-MEM, and recalibrating base quality scores to reduce sequencing errors. Variants are identified and annotated after duplicate tagging and read alignment.
For the CALiBRx screening using cell-free DNA (cfDNA), whole blood is collected in Streck tubes and processed within three days. Plasma is separated by centrifugation, and cfDNA is extracted using the QIAamp MinElute ccfDNA Midi Kit. This process yields 30 to 300 ng of cfDNA from 8-10 ml of plasma. The cfDNA is prepared for sequencing using the Takara ThruPLEX Tag-Seq HV kit, optimized for cfDNA to enhance library complexity and maintain GC representation. This kit allows input volumes up to 30 µl and incorporates UMI barcoding for high-sensitivity and specificity detection of low-frequency alleles. Libraries generated are suitable for CNV analysis, whole-genome sequencing, or targeted enrichment.
Targeted sequencing for the CALiBRx panel involves hybridizing genomic regions of interest with biotinylated oligonucleotide probes, followed by enrichment using streptavidin beads. The captured libraries are sequenced on an Illumina platform with 150x2 chemistry, achieving 5000x coverage for cfDNA and 100x for targeted sequencing. This approach enables the detection of variants at a frequency as low as 0.5%.
The study includes experimental groups representing different genetic and lifestyle contexts. The Parsi cohort, which avoids tobacco-related products due to religious reasons, provides a unique genetic background. In addition, the study involves 200 non-Parsi non-smokers to account for genetic variability, 200 heavy smokers with no diagnosis of tobacco-related pathologies, and 200 heavy smokers diagnosed with lung cancer prior to treatment. This cohort design ensures robust comparisons across genetic and environmental influences on tobacco-related diseases.
Genomic DNA Analysis:
The genomic DNA samples were fragmented to an average size of 350 bp using ultrasonication (Covaris ME220 ultrasonicator). Sequencing libraries were prepared with the TruSeq Nano DNA Library Prep kit (Illumina) and validated using Agilent TapeStation. Libraries were pooled equimolarly and sequenced on Illumina Novaseq 6000 using S4 flow cells to achieve 30x genome coverage. Customized Twist Custom Panels were designed for 624 CALiBRx genes, targeting specific coding regions. Sequencing generated high-quality reads with at least 100x average coverage, ensuring the reliability of variant identification.
Quality Control and Variant Calling:
Sequencing reads underwent trimming and adapter removal with AdapterRemoval, retaining only high-quality reads (Q30). The GATK pipeline facilitated variant detection, supported by tools such as BWA-MEM, Picard, and Samtools. Pre-processed reads were mapped to the GRCh38 reference genome, and local reassembly of haplotypes enabled the identification of single nucleotide variants (SNVs) and indels. Variants were annotated and prioritized for clinical relevance.
Experimental data 1:
Figure 1: Overall validated workflow for cfDNA analysis for the present study from plasma separation to sequencing through cfDNA extraction protocols.

Figure 2: Summary of samples collected for cfDNA analysis for early biomarker identification for lung carcinoma associated with smoking



Figure 3: Distribution of samples collected and Age Distribution of male and female subjects across the study groups

Table 1.1:List of genes in CALiBRx Panel

Table 1.2: List of fusion genes part of CALiBRx Panel

Table-1.3: Exonic coordinates of genes in CALiBRx
,CLAIMS:We Claim:
1. A method for detecting single nucleotide variants (SNVs) or insertions/deletions (indels) in cell-free DNA (cfDNA) from a plasma sample, comprising:
(a) Amplifying cfDNA molecules from the plasma; (b) Enriching the amplified cfDNA using a CALiBRx sequencing panel targeting genomic regions selected based on a cancer tumor biopsy, generating an enriched set of cfDNA molecules representing no more than 100-200 base pairs of the human genome; (c) Detecting the presence or absence of SNVs or indels in the enriched cfDNA using sequencing.
2. The method of claim 1, wherein sequencing is performed at a depth of at least 5000 reads per base.
3. The method of claim 1, further comprising comparing sequence data from cfDNA of lung cancer treatment-naive patients with a cohort of healthy Parsi and non-Parsi individuals, as well as smokers.
4. The method of claim 3, wherein reference sequences are from healthy individuals.
5. The method of claim 1, wherein baseline frequency of base positions is derived from a healthy cohort and used for determining base calls.
6. The method of claim 1, wherein the base frequency in a cohort of healthy individuals is compared to that of a smoker who has developed lung cancer but is untreated.
7. The method of claim 1, wherein the enriched cfDNA represents no more than 150-200 base pairs of the genome.
8. The method of claim 1, wherein the enriched cfDNA represents no more than 5000 base pairs of the human genome.
9. The method of claim 8, wherein the cancer is lung cancer.
10. The method of claim 9, wherein the subject shows no detectable symptoms of cancer.
11. The method of claim 1, further comprising generating a consensus sequence from multiple reads to minimize amplification or sequencing errors.
12. The method of claim 1, further comprising tagging the enriched cfDNA molecules with barcoded primers before sequencing.
13. The method of claim 1, wherein the CALiBRx panel covers no more than 624 genes and gene fusions, achieving at least 85% sensitivity for detecting SNVs or indels. The panel uses 120 nt probes covering 1,741,642 bp of the hg19 reference genome. A total target region of 1,741,642 bp is directly covered by 18,270 probes. The rest of the target regions are not directly covered by probes due to repeats. Most of the individual set of non-redundant regions specified by the target file will tend to be fully or partially covered by probes. Only 54 out of 9,196 whole target regions are found not to have any probes and will not be captured (these constitute 0.34% of the total unique region targets.
14. The method of claim 1, wherein the panel targets no more than 20 genomic regions to achieve at least 85% sensitivity for lung cancer, validated by association studies in OMIM, ClinVar, and other sources.
15. The method of claim 1, wherein enrichment in step (b) is performed using amplification-based methods.
16. The method of claim 1, further comprising filtering out a portion of the sequence reads or aligned reads.

Dated this 23rd day of January 2024

Sheethal Suryaprakash
(Patent Agent No. IN/PA-1443)
of PHNINE PRIVATE LIMITED
Agent for the Applicant

To,
The Controller of Patents
The Patent Office
Chennai

Documents

Application Documents

# Name Date
1 202441004521-PROVISIONAL SPECIFICATION [23-01-2024(online)].pdf 2024-01-23
2 202441004521-PROOF OF RIGHT [23-01-2024(online)].pdf 2024-01-23
3 202441004521-POWER OF AUTHORITY [23-01-2024(online)].pdf 2024-01-23
4 202441004521-FORM 1 [23-01-2024(online)].pdf 2024-01-23
5 202441004521-DECLARATION OF INVENTORSHIP (FORM 5) [23-01-2024(online)].pdf 2024-01-23
6 202441004521-COMPLETE SPECIFICATION [23-01-2025(online)].pdf 2025-01-23