A Hybrid Transformer Cnn System For Early Alzheimer’s Disease

< Back

A Hybrid Transformer Cnn System For Early Alzheimer’s Disease Diagnosis Using 3 D Brain Mri

Abstract: Disclosed herein is a hybrid transformer-CNN system for early Alzheimer’s disease diagnosis using 3D brain MRI (100) comprises a 3D MRI data acquisition module (102) configured to obtain volumetric brain scans of a subject. The system also includes a spatial feature extraction module (104) configured to extract local structural features from the acquired 3D MRI data. The system also includes a global contextual modeling module (106) configured to process the spatial features extracted by the CNN. The system also includes a multimodal fusion module (108) configured to integrate additional neuroimaging modalities and clinical biomarkers. The system also includes an explainable artificial intelligence (XAI) module (110) configured to highlight critical brain regions and model features relevant to Alzheimer’s disease classification. The system also includes a classification module (112) configured to generate an early-stage Alzheimer’s disease diagnosis based on the combined output of the Transformer, CNN, and multimodal fusion modules.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

07 October 2025

Publication Number

46/2025

Publication Type

INA

Invention Field

BIO-MEDICAL ENGINEERING

Status

Parent Application

Applicants

SR UNIVERSITY

ANANTHSAGAR, HASANPARTHY (M), WARANGAL URBAN, TELANGANA - 506371, INDIA

Inventors

1. G DIVYA JYOTHI

SR UNIVERSITY, ANANTHSAGAR, HASANPARTHY (M), WARANGAL URBAN, TELANGANA - 506371, INDIA

2. DR. C MADAN KUMAR

SR UNIVERSITY, ANANTHSAGAR, HASANPARTHY (M), WARANGAL URBAN, TELANGANA - 506371, INDIA

3. KORA SWETHA

SR UNIVERSITY, ANANTHSAGAR, HASANPARTHY (M), WARANGAL URBAN, TELANGANA - 506371, INDIA

Specification

Description:FIELD OF DISCLOSURE
[0001] The present disclosure relates generally relates to the field of medical imaging and computational intelligence. More specifically, it pertains to a hybrid transformer-CNN system for early Alzheimer’s disease diagnosis using 3D brain MRI.
BACKGROUND OF THE DISCLOSURE
[0002] Alzheimer’s disease (AD) is a progressive neurodegenerative disorder characterized by cognitive decline, memory impairment, and functional deterioration, ultimately affecting the quality of life of millions of individuals worldwide. As the most common form of dementia, Alzheimer’s disease presents a substantial social, economic, and healthcare burden, particularly in aging populations. The global prevalence of AD has been steadily rising, largely due to increased life expectancy and demographic shifts. Epidemiological studies indicate that an estimated 50 million people are currently living with dementia, and this number is expected to triple by 2050, with Alzheimer’s disease accounting for approximately 60-70% of these cases. Early diagnosis is crucial not only for implementing effective therapeutic interventions but also for slowing disease progression, optimizing care strategies, and facilitating patient planning and support.
[0003] Neuroimaging has become a cornerstone in the diagnosis and management of Alzheimer’s disease. Among various neuroimaging modalities, magnetic resonance imaging (MRI) offers a non-invasive approach for visualizing structural, functional, and microstructural changes in the brain. MRI provides high-resolution images of the cerebral anatomy, enabling clinicians and researchers to examine brain atrophy patterns, ventricular enlargement, and hippocampal volume reduction, which are hallmark features of Alzheimer’s disease. Structural MRI, in particular, has demonstrated significant utility in detecting subtle morphological changes associated with the early stages of AD, including mild cognitive impairment (MCI), which often precedes full-blown clinical manifestations. Studies have shown that patients with MCI exhibit characteristic atrophy in the medial temporal lobe, entorhinal cortex, and hippocampus, regions critically involved in memory processing and cognitive function. These structural biomarkers are essential for early detection and differentiation of AD from other forms of dementia and age-related cognitive decline.
[0004] Conventional approaches for analyzing MRI scans rely heavily on manual inspection and visual assessment by radiologists. While experienced clinicians can identify gross morphological changes, early-stage Alzheimer’s disease often presents with subtle, diffuse alterations that may escape conventional observation. Manual interpretation is also prone to inter- and intra-observer variability, leading to inconsistencies in diagnosis. To address these limitations, automated and semi-automated computational methods have been developed, which utilize machine learning and image processing techniques to extract quantitative features from MRI scans. Early methods often relied on voxel-based morphometry, region-of-interest (ROI) analysis, and statistical parametric mapping to identify disease-related structural variations. These approaches, though informative, have limited ability to capture complex, non-linear patterns inherent in high-dimensional neuroimaging data.
[0005] In recent years, deep learning has emerged as a transformative technology in the field of medical image analysis. Convolutional neural networks (CNNs), a class of deep learning architectures designed for image processing, have demonstrated remarkable success in feature extraction, pattern recognition, and classification tasks. CNNs are particularly well-suited for neuroimaging applications due to their capacity to automatically learn hierarchical representations from raw image data without requiring handcrafted features. In the context of Alzheimer’s disease, CNN-based models have been extensively explored for tasks such as disease classification, progression prediction, and biomarker identification. These models leverage convolutional layers to detect local patterns, pooling layers for spatial downsampling, and fully connected layers for decision-making, enabling robust analysis of complex 3D brain structures.
[0006] Despite their advantages, CNNs face certain limitations when applied to volumetric neuroimaging data. The high dimensionality of 3D MRI scans increases computational demands and memory requirements, making training and deployment of deep CNN models challenging. Moreover, conventional CNNs primarily focus on local receptive fields, which may limit their ability to capture long-range spatial dependencies and global contextual information within the brain. This limitation is particularly critical in Alzheimer’s disease, where structural changes are often distributed across multiple regions and inter-regional interactions provide valuable diagnostic insights. To overcome these challenges, researchers have explored various strategies, including 3D CNN architectures, multi-scale feature extraction, and data augmentation techniques. While these approaches have improved performance, they are still constrained in their ability to model global dependencies and complex anatomical relationships inherent in volumetric brain data.
[0007] Transformers, originally developed for natural language processing tasks, have recently been adapted to vision applications due to their ability to capture long-range dependencies through self-attention mechanisms. Unlike CNNs, which rely on localized convolutions, transformers process input sequences in parallel and compute pairwise interactions between all elements, enabling comprehensive modeling of both local and global relationships. Vision transformers (ViTs) and their 3D extensions have shown promise in medical imaging tasks, including segmentation, classification, and anomaly detection. The self-attention mechanism in transformers allows the model to weigh the importance of different regions of the input, providing flexibility in capturing intricate structural patterns and interactions across the brain. This capacity is particularly beneficial for Alzheimer’s disease diagnosis, where subtle changes in multiple regions may collectively indicate disease onset.
[0008] In addition to architectural innovations, the integration of multimodal data has become increasingly important for improving diagnostic accuracy. Alzheimer’s disease is characterized by heterogeneous pathological changes, including amyloid-beta accumulation, tau protein aggregation, vascular alterations, and neuroinflammation. While structural MRI provides valuable anatomical information, combining it with other modalities such as positron emission tomography (PET), functional MRI (fMRI), and cerebrospinal fluid (CSF) biomarkers has been shown to enhance early detection and prognostic assessment. Multimodal approaches enable comprehensive analysis of both structural and functional aspects of the brain, facilitating a more holistic understanding of disease mechanisms. Furthermore, machine learning models that incorporate multimodal features can improve generalizability and robustness, reducing susceptibility to noise and imaging artifacts.
[0009] Preprocessing and standardization of neuroimaging data are critical steps in ensuring reliable analysis. MRI scans are often acquired from multiple centers using different scanners, protocols, and resolutions, leading to variability that can affect model performance. Common preprocessing steps include skull stripping, bias field correction, spatial normalization, intensity normalization, and segmentation of brain tissues into gray matter, white matter, and cerebrospinal fluid. These procedures enhance the quality and consistency of the input data, enabling downstream models to focus on relevant anatomical structures. Additionally, data augmentation techniques, such as rotation, scaling, and flipping, are frequently employed to expand the training dataset and improve model generalization, particularly in cases where the availability of labeled MRI scans is limited.
[0010] Evaluation of Alzheimer’s disease diagnosis systems involves rigorous performance assessment using appropriate metrics. Classification models are typically evaluated using accuracy, sensitivity, specificity, precision, recall, F1-score, and area under the receiver operating characteristic curve (AUC-ROC). Cross-validation and independent testing on external datasets are essential to ensure robustness and prevent overfitting. Furthermore, explainability and interpretability of models are increasingly emphasized in clinical contexts. Techniques such as saliency maps, Grad-CAM, and attention visualization help elucidate which brain regions contribute most to model decisions, providing clinicians with insights into the underlying pathological changes and fostering trust in automated diagnostic systems.
[0011] The growing availability of large-scale neuroimaging datasets has accelerated research in machine learning-based Alzheimer’s disease diagnosis. Public repositories such as the Alzheimer’s Disease Neuroimaging Initiative (ADNI), the Open Access Series of Imaging Studies (OASIS), and the Minimal Interval Resonance Imaging in Alzheimer’s Disease (MIRIAD) dataset provide annotated MRI scans covering various disease stages. These datasets facilitate benchmarking, model comparison, and reproducibility of research findings. In parallel, advances in computational hardware, including graphics processing units (GPUs) and tensor processing units (TPUs), have enabled the training of increasingly complex models on high-dimensional volumetric data, making real-time and large-scale analysis feasible.
[0012] Despite substantial progress, several challenges persist in the domain of automated Alzheimer’s disease diagnosis. First, early-stage detection remains difficult due to subtle anatomical changes and overlapping patterns with normal aging. Second, variability across populations, scanners, and acquisition protocols introduces heterogeneity that can hinder model generalization. Third, interpretability and clinical validation of deep learning models are essential to ensure adoption in real-world healthcare settings. Addressing these challenges requires the development of advanced computational frameworks that combine the strengths of different modeling paradigms, incorporate global and local features, and leverage high-dimensional 3D MRI data effectively.
[0013] The landscape of Alzheimer’s disease diagnosis has evolved significantly with the advent of neuroimaging and computational methods. MRI has established itself as a crucial modality for detecting structural brain changes, while machine learning, particularly deep learning, has enabled automated, quantitative analysis of complex neuroanatomical patterns. CNNs excel at local feature extraction but are limited in capturing long-range dependencies, whereas transformer-based architectures offer enhanced capability for modeling global contextual information. Multimodal integration, robust preprocessing, and access to large annotated datasets further augment the potential of computational approaches. Nevertheless, early diagnosis, generalization across heterogeneous data, and model interpretability continue to pose significant challenges. The ongoing research emphasizes the need for innovative frameworks that can accurately and reliably detect Alzheimer’s disease at its earliest stages, ultimately facilitating timely interventions and improved patient outcomes.
[0014] The body of research in this domain underscores a clear trend toward hybrid modeling approaches, combining the localized feature extraction capability of CNNs with the global attention mechanisms of transformers, particularly for volumetric neuroimaging data such as 3D MRI scans. Such integrative methodologies aim to leverage the complementary strengths of both architectures, addressing limitations inherent in traditional single-paradigm models. Furthermore, with increasing computational power and sophisticated preprocessing pipelines, there is a growing opportunity to harness rich neuroimaging datasets for predictive modeling and early diagnostic interventions. These developments form the backdrop against which innovations in automated Alzheimer’s disease diagnosis continue to emerge, reflecting an ongoing effort to translate advanced computational techniques into clinically actionable tools.
[0015] Thus, in light of the above-stated discussion, there exists a need for a hybrid transformer-CNN system for early Alzheimer’s disease diagnosis using 3D brain MRI.
SUMMARY OF THE DISCLOSURE
[0016] The following is a summary description of illustrative embodiments of the invention. It is provided as a preface to assist those skilled in the art to more rapidly assimilate the detailed design discussion which ensues and is not intended in any way to limit the scope of the claims which are appended hereto in order to particularly point out the invention.
[0017] According to illustrative embodiments, the present disclosure focuses on a hybrid transformer-CNN system for early Alzheimer’s disease diagnosis using 3D brain MRI which overcomes the above-mentioned disadvantages or provide the users with a useful or commercial choice.
[0018] An objective of the present disclosure is to optimize computational efficiency of hybrid CNN-Transformer models, reducing training and inference times without compromising accuracy, making the solution practical for routine clinical deployment.
[0019] Another objective of the present disclosure is to develop a robust hybrid deep learning framework that combines the strengths of Convolutional Neural Networks (CNNs) and Transformer models for accurate and early detection of Alzheimer’s disease from 3D brain MRI scans.
[0020] Another objective of the present disclosure is to automatically extract both local and global spatial features from high-dimensional neuroimaging data, leveraging CNNs for fine-grained structural details and Transformers for modeling long-range dependencies across brain regions.
[0021] Another objective of the present disclosure is to improve diagnostic performance through multimodal data fusion, integrating structural MRI, PET scans, cognitive scores, and potentially other relevant clinical or genomic data, enabling comprehensive assessment of Alzheimer’s disease progression.
[0022] Another objective of the present disclosure is to reduce reliance on large labeled datasets by implementing data-efficient training techniques, transfer learning strategies, or self-supervised pretraining, thus addressing one of the major limitations of current deep learning models in clinical neuroimaging.
[0023] Another objective of the present disclosure is to enhance the interpretability of model predictions so that clinicians can understand and trust the diagnostic outputs, aligning AI-based insights with established clinical knowledge for improved decision-making.
[0024] Another objective of the present disclosure is to provide a non-invasive, timely, and scalable diagnostic solution that supports early-stage detection of Alzheimer’s disease, ultimately reducing the risk of physical complications such as falls and fractures among elderly patients.
[0025] Another objective of the present disclosure is to improve generalizability across diverse patient populations and imaging modalities, ensuring that the system maintains high diagnostic accuracy even when exposed to heterogeneous datasets from different scanners or demographic groups.
[0026] Another objective of the present disclosure is to validate and benchmark the proposed system against existing state-of-the-art approaches, demonstrating superior performance in terms of accuracy, sensitivity, specificity, and early detection capability.
[0027] Yet another objective of the present disclosure is to establish a clinically deployable AI-assisted diagnostic tool, facilitating seamless integration into hospital workflows and enabling healthcare professionals to leverage advanced neuroimaging analytics for personalized patient care and monitoring.
[0028] In light of the above, a hybrid transformer-CNN system for early Alzheimer’s disease diagnosis using 3D brain MRI comprises a 3D MRI data acquisition module configured to obtain volumetric brain scans of a subject. The system also includes a spatial feature extraction module configured to extract local structural features from the acquired 3D MRI data. The system also includes a global contextual modeling module configured to process the spatial features extracted by the CNN. The system also includes a multimodal fusion module configured to integrate additional neuroimaging modalities and clinical biomarkers. The system also includes an explainable artificial intelligence (XAI) module configured to highlight critical brain regions and model features relevant to Alzheimer’s disease classification. The system also includes a classification module configured to generate an early-stage Alzheimer’s disease diagnosis based on the combined output of the Transformer, CNN, and multimodal fusion modules.
[0029] In one embodiment, the 3D MRI data acquisition module is configured to perform pre-processing of the volumetric brain scans including noise reduction, normalization, and alignment of the brain images to a standard anatomical template.
[0030] In one embodiment, the spatial feature extraction module comprises a convolutional neural network (CNN) having multiple layers including convolutional layers, pooling layers, and activation layers to extract hierarchical local structural features from the 3D MRI data.
[0031] In one embodiment, the global contextual modeling module comprises a transformer-based architecture configured to capture long-range dependencies between spatial features and model global contextual relationships in the brain scans.
[0032] In one embodiment, the multimodal fusion module is configured to integrate additional neuroimaging modalities including functional MRI (fMRI), diffusion tensor imaging (DTI), and positron emission tomography (PET), and clinical biomarkers such as cerebrospinal fluid (CSF) protein levels and cognitive test scores.
[0033] In one embodiment, the explainable artificial intelligence (XAI) module utilizes saliency maps, attention maps, or gradient-based feature attribution methods to highlight critical brain regions and provide interpretable insights for Alzheimer’s disease classification.
[0034] In one embodiment, the classification module employs a softmax layer or other probabilistic classifier to generate an early-stage Alzheimer’s disease diagnosis and a confidence score corresponding to the classification result.
[0035] In one embodiment, the system further comprising a training module configured to train the hybrid transformer-CNN system using labeled 3D MRI datasets with known Alzheimer’s disease progression stages.
[0036] In one embodiment, the multimodal fusion module employs attention-based weighting to prioritize features from different modalities and clinical biomarkers for improving diagnostic accuracy.
[0037] In one embodiment, the 3D MRI data acquisition module includes motion correction and artifact removal algorithms to improve the quality of volumetric brain scans before feature extraction.
[0038] These and other advantages will be apparent from the present application of the embodiments described herein.
[0039] The preceding is a simplified summary to provide an understanding of some embodiments of the present invention. This summary is neither an extensive nor exhaustive overview of the present invention and its various embodiments. The summary presents selected concepts of the embodiments of the present invention in a simplified form as an introduction to the more detailed description presented below. As will be appreciated, other embodiments of the present invention are possible utilizing, alone or in combination, one or more of the features set forth above or described in detail below.
[0040] These elements, together with the other aspects of the present disclosure and various features are pointed out with particularity in the claims annexed hereto and form a part of the present disclosure. For a better understanding of the present disclosure, its operating advantages, and the specified object attained by its uses, reference should be made to the accompanying drawings and descriptive matter in which there are illustrated exemplary embodiments of the present disclosure.

BRIEF DESCRIPTION OF THE DRAWINGS
[0041] To describe the technical solutions in the embodiments of the present disclosure or in the prior art more clearly, the following briefly describes the accompanying drawings required for describing the embodiments or the prior art. Apparently, the accompanying drawings in the following description merely show some embodiments of the present disclosure, and a person of ordinary skill in the art can derive other implementations from these accompanying drawings without creative efforts. All of the embodiments or the implementations shall fall within the protection scope of the present disclosure.
[0042] The advantages and features of the present disclosure will become better understood with reference to the following detailed description taken in conjunction with the accompanying drawing, in which:
[0043] FIG. 1 illustrates a flowchart outlining sequential step involved in a hybrid transformer-CNN system for early Alzheimer’s disease diagnosis using 3D brain MRI, in accordance with an exemplary embodiment of the present disclosure;
[0044] FIG. 2 illustrates a flowchart showing working of a hybrid transformer-CNN system for early Alzheimer’s disease diagnosis using 3D brain MRI, in accordance with an exemplary embodiment of the present disclosure.
[0045] Like reference, numerals refer to like parts throughout the description of several views of the drawing;
[0046] The hybrid transformer-CNN system for early Alzheimer’s disease diagnosis using 3D brain MRI, which like reference letters indicate corresponding parts in the various figures. It should be noted that the accompanying figure is intended to present illustrations of exemplary embodiments of the present disclosure. This figure is not intended to limit the scope of the present disclosure. It should also be noted that the accompanying figure is not necessarily drawn to scale.
DETAILED DESCRIPTION OF THE DISCLOSURE
[0047] The following is a detailed description of embodiments of the disclosure depicted in the accompanying drawings. The embodiments are in such detail as to communicate the disclosure. However, the amount of detail offered is not intended to limit the anticipated variations of embodiments; on the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the spirit and scope of the present disclosure.
[0048] In the following description, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the present disclosure. It may be apparent to one skilled in the art that embodiments of the present disclosure may be practiced without some of these specific details.
[0049] Various terms as used herein are shown below. To the extent a term is used, it should be given the broadest definition persons in the pertinent art have given that term as reflected in printed publications and issued patents at the time of filing.
[0050] The terms “a” and “an” herein do not denote a limitation of quantity, but rather denote the presence of at least one of the referenced items.
[0051] The terms “having”, “comprising”, “including”, and variations thereof signify the presence of a component.
[0052] Referring now to FIG. 1 to FIG. 2 to describe various exemplary embodiments of the present disclosure. FIG. 1 illustrates a flowchart outlining sequential step involved in a hybrid transformer-CNN system for early Alzheimer’s disease diagnosis using 3D brain MRI, in accordance with an exemplary embodiment of the present disclosure.
[0053] A hybrid transformer-CNN system for early Alzheimer’s disease diagnosis using 3D brain MRI 100 comprises a 3D MRI data acquisition module 102 configured to obtain volumetric brain scans of a subject. The 3D MRI data acquisition module 102 is configured to perform pre-processing of the volumetric brain scans including noise reduction, normalization, and alignment of the brain images to a standard anatomical template. The 3D MRI data acquisition module 102 includes motion correction and artifact removal algorithms to improve the quality of volumetric brain scans before feature extraction.
[0054] The system also includes a spatial feature extraction module 104 configured to extract local structural features from the acquired 3D MRI data. The spatial feature extraction module 104 comprises a convolutional neural network (CNN) having multiple layers including convolutional layers, pooling layers, and activation layers to extract hierarchical local structural features from the 3D MRI data.
[0055] The system also includes a global contextual modeling module 106 configured to process the spatial features extracted by the CNN. The global contextual modeling module 106 comprises a transformer-based architecture configured to capture long-range dependencies between spatial features and model global contextual relationships in the brain scans.
[0056] The system also includes a multimodal fusion module 108 configured to integrate additional neuroimaging modalities and clinical biomarkers. The multimodal fusion module 108 is configured to integrate additional neuroimaging modalities including functional MRI (fMRI), diffusion tensor imaging (DTI), and positron emission tomography (PET), and clinical biomarkers such as cerebrospinal fluid (CSF) protein levels and cognitive test scores. The multimodal fusion module 108 employs attention-based weighting to prioritize features from different modalities and clinical biomarkers for improving diagnostic accuracy.
[0057] The system also includes an explainable artificial intelligence (XAI) module 110 configured to highlight critical brain regions and model features relevant to Alzheimer’s disease classification. The explainable artificial intelligence (XAI) module 110 utilizes saliency maps, attention maps, or gradient-based feature attribution methods to highlight critical brain regions and provide interpretable insights for Alzheimer’s disease classification.
[0058] The system also includes a classification module 112 configured to generate an early-stage Alzheimer’s disease diagnosis based on the combined output of the Transformer, CNN, and multimodal fusion modules.
[0059] The classification module 112 employs a softmax layer or other probabilistic classifier to generate an early-stage Alzheimer’s disease diagnosis and a confidence score corresponding to the classification result.
[0060] The system further comprising a training module configured to train the hybrid transformer-CNN system using labeled 3D MRI datasets with known Alzheimer’s disease progression stages.
[0061] FIG. 1 illustrates a flowchart outlining sequential step involved in a hybrid transformer-CNN system for early Alzheimer’s disease diagnosis using 3D brain MRI.
[0062] At 102, the process begins with the 3D MRI data acquisition module, which is responsible for obtaining high-resolution volumetric scans of the subject's brain. This module ensures that comprehensive anatomical data is captured, encompassing both cortical and subcortical structures. The quality and resolution of the MRI scans acquired at this stage are crucial as they serve as the foundation for all subsequent processing, directly affecting the accuracy of feature extraction and classification.
[0063] At 104, once the MRI data is acquired, it is directed to the spatial feature extraction module. This module leverages convolutional neural networks (CNNs) to capture local structural characteristics of the brain. By applying hierarchical convolutional operations, the system identifies intricate patterns and subtle volumetric changes in regions of interest, such as the hippocampus and entorhinal cortex, which are known to exhibit early signs of Alzheimer’s disease. The CNN effectively reduces the dimensionality of the 3D MRI data while retaining spatially significant features that are essential for accurate modeling. These extracted features represent a detailed mapping of structural anomalies and variations across the brain volume.
[0064] At 106, following spatial feature extraction, the data flows into the global contextual modeling module, which employs a transformer-based architecture to process the CNN-extracted features. This module excels at capturing long-range dependencies and global interactions between distant brain regions, which are often critical in the context of neurodegenerative diseases. The transformer’s attention mechanism enables the system to weigh the relative importance of features across the entire brain volume, ensuring that subtle but clinically significant patterns are emphasized in the final analysis. By integrating these global contextual insights with the localized CNN features, the system achieves a holistic representation of the brain’s structural integrity.
[0065] At 108, the system further enhances its diagnostic capability through the multimodal fusion module. This module allows the integration of additional neuroimaging modalities, such as PET or fMRI, along with relevant clinical biomarkers like cognitive test scores or genetic information. By combining structural MRI data with complementary modalities, the system constructs a richer, multidimensional representation of the patient’s neurological and clinical profile. This fusion ensures that the diagnosis is informed not solely by anatomical changes but also by functional and biochemical indicators, thereby improving sensitivity to early-stage Alzheimer’s disease.
[0066] At 110, to ensure transparency and interpretability of the results, the flowchart incorporates an explainable artificial intelligence (XAI) module. This module highlights critical brain regions and model features that contribute most significantly to the Alzheimer’s disease classification. By generating attention maps or saliency overlays on the 3D MRI scans, the XAI module allows clinicians and researchers to visualize which areas of the brain the system considers most indicative of pathology. This not only aids in clinical validation but also enhances trust in the automated diagnostic process, bridging the gap between black-box AI models and practical medical decision-making.
[0067] At 112, the processed and fused information reaches the classification module. This module synthesizes the outputs from the transformer, CNN, and multimodal fusion modules to generate a definitive early-stage Alzheimer’s disease diagnosis. The classification algorithm evaluates the combined feature representations, assigning a probability or categorical label corresponding to the presence or progression of Alzheimer’s pathology. By leveraging both local structural details and global contextual relationships, as well as multimodal clinical information, the classification module provides a highly accurate and reliable diagnostic prediction, thereby facilitating timely intervention and personalized care planning for patients at risk of Alzheimer’s disease.
[0068] FIG. 2 illustrates a flowchart showing working of a hybrid transformer-CNN system for early Alzheimer’s disease diagnosis using 3D brain MRI.
[0069] At the outset, the framework begins with input data acquisition, which includes 3D structural MRI (sMRI), positron emission tomography (PET), diffusion tensor imaging (DTI), and cognitive scores. These inputs provide complementary information, capturing not only the anatomical and structural features of the brain but also metabolic, connectivity, and cognitive function patterns that are associated with early-stage Alzheimer’s disease.
[0070] Following data acquisition, a preprocessing module is employed, which performs essential operations such as skull stripping and normalization. Skull stripping removes non-brain tissues from the MRI scans, ensuring that subsequent analysis focuses solely on the relevant brain structures. Normalization standardizes the data across subjects, correcting for differences in scale, orientation, and intensity, thereby enabling consistent feature extraction. Once preprocessing is complete, the 3D MRI data is passed through 3D convolutional layers designed to extract local spatial features. These layers perform localized feature extraction and registration, allowing the system to detect subtle structural variations, such as hippocampal atrophy, that are indicative of Alzheimer’s progression.
[0071] The extracted features are then integrated with multimodal embeddings, which represent complementary data from PET, DTI, and cognitive scores. A feature fusion module is used to combine these diverse inputs through cross-attention and concatenation mechanisms. The cross-attention aligns information across modalities, enabling the model to leverage complementary biomarkers effectively, while concatenation consolidates the fused features into a coherent representation for downstream processing.
[0072] To capture long-range dependencies and global contextual information, the framework incorporates Transformer encoders. Initially, the features undergo patch embedding, segmenting the volumetric data into manageable patches suitable for the Transformer architecture. The Transformer encoders then apply self-attention mechanisms to model relationships across distant brain regions, thereby complementing the localized spatial features extracted by the CNN. This combination of CNNs for local feature extraction and Transformers for global modeling constitutes the hybrid core of the architecture.
[0073] The output of the Transformer encoders is passed through fully connected layers followed by a softmax classifier, producing diagnostic predictions for categories such as cognitively normal (CN), mild cognitive impairment (ICI), and Alzheimer’s disease (AD). The model incorporates explainability mechanisms, such as attention mapping and region-aware visualizations, which highlight the brain regions contributing most significantly to the prediction, facilitating interpretability and clinical validation.
[0074] Finally, the performance of the framework is evaluated using standard evaluation metrics including accuracy, AUC, and F1 score. To ensure robustness and generalizability, the system is validated across multiple datasets such as ADNI, AIBL, and OASIS, demonstrating its ability to perform consistent early Alzheimer’s diagnosis across diverse populations.
[0075] While the invention has been described in connection with what is presently considered to be the most practical and various embodiments, it will be understood that the invention is not to be limited to the disclosed embodiments, but on the contrary, is intended to cover various modifications and equivalent arrangements included within the scope of the appended claims.
[0076] A person of ordinary skill in the art may be aware that, in combination with the examples described in the embodiments disclosed in this specification, units and algorithm steps may be implemented by electronic hardware, computer software, or a combination thereof.
[0077] The foregoing descriptions of specific embodiments of the present disclosure have been presented for purposes of illustration and description. They are not intended to be exhaustive or to limit the present disclosure to the precise forms disclosed, and many modifications and variations are possible in light of the above teaching. The embodiments were chosen and described to best explain the principles of the present disclosure and its practical application, and to thereby enable others skilled in the art to best utilize the present disclosure and various embodiments with various modifications as are suited to the particular use contemplated. It is understood that various omissions and substitutions of equivalents are contemplated as circumstances may suggest or render expedient, but such omissions and substitutions are intended to cover the application or implementation without departing from the scope of the present disclosure.
[0078] Disjunctive language such as the phrase “at least one of X, Y, Z,” unless specifically stated otherwise, is otherwise understood with the context as used in general to present that an item, term, etc., may be either X, Y, or Z, or any combination thereof (e.g., X, Y, and/or Z). Thus, such disjunctive language is not generally intended to, and should not, imply that certain embodiments require at least one of X, at least one of Y, or at least one of Z to each be present.
[0079] In a case that no conflict occurs, the embodiments in the present disclosure and the features in the embodiments may be mutually combined. The foregoing descriptions are merely specific implementations of the present disclosure, but are not intended to limit the protection scope of the present disclosure. Any variation or replacement readily figured out by a person skilled in the art within the technical scope disclosed in the present disclosure shall fall within the protection scope of the present disclosure. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.
, Claims:I/We Claim:
1. A hybrid transformer-CNN system for early Alzheimer’s disease diagnosis using 3D brain MRI (100) comprising:
a 3D MRI data acquisition module (102) configured to obtain volumetric brain scans of a subject;
a spatial feature extraction module (104) configured to extract local structural features from the acquired 3D MRI data;
a global contextual modeling module (106) configured to process the spatial features extracted by the CNN;
a multimodal fusion module (108) configured to integrate additional neuroimaging modalities and clinical biomarkers;
an explainable artificial intelligence (XAI) module (110) configured to highlight critical brain regions and model features relevant to Alzheimer’s disease classification;
a classification module (112) configured to generate an early-stage Alzheimer’s disease diagnosis based on the combined output of the Transformer, CNN, and multimodal fusion modules.
2. The system (100) as claimed in claim 1, wherein the 3D MRI data acquisition module (102) is configured to perform pre-processing of the volumetric brain scans including noise reduction, normalization, and alignment of the brain images to a standard anatomical template.
3. The system (100) as claimed in claim 1, wherein the spatial feature extraction module (104) comprises a convolutional neural network (CNN) having multiple layers including convolutional layers, pooling layers, and activation layers to extract hierarchical local structural features from the 3D MRI data.
4. The system (100) as claimed in claim 1, wherein the global contextual modeling module (106) comprises a transformer-based architecture configured to capture long-range dependencies between spatial features and model global contextual relationships in the brain scans.
5. The system (100) as claimed in claim 1, wherein the multimodal fusion module (108) is configured to integrate additional neuroimaging modalities including functional MRI (fMRI), diffusion tensor imaging (DTI), and positron emission tomography (PET), and clinical biomarkers such as cerebrospinal fluid (CSF) protein levels and cognitive test scores.
6. The system (100) as claimed in claim 1, wherein the explainable artificial intelligence (XAI) module (110) utilizes saliency maps, attention maps, or gradient-based feature attribution methods to highlight critical brain regions and provide interpretable insights for Alzheimer’s disease classification.
7. The system (100) as claimed in claim 1, wherein the classification module (112) employs a softmax layer or other probabilistic classifier to generate an early-stage Alzheimer’s disease diagnosis and a confidence score corresponding to the classification result.
8. The system (100) as claimed in claim 1, the system further comprising a training module configured to train the hybrid transformer-CNN system using labeled 3D MRI datasets with known Alzheimer’s disease progression stages.
9. The system (100) as claimed in claim 1, wherein the multimodal fusion module (108) employs attention-based weighting to prioritize features from different modalities and clinical biomarkers for improving diagnostic accuracy.
10. The system (100) as claimed in claim 1, wherein the 3D MRI data acquisition module (102) includes motion correction and artifact removal algorithms to improve the quality of volumetric brain scans before feature extraction.

Documents

Application Documents

#	Name	Date
1	202541096579-STATEMENT OF UNDERTAKING (FORM 3) [07-10-2025(online)].pdf	2025-10-07
2	202541096579-REQUEST FOR EARLY PUBLICATION(FORM-9) [07-10-2025(online)].pdf	2025-10-07
3	202541096579-POWER OF AUTHORITY [07-10-2025(online)].pdf	2025-10-07
4	202541096579-FORM-9 [07-10-2025(online)].pdf	2025-10-07
5	202541096579-FORM FOR SMALL ENTITY(FORM-28) [07-10-2025(online)].pdf	2025-10-07
6	202541096579-FORM 1 [07-10-2025(online)].pdf	2025-10-07
7	202541096579-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [07-10-2025(online)].pdf	2025-10-07
8	202541096579-DRAWINGS [07-10-2025(online)].pdf	2025-10-07
9	202541096579-DECLARATION OF INVENTORSHIP (FORM 5) [07-10-2025(online)].pdf	2025-10-07
10	202541096579-COMPLETE SPECIFICATION [07-10-2025(online)].pdf	2025-10-07