Attention Enhanced Inceptionnext Based Hybrid Deep Learning Model For

< Back

Attention Enhanced Inceptionnext Based Hybrid Deep Learning Model For Lung Cancer Detection

Abstract: Worldwide, lung cancer is the leading cause of cancer-related death. Survival chances can be greatly increased and the progression of this extremely common and deadly disease can be stopped with early identification. The gold standard for imaging lung cancer is computed tomography (CT), which provides vital information on lung nodule evaluation.Vision Transformers (ViTs) and Convolutional Neural Networks (CNNs) are combined in our hybrid deep learning model. Both fine-grained and large-scale features in CT images are successfully captured by the suggested model through the integration and optimization of grid and block attention processes with InceptionNeXt blocks. The model can distinguish between benign and malignant nodules thanks to this thorough methodology, and it can even recognize particular cancer subtypes including squamous cell carcinoma, big cell carcinoma, and adenocarcinoma. Because InceptionNeXt blocks enable multi-scale feature processing, the model performs especially well for intricate and varied lung nodule patterns. While block attention concentrates on capturing contextual and hierarchical information, enabling accurate identification and classification of lung nodules, grid attention enhances the model's ability to recognize spatial relationships across various image sections. Two public datasets, Chest CT and IQ-OTH/NCCD, were used to train and verify the model in order to guarantee its robustness and generalizability. Transfer learning and pre-processing techniques were used to increase the detection accuracy. The suggested model outperformed cutting-edge CNN-based and ViT-based techniques with an astounding accuracy of 98.55% on the IQ-OTH/NCCD dataset and 97.31% on the Chest CT dataset. The model offers a lightweight yet effective approach to early lung cancer identification using just 18.4 million characteristics, which may enhance clinical results and raise patient survival rates.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

14 July 2025

Publication Number

38/2025

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Parent Application

Applicants

SR University

Ananthasagar, Hasnparthy (M), Waranagal Urban, Telangana 506371, India.

Mrs. Kasi Sailaja

Research Scholar, School of CS & AI, SR University, Ananthasagar, Hasnparthy (M), Waranagal Urban, Telangana 506371, INDIA.

Dr. Anurodh Kumar

Assistant Professor, School of CS & AI, SR University, Ananthasagar, Hasnparthy (M), Waranagal Urban, Telangana 506371, India.

Inventors

1. Mrs. Kasi Sailaja

Research Scholar, School of CS & AI, SR University, Ananthasagar, Hasnparthy (M), Waranagal Urban, Telangana 506371, INDIA.

2. Dr. Anurodh Kumar

Assistant Professor, School of CS & AI, SR University, Ananthasagar, Hasnparthy (M), Waranagal Urban, Telangana 506371, India.

Specification

Description:A novel hybrid deep learning model for lung cancer detection, integrating an attention-enhanced mechanism with InceptionNeXt architecture. The model combines convolutional and transformer-based attention modules to accurately extract both local and global features from medical imaging data such as CT scans. This invention improves diagnostic performance, sensitivity, and specificity in lung cancer detection by addressing challenges in feature extraction and lesion localization. The hybrid model leverages both the depth and scale-awareness of InceptionNeXt with adaptive attention for enhanced feature discrimination.
BACKGROUND OF THE INVENTION
The following description of related art is intended to provide background information pertaining to the field of the disclosure. This section may include certain aspects of the art that may be related to various features of the present disclosure. However, it should be appreciated that this section be used only to enhance the understanding of the reader with respect to the present disclosure, and not as admissions of prior art.

Lung cancer detection from radiographic images (e.g., CT or PET scans) is challenging due to variability in tumor size, shape, and texture. Traditional CNN architectures face limitations in capturing long-range dependencies and global context, which are essential for accurate tumor identification. Transformer-based models offer global contextual understanding, but often lack the spatial efficiency of CNNs.

InceptionNeXt, an evolved convolutional architecture combining multi-scale feature extraction with modern design principles (e.g., normalization, residual connections), shows great promise for medical image analysis. Enhancing it with attention modules can further improve the interpretability and accuracy of lung cancer diagnosis.

Worldwide, lung cancer is the leading cause of cancer-related death. Survival chances can be greatly increased and the progression of this extremely common and deadly disease can be stopped with early identification. The gold standard for imaging lung cancer is computed tomography (CT), which provides vital information on lung nodule evaluation. Vision Transformers (ViTs) and Convolutional Neural Networks (CNNs) are combined in our hybrid deep learning model.

US20210225511A1: Method and System for Improving Cancer Detection Using Deep Learning This patent describes a system utilizing a 3D Inflated Inception V1 model trained on CT scan datasets like NLST and LIDC. It incorporates lung segmentation and multiple binary classifiers to predict cancer presence, mortality risk, and nodule characteristics.

US12059284B2: Lung Cancer Prediction, this patent outlines a method for assessing lung cancer risk by analyzing temporal changes in lung nodules across multiple CT scans. It employs machine learning models trained on longitudinal data, considering factors like nodule size, shape, and patient demographics.

US12032658B2: Method and System for Improving Cancer Detection Using Deep Learning Similar to US20210225511A1, this patent focuses on enhancing specificity in lung cancer screening using deep learning models trained on annotated CT datasets. It emphasizes the use of 3D convolutional neural networks for pattern recognition in lung tissue.

US12059284B2: Lung Cancer Prediction (Issued August 13, 2024), this patent describes a system that utilizes convolutional neural networks (CNNs), including architectures like Inception, ResNet, and VGG, to analyze temporal changes in lung nodules across multiple CT scans. It emphasizes the use of machine learning models trained on longitudinal data, considering factors such as nodule size, shape, and patient demographics to assess lung cancer risk.

US20210225511A1: Method and System for Improving Cancer Detection Using Deep Learning (Published July 22, 2021), this application outlines a method employing a 3D Inflated Inception V1 model trained on CT scan datasets like NLST and LIDC. It incorporates lung segmentation and multiple binary classifiers to predict cancer presence, mortality risk, and nodule characteristics. The system leverages longitudinal and multimodal imaging in a deep learning framework for enhanced cancer detection and diagnosis.

US11861881B2: Critical Component Detection Using Deep Learning and Attention (Issued January 2, 2024), this patent focuses on the application of attention mechanisms within deep learning models to detect critical components in various contexts. While not specific to lung cancer, the methodologies described for integrating attention modules into neural networks can be adapted to enhance feature extraction and interpretability in medical imaging applications, including lung cancer detection.

UKGB2211487.0: Enhancing Cancer Prediction in Challenging Screen-Detected Incident Lung Nodules Using Time-Series Deep Learning, this application focuses on improving lung cancer prediction by employing time-series deep learning models. It addresses the challenges in detecting incident lung nodules through advanced AI techniques.

UKGB2113765.8: Methods and Systems for Cancer Prediction Using Deep Learning, this application pertains to methods and systems that utilize deep learning for cancer prediction, potentially encompassing attention mechanisms and hybrid models similar to InceptionNeXt.

WO2022249198A1 – Predicting Lung Cancer Risk, this international patent application, filed under the Patent Cooperation Treaty (PCT) with India as the receiving office, describes a method for predicting lung cancer risk by analyzing chest imaging data. The system utilizes machine learning techniques to assess nodule characteristics and other features indicative of cancer risk.

202131055901: Intelligent Healthcare System for Detection of Tumor Cells in Lung Cancer CT Images Using Image Processing, this patent describes an intelligent healthcare system that employs image processing techniques to detect tumor cells in lung cancer CT images. The system aims to enhance diagnostic accuracy and support clinical decision-making.

US11730387: This patent presents a method for detecting and diagnosing lung and pancreatic cancers from imaging scans. It utilizes a 3D densely connected Convolutional Neural Network (CNN) that processes CT scan volumes by dividing them into a grid of cells. Each cell is analyzed simultaneously to detect nodules, leveraging global contextual information from the entire 3D input volume. This approach allows for efficient and accurate detection without the need for multi-scale processing.

US12059284: This patent outlines a method for predicting lung cancer risk by analyzing temporal changes in lung nodules across multiple CT scans. It employs machine learning models trained on longitudinal data, considering factors like nodule size, shape, and patient demographics. The system determines the risk of lung cancer based on the elapsed time between imaging sessions and differences observed in the nodules.

US12051509: This patent describes methods and machine learning systems for predicting the likelihood or risk of having cancer. The system integrates data from blood biomarkers, patient medical records, epidemiological factors, and imaging analyses (such as x-rays and CT scans). It utilizes machine learning to assess a patient's cancer risk relative to a matched cohort and provides assessments for multiple cancers, including lung cancer.
US11276173: This patent focuses on predicting lung cancer risk using deep learning techniques applied to chest X-rays. The system enables real-time, automatic risk prediction without human intervention. It determines multiple parameters from chest X-rays and monitors changes in nodule size, providing a cost-effective method for assessing lung cancer risk.

US12183462: Method for Predicting Lung Cancer Development Based on Artificial Intelligence Model, this patent describes a CNN-based system for predicting lung cancer development. The model comprises multiple convolutional and pooling layers, followed by fully connected layers that output the probability of developing lung cancer. It also discusses using dual neural networks for detecting regions of interest in medical images.

US11730387: Method for Detection and Diagnosis of Lung and Pancreatic Cancers from Imaging Scans, this patent outlines a method employing densely connected convolutional blocks for cancer detection. The network consists of five dense blocks, each with six convolution layers, facilitating efficient information flow and improved feature extraction for accurate diagnosis.

US12039724: Methods of Assessing Lung Disease in Chest X-Rays, this patent discusses systems using CNNs to assess lung diseases from chest X-rays. The CNNs aggregate local information to provide global predictions, utilizing multiple convolutional layers to learn and extract feature maps, aiding in accurate disease assessment.

US12032658: Method and System for Improving Cancer Detection Using Deep Learning, this patent presents a deep learning approach for cancer detection, incorporating a global model trained to predict cancer presence in lung tissue. It includes lung segmentation features and employs 3D data augmentation techniques to enhance model performance.

OBJECTIVE OF THE INVENTION

Some of the objects of the present disclosure, which at least one embodiment herein satisfies are listed herein below.

To provide a deep learning-based diagnostic model that combines CNN and transformer architectures to enhance lung cancer detection.

To integrate the InceptionNeXt architecture as a multi-scale feature extractor capable of capturing diverse lung nodule patterns.

To embed channel and spatial attention modules such as CBAM and SE blocks for refining feature representation.

To incorporate transformer-based global attention layers that identify non-local dependencies in CT images.

To develop a classifier capable of distinguishing between normal, benign, and malignant lung conditions with high accuracy.

To enable the model's use within a clinical decision support system (CDSS) that assists radiologists in real-time decision-making.

To improve generalizability and performance across diverse datasets through data augmentation, transfer learning, and lightweight architecture.

SUMMARY OF THE INVENTION
This section is provided to introduce certain objects and aspects of the present disclosure in a simplified form that are further described below in the detailed description. This summary is not intended to identify the key features or the scope of the claimed subject matter.

The present invention introduces a hybrid model for lung cancer detection, integrating:
1. InceptionNeXt backbone for multi-scale spatial feature extraction.
2. Attention enhancement modules such as Convolutional Block Attention Module (CBAM) or Squeeze-and-Excitation (SE) blocks for channel and spatial refinement.
3. Transformer-based global attention layers for capturing non-local dependencies.
4. Feature fusion layers to optimally combine hierarchical features from both attention and convolutional streams.
5. Classifier head trained on labeled lung cancer datasets to differentiate between malignant and benign cases or between various cancer stages.

This model is designed for use in CAD (Computer-Aided Diagnosis) systems and can be embedded in hospital-grade diagnostic equipment.

ADVANTAGES OF THE INVENTION
• Enhanced Feature Representation: Combines the strengths of CNNs (local features) and transformers (global context).
• Higher Detection Accuracy: Demonstrates superior performance on benchmark lung cancer datasets (e.g., LIDC-IDRI).
• Model Interpretability: Attention maps aid in visualizing key diagnostic regions.
• Scalable & Portable: Can be optimized for mobile health systems or cloud deployment.

BRIEF DESCRIPTION OF DRAWINGS
The accompanying drawings, which are incorporated herein, and constitute a part of this invention, illustrate exemplary embodiments of the disclosed methods and systems in which like reference numerals refer to the same parts throughout the different drawings. Components in the drawings are not necessarily to scale, emphasis instead being placed upon clearly illustrating the principles of the present invention. Some drawings may indicate the components using block diagrams and may not represent the internal circuitry of each component. It will be appreciated by those skilled in the art that invention of such drawings includes the invention of electrical components, electronic components or circuitry commonly used to implement such components.

FIG. 1 illustrates an exemplary Computer-implemented method for detecting lung cancer in medical images; in accordance with an embodiment of the present disclosure.

DETAILED DESCRIPTION OF THE INVENTION

In the following description, for the purposes of explanation, various specific details are set forth in order to provide a thorough understanding of embodiments of the present disclosure. It will be apparent, however, that embodiments of the present disclosure may be practiced without these specific details. Several features described hereafter can each be used independently of one another or with any combination of other features. An individual feature may not address all of the problems discussed above or might address only some of the problems discussed above. Some of the problems discussed above might not be fully addressed by any of the features described herein.

The ensuing description provides exemplary embodiments only and is not intended to limit the scope, applicability, or configuration of the disclosure. Rather, the ensuing description of the exemplary embodiments will provide those skilled in the art with an enabling description for implementing an exemplary embodiment. It should be understood that various changes may be made in the function and arrangement of elements without departing from the spirit and scope of the disclosure as set forth.

Specific details are given in the following description to provide a thorough understanding of the embodiments. However, it will be understood by one of ordinary skill in the art that the embodiments may be practiced without these specific details. For example, circuits, systems, networks, processes, and other components may be shown as components in block diagram form in order not to obscure the embodiments in unnecessary detail. In other instances, well-known circuits, processes, algorithms, structures, and techniques may be shown without unnecessary detail to avoid obscuring the embodiments.

Also, it is noted that individual embodiments may be described as a process that is depicted as a flowchart, a flow diagram, a data flow diagram, a structure diagram, or a block diagram. Although a flowchart may describe the operations as a sequential process, many of the operations can be performed in parallel or concurrently. In addition, the order of the operations may be re-arranged. A process is terminated when its operations are completed but could have additional steps not included in a figure. A process may correspond to a method, a function, a procedure, a subroutine, a subprogram, etc. When a process corresponds to a function, its termination can correspond to a return of the function to the calling function or the main function.

The word “exemplary” and/or “demonstrative” is used herein to mean serving as an example, instance, or illustration. For the avoidance of doubt, the subject matter disclosed herein is not limited by such examples. In addition, any aspect or design described herein as “exemplary” and/or “demonstrative” is not necessarily to be construed as preferred or advantageous over other aspects or designs, nor is it meant to preclude equivalent exemplary structures and techniques known to those of ordinary skill in the art. Furthermore, to the extent that the terms “includes,” “has,” “contains,” and other similar words are used in either the detailed description or the claims, such terms are intended to be inclusive in a manner similar to the term “comprising” as an open transition word without precluding any additional or other elements.

Reference throughout this specification to “one embodiment” or “an embodiment” or “an instance” or “one instance” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. Thus, the appearances of the phrases “in one embodiment” or “in an embodiment” in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. As used herein, the singular forms “a”, “an”, and “the” are intended to include the plural forms as well, unless the context indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items.

FIG 1 : Proposed hybrid architecture for lung cancer classification:
Successful and efficient treatment of lung cancer depends on early and accurate identification. DL algorithms offer a lot of promise in this area for the identification and categorization of malignant tumors. These algorithms are typically created with the use of huge datasets. Small and big model modifications are tested by researchers in order to determine which model best fits the datasets and issues they are attempting to address. It is necessary to adjust these models to the particular dataset in order to attain optimal performance. Although architectures that work well on these big datasets may work well on other datasets, each dataset frequently requires unique optimizations.The hybrid architecture is redesigned in this work to diagnose lung cancer. Based on the MaxViT architecture, the model in Figure 1 offers a scalable and effective solution with excellent accuracy rates.

The detailed architecture and operation of the hybrid DL model suggested in this study are shown in Figure 1. By merging CNN and ViT-based techniques, the model exhibits a hybrid structure and is designed to execute delicate and complicated tasks like lung cancer diagnosis. Both the global information capture potential of transformer-based techniques and the local feature extraction benefits of convolutional networks are intended to be utilized by this design.

InceptionNeXt block:
Faster training times and improved accuracy rates are noteworthy outcomes of this suggested architecture. These enhancements make DL models more effective and flexible, which is very beneficial when dealing with big datasets like lung cancer. Combining Inception-NeXt and Hybrid for the identification of lung cancer may result in more precise classifications since it captures both local and global characteristics. The block InceptionNeXt is displayed in Figure 2.

Sample images for each class from the IQ-OTH/NCCD dataset:
The collection includes 80–200 picture slices of the patient's chest from various perspectives in each scan. A selection of the IQ-OTH/NCCD dataset's photos are displayed in Figure 3. The ethical committees of each medical facility approved the study, and all images were de-identified before processing. Most of the participants in the study were from the central region of Iraq, however the cases included varied in terms of age, gender, occupation, and place of residence. The IQOTH/NCCD dataset is a significant data source that is frequently utilized in studies on lung cancer.

Example images for each class from the chest CT dataset:
CT pictures of lung tumors can be found in the Kaggle Chest CT dataset. Training, test, and validation are the three folders that make up the dataset. Four categories exist for CT scans: huge Data on the CT dataset of the chest . normal, squamous cell carcinoma, denocarcinoma, and cell carcinoma. 1000 CT scans with lung cancer diagnoses are included in the chest CT. PNG and JPG files are used to store these pictures. These photos, which have been annotated by seasoned radiologists, are a publicly accessible dataset. The manner that chest CT pictures are divided into classes is shown in Table 2. Chest CT offers a sizable and annotated dataset for DL model testing and training, yielding reliable and repeatable outcomes for contemporary medical applications . Example images for each class in the Chest CT dataset are shown in Figure 4.

Results of CNN-based and ViT-based models on IQ-OTH/NCCD dataset:
This study compares the outcomes of evaluating the suggested model's lung cancer detection performance on two distinct datasets: Chest CT and IQOTH/NCCD. The model's performance is evaluated using performance metrics such F1 score, accuracy, sensitivity, and precision. It is demonstrated to outperform the current approaches in the literature on these criteria. Consistent outcomes across two distinct data sets, universal validity, and limited reliance on particular data sets are characteristics of the model's effectiveness.

Evaluation of the proposed model for IQ-OTH/NCCD dataset against CNN and ViT:
Furthermore, every model that was compared in the study was trained using identical data sets and experimental setups. To guarantee the consistency and dependability of the performance reviews, this method has been used with extra caution. The primary justification for using the suggested hybrid strategy is to combine the global contextual information capture capabilities of ViTs with the local feature extraction capability of CNNs to give a more thorough and efficient analysis for lung cancer diagnosis. The model's lightweight construction made it possible to attain excellent accuracy rates in addition to computational economy. Specifically, the investigation was conducted in an equitable setting using the same experimental settings. This allows for an objective comparison of the results and demonstrates the truly superior performance of the model.

Confusion matrix for IQ-OTH/NCCD dataset:
The confusion matrix, which evaluates performance on the IQ-OTH/NCCD dataset by class, is shown in Figure 7. This matrix shows the number of accurate and inaccurate classifications for each class in detail, indicating the ability of the proposed model to correctly categorize normal, benign, and malignant classes. All 24 samples were accurately classified in the benign class, and the model's accuracy in this class was 100%. The model's exceptional success in identifying crucially significant malignant tumors is demonstrated by the fact that all 112 samples in the Malignant class were accurately recognized. Just one sample was misclassified as benign in the typical class, while 82 out of 83 samples were correctly classified. The model's overall success was only somewhat impacted by this.

Grad-CAM heat maps for CT scan case:
Grad-CAM heatmaps are shown for three typical CT scan scenarios in Figure 9: benign, malignant, and normal. The original ground truth photos are shown in the upper row, while the matching Grad-CAM visualizations are shown in the lower row. By showing the areas the model considered most important for its categorization choices, these heatmaps provide a better picture of how the model behaved. The Grad-CAM image shows particular localized areas in the lung that correspond to benign nodules or mild anomalies in the benign condition (first column). This targeted activation demonstrates a sophisticated comprehension of benign diseases that is consistent with professional radiological assessment, indicating the model's capacity to detect subtle but clinically significant aspects.

Performance comparison of model variants with grid and block attention:
The evaluation comprised four configurations: the entire suggested model, which combines the Grid and Block Attention processes; the baseline model, the baseline model with Grid Attention; and the baseline model with Block Attention. Fig 9, provides an overview of the findings, demonstrating the mechanisms' synergy and the incremental benefits they offer in the suggested architecture.

PaireFd t-test results for statistical significance:
The paired t-test p-values for Accuracy and F1-Score for both datasets are summarized in detail in Fig 10. A statistically significant p-value was one that was less than 0.06, indicating substantial performance differences between the baseline models and the suggested model. A thorough assessment of the statistical significance of the performance gains of the suggested model over baseline models is given by the paired t-test results in Fig 10. These results highlight how the suggested model is stable and reliable across two datasets IQOTH/NCCD and Chest CT each of which has unique features and difficulties.

While considerable emphasis has been placed herein on the preferred embodiments, it will be appreciated that many embodiments can be made and that many changes can be made in the preferred embodiments without departing from the principles of the invention. These and other changes in the preferred embodiments of the invention will be apparent to those skilled in the art from the disclosure herein, whereby it is to be distinctly understood that the foregoing descriptive matter to be implemented merely as illustrative of the invention and not as limitation.
, Claims:1. A deep learning-based method for detecting lung cancer in medical images using a hybrid model combining InceptionNeXt and attention modules.
2. The method of claim 1, wherein InceptionNeXt acts as the primary convolutional backbone for multi-scale feature extraction.
3. The method of claim 1, further comprising an attention mechanism including channel and spatial attention via CBAM or SE blocks.
4. The method of claim 1, further comprising a transformer-based global attention module for capturing long-range dependencies.
5. The method of claim 1, wherein outputs from InceptionNeXt and attention modules are fused via concatenation and passed to a classification head.
6. The method of claim 1, wherein the classification head distinguishes between benign and malignant cases of lung cancer.
7. The method of claim 1, wherein data augmentation techniques improve generalization across diverse imaging datasets.
8. A system comprising the model of any of claims 1–7 embedded in a clinical decision support system for radiologists.

Documents

Application Documents

#	Name	Date
1	202541067117-STATEMENT OF UNDERTAKING (FORM 3) [14-07-2025(online)].pdf	2025-07-14
2	202541067117-REQUEST FOR EARLY PUBLICATION(FORM-9) [14-07-2025(online)].pdf	2025-07-14
3	202541067117-FORM-9 [14-07-2025(online)].pdf	2025-07-14
4	202541067117-FORM 1 [14-07-2025(online)].pdf	2025-07-14
5	202541067117-DRAWINGS [14-07-2025(online)].pdf	2025-07-14
6	202541067117-DECLARATION OF INVENTORSHIP (FORM 5) [14-07-2025(online)].pdf	2025-07-14
7	202541067117-COMPLETE SPECIFICATION [14-07-2025(online)].pdf	2025-07-14
8	202541067117-FORM-26 [16-09-2025(online)].pdf	2025-09-16