Sign In to Follow Application
View All Documents & Correspondence

Pneumo Vi T: Pneumonia Detection System With Vision Transformer Integration

Abstract: The global health crisis of pneumonia mostly affects vulnerable populations while remaining as the foremost reason for mortality worldwide. Routine interpretation of CXRs occurs through expert assessment yet delivery of needed treatments may become delayed. The work illuminates a new ViT approach for CXR pneumonia detection which utilizes self-attention to effectively obtain features for accurate classification. The proposed system employs Vision Transformer architecture which receives training from a Kaggle pediatric pneumonia image database consisting of 1,341 normal and 3,875 pneumonia pictures. The technique used Adam optimization together with learning rate scheduling along with extensive data augmentation. Through its assessment the proposed model achieved 90% accuracy which surpassed baseline performance of deep learning approaches. The performance of ViTs in identifying pneumonia signals offers great potential for clinical applications of reliable pneumonia detection methods

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
17 July 2025
Publication Number
30/2025
Publication Type
INA
Invention Field
COMPUTER SCIENCE
Status
Email
Parent Application

Applicants

Vinay S
School of Computing and Information Technology Engineering, REVA University, Bangalore, Karnataka, India, 560064
Swimpy Pahuja
School of Computing, and Information Technology Engineering, REVA University, Bangalore, Karnataka, India, 560064
Suhas Reddy R P
School of Computing and Information Technology Engineering, REVA University, Bangalore, Karnataka, India, 560064
Yashas M C
School of Computing and Information Technology Engineering, REVA University, Bangalore, Karnataka, India, 560064
Kavyashree B
School of Computing and Information Technology Engineering, REVA University, Bangalore, Karnataka, India, 560064
Arati Chabukswar
School of Computing and Information Technology Engineering, REVA University, Bangalore, Karnataka, India, 560064
REVA University
Rukmini Knowledge Park, Kattigenahalli, Bangalore, Karnataka, India, 560064

Inventors

1. Vinay S
School of Computing and Information Technology Engineering, REVA University, Bangalore, Karnataka, India, 560064
2. Swimpy Pahuja
School of Computing, and Information Technology Engineering, REVA University, Bangalore, Karnataka, India, 560064
3. Suhas Reddy R P
School of Computing and Information Technology Engineering, REVA University, Bangalore, Karnataka, India, 560064
4. Yashas M C
School of Computing and Information Technology Engineering, REVA University, Bangalore, Karnataka, India, 560064
5. Kavyashree B
School of Computing and Information Technology Engineering, REVA University, Bangalore, Karnataka, India, 560064
6. Arati Chabukswar
School of Computing and Information Technology Engineering, REVA University, Bangalore, Karnataka, India, 560064

Specification

Description:The PneumoViT system applies Vision Transformer (ViT) based architecture for identifying pneumonia in X-ray images from the chest area. PneumoViT follows a workflow structure composed of four main sections beginning with Model Architecture which is followed by Data Processing then Training Pipeline followed by Evaluation & Inference. Through these successive stages PneumoViT executes pneumonia diagnosis operations with accuracy and efficiency.
5.1. Model Architecture
The base version of ViT named ViT-Base-Patch16-224 developed by Google serves as the main component of the system. The self-attention components in Vision Transformers differ from conventional convolutional neural networks (CNNs) because they analyze full image relationships without restrictions. The system uses self-attention mechanisms within the Vision Transformer model to achieve better results for diagnosing chest X-ray details.
During processing the ViT model extracts the classification (CLS) token from images which represents their complete information.
A linear classifier applies its input of 768 dimensions to perform diagnosis between Pneumonia and Normal classes.
The proposed architecture functions as a solution to enhance both generalization capability and robust detection of pneumonia over various datasets.
5.2. Data Processing
A model training process demands preprocessed data from the initial dataset. This stage includes:
The system obtains the chest X-ray dataset by downloading information from trustworthy sources including NIH or Kaggle datasets.
An improvement process applied to images includes resizing steps and normalization together with augmentation that leads to better quality and superior model performance.
An approach for creating and splitting the preprocessed data results in three separate subsections:
A training set exists to learn and train the model structure.
Validation Set: Helps fine-tune hyperparameters.
The Test Set enables the evaluation of final model efficiency.
The images get transformed into DataLoaders for efficient batch processing throughout training procedures and testing phases.
5.3. Training Pipeline
The efficiency and accuracy of models depend on their fundamental training infrastructure which serves as the main decision-making element. The procedure has three operational steps.
The Vision Transformer model receives training data from the dataset used to learn patterns that exist in chest X-ray images.
The system performs prediction using Cross Entropy Loss to determine differences between original and target classifications.
AdamW Optimizer serves as the optimizer that enhances weight updates together with convergence velocity.
The ReduceLROnPlateau scheduler uses model performance to adapt the learning rate since overfitting prevention is its main intention.
The deployment-ready model stems from selecting the validation accuracy-maximizing model among all options.
The system reaches its output goals more quickly using these enhanced optimization approaches which leads to better accuracy in pneumonia detection.
5.4. Evaluation & Inference
After completion of training the model requires evaluation through several performance metrics to guarantee practical use in real-world settings.
Single Image Inference:
The prepared new chest X-ray undergoes preprocessing before the trained model receives it for processing.
The prediction model decides between Pneumonia and Normal categories while returning a probability value.
Model Evaluation:
A Classification Report produces precision and F1-score and recall metrics to evaluate model performance. The Confusion Matrix tool analyzes errors beyond basic results by providing information about false positives, false negatives as well as true positives and true negatives. , Claims:We claim,
 A deep learning system that uses Vision Transformers (ViT) to perform accurate identification of pneumonia in chest X-rays.
 The model uses a ViT-Base-Patch16-224 backbone together with CLS token extraction and adds a linear classifier to accomplish superior image classification results.
 A customized data splitting method creates training validation and testing datasets while improving reliability and robustness of the model

Documents

Application Documents

# Name Date
1 202541068502-STATEMENT OF UNDERTAKING (FORM 3) [17-07-2025(online)].pdf 2025-07-17
2 202541068502-REQUEST FOR EARLY PUBLICATION(FORM-9) [17-07-2025(online)].pdf 2025-07-17
3 202541068502-FORM-9 [17-07-2025(online)].pdf 2025-07-17
4 202541068502-FORM FOR SMALL ENTITY(FORM-28) [17-07-2025(online)].pdf 2025-07-17
5 202541068502-FORM FOR SMALL ENTITY [17-07-2025(online)].pdf 2025-07-17
6 202541068502-FORM 1 [17-07-2025(online)].pdf 2025-07-17
7 202541068502-FIGURE OF ABSTRACT [17-07-2025(online)].pdf 2025-07-17
8 202541068502-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [17-07-2025(online)].pdf 2025-07-17
9 202541068502-EVIDENCE FOR REGISTRATION UNDER SSI [17-07-2025(online)].pdf 2025-07-17
10 202541068502-DRAWINGS [17-07-2025(online)].pdf 2025-07-17
11 202541068502-DECLARATION OF INVENTORSHIP (FORM 5) [17-07-2025(online)].pdf 2025-07-17
12 202541068502-COMPLETE SPECIFICATION [17-07-2025(online)].pdf 2025-07-17