Abstract: The present invention discloses an innovative decision-support model for the classification of gastric histopathology images, for the diagnosis of H. pylori bacteria. The proposed 6-layer CNN architecture served as the feature extractor for the histopathological image classification. The integration of the scalable XGBoost algorithm with the feature extractor resulted in the creation of an advanced model known as BoostedNet, which demonstrates improved classification performance. The model achieved a remarkable performance of 99% for H(+ve) and H(-ve) histopathology image classification. The potential of this proposed model extends to a broader scope. Capitalizing on its ability to provide diagnostic interpretability and enhanced classification accuracy, the model can be further developed into reliable Computer-Aided Diagnosis (CAD) system for the automated detection of H. pylori bacteria in histopathology images. Future research can explore the applicability of different CNN models and diverse datasets containing images with varying stains to investigate the relationship between deep learning based image extraction and clinical diversity.
DESC:FIELD OF THE INVENTION
The present invention relates to a novel decision support model for the identification of Helicobacter pylori bacteria within gastric histopathological images. More specifically, it discloses a decision-support model which leverages a 6-layer CNN model integrated with an Extreme Gradient Boost classifier for accurate inference of discriminative features from gastric histopathology images to classify the images.
BACKGROUND OF THE INVENTION
Helicobacter pylori (H. pylori), a bacterium residing in the human stomach, is widely recognized as a leading cause of various gastric diseases, including gastric ulcer, duodenal ulcer, gastritis and gastric malignancy.
H. pylori is a gram-negative bacterium characterized by its distinctive helical or spiral shape, measuring approximately 2.5-4.0 µm in length and 0.5-1.0 µm in width. It is an exceptionally unique bacterium resilient to acidic environments of the stomach and is commonly found in the vicinity of the pyloric region of the stomach.
Detection of H. pylori infection is important for consequential appropriate antibiotic treatment of the H. pylori associated gastritis. As a result, accurate detection of the H. pylori organism is imperative for prompt treatment and to prevent progression complications of the bacterial infection i.e. malignancy (carcinoma, lymphoma).
H. pylori infections are usually asymptomatic but are clinically significant when patients develop stomach inflammation (gastritis) or a peptic ulcer due to H pylori. Several diagnostic methods have been developed over the years with the aim of detecting this organism accurately. These tests include noninvasive methods like serology, urea breath test, or stool antigen test as well as invasive methods such as testing gastric biopsy samples obtained through upper gastrointestinal endoscopy for culture, histological assessment, and rapid urease test. Gastric mucosal tissues obtained from upper gastrointestinal endoscopy undergo fixation, processing, and staining , which enable medical professionals to examine the tissue under high-magnification microscopy.
Definitive identification of the microorganism in the gastric mucosa is heavily dependent on careful histological assessment of the gastric mucosal biopsy. Microscopic examination of gastric biopsy tissue slides is considered the gold standard method for confirmation of the presence of the H. pylori bacterium. Various staining methods, such as Hematoxylin-Eosin (H&E), modified Giemsa stain, and silver stain are employed to identify H. pylori. Histological detection of H. pylori by light microscopy entails microscopic scrutiny of an overwhelming inflow of tissue slides by pathologists which may be prone to misdiagnosis/errors.
Histological examination has its own limitations which include staining artifacts, higher cost, longer turnaround time, dependence on the skills of the operator and interobserver variability in assessment. Moreover, the density of H. pylori can vary at different sites and may possibly lead to sampling error. When there is dense colonization of H. pylori, it is not difficult to identify the organism in the mucosa. However, expertise and experience is required when organisms are few in number. False-positive outcomes obligate patients to extraneous expenses and toxic medication, while false-negative results may deprive patients’ access to potentially beneficial therapies Failure to identify H. pylori in tissue biopsy results in undertreating gastritis, resulting in persistence of symptoms and progression to gastric cancer. Gastric mucosal tissues obtained from upper gastrointestinal endoscopy endure fixation, processing, and staining of slides, which enable medical professionals to examine the tissue under high-magnification microscopy. Among the various staining methods, such as Hematoxylin-Eosin (H&E), modified Giemsa stain, and silver stain employed to identify the organism H&E staining is frequently used for routine diagnosis of histopathological images, where the H. pylori bacterium appears pink or light red in color.
Despite the availability of diverse imaging modalities to assist in the diagnosis of gastric diseases, definitive identification of the microorganism in the gastric mucosa is contingent on precise biopsy results.
In India, the unabating incidence of cancer has evolved into an onerous workload for pathology laboratories. Accurate diagnostic tissue analysis is constrained by the expertise and experience of pathologists, laboratory specific protocols, test instruments utilized, staining techniques, and the methods employed for the examination of histopathological images. Therefore, factors involving pathologists are the predominant contributors to diagnostic variability.
In challenging borderline cases, subjective interpretation ensues in variances among pathologists. For example, pathologists’ misreading of the H. pylori bacterium as mucin strands, debris, or commensal leads to false negatives in the final diagnostic reports. Misinterpretation of H. pylori supervenes under-treated gastric conditions that may degenerate as gastric cancer. False-positive outcomes obligate patients to extraneous expenses and toxic medication, while false-negative results may deprive patients’ access to potentially beneficial therapies. These factors underline the need to employ automated systems capable of meticulously identifying H. pylori bacteria as highly recommended for the timely treatment of gastric cancer. Automated systems are anticipated to be helpful to pathologists by pinpointing the regions pervaded by H. pylori bacteria in histopathological images. These systems enable pathologists to concentrate on delivering precise diagnoses and deliver timely results, thus adding efficiency to their practice. Increased adoption of Artificial intelligence (AI) in clinical practice overcomes the limitations of diagnostic techniques, especially with respect to image recognition and classification. AI systems have the capability to accurately analyze biopsy slides and recognize disease patterns and irregularities that might not be readily apparent to the human eye. Thus, they deliver promising results by uncovering latent indicators that may be overlooked by medical professionals.
Recent advances in Artificial Intelligence (AI) and image processing have presented versatile tools for the characterization of gastric histopathological images which has proved to be beneficial in identifying neoplastic or non-neoplastic lesions of the gastric mucosa and gastric cancer at an early stage and detect H. pylori in real time.
This has encouraged the employment of an automated system capable of meticulously identifying H. pylori bacteria which will enable pathologists, irrespective of level of experience, to consistently deliver precise diagnoses and timely results, adding efficiency to their practice. Interestingly, such prospective techniques can be used in various settings of practice e.g., small diagnostic labs or high-volume tertiary centers.
Utilization of deep learning (DL) methods in combination with Convolutional Neural Networks (CNN) has demonstrated exceptional effectiveness in the identification of H. pylori bacteria in gastric histopathological images. Recent advancements in CNN’s application and the implementation of transfer learning techniques furnish a promising solution even when dealing with small-sized histopathological image datasets. A computerized framework for histopathological image analysis is envisioned as a diagnostic tool that alleviates pathologists’ encumbrance, enhancing their ability to accurately carry out annotative diagnosis of abnormalities in gastric histopathology.
Transfer learning involves deployment of a network trained in one domain to serve an application in a different domain. Transfer learning with Convolutional Neural Networks (CNNs) has demonstrated positive prospects in analyzing histological images. However, the interpretability of the decision-making process in transfer learning is challenging, as features learned from the source task may not directly translate for the target task. Hence, opting for an indigenous and simplistic model becomes a promising solution for detecting the H. pylori bacterium. CNN models trained on the extensive ImageNet dataset are intuitively adaptable for the classification of histopathological images.
Albeit transfer learning is propitious in leveraging pre-trained models for image classification tasks, it has certain deficiencies when compared with training a custom CNN model of moderate intricacy. A key drawback lies in the mismatch of learned features, as pre-trained models may prioritize features from the source task (e.g., ImageNet classification) rather than zeroing in on task-specific features for H. pylori images. Trans domain issues may evolve if the source and target domains differ outright, restraining the applicability of learned features. Besides, pre-trained models may lack adaptability to the unique characteristics of the target task. Moreover, larger sizes of pre-trained models are oppressive in scenarios with limited computational resources. Comprehension of the decision-making process is compromised, as features learned on the source task may not be directly pertinent to the target task.
Conversely, a custom CNN model with minimal complexity offers versatile features, task-specific adaptability, preclusion of domain shift issues, modest model size, and an uncompromised understanding of the learned features. The foregoing pinpoints the call for an indigenous, less complex model, that can achieve superior classification accuracy in the detection of the H. pylori bacterium.
Diverse Neural Network architectures and traditional ML methods have been employed in the recent past, to identify the presence of H. pylori bacteria in gastric histopathological images. In one such study, Sebastian Klein et al., 2020 employed a deep learning architecture to detect H. pylori from H&E and Giemsa-stained histopathological images by applying a localization process that included downscaling the slide, application of Otsu thresholding and employing morphological operations to create a mask of white regions. The method demonstrated exceptional sensitivity and specificity, particularly in Giemsa-stained images and exhibited promising results in reducing the false positive rate compared to technologies like Polymerase Chain Reaction (PCR). These findings underscore the efficacy of deep learning algorithms in detecting H. pylori across various staining techniques.
In another such study, Sharon Zhou et al., 2020 developed a deep learning assistant for the detection of H. pylori in gastric biopsies via image patches extracted from high-resolution WSI. Zhou’s team introduced an ensemble model, adopting ResNet and DenseNet for the detection of H. pylori bacterium. The deep learning (DL) assistant significantly improved the accuracy and speed of diagnosis for H. pylori-positive cases. However, the diagnostic uncertainty for H. pylori-negative cases also increased, leading to an overall decrease in detection accuracy, thus highlighting the potential of DL in assisting pathologists, warranting the need for optimization to mitigate the fallout on diagnostic accuracy.
A work by Yi Juin Lin et al., 2023 proposed a two-tiered, DL-based model for the histologic diagnosis of Helicobacter gastritis, by employing a weakly supervised training approach to train a CNN classifier at the slide level for the differentiation of H(+ve) and H(-ve) gastric biopsy samples. They achieved 93 % classification accuracy in the identification of H. pylori positive and negative images.
In a similar work, Yongquan Yang et al., 2020 introduced a novel weakly supervised multi-task learning framework (WSMLF) to enhance segmentation performance by slick leveraging of Machine Learning’s weak supervision scheme relying on polygon annotations lest creep in of mislabeled pixel-level H. pylori morphologies in WSI. They achieved an F1 score of 83.20% for segmentation of the H. pylori region of the WSIs.
Nick Wong et al., 2023 presented U-Net overlaid on a ResNet34 backbone architecture employing the Lovasz-Softmax loss function for the classification of gastric histopathological images by utilizing image processing techniques for the extraction of patches from whole slide gastric biopsied images, which were then input into the model. Their model exhibited remarkable performance in delineating H. Pylori infected regions, achieving an Intersection over Union (IoU) score of 0.7805.
There have been a few research reports on a compact CNN model tailored for H. pylori diagnosis in gastric histopathological images, excepting pre-trained networks. The analysis affirmed that there is room for improvement in the detection of H. pylori from gastric histopathological images. Nevertheless, there is a need for an indigenous, less complex model, especially for the detection of the H. pylori bacterium, which can achieve superior classification accuracy.
OBJECT OF THE INVENTION
In order to obviate the drawbacks of the existing state of the art, the present invention discloses a novel, unique, indigenous decision-support model for the identification of H. pylori within gastric histopathological images.
The main object of the present invention is to provide a decision-support model for the identification of H. pylori within gastric histopathological images, which leverages a CNN model integrated with an Extreme scalable Gradient Boost algorithm XGBoost, resulting in the creation of an advanced model known as BoostedNet, with improved classification performance.
Another object of the invention is to provide a decision-support model for the identification of H. pylori within gastric histopathological images, having enhanced prediction capabilities and the ability to capture unique patterns.
Yet another object of the invention is to provide a decision-support model for the identification of H. pylori within gastric histopathological images having diagnostic interpretability through heatmap visualization, allowing the model to elucidate the rationale behind its decisions.
Yet another object of the invention is to provide a decision-support model with superior classification performance to classify gastric histopathological images as H. pylori-positive or negative.
Yet another object of the invention is to provide a decision-support model for the identification of H. pylori within gastric histopathological images, wherein the base CNN model is a 6-layer model generating Gradient Class Activation Mapping (Grad-CAM) visualization having reduced depth and complexity along with diagnostic interpretability.
Yet another object of the invention is to provide a decision-support model as a support for a computer-aided system for accurate classification of gastric histopathology H(+ve) and H(-ve) images utilizing conventional datasets with different stained images.
SUMMARY OF THE INVENTION
Accordingly, the present invention discloses a novel decision-support model for the identification of H. pylori bacteria within gastric histopathological images.
The decision-support model called BoostedNet comprises of two primary components: a Convoluted Neural Network (CNN) feature extractor designed for extracting discriminative features from H. pylori positive and H. pylori negative datasets and an XGBoost classifier. BoostedNet is overlaid on a CNN framework unified with the XGBoost classifier offering a creative approach to clinical decision-making in the detection of H. pylori bacteria. This unification substantiates the remarkable efficacy of the BoostedNet configuration in preference to the standalone baseline CNN model and sustains an unambiguous classification of the images and detection of the H. pylori bacterium. The conceived lightweight CNN model is provisioned with shallow depth and fewer parameters and is tailored for feature extraction from gastric histopathology images. The CNN model was groomed to provide a customized solution for these datasets, ensuring effective feature extraction and image classification.
The CNN feature extractor is composed of three convolutional blocks, each block containing two convolutional layers. Each convolutional block has multiple layers comprising of filters for extracting feature maps. To reduce the spatial dimensions of the feature maps extracted by each convolution layer, the CNN model incorporates three max-pooling layers. Each max-pooling layer follows the final convolution layer in each convolutional block and effectively reduces the dimensions by 2. The final extracted feature maps obtained from the third max-pooling layer are fed into the XGBoost classifier which conducts a binary classification task to determine whether the histopathology image is H (+ve) or H(-ve) on the basis of presence or absence of H. pylori bacterium in the images. The CNN feature extractor is optimized by evaluating the impact of varying the numbers of its own convolutional and max-pooling layers.
A comprehensive analysis was conducted to establish reliability and robustness of the model by comparing the performance of the baseline 6-layer CNN model with the BoostedNet model, which helped to fine-tune the training process, leading to improved performance and preclusion of overfitting. Datasets comprising H&E stained DeepHP and Giemsa-stained gastric histopathological images were used to assess the reliability, robustness and efficacy of the BoostedNet Decision-support model. The DeepHP dataset, composed of histopathological images of gastric mucosa, was stained with hematoxylin and eosin (H&E) for the detection of H. pylori bacterium. A Giemsa-stained dataset, sourced from Kaggle comprising of image patches derived from gastric histopathological images, was utilized. All the images available in DeepHP and Giemsa datasets were resized and aligned with the baseline CNN model’s specifications. The dataset was partitioned into distinct training and validation sets, employing a training-test split ratio of 70:30. The training dataset underwent an inplace data augmentation to improve the model’s generalization and to mitigate overfitting. Performance of the BoostedNet model was appraised with reference to certain metrics. These metrics reaffirmed the model’s excellence in classification, near-error-free accuracy and precision with minimal instances of false positives and false negatives both in case of H&E stained DeepHP as well as Giemsa-stained histopathological images. The results clearly demonstrate that the BoostedNet model achieved superior classification performance on histopathological images. These findings substantiate the effectiveness and reliability of the BoostedNet model as a versatile tool for the detection of H. pylori bacteria in gastric histopathological images. Thus, BoostedNet model is a generic, promising, reliable model with exceptional performance in accurate classification of both H&E and Giemsa-stained gastric histopathological images.
Further, the feature maps extracted from the final convolutional layer of the 6-layer CNN model were subjected to GradCAM visualization. The Mapping visualization, thus generated, furnishes diagnostic interpretability of the model, inherently bolstering the pathologist’s confidence in automated AI systems. The model exhibits a remarkable performance of 99% for H(+ve) and H(-ve) histopathology image classification. The BoostedNet model also exhibited superior performance over other popular CNN models-VGG16, InceptionV3, and ResNet50. The model blends the potential of a low-footprint CNN model and XGBoost classifier for the detection of H. pylori bacterium.
BRIEF DESCRIPTION OF DRAWINGS:
Fig. 1: depicts the flow diagram of the proposed BoostedNet model
Fig. 2: depicts the pictorial representation of 6-layer CNN model
Fig. 3a: depicts the DEEP HP samples
Fig. 3b: depicts the GIEMSA samples
Fig. 4: depicts the (a)Validation loss and (b) accuracy curve of 6-layer CNN
model
Fig. 5: depicts the Confusion Matrix-CNN Layer
Fig. 6: depicts the ROC curve analysis of a)6-layer CNN model achieved an
AUC of 0.9892; b) The Boosted-Net model achieved an AUC of 0.9902
Fig. 7: depicts the Precision recall curve of a)6-layer CNN model; b:
Boosted-Net
Fig. 8: depicts the Precision confidence curve of a)6-layer CNN model b)
Boosted-Net
Fig. 9: depicts the GradCAM visualization of 6-layer CNN model a)
Correctly classified image samples by CNN model; (b) Incorrectly
classified image samples by 6-layer CNN model.
Fig. 10: depicts the Confusion Matrices of a)6-layer CNN model b) Boosted-
Net.
Fig. 11: depicts the ROC curve analysis of a)6-layer CNN model b) Boosted-
Net
Fig. 12: depicts the Precision recall curve of a)6-layer CNN model b)
Boosted-Net
Fig. 13: depicts the Precision Confidence curve of a)6-layer CNN model b)
Boosted-Net
Fig. 14: depicts the GradCAM visualization of 6-layer CNN model for(a)
Correctly classified images (b)Incorrectly classified images
Fig. 15: depicts the Validation loss and accuracy curve of 6-layer CNN
model
Fig. 16: depicts the Confusion Matrices with results analyzed from testing
data(Both H&E and Giemsa stained images) classified using
a)6layer CNN model b) Boosted-Net.
Fig. 17: depicts the Precision recall curve of a)6-layer CNN model b)
Boosted-Net
Fig. 18: depicts the Precision Confidence curve of a)6-layer CNN model b)
Boosted-Net
Fig. 19: depicts the GradCAM visualization of 6-layer CNN model
(a)Correctly classified image samples by CNN model. The original
image samples, the heatmap visualization and superimposed
images are depicted in Figure. (b)Incorrectly classified image
samples by 6-layer CNN model.
Fig. 20: depicts the comparison of BoostedNet model on H&E and Giemsa-
stained image classification
Fig. 21: depicts the Superimposed images with GradCAM visualization of
different models are depicted; (a)image samples from DeepHP and
Giemsa-stained image datasets (b) Image superimposed with Grad
CAM visualization of the final convolutional layer of the 6-layer
CNN model (c) Image superimposed with Grad CAM visualization
of the final convolutional layer of VGG Net (d)Image superimposed
with Grad CAM visualization of the final convolutional layer of
ResNet50 (e) Image superimposed with Grad CAM visualization of
the final convolutional layer of Inceptionv3Net
Fig. 22: depicts the Giemsa stained real-time H(+ve) and H(-ve) images
collected from collaborative super speciality hospital
DETAILED DESCRIPTION OF THE INVENTION:
The present invention discloses a decision-support model BoostedNet for the identification of H. pylori within gastric histopathological images, which leverages a CNN model integrated with an Extreme scalable Gradient Boost algorithm XGBoost. The model comprises of two primary components namely: a Convoluted Neural Network (CNN) feature extractor and an XGBoost classifier. The BoostedNet system is overlaid on a CNN framework unified with the XGBoost classifier offering a creative approach to clinical decision-making in the detection of H. pylori bacteria.
Fig. 1 depicts the overall flow diagram representing the BoostedNet model. The model is based on an indigenous 6-layer CNN model which is designed for extracting discriminative features from H. pylori positive and H. pylori negative datasets through the CNN feature extractor. The CNN feature extractor’s role is to extract pertinent feature maps that have the capability to identify the H. pylori bacterium within histopathological images. The extracted feature maps are then fed into the XGBoost classifier, which conducts a binary classification task to determine whether the histopathology image is H (+ve) or H(-ve).
CNN Feature Extractor:
The CNN feature extractor (Fig. 2) is composed of three convolutional blocks, each block containing two convolutional layers. Each convolutional block has multiple layers comprising of filters for extracting feature maps. To reduce the spatial dimensions of the feature maps extracted by each convolution layer, the CNN model incorporates three max-pooling layers. Each max-pooling layer follows the final convolution layer in each convolutional block and effectively reduces the dimensions by 2. The final extracted feature maps obtained from the third max-pooling layer are fed into the XGBoost classifier which conducts a binary classification task to determine whether the histopathology image is H (+ve) or H(-ve) on the basis of presence or absence of H. pylori bacterium in the images. The CNN feature extractor is optimized by evaluating the impact of varying the numbers of its own convolutional and max-pooling layers. The CNN feature extractor extracts pertinent feature maps that have the capability to identify the H. pylori bacterium within histopathological images. XGBoost is perceived as a scalable end-to-end tree-boosting system extensively employed in machine learning for classification and regression tasks.
Extreme Gradient Boosting Classifier (XGBoost):
XGBoost is a tree-based algorithm that has gained substantial traction in the realm of data classification. XGBoost is perceived as a scalable end- to-end tree-boosting system extensively employed in ML for classification and regression tasks. The goal function of the XGBoost algorithm model is represented by the function, obj(?) = L(?) + ?(?) where L(?) is the training loss function, and ?(?) is the complexity function of the tree. L(?) = n
?nn-1 l(yi, yˆi) where l(yi, yˆi), corresponds to the training loss function for
each sample, yi represents the true value of the ith sample, yˆi represents the estimated value of the ith sample. yˆi represents the estimated value of the ith sample.
O(f) = ?T + 1/2??Ti=1w2i, where wi is the score on the ith leaf node and T is the number of leaf nodes in the tree. By adjusting parameters, the objective function is continuously optimized, and optimal results are obtained. The XGBoost parameters are set up as follows:
• learning_rate= 0.1
• n_estimators = 100
• max_depth =3
• min_child_weight=1
• gamma = 0
• subsample =0.8
• colsample_bytree =0.8
The default values are used for other parameters, including general parameters, booster parameters, learning task parameters, and command-line parameters.
Development of the BoostedNet model was undertaken based on the data taken from two publicly available datasets: DeepHP and Giemsa-stained gastric histopathological images:
a) DeepHP Dataset:
The DeepHP dataset, composed of histopathological images of gastric mucosa, was stained with hematoxylin and eosin for the detection of H. pylori bacterium. For this, images were captured by an Axio-imager, a ZEISS Microscope at a magnification of 20X. The images were pre-processed and formatted in RGB, with a pixel resolution of 0.16 µm. The dataset, comprising of a total of 13,921 images, was derived from 19 WSI scans of histopathological samples. 9,926 of these images were categorized as H(-ve), and the residual 3,995 images were labelled as H (+ve); the dimensions of each image being 2776 × 2080 pixels. Preprocessing procedures were applied to these images, inclusive of segmenting images into patches with dimensions of 1000 × 1000 pixels. The dataset encompassed 111,000 H(+ve) image patches and 285,000 H(-ve) image patches. Fig. 3a depicts samples of images in the DeepHP dataset.
b) Giemsa Dataset:
A Giemsa-stained dataset, sourced from Kaggle, was utilized to validate the proposed BoostedNet model. This dataset comprised 24,901 image patches derived from gastric histopathological images, comprising 8,403 H(+ve) and 16,500 H(-ve) image patches. Fig. 3 presents samples of images in the Giemsa-stained dataset. Fig. 3b depicts the samples from Giemsa stained dataset.
Data Preprocessing and Augmentation:
All the images available in DeepHP and Giemsa datasets were resized to dimensions of (256 × 256) pixels, aligned with the baseline CNN model’s specifications. In the case of Giemsa-stained images, gaussian filtering was employed to address distortion in this dataset and remove noise, prior to the application of data augmentation. The training dataset underwent an inplace data augmentation to improve the model’s generalization and mitigate overfitting. This process involved image transformations - rotation within a range of (-20 to +20) degrees, zooming range of 0.2, shear range of 0.2, and vertical flipping.
Model Training Setup:
All input images from DeepHp and Giemsa datasets were resized to a consistent dimension of 256 × 256 pixels, aligned with the baseline CNN model’s specifications. The dataset was partitioned into distinct training and validation sets, employing a train-test split ratio of 70:30. Experimental
validation of the BoostedNet model was carried out utilizing a Tesla K-80 GPU, which was accessible through Google Colaboratory. The training process identical to DeepHp and Giemsa datasets utilized the Adam optimizer and categorical cross-entropy loss function. The BoostedNet model was trained over 100 epochs, with a mini-batch size set to 32. Performance of the proposed model was appraised with reference to the following metrics: Accuracy, Precision, Recall, F1 score, Sensitivity, Specificity, MCC, Receiver Operating Characteristic (ROC) curve, Precision-Recall (PR) Curve, and Precision Confidence (PC) Curve.
Details of these metrics have been reported in Eqn.1, Eqn.2, Eqn.3, Eqn.4, Eqn.5 and Eqn.6. The variables in these equations TP, TN, FP, and FN represent specific classification outcomes. TP (True Positive) signifies the count of images correctly identified as H(+ve) images within the test set, TN (True Negative) indicates the number of images accurately categorized as H(-ve) images in the test set, FP (False Positive) represents the count of images erroneously classified as H(+ve) in the test set and FN (False Negative), denotes the number of images that were misinterpreted as H(-ve) images in the test set. Macroaverage was utilized in the performance metrics calculations to ensure equal consideration for each class.
Accuracy =TP + TN/TP + FP + TN + FN (1)
Precision = TP/TP + FP (2)
Recall = TP/TP + FN (3)
Specificity = TN/TN + FP (4)
F1Score = 2 ×Precision × Recall/Precision + Recall (5)
MCC = (TP × TN) - (FP × FN)/v(TP + FP) × (TP + FN) × (TN + FP) × (TN + FN) (6)
Receiver Operating Characteristic curve:
The ROC curve is a graphical representation of a model’s performance for binary classification problems in the context of varying the decision threshold levels. ROC delineates the trade-off between the True Positive Rate (TPR or Sensitivity) and False Positive Rate (FPR, 1-specificity) at different threshold values. A larger area under the ROC curve (AUC-ROC) value signifies the model’s better discriminative capability and classification performance.
Precision-Recall Curve:
The PR curve is another graphical tool used to evaluate the performance of a binary classification model focused on the TPR and FPR. The PR curve elucidates the trade-off between Precision (Positive Predictive Value) and Recall (Sensitivity or TPR) at various decision thresholds.
Precision-Confidence Curve:
The PC curve illustrates the confidence score corresponding to each precision value. The confidence score, usually expressed as a probability or a model-assigned score, signifies the level of certainty apropos of the correctness of each prediction.
Fine-tuning of CNN Feature Extractor:
The CNN feature extractor was optimized by evaluating the impact of varying the numbers of its own convolutional and max-pooling layers. The
prudence of increasing layers in the CNN framework was gauged by analysis of the classification performance via connecting fully connected layers. The results are presented in Table 1. Three Dense layers were employed for the assessment of the classification task performance of the proposed CNN Feature Extractor. In each dense layer, a dropout layer with a regularization rate of 0.25 was utilized, alongside the ReLU activation function. The first Dense layer comprising of 64 units, learns high-level features, concurrently introducing non-linearity in the model. The second Dense layer comprising 256 units and focuses on capturing complex patterns and relationships within the extracted features. The third Dense layer has a single unit and applies the sigmoid activation function, providing the binary classification output.
Table 1: Performance analysis of CNN architectures with varying layers on DeepHP dataset
Convolution Layer MaxPooling Layer Dense Layer Epochs Accuracy(%)
10 84.58
2 layer 2 layer 2 layer 20 89.01
30 90.82
100 92.14
10 90.23
2 layer 2 layer 3 layer 20 94.11
30 95.67
100 96.93
10 92.10
3 layer 2 layer 3 layer 20 93.08
30 96.34
100 97.12
10 89.27
4 layer 3 layer 3 layer 20 93.58
30 95.73
100 96.81
10 94.15
5 layer 3 layer 3 layer 20 95.94
30 97.08
100 97.93
10 96.56
6 layer 3 layer 3 layer 20 97.12
30 98.19
100 98.41
Comparison of the results in Table 1, helped determine which convolutional architecture yielded better performance for H(+ve) and H(-ve) image classification. The models were trained over four durations: 10 epochs, 20 epochs, 30 epochs, and 100 epochs to monitor progress in accuracy over time. The model was trained for different numbers of epochs, and the corresponding accuracy results are reported in Table 1. Among the evaluated architecture, the CNN model configured with 6 convolution layers, 3 max-pooling layers, and 3 dense layers turned in the best classification accuracy of 98.41%. However, increasing the CNN convolutional layers beyond 6 did not seem to enhance the classification performance of detecting H. pylori bacteria. It is contented that the CNN model configured with 6 layers justifiably enables the model to capture intricate patterns and hierarchical representations within the input images, intensifying the model’s overall performance. Incorporation of dropout layers within the fully connected layers, precluded the overfitting problem. This regularization technique enhanced the model’s generalization ability beyond the training data. This analysis validates the selection of 6-layer CNN model as the baseline feature extractor for the detection of the H. pylori bacterium from the DeepHP dataset.
Selection of Optimizer:
The performance analysis of popular optimizers Adam, RMSprop, Stochastic Gradient Descent (SGD), and Adagrad was carried out using a fixed learning rate of 0.001 for training the DeepHP dataset. Based on the interpretation of results in Table 2, the Adam optimizer, with a learning rate of 0.001, achieved the highest accuracy among the tested optimizers, attaining an accuracy of 98.41%.
Table 2: Comparison of accuracy for different optimizers at a Learning Rate of 0.001
Optimizer Accuracy (%)
Adam 98.41
RMSprop 98.02
Adagrad 93.89
SGD 92.35
These findings affirm that the Adam optimizer is the preferred choice for this specific task. The Adam optimizer is well known for its effectiveness in
optimizing DL models. Adjustment of the learning rates for each parameter, yielded the combined advantages of AdaGrad and RMSprop optimizers. Adam’s adaptiveness enables it to handle sparse gradients, besides faster convergence. In terms of accuracy, the Adam optimizer outperformed the other evaluated optimizers, rendering it as the optimal choice for tuning the
baseline CNN model for H(+ve) and H(-ve) image classification.
Classification Results of BoostedNet Model:
Results from Table 1 demonstrated that the 6-layer CNN model yielded optimal performance against CNNs with 2,3,4 or 5 layers. Consequently, the 6-layer CNN model was selected as the feature extractor for the classification of H(+ve) and H(-ve) images. To assess the generalizability and reliability of the model, empirical experiments were conducted using various combinations of H&E stained and Giemsa-stained datasets. The first experiment (Experiment-1) exclusively used the H&E stained DeepHP dataset for training and testing the proposed BoostedNet model. Next, a separate analysis (Experiment 2) was conducted solely using the Giemsa stained dataset. For (Experiment-3), images from the H&E stained DeepHP, and the Giemsa stained datasets were combined for training and testing the BoostedNet model. The data used for each experiment is reported in Table 3.
Table 3: Data summary of experiments on BoostedNet with DeepHp and Giemsa datasets
Experiment Train Validation Test
H(+ve) H(-ve) H(+ve) H(-ve) H(+ve) H(-ve)
Experiment-1 35399 39721 7583 8510 8510 7583
Experiment-2 11633 11557 2496 2474 2474 2495
Experiment-3 47502 51278 10078 10084 10985 8845
Subsections hereunder present detailed observations from experiments run on the proffered BoostedNet model.
Experiment-1:
Analysis of BoostedNet Model Using H&E Stained DeepHP Dataset:
A comprehensive analysis was conducted to compare the performance of the baseline 6-layer CNN model with the BoostedNet model. The recommended 6-layer CNN model was trained using the DeepHP dataset. The model’s performance was appraised by generating training and validation graphs for accuracy and loss, as depicted in Fig. 4. The validation loss and accuracy curves serve as crucial indicators of how accurately the proposed model predicted labels for a new test dataset. Analysis of the patterns and trends within the validation loss and accuracy curves provided ML practitioners valuable insights into the model’s stability. This analysis enabled them to fine-tune the training process, leading to improved performance and the preclusion of overfitting.
As illustrated in Fig. 4, the validation loss and accuracy curves were derived from training the tissue images from the DeepHP dataset using the 6- layer CNN model. These graphs demonstrate the model’s ability to minimize training and validation losses. Furthermore, they indicate that the model can generalize well to new data samples with acceptable accuracy. A detailed performance comparison between the two models is presented in Table 4. The BoostedNet model turned in outstanding performance, with a 0.83% boost in Accuracy, a 0.93% improvement in Precision, a 0.84% improvement in Recall, a 0.9% improvement in F1 Score, a 1.1% significant
improvement in Specificity and a 1.7% improvement in MCC compared to
the 6-layer CNN model. These metrics reaffirm the model’s excellence in classification, near-error free accuracy and precision with minimal instances of false positives (FP) and false negatives (FN). The high Recall score implies a low rate of false negatives. In addition, the F1 score, which balances precision and recall, reached an impressive value of 99.19%, highlighting the model’s effectiveness in both aspects. The robust MCC, serving as an indicator of binary classification predictive quality, reinforced the model’s strong predictive performance.
Fig. 5 illustrates the confusion matrices for the 6-layer CNN and the BoostedNet model. The results showcase the exceptional performance of the BoostedNet model in the classification of the DeepHP dataset compared to the 6-layer CNN model. The ROC analysis conducted on the 6-layer CNN and the BoostedNet models, summarized in Fig. 6. The CNN model yielded an AUC of 0.9892, while the BoostedNet model achieved an AUC of 0.9902, illustrating the near-perfect performance of the BoostedNet model compared to 6-layer CNN model. The findings signify the efficacious capability of the BoostedNet model discerning H&E stained H(+ve) and H(-ve) DeepHP images.
Table 4: Performance analysis of BoostedNet model with baseline 6-layer CNN architecture on DeepHP dataset
CNN Models Accuracy (%) Precision (%) Recall/Sensitivity (%) F1 Score (%) Specificity (%) MCC (%)
6 layer CNN Model 98.41 98.56 98.07 98.31 98.71 96.82
BoostedNet (6 layer CNN Model with XGBoost) 99.23 99.48 98.89 99.19 99.80 98.47
An analysis of the PR curve for both the 6-layer CNN and the BoostedNet
models is illustrated in Fig. 7. The PR curve computes the discriminatory
capabilities of the models at various threshold values. Subjective examination of Fig. 8 reveals that both the models displayed comparable performance in Experiment-1 with analysis of the H&E stained DeepHP dataset.
The PC Curve illustrated in Fig. 8 validates that the 6-layer CNN and BoostedNet model consistently maintain higher predictions for H(+ve) at different confidence thresholds, establishing its superior reliability on DeepHP image classification.
The Grad-CAM visualization technique, a perturbation-based method widely recognized for its diagnostic insights, was applied for assessment of
the diagnostic interpretability of the proposed CNN model. This scheme generates heatmaps to pinpoint the specific regions where the CNN model
focuses during its decision-making process. The feature maps extracted from the final convolutional layer of the 6-layer CNN model were subjected to GradCAM visualization. Fig. 9 portrays the GradCAM visualizations
for images in the DeepHP dataset wherein both correctly classified and incorrectly classified images by the 6-layer CNN model were analyzed revealing the focal regions influencing the model’s decision-making process. Analysis of the correctly classified images in Fig. 9 (a), showed that the 6-layer CNN model focuses on the surface epithelium regions of DeepHP images. Given that H. pylori exhibits an affinity for gastric mucus cells and is often observed in the mucus overlying the surface epithelium, as well as in crypts and glands, visualization was focused on these characteristics in the gastric histopathology images. The GradCAM visualization of the 6-layer CNN model, as shown in Fig. 9, revealed that the model accurately extracted features from the H. pylori affinity regions pinpointed the regions overlying the surface epithelium, aligning with the typical localization of the organism. The BoostedNet model demonstrated the facility to precisely identify regions where H. pylori commonly resides, effectively distinguishing between H(+ve) and H(-ve) cases. Above all, the model accurately differentiated other components of the tissues such as nuclei, cytoplasm from the active H. pylori areas in the gastric histopathology images.
Reliability of the suggested BoostedNet model was appraised via heatmaps generated by the final convolutional layer of the baseline 6-layer CNN model on incorrectly classified images. BoostedNet model’s highlighted regions do not fully align with the H. pylori affinity regions, mainly picking up features from other parts of the images, such as nuclei, cytoplasm. This suggests that fine-tuning the BoostedNet model revitalizes its ability to accurately classify H(+ve) and H(-ve) images.
The BoostedNet model represents a promising solution, offering not only dependable classification performance but also insightful diagnostic interpretability through its ability to accurately identify and localize H. pylori in gastric histopathological images.
Experiment-2:
Analysis of the BoostedNet Model Using Giemsa-Stained Dataset
Efficacy of the BoostedNet model’s classification capability was gauged using the dataset of Giemsa-stained images. The evaluation entailed scrutinizing the performance of the 6-layer CNN and the BoostedNet models with the Giemsa stained dataset. Overfitting was encountered during the training phase of the 6-layer CNN model, evidenced by the validation loss and accuracy curves, which demonstrated unacceptable convergence. The performance metrics achieved by the 6-layer CNN is tabulated in Table 5.
Table 5: Performance analysis of BoostedNet model with baseline 6-layer CNN architecture on Giemsa-stained dataset
CNN Models Accuracy (%) Precision (%) Recall/Sensitivity (%) F1 Score (%) Specificity (%) MCC (%)
6 layer CNN Model 91.08 87.34 96.19 91.55 85.93 82.59
BoostedNet (6 layer CNN Model with XGBoost) 94.79 98.07 91.42 94.63 98.18 89.79
Notable improvement in performance was apparent on integration of the 6-layer CNN with XGBoost, in the BoostedNet model. For Giemsa stained images, the BoostedNet model demonstrated impressive performance compared to the 6-layer CNN notching up a notable 4.07% increase in Accuracy, a 12.29% enrichment in Precision, a 3.36% enhancement in F1 score, an enticing 14.26% boost in Specificity, and an 8.72% improvement in MCC. These enhancements underscore the versatile effectiveness of the BoostedNet model in the classification of Giemsa-stained gastric histopathological images in preference to the standalone 6-layer CNN. These metrics reassert the BoostedNet model’s honorable performance, with a low incidence of false positives and false negatives. The F1 score, which assesses the weighted average of precision and recall, also achieved a satisfactory value, indicating a trade-off between the two metrics. However, a decrease in Recall was observed on integration of the XGBoost algorithm with the 6-layer CNN indicative of lurking misclassifications or inconsistencies in the model’s predictions.
Fig. 10 illustrates the confusion matrices for the 6-layer CNN and the BoostedNet models. The number of false negatives increased after integration of the XGBoost with the baseline model, highlighting arduousness in the accurate classification of H(-ve) instances in Giemsa-stained samples. A decrease in Recall indicates that despite the model’s promising overall model performance in specific aspects, there could be a trade-off in the model’s capability to accurately detect positive instances, emphasizing its significance.
Analysis of the ROC curve for the 6-layer CNN and the BoostedNet mod-els with Giemsa stained images, demonstrated an enhanced performance of
the BoostedNet model (Fig. 11). The 6-layer CNN yielded an AUC of 0.9106, whereas the BoostedNet model achieved an AUC of 0.9808, symptomatic of excellent performance compared to the baseline 6-layer CNN model. The AUC revealed a higher capability of the model to correctly
discriminate H(+ve) and H(-ve) cases.
The PR Curve evaluates the discrimination power of the model at different threshold values. From Fig. 12, the 6-layer CNN achieved a high threshold with better Precision-Recall trade-off compared to BoostedNet model. The BoostedNet model achieved fewer false positives and higher false negatives than the 6-layer CNN model. The PC curve in Fig. 13 illustrates the capability of BoostedNet model to consistently maintain higher H(+ve) predictions at different confidence thresholds, rendering it more reliable than the 6-layer CNN model.
The 6-layer CNN model’s diagnostic interpretability of Giemsa-stained images was examined by application of the Grad-CAM visualization technique as shown in Fig. 14. The resulting heatmaps effectively identified surface epithelial regions, allowing for the localization of H. pylori bacterium.
Albeit the CNN model has shown promising performance, further refinements are germane to tackle the challenges posed by staining variations and cellular similarities, thereby enhancing the model’s accuracy and overall predictive capabilities. To address these issues, meticulous hyperparameter tuning was crucial to mitigate overfitting and enhance the model’s adaptability to newer datasets. Additional fine-tuning of the CNN model is recommended to enhance its efficacy in classifying Giemsa-stained gastric histopathology images, particularly for identifying H(+ve) and H(-ve) H. pylori cases. The BoostedNet model exhibited outstanding performance with the DeepHP dataset (Experiment-1) compared to the Giemsa-stained dataset. The lower accuracy in the latter case may be attributed to variations in staining intensity and the presence of other cellular structures resembling H. pylori bacteria, potentially leading to misclassification.
Experiment-3:
Analysis of BoostedNet Model Using Both H&E Stained DeepHP and Giemsa-Stained Images
The 6-layer CNN model was trained using the images containing both H&E stained DeepHP and Giemsa-stained images. The model’s behavior was scrutinized by the generation of training and validation graphs for accuracy
and loss, as depicted in Fig. 15. These graphs demonstrate the model’s ability to minimize training and validation losses, suggesting a minimal overfitting problem. Furthermore, they indicate that the model can generalize well to new data samples with acceptable accuracy. The BoostedNet model achieved substantial improvements, including a 3.82% increase in Accuracy, a noteworthy 5.54% enhancement in Precision, a 3.09% improvement in Recall, a 4.24% boost in F1 score, a significant 4.68%
improvement in Specificity, and an impressive 8.42% increase in MCC when compared to the 6-layer CNN model.
Table 6: Performance analysis of BoostedNet model with baseline 6-layer CNN architectures on H&E and Giemsa stained images
CNN Models Accuracy (%) Precision (%) Recall/Sensitivity F1 Score (%) Specificity (%) MCC (%)
6 layer CNN Model 92.55 90.78 92.65 91.73 92.14 84.96
BoostedNet (6 layer CNN Model with XGBoost) 96.09 95.73 95.51 95.62 96.45 92.10
These metrics indicate the BoostedNet model’s proficiency in classification tasks. The accuracy, precision, and recall scores are notably high, signifying
the model’s accurate predictability, with a minimal occurrence of false positives and false negatives. Furthermore, the F1 score, which balances precision and recall, also achieved a high value, highlighting the model’s effectiveness in both aspects. Moreover, the MCC, an indicator of predictive
performance, reached a high level, validating the model’s robustness. The observed higher accuracy in Experiment-3 compared to Experiment-2 suggests that combining images with distinct staining helped address the limitations associated with exclusive use of Giemsa staining alone, ultimately leading to improved results. The results underlined the versatality of the BoostedNet model for effective classification of H(+ve) and H(-ve) instances in H&E stained and Giemsa-stained images. The confusion matrix of Experiment-3 is presented in Fig. 16. These results demonstrate the model’s successful performance and the potential benefits of employing multiple staining techniques for enhanced classification accuracy in medical image analysis.
Assessment of the ROC curve, Precision-Recall Curve, and Precision-Confidence Curve for the 6-layer CNN model and the BoostedNet model, utilizing a combined dataset that included H&E and Giemsa stained images, revealed an improved performance of the BoostedNet model (Fig. 19, Fig. 20, Fig. 21). 6-layer CNN model achieved an AUC of 0.9503 in the ROC analysis, whereas the BoostedNet model demonstrated a superior AUC of 0.9540, indicating improved performance in comparison to the baseline 6-layer CNN. The higher AUC values signify the model’s improved discrimination ability, demonstrating the proficiency in correctly identifying both H&E stained and giemsa-stained H(+ve) and H(-ve) images.
The PR Curve assesses the model’s discriminatory power across various threshold values. As depicted in Fig. 17, it is evident that the 6-layer CNN
model attained a higher threshold, legitimizing a more favorable Precision-
Recall trade-off compared to the BoostedNet model. The BoostedNet model
demonstrated fewer false positives and false negatives than the 6-layer CNN model.
The PC Curve in Fig. 18 bears testimony to the BoostedNet model’s ability to consistently uphold higher H(+ve) predictions across various confidence thresholds, establishing its greater reliability compared to the 6-layer CNN model. To visualize the diagnostic interpretability of the proposed 6-layer CNN model on both H&E stained and giemsa stained images, the heatmaps were generated through Grad-CAM visualization to pinpoint the specific regions focused by the CNN model during its decision-making process. Fig. 19 presents the GradCAM visualizations of the final convolutional layer of the 6-layer CNN model on combined images from DeepHp and Giemsa-stained datasets. Both correctly classified and incorrectly classified images by the 6-layer CNN model were analyzed in Fig. 19. Apropos of the correctly classified images, the model was observed to focus on the surface epithelium region in the H(+ve) DeepHp and Giemsa stained images. H. pylori exhibits an affinity for gastric mucus cells and is often observed in the mucus overlying the surface epithelium and in crypts and glands. Hence the visualization focused on these characteristics in the H(+ve) gastric histopathological images. Analysis of the H(-ve) DeepHP and Giemsa-stained images revealed the model specifically focused on the gastric mucus regions. The GradCAM visualization of the proposed 6-layer CNN model confirmed that the model extracted features from the H. pylori affinity regions and accurately pinpointed the regions overlying the surface epithelium, aligning with the typical localization of the organism. The model accurately differentiated other components of the tissues from the active H. pylori areas in the gastric histopathology images.
Reliability of the suggested BoostedNet model was appraised by the heatmaps generated of incorrectly classified images and analysis of the final
convolutional layer of baseline 6-layer CNN, as shown in Fig. 19(b). The BoostedNet model’s highlighted regions do not fully align with the H. pylori affinity regions. However, the visual representation shows a significant overlap between the regions of interest in the original images. This suggests that fine-tuning the BoostedNet model can enhance its ability to accurately classify H(+ve) and H(-ve) images.
The BoostedNet model showcased admissible performance with H&E stained and Giemsa-stained images, exhibiting the highest overall performance across all metrics. Experiment-2 demonstrated lower performance, in terms of the MCC metric. Variability in the staining techniques played a pivotal role in influencing the visual appearance and distinguishability of H. pylori in the images, thereby impacting the models’ performance. Despite these differences, all models exhibited favorable outcomes in their respective contexts. The diverse experimental analyses with distinct sets of stained images reveal the generalizability of the BoostedNet model in handling stain variability effectively. Fig. 20 portrays the comparative summary of performance metrics of the BoostedNet model on H&E and Giemsa-stained images. This graph reveals the generalizability of the BoostedNet model in effective handling stain variability.
Performance comparison of BoostedNet Model with Other CNN Models:
In addition to comparing the BoostedNet model with the 6-layer CNN model, its performance was evaluated against other popular CNN models-
VGG16, InceptionV3, and ResNet50. Table 6 presents a performance comparison of these distinct models. To assess these models, transfer learning technique was employed followed by fine-tuning them using images from both the DeepHP and Giemsa-stained datasets. Besides, examination of the impact of integrating XGBoost with these models were examined, ensuing in improved performance of all models after integration, as shown in Table 6. Both the 6-layer CNN model and the BoostedNet model outperformed these models, affirming the propriety of choosing the 6-layer CNN model as the baseline model for the classification of H(+ve) and H(-ve) images in the DeepHP dataset. Fusion of XGBoost with the 6-layer CNN model was demonstrated culminating in the creation of the BoostedNet model which represents an innovative and efficient approach for classification of gastric demonstrated superior performance in comparison with the VGG16, InceptionV3 and ResNet50 models. The reliability of the model was assessed through Grad CAM visualization, as illustrated in Fig. 23. histopathology image datasets. Overall, the Boosted-Net and CNN models.
In Fig. 21(a), the heatmaps generated by the 6-layer CNN model indicated a close focus on the surface epithelial regions, where the affinity for H. pylori was highest. Therefore, we claim that the proposed 6-layer CNN model is a preferred choice for the H(+ve) and H(-ve) classification of gastric histopathology images. Fig. 21(b) displayed the heatmaps of the VGG16 model, which concentrated closely on the gastric mucus area—a region also in close proximity to H. pylori affinity. Hence, VGG16 can be considered an effective model for detecting H. pylori in gastric histopathological images. However, the Inception model and ResNet50 model, as shown in Fig. 21(c) and Fig. 21(d), focused on peri-nuclear and inter-cytoplasmic regions, respectively, which lack affinity for H. pylori presence. Therefore, these two models are not suitable for the analysis of gastric histopathology images to detect H(+ve) and H(-ve) cases.
Table 7: Performance comparison of BoostedNet model with other CNN models
Mo Accuracy(%) Precision(%) Recall(%) F1 Score (%) Specificity (%) MCC
(%) Accuracy (%) Precision(%) Recall
(%) F1 Score
(%) Specificity (%) MCC
(%) Accuracy (%) Precision(%) Recall (%) F1 Score (%) Specificity (%) MCC(%)
M
VGG 96.89 99.85 93.45 96.59 99.96 93.92 95.23 96.08 94.35 95.21 96.12 90.48 96.65 97.10 93.23 95.12 98.24 90.97
VGG+ 98.19 98.79 97.36 98.07 98.94 96.39 95.27 95.68 94.87 95.27 95.68 90.54 96.88 96.23 94.38 95.29 97.35 91.39
Inception 85.04 95.61 71.54 81.84 97.07 71.69 85.79 85.77 85.97 85.87 85.61 71.58 85.32 85.88 82.19 83.99 86.23 71.32
Inceptionv3 89.11 89.21 87.47 88.33 90.58 78.14 86.13 88.56 83.13 85.76 89.17 72.41 88.59 90.41 87.68 89.02 91.21 78.93
ResNet50 72.03 64.04 92.67 75.73 53.63 49.61 85.83 92.34 78.28 84.73 93.45 72.53 79.32 76.88 76.11 76.79 81.42 58.14
ResNet50 98.04 97.31 98.56 97.93 97.52 96.07 86.86 89.09 84.13 86.54 89.61 73.84 90.31 91.26 92.82 92.06 91.83 89.18
6 layer 98.41 98.56 98.07 98.31 98.71 96.82 91.08 87.34 96.19 91.55 85.93 82.59 92.55 90.78 92.65 91.73 92.14 84.95
BoostedNet 99.23 99.48 98.89 99.19 99.80 98.47 94.79 98.07 91.42 94.63 98.18 89.79 96.09 95.73 95.51 95.62 96.45 92.10
The prime objective of the invention is to develop create a computer-aided system for accurate classification of gastric histopathology H(+ve) and H(-ve) images utilizing conventional datasets with different stained images. The proposed BoostedNet model incorporates a custom CNN architecture unified with the XGBoost algorithm, leading to enhanced prediction capabilities and the ability to capture unique patterns. Optimistic outcomes were noted with the Boosted-Net model for the identification of H. pylori in gastric histopathology images. Detailed examination of the baseline 6-layer CNN model’s performance with different layer configurations demonstrated that the 6-layer model was a judicious choice for the baseline model in classification of H(+ve) and H(-ve) gastric histopathology images. Additionally, we explored the impact of data augmentation on the 6-layer CNN model’s performance. The analysis revealed that the CNN model achieved an enhanced classification accuracy of 98.41% with data augmentation, compared to 97.93% accuracy without data augmentation on the DeepHP dataset.
Since 75120 image patches were utilized to train the low footprint 6-layer CNN model, the impact of data augmentation on the model’s performance is minimal. However, it still resulted in a modest improvement of 0.49% in accuracy. Consequently, the 6-layer CNN model was selected as the baseline model, implementing preprocessing steps such as in-place augmentation on the input image data. The detailed results analysis of with and without data augmentation on H&E stained Deep Hp images are tabulated in Table 8.
Table 8: Performance analysis of baseline 6-layer CNN model with and without data augmentation on DeepHP dataset
Method Accuracy Precision Recall/Sensitivity F1 Score Specificity MCC
6 layer CNN Model without data augmentation 97.93 98.12 97.87 97.99 98.16 95.85
6 layer CNN Model with data augmentation 98.41 98.56 98.07 98.31 98.71 96.82
Analysis of 6-layer CNN model integrated with an XGBoost classifier was
shown to enhance the classification performance. The generalization capability of the BoostedNet model was assessed using an H&E-stained DeepHP and a Giemsa-stained datasets. The results demonstrated that the BoostedNet model is a generic, promising, reliable model with exceptional performance in accurate classification of both H&E and Giemsa-stained gastric histopathological images.
Appraisal of the baseline CNN model with BoostedNet substantiated the latter model’s enhanced prediction accuracy compared to the baseline CNN
Model such that both the models demonstrated better classification performance in the given scenario. The BoostedNet model exhibited higher TNs and TPs and fewer FPs and FNs, leading to an overall superior performance. The higher number of FNs in the case of the CNN model suggests that it misclassified more instances of H. pylori infections as non-infected. This may be due to the complexity and variability of the histopathological images, as well as limitations in the CNN’s ability to capture all relevant features and patterns for error free classification. The BoostedNet model, built by replacing fully connected layers with an XGBoost classifier, effectively handled the extracted features and enhanced the classification performance. XGBoost has a proven track record of enhanced performance compared to other well-known classifiers viz., Random Forest, Support Vector Machine, and Logistic Regression. The enhanced performance of the XGBoost classifier reported in the literature motivated this investigation to combine it with the baseline CNN model for H. pylori diagnosis. The enhanced CNN architecture empowered the model to extract discriminative features from the images, improving its ability to differentiate between infected and non-infected cases. The XGBoost algorithm, with its iterative boosting process, further refined the model’s predictions by leveraging the strengths of multiple weak classifiers. This helped reduction of errors and misclassifications, resulting in a lower number of false negatives.
Robustness of the BoostedNet model’s performance was reaffirmed by the
evaluation on a set of 15 images sourced from the DeepHP dataset. Notably,
these images were not part of the training, validation, or test datasets. This
set comprised eight H(+ve) images and seven H(-ve) images. It is worth a mention that the BoostedNet model successfully classified all 15 of these images correctly.
An extensive analysis was done to assess the effectiveness of the BoostedNet model in real-world scenarios and its capability to address the cross stain issue. For this, 40 H(+ve) and 15 H(-ve) Giemsa-stained images were obtained from Amrita Institute of Medical Sciences (AIMS) Research Centre and validated the classification performance of the BoostedNet model. Fig. 22 depicts a few samples in the image set. The BoostedNet model demonstrated exceptional performance of precision of 98%, Recall of 93% and F1 Score of 95.4%. These results firmly establish the BoostedNet model as a promising solution for classifying gastric histopathology images, encompassing both H(+ve) and H(-ve) cases.
By leveraging both the image features and the boosted decision trees, the Boosted-Net model effectively improved prediction accuracy and reduced the number of false negatives. This combined approach showcased the model’s potential in H. pylori image identification, making it a valuable tool in histopathologic analysis.
Comparison of BoostedNet Model with State-of-the-art Methods:
There were limited investigations into H. pylori diagnosis through histopathology image classification utilizing CNNs, with these studies consistently achieving classification accuracy rates exceeding 80%. Table 9 provides a sample of openly accessible research focused on classification using gastric histopathology datasets. The datasets used in the state-of-the-art studies differ from those used in the present invention. As a result, achieving a fair comparison with previous approaches is not feasible.
Table 9: Performance comparison of BoostedNet model with state-of-the-art results
Paper Methods F1-score
Sharon Zhou et al. [27] Ensemble model with ResNet and DenseNet 90%
Yi Juin Lin et al. [28] Two tiered deep learning model 93%
Yongquan Yang et al. [29] weakly supervised multi- task learning framework (WSMLF) 83.20
Pau Cano et al. [34] Autoencoders 91%
Proposed BoostedNet Model 6 layer CNN with XGBoost 99.09%
The outcome of this diagnostic study revealed how the integration of the XGBoost classifier with the 6-layer CNN model, forming the Boosted-Net model, led to an enhancement in classification performance. Notably, the BoostedNet model achieved classification performance on the DeepHP dataset that was on par with or exceeded state-of-the-art results. The model
outperformed baseline CNN models and demonstrates strong generalization capabilities. The developed computer-aided system holds significant prospects for improved accuracy and efficiency in the detection of H. pylori bacterium, thereby facilitating early diagnosis and timely treatment interventions.
,CLAIMS:1. A decision support model XGBoost for detection and identification of Helicobacter pylori bacteria within gastric histopathological images, the model comprising:
- Convoluted Neural Network (CNN) feature extractor module from extracting feature maps from histopathological images; and
- Extreme scalable Gradient Boost classifier module for classifying the histopathological images,
wherein the CNN feature extractor is integrated with Extreme scalable Gradient Boost classifier for accurate inference of discriminative features from gastric histopathology images to classify the images. These metrics reaffirm the model’s excellence in classification, near-error free accuracy and precision with minimal instances of false positives (FP) and false negatives (FN). The high Recall score implies a low rate of false negatives. In addition, the F1 score, which balances precision and recall, reached an impressive value of 99.19%, highlighting the model’s effectiveness in both aspects.
2. The decision support model as claimed in claim 1, wherein the CNN feature extractor module extracts pertinent feature maps from histopathological images to identify the H. pylori bacterium within said histopathological images.
3. The decision support model as claimed in claim 1, wherein the Extreme scalable Gradient Boost classifier module performs binary classification of the extracted feature maps to classify the histopathological images as H. pylori-positive H (+ve) or negative H(-ve).
4. The decision support model as claimed in claim 1, wherein a Grad-CAM visualization technique is used to assess the diagnostic interpretability of the said model.
5. The decision support model as claimed in claim 1, wherein the Grad-CAM visualization generates heatmaps to pinpoint the specific regions focused by the CNN model during its decision-making process.
6. The decision support model as claimed in claim 1, wherein the CNN feature extractor comprises of at least three convolutional blocks, each block containing at least two convolutional layers and each convolutional block comprising of multiple layers having filters for extracting feature maps.
7. The decision support model as claimed in claim 1, wherein the CNN feature extractor incorporates three max-pooling layers with each max-pooling layer following the final convolution layer in each convolutional block and effectively reducing the dimensions by 2.
8. The decision support model as claimed in claim 1, wherein the final extracted feature maps obtained from the third max-pooling layer are fed into the XGBoost classifier which conducts a binary classification task to determine whether the histopathology image is H (+ve) or H(-ve) on the basis of presence or absence of H. pylori bacterium in said images.
9. The decision support model as claimed in claim 1, wherein the feature maps extracted from the final convolutional layer of the 6-layer CNN model are subjected to GradCAM visualization.
10. The decision support model as claimed in claim 1, wherein the CNN feature extractor is optimized by evaluating the impact of varying the numbers of its own convolutional and max-pooling layers.
| # | Name | Date |
|---|---|---|
| 1 | 202441053219-STATEMENT OF UNDERTAKING (FORM 3) [12-07-2024(online)].pdf | 2024-07-12 |
| 2 | 202441053219-PROVISIONAL SPECIFICATION [12-07-2024(online)].pdf | 2024-07-12 |
| 3 | 202441053219-FORM FOR SMALL ENTITY(FORM-28) [12-07-2024(online)].pdf | 2024-07-12 |
| 4 | 202441053219-FORM 1 [12-07-2024(online)].pdf | 2024-07-12 |
| 5 | 202441053219-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [12-07-2024(online)].pdf | 2024-07-12 |
| 6 | 202441053219-EVIDENCE FOR REGISTRATION UNDER SSI [12-07-2024(online)].pdf | 2024-07-12 |
| 7 | 202441053219-EDUCATIONAL INSTITUTION(S) [12-07-2024(online)].pdf | 2024-07-12 |
| 8 | 202441053219-DECLARATION OF INVENTORSHIP (FORM 5) [12-07-2024(online)].pdf | 2024-07-12 |
| 9 | 202441053219-FORM-26 [12-10-2024(online)].pdf | 2024-10-12 |
| 10 | 202441053219-Proof of Right [16-01-2025(online)].pdf | 2025-01-16 |
| 11 | 202441053219-FORM-5 [16-01-2025(online)].pdf | 2025-01-16 |
| 12 | 202441053219-ENDORSEMENT BY INVENTORS [16-01-2025(online)].pdf | 2025-01-16 |
| 13 | 202441053219-FORM-9 [28-05-2025(online)].pdf | 2025-05-28 |
| 14 | 202441053219-FORM 18 [28-05-2025(online)].pdf | 2025-05-28 |
| 15 | 202441053219-DRAWING [28-05-2025(online)].pdf | 2025-05-28 |
| 16 | 202441053219-COMPLETE SPECIFICATION [28-05-2025(online)].pdf | 2025-05-28 |