Sign In to Follow Application
View All Documents & Correspondence

A Novel Deep Learning Technique To Identify Chronic Kidney Disease (Ckd) Through Effective Feature Selection And Classification

Abstract: [036] The present invention relates to a novel deep learning technique to identify chronic kidney disease (CKD) through effective feature selection and classification. The invention introduces a technique to predict chronic kidney disease. There are five steps to achieving it. First, in the pre-processing stage, remove missing values and normalize the data while reducing noise. Then, employ the EfficientNet V2 approach to extract the features. The Binary Dandelion Algorithm (BDA) must be used to choose the necessary features once features have been extracted to speed up classification evaluation. Then, using the HMLSTM approach, determine whether the person has CKD. We employed the Lion Swarm Optimization Algorithm (LSOA) to increase forecast accuracy. The dataset on chronic renal illness provides the data we need for the experiment. The evaluation performance of the proposed method achieved 99.92% accuracy with less computation time compared to other existing techniques. The proposed method overcome the previous literature issues using feature selection technique and optimization algorithm. In the future, we will develop a hybrid technique with an optimization algorithm to increase the accuracy of disease identification before the condition reveals itself in humans.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
02 September 2023
Publication Number
40/2023
Publication Type
INA
Invention Field
BIOTECHNOLOGY
Status
Email
Parent Application

Applicants

Andhra University
Visakhapatnam, Andhra Pradesh, India. Pin Code: 530003

Inventors

1. Prof. James Stephen Meka
Dr. B. R. Ambedkar Chair Professor, Dean, A.U. TDR-HUB, Andhra University, Visakhapatnam, Andhra Pradesh, India. Pin Code: 530003
2. Mrs.Ramya Asa Latha Busi
Research Scholar, Department of CS & SE, A.U. TDR-HUB, Andhra University, Visakhapatnam, Andhra Pradesh, India. Pin Code: 530003
3. Prof. Prasad Reddy P.V.G.D.
Senior Professor, Department of CS & SE, A.U. College of Engineering (A), Andhra University, Visakhapatnam, Andhra Pradesh, India. Pin Code: 530003

Specification

Description:[001] The invention pertains to the field of the system and method to identify chronic kidney disease, more particularly a novel deep learning technique to identify chronic kidney disease (CKD) through effective feature selection and classification.
BACKGROUND OF THE INVENTION
[002] The following description provides the information that may be useful in understanding the present invention. It is not an admission that any of the information provided herein is prior art or relevant to the presently claimed invention, or that any publication specifically or implicitly referenced is prior art.
[003] The increasing chronic kidney disease (CKD) rate is an essential problem for worldwide public health. High mortality rates are associated with the illness, particularly in poorer nations. Since there are no visible early-stage signs, CKD frequently goes undiagnosed. In the meantime, preventing the disease from progressing requires early detection and early clinical care. However, in earlier research, the performance computation time was longer, and the forecast accuracy was lower. To help clinicians discover CKD early, Deep Learning (DL) models can offer an efficient and affordable computer-aided diagnostic. We provide a unique, deep learning-optimized method to predict CKD to get around the issues. It has five steps to perform. Initially, eliminate the missing values, reduce data noise, and normalize in the pre-processing stage. After that, extract the features using the EfficientNet V2 technique. After extracting features, we must select the necessary features to reduce the classification evaluation time utilizing Binary Dandelion Algorithm (BDA). Then, predict whether the person has CKD using the Hierarchical Multi-scale Long Short-Term Memory (HMLSTM) technique. We used Lion Swarm Optimization Algorithm (LSOA) to improve the prediction accuracy. We get information for the experiment from the chronic kidney disease dataset. The experimental findings also demonstrate a favorable impact of feature selection on evaluating the different techniques. The proposed technique has developed a reliable classification system for detecting CKD and might be used to identify diseases in more unbalanced medical datasets.
[004] Chronic kidney disease (CKD), a long-term disorder that worsens with time, impacts the kidneys' function. The kidneys are essential organs that filter waste and extra fluid from the blood to create urine. CKD occurs when the kidneys are damaged and unable to filter and eliminate waste products properly. Hypertension, glomerulonephritis, diabetes, polycystic kidney disease, and other kidney-related disorders are only a few of the conditions that can lead to CKD. Glomerulonephritis is an infection of the kidney's filtration units. Age, smoking, obesity, family history of renal disease, and certain drugs are all risk factors for CKD. CKD is divided into phases based on the eGFR, which gauges how well the kidneys are working. The stages vary from Stage 1 (minimally damaged kidneys with normal or high eGFR) to Stage 5 (end-stage renal disease, where patients frequently need dialysis or a kidney transplant because kidney function is substantially compromised). CKDs at the beginning may not be accompanied by any apparent symptoms [6-9]. Fatigue, fluid retention (edema), swelling in the legs and ankles, more frequent nighttime urination, difficulties concentrating, anemia, and high blood pressure are among the symptoms that may emerge as the condition worsens. If neglected, CKD can result in significant complications such as heart disease, bone problems, anemia, electrolyte imbalances, and, in severe cases, the requirement for dialysis or kidney transplantation. CKD is identified through urine and blood tests that evaluate renal function and damage. Essential markers for identification and prognosis include eGFR and albuminuria (protein in the urine). Treatment of the underlying causes, regulation of blood pressure and blood sugar levels, control of fluid and electrolyte balance, and dietary and lifestyle changes are all necessary for CKD management. Medications may be recommended to control side effects and prevent the condition from worsening. A balanced diet, moderate exercise, addressing underlying medical disorders, limiting alcohol intake, quitting smoking, and testing kidney function frequently are examples of preventive approaches.
[005] All the necessary medical information and health factors can be analyzed using ML techniques in forecasting systems. This is an efficient method for making early medical diagnoses to accurately and effectively examine the patient's health. Data mining techniques, such as categorization methods, are potent tools frequently employed in numerous research as an efficient disease forecasting and anomaly detection strategy. Offline gathering, processing, and analyzing data are still limitations in most CKD prediction systems. Using too many characteristics could lengthen the process's execution and reduce its accuracy in predicting CKD. However, in recent research, various clinical symptoms like chest discomfort, nausea, insomnia, and other indications that might greatly aid in the early detection of potential CKD have been disregarded in addition to those criteria. These problems drive us to develop a diagnostic forecasting system for CKD using attribute-chosen algorithms among all potential influencing elements to address the abovementioned constraints. The critical contribution of this paper is,
• In the pre-processing stage, eliminate the missing values, reduce noise from the data, and normalize the data.
• To extract the features using the EfficientNet V2 technique. After extracting features, we must select the necessary features to reduce the classification evaluation time utilizing Binary Dandelion Algorithm (BDA).
• Then, predict whether the person they have CKD or not using the HMLSTM technique. We used Lion Swarm Optimization Algorithm (LSOA) to improve the prediction accuracy.
• The evaluation of this research is performed on the chronic kidney disease dataset. And also with accuracy, precision, sensitivity, specificity, and AUC metrics.
[006] Accordingly, on the basis of aforesaid facts, there remains a need in the prior art to provide development of a novel deep learning technique to identify chronic kidney disease (CKD) through effective feature selection and classification. Therefore, it would be useful and desirable to have a system, method, apparatus and interface to meet the above-mentioned needs.
SUMMARY OF THE PRESENT INVENTION
[007] The present invention relates to a novel deep learning technique to identify chronic kidney disease (CKD) through effective feature selection and classification.
[008] In one aspect of the present invention, chronic kidney disease (CKD) is a progressive and long-term condition where the kidneys gradually lose their function over time. Early detection and proactive management are crucial in slowing the progression of CKD and improving quality of life. To predict CKD, we propose a novel deep-learning technique. There are five steps to achieving it. First, in the pre-processing stage, remove missing values and normalize the data while reducing noise. Then, employ the EfficientNet V2 approach to extract the features. The Binary Dandelion Algorithm (BDA) must be used to choose the necessary features once features have been extracted to speed up classification evaluation. Then, using the HMLSTM approach, determine whether the person has CKD. We employed the Lion Swarm Optimization Algorithm (LSOA) to increase forecast accuracy. The dataset on chronic renal illness provides the data we need for the experiment. Figure 1 shows the architecture of the proposed method.
[009] In this respect, before explaining at least one object of the invention in detail, it is to be understood that the invention is not limited in its application to the details of set of rules and to the arrangements of the various models set forth in the following description or illustrated in the drawings. The invention is capable of other objects and of being practiced and carried out in various ways, according to the need of that industry. Also, it is to be understood that the phraseology and terminology employed herein are for the purpose of description and should not be regarded as limiting.
[010] These together with other objects of the invention, along with the various features of novelty which characterize the invention, are pointed out with particularity in the disclosure. For a better understanding of the invention, its operating advantages and the specific objects attained by its uses, reference should be made to the accompanying drawings and descriptive matter in which there are illustrated preferred embodiments of the invention.
BRIEF DESCRIPTION OF THE DRAWINGS
The invention will be better understood and objects other than those set forth above will become apparent when consideration is given to the following detailed description thereof. Such description makes reference to the annexed drawings wherein:
Figures 1-14 illustrate various representation of a novel deep learning technique to identify chronic kidney disease (CKD) through effective feature selection and classification, in accordance with an embodiment of the present invention.
DETAILED DESCRIPTION OF THE INVENTION
[011] While the present invention is described herein by way of example using embodiments and illustrative drawings, those skilled in the art will recognize that the invention is not limited to the embodiments of drawing or drawings described and are not intended to represent the scale of the various components. Further, some components that may form a part of the invention may not be illustrated in certain figures, for ease of illustration, and such omissions do not limit the embodiments outlined in any way. It should be understood that the drawings and detailed description thereto are not intended to limit the invention to the particular form disclosed, but on the contrary, the invention is to cover all modifications, equivalents, and alternatives falling within the scope of the present invention as defined by the appended claims. As used throughout this description, the word "may" is used in a permissive sense (i.e. meaning having the potential to), rather than the mandatory sense, (i.e. meaning must). Further, the words "a" or "an" mean "at least one” and the word “plurality” means “one or more” unless otherwise mentioned. Furthermore, the terminology and phraseology used herein is solely used for descriptive purposes and should not be construed as limiting in scope. Language such as "including," "comprising," "having," "containing," or "involving," and variations thereof, is intended to be broad and encompass the subject matter listed thereafter, equivalents, and additional subject matter not recited, and is not intended to exclude other additives, components, integers or steps. Likewise, the term "comprising" is considered synonymous with the terms "including" or "containing" for applicable legal purposes. Any discussion of documents, acts, materials, devices, articles and the like is included in the specification solely for the purpose of providing a context for the present invention. It is not suggested or represented that any or all of these matters form part of the prior art base or were common general knowledge in the field relevant to the present invention.
[012] In this disclosure, whenever a composition or an element or a group of elements is preceded with the transitional phrase “comprising”, it is understood that we also contemplate the same composition, element or group of elements with transitional phrases “consisting of”, “consisting”, “selected from the group of consisting of, “including”, or “is” preceding the recitation of the composition, element or group of elements and vice versa.
[013] The present invention is described hereinafter by various embodiments with reference to the accompanying drawings, wherein reference numerals used in the accompanying drawing correspond to the like elements throughout the description. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiment set forth herein. Rather, the embodiment is provided so that this disclosure will be thorough and complete and will fully convey the scope of the invention to those skilled in the art. In the following detailed description, numeric values and ranges are provided for various aspects of the implementations described. These values and ranges are to be treated as examples only and are not intended to limit the scope of the claims. In addition, a number of materials are identified as suitable for various facets of the implementations. These materials are to be treated as exemplary and are not intended to limit the scope of the invention.
[014] The present invention discloses a novel deep learning technique to identify chronic kidney disease (CKD) through effective feature selection and classification.
[015] The present invention gathers the data from the dataset in this step. The input data are removing the missing values and reduce the noise in data. Then normalize the data using min-max normalization method. Both min-max normalization and z-score-based normalization are implemented at the normalization stage. Normalization is a helpful technique for adjusting the stock information to fit within a specific range when employing many historical stock data. After normalization, gradient descent quickens and gets more precise [21]. Min-Max normalization is widely used to scale the data between specified ranges by applying a linear trend to the starting date. The notations stand for an attribute's lowest and most significant values. The value x is computed to a value in the range [ and ] to calculate the distinction between the two values. The normalized data for the CKD dataset is shown in Figure 2:
(1)
where the variables stand in for the lowest and highest values, respectively. The notation represents the smallest integer, whereas it represents the largest.
Feature Extraction
EfficientNetV2 is used in this system to extract significant attributes from the input data. A new family of convolutional neural networks called EfficientNetV2 is an upgraded version of EfficientNetV1, emphasizing two specific areas: increasing training time and increasing parameter efficiency. Compound scaling and training-aware search for neural architecture were combined for this purpose. The Fused-MBConv are the core building blocks of EfficientNetV2.
There are seven blocks in the EfficientNetV2 networks. All blocks in EfficientNetV1 are MBConv. Fused-MBConv is used in the first three blocks instead of MBConv to increase training speed and decrease model complexity [22]. Three additional blocks are made up of MBConv blocks. Block 7 comprises a complete connection layer, an average pooling layer, and a 1x1 convolution layer. A 1x1 ordinary convolution, a 1x1 ordinary convolution, a SE module, a kxk depth-wise convolution, and a dropout layer comprise most of the MBconv structure. Equations (2) and (3) show the convolutional layer equations
(2)
(3)
The SE module comprises two completely connected layers plus a global average pool. To improve feature representations, the SE block uses an attention technique. ReLU discards value less than zero, which could lead to the loss of crucial information. Therefore, the network utilizes the h-swish value as an activation function instead. The following describes the h-Swish:
(4)
(5)
Here, σ(x) is the piece-wise linear complex analog function.
The fused-MBconv block is another significant node in this network. The primary branch of the initial MBConv structure's expansion Conv1x1 and depth-wise convolutions 3x3 are swapped out for a conventional conv3x3. The input data undergoes transformations and convolutions as it moves through the network's layers and blocks. Feature maps, which are illustrations of the input information at various levels of abstraction, are the end product of these processes.
Feature Selection
The feature selection procedure is required to reduce the evaluation period in research. The evaluation process can be improved by removing the extraneous characteristics and choosing the essential features. In this case, the Binary Dandelion Algorithm (BDA) was used to choose the required characteristics. Previous research has demonstrated that DA performs better than some other conventional intelligent optimization algorithms [23]. When optimization precision and speed of convergence are assessed on some benchmark functions, DA performs best. Since DA performs so well, it is thought to be used to attribute chosen. However, because the components of the search space are Boolean values rather than continuous DA values, feature selection cannot be done so. As a result, BDA is altered and recommended to utilize the original method.
The search area used for feature selection is binarized, meaning that it only comprises the digits 0 and 1. where 1 denotes the selection of the feature within the same dimension, and 0 denotes the non-selection of the attribute within the same dimension. Like DA, BDA is broken down into initialization, variant seeding, standard seeding, and selection strategy components. To utilize DA for creating feature subsets, one must first discretize the search space because it operates in a continuous space.
Normal sowing
The transfer function is a fairly straightforward, effective, and popular approach for converting continuous optimization to binary optimization, among many others. The standard seeding method uses a transfer function to convert the seeding ratio into the likelihood that the location vector components will take the value 0 or 1.
Position formula
The location vector elements in the binary optimization technique can only have values of 0 or 1. The value of the location vector's components can be immediately determined by probability to be either 0 or 1. This method is popular since it is easy to understand and uncomplicated. As indicated in Eq. (6), a different approach is taken to determine the values of the vector's position components by chance that they will be the same as the previous generation or will be taken inversely.
(6)
Mutation seeding
The initial variationally seeding formula is no longer relevant because the search area and the location vector are converted into binary form. The vector's position components are a bit inverted to have a similar impact on enhancing the candidate answers, as stated in Eq. (7). This approach is rapid and effective, and by using dandelions to create complimentary solutions, it enables a more thorough search of the answer space in the iterative phase.
(7)
Evaluation function
The categorization error rate of the set of tests in the classifier is frequently the most crucial benchmark to be assessed in feature selection. The number of characteristics chosen should also be taken into account in addition to this. Less redundancy and fewer features generally make the following stage of feature processing easier. As a result, as indicated in Eq. (8), the success rate of the attribute set and the amount of chosen attributes must be combined in the evaluation function:
(8)
Seeding radius
A crucial component of the algorithm is the change in the sowing radius. The measure of the sowing radius directly impacts where the seeds land. Because each position vector element must be operated on, the seeding range of every seed is altered. The initial seeding radius calculation for the core dandelion would have produced the identical seeding radius for every dimension, which prevents differential tuning of each dimension. Eq. (8) is therefore suggested to accomplish the objective. The assistant dandelion's seeding radius also receives some adjustments. Eq. (9) calculates the seeding radius of every dimension by contrasting the core dandelion with every assistant dandelion.
(9)
Bound is the largest seeding radius, e, r is the growth factor and wilting factor from Eq. (9), an is the judgment factor, and r1, r2 are two random values between [0.5, 0.5].
(10)
Where w is the Eq. (10) weighting factor and r3, r4 are two random values between [0, 1], respectively.
Classification/Prediction
Finally, we utilize the Hierarchical Multi-scale Long Short-Term Memory technique to categorize the speech's emotions into CKD. A modified LSTM called HMLSTM is used in this work for this purpose. HMLSTM comes in a variety of forms. The HMLSTM has been modified in this study to meet the needs of the IDS model. The general LSTM uses three gates to obtain inputs, which are then processed using the sigmoid activation method [24]. This HMLSTM introduces a parameterized boundary detector that generates binary output values for each layer to learn the termination conditions and produce the temporal attributes. Also included are the dense connections, which enable layer l to take as input the attribute maps from all preceding layers and generate a concatenation of attribute maps. The LSTM model's spatial attribute learning property is improved through this method, which also increases the classifier's effectiveness for intrusion detection. The standard LSTM equations are created first.
(11)
(12)
(13)
Here, it stands for the LSTM's current input, ht1 for the previous hidden state, and ct1 for the prior cell state. The LSTM receives these three parameters as input. The letters it, ft, ut, and out denote the forget, input, output gates, and candidate activation. The weight matrix is represented by W, the activation function matrix by U, and the bias is represented by b.
When these standard functions are combined with the boundary detector variable (zt),
(14)
The update process is carried out at each layer in the proposed HMLSTM system, which has L layers (ℓ = 1, 2, . . . , L) at time t.
(15)
The forget gate of HMLSTM, obtained by the two border states ztℓ−1, ztℓ−1 is represented by the function fHMLSTM ℓ. The statuses of the cells can be updated as
(16)
Suppose the boundary ztℓ−1 is discovered at the bottom layer, but ztℓ−1 was absent in the previous time step. In that case, the Update operation is carried out to update the summary representation of the layer ℓ. Update operations are only occasionally used because this circumstance happens so infrequently. The copy procedure is as simple as (h1 t, ct1) ← (h1 t−1, ct1−1). This means that the top layer remains unmodified until the bottom layer's summarized input is received. This indicates that Reset deletes the bottom layer summary if eject is not run but permits the upper layer to absorb it otherwise.
The slice function can determine the gate values (ftℓ, iℓ t, oℓ t), cell proposal (gtℓ), and preactivation of the boundary detector (z˜tℓ = rigid sign (Uhℓ t)) for each operation.
(17)
Here
(18)
(19)
(20)
It is determined that the binary boundary state ztℓ is
(21)
The deterministic step function can be used to model it.
(22)
Using HMLSTM technique, we achieved higher classification accuracy. But, to improve the classification accuracy, utilizing Lion Swarm Optimization (LSO) algorithm.
Optimization
[016] The Lion Swarm Optimization (LSO) is a new player in meta-heuristic algorithms. It is based on the natural division of labor between the lionesses, lion king, and lion cubs in a lion group, with the lion king defending and lion cubs and lionesses hunting following. Due to benefits including its straightforward structure, limited number of control parameters, ease of execution, good resilience, and quick iteration speed, LSO has been extensively employed to address various real-world optimization issues. Several LSO variations have also been developed.
The behavior of real lions
In contrast to other species of the Felidae family, lions typically form two social groups: resident lions, which typically number fifteen or more on average. Apex predators are those at the top of a food chain, and lions are considered to be these predators. The lion group has an apparent social dominance order.
Lion King: According to the principle of survival of the fittest. This dominant male can breed with almost all females in the pride to produce progeny. In addition to protecting the lion cubs and providing accommodation for the cubs, the lion king must keep their territory free from invasion from both within and without the groupings. Additionally, the lion king must be powerful enough to repel the threat; otherwise, other male lions from the group or outside it will take over as king. If the other male lions defeat the lion king, they may be murdered or driven from pride and become wandering lions. The newly anointed lion king has the power to drive lionesses into estrus and cohabitate with their young. It will also try to murder the lion cubs produced by its forebears.
Lionesses: The primary duties of the lionesses, commonly referred to as the stalking lions, include raising the lion pups and stalking in concert with the prey trail. The lionesses follow the prey's track over a vast area, and as they get close to the prey, they can see where the prey is and encircle them.
Lion cubs mostly conduct their activities around the mother lioness and the lion king. The typical behavior of lion cubs can be broken down into three categories: (1) approaching the lion king for food; (2) following the lioness to learn how to hunt; and (3) when they gain maturity and go out of the group and become nomadic lions. To work together, every lion needs to be more complex. But when grouped, they can work as a team to find food while adhering to a precise social hierarchy. Three populations in LSO have various location updating policies.
LSO algorithm
The explanations above allow us to create LSO mathematically [25]. First, we provide the LSO principle: The program uses a bottom-up design approach inspired by lion swarm hunting behavior and based on a notion of autonomous animates. The lion king will take control of the location of the prey once it is discovered and is superior to the prey it currently occupies.
Definition of parameters
(1) The proportion factor of adult lions β
The percentage of big lions in the lion team significantly affects the optimization outcome. The proportion of big lions increases as the amount of lion cubs decreases. The adult lion percentage factor β is a positive random amount between [0,1]. Here, we set β to be smaller than 0.5 to ensure the quick convergence of the LSO.
(2) The moving range disturbance factor of lionesses αf
An optimizer's global exploration capability is crucial for issues that are challenging to optimize. The local exploration capability needs to be strengthened after the approximate location of the ideal solution has been determined. The lionesses' range of activity will gradually shrink as the updating process continues. The following is a definition of the phrase for the disturbance factor f:
(23)
Where T is the number of iterations that can be made, t is the current iteration's tth, and Step1 indicates the value of the step in the lionesses' activity range, which is determined by
(24)
Where max x and min x represent the minimal and maximum means of each dimension, the lioness' step is known to be controlled by a1 and a randomized number in the [0, 1] range.
Random initialization
Every lion location in the LSO algorithm indicates a potential answer to the issue under consideration, and the caliber of the prey reflects the caliber (fitness) of the corresponding answer.
(25)
Fitness evaluation
By feeding the values of the choice variable into user-defined fitness functions, the fitness value of location for every lion is determined, and the related fitness values can be written as follows:
(26)
Hunting Behaviors of Lions
[017] Every lion in LSO updates its location based on its self-experience and its neighbors' experiences. As was already noted, distinct lions move in various hunting postures during the hunting process. The following brief list of benefits can be used to summarize the proposed hunting mechanism.
[018] The lion king: The lion king may move to the area with the finest food, i.e., the location with the lowest fitness value, to ensure that he has priority over the other lions hunting prey. In this instance, the following steps might be taken to earn the new title of lion king:
(27)
Lionesses: The typical hunting strategy involves spotting their target, surrounding them, and then charging at them. When lionesses engage in hunting behavior, they frequently work together. Remember that the lioness chosen from the lioness group to cooperate is someone other than herself. The new status of the lionesses in this situation is as follows:
(28)
Lion cubs: Three scenarios may arise while lion cubs are actively hunting, as was stated above. In this situation, the lion cubs' new location can be obtained as follows:
(29)
Results and Discussion
[019] The system's development findings are presented in this section. The settings for hyper-parameters shown Table 1.
Table 1. Settings for hyper-parameters.
Hyper-Parameter Setting
Batch size 15
Epochs 850
Activation Function Relu
Optimizer Adam
Dropout rate 0.5 to 0.1
Loss Binary_Crossentropy
Activation output layer Sigmoid

Experimental Setup
[020] Different environments have been used in the system's development. The environment configuration for the evolving system is displayed in Table 2.
Table 2. Environment setup of the proposed system.
Resource Details
CPU Core i5 Gen6
RAM 8 GB
GPU 4GB
Software Python

Dataset Description
[021] The CKD dataset contains 400 patients. In addition to the class features, such as "CKD" and "notckd" for classification, the dataset has 24 features, separated into 11 numerical and 13 categorical features. The following characteristics are present: specific gravity, albumin, blood pressure, red blood cells, pus cell, bacteria, sugar, serum creatinine, blood urea, sodium, hemoglobin, potassium, diabetes mellitus, red blood cell count, packed cell volume, white blood cell count, appetite, hypertension, pedal edema, coronary artery disease, and anemia. Notckd and CKD are the two values in the diagnostic class.
Table 3. Splitting Dataset.
Dataset Numbers
Training 320 patients
Testing and validation 80 patients

The dataset was split into 20% for testing and validation and 80% for training. The split data are displayed in Table 3.
Evaluation Metrics
Performance indicators were employed to assess each of the four classifiers' capabilities. One of these metrics is the confusion matrix, from which the accuracy, sensitivity, specificity, and AUC are derived by computing the properly categorized samples and the wrongly classified samples as illustrated in the following equations:
Accuracy
It measures the proportion of accurate forecasts to all other forecasts. Accuracy can be defined as the ability to forecast outcomes with accuracy.
(30)
Sensitivity
This shows the capacity to identify a patient at risk for heart disease and is assessed as stated in equation (30).
(31)
Specificity
This can be calculated by dividing the total number of negatives by the true negatives, as shown in (32). Value 1.0 designates the best specificity, while 0.0 designates the poorest. (32)
The area under the ROC curve (AUC)
AUC is the variation between the ROC curve's area above and below. AUC, which reduces the ROC curve outcome into a scalar value and is computed as shown in equation (48), is a measure of accuracy.
(33)
Evaluation of Proposed Implementation
[022] In this paper, we propose a new technique to predict CKD. Here, we show results as GUI for whether the person has CKD or not. In Figure 3, we display the actual front page for inputting patient information. Python Tkinter generates this GUI.
Figure 4 shows that if all the patient information is usual, it will show as "Great! You don't have a chronic kidney disease".
Figure 5 shows that if all the patient information is usual, it will show as "Oops! You have a chronic kidney disease”.
Performance of classification techniques without feature selection
[023] The experimental findings from classifier training using the entire feature set are shown in this subsection. Table 4 presents these findings. Additionally, Figure 6 displays each classifier's ROC curves and AUC values.
Figure 7. Accuracy, Sensitivity, and Specificity for each classifier without feature selection.
[024] The proposed Optimized EfficientNetV2 outperformed other classifiers to perform well without feature selection shown in Figure 7.
Performance of classification techniques with feature selection
[025] The qualities of chronic kidney disease were ranked using information gained as a basis for feature selection. This stage aims to choose the features that will provide the most information about the target variable. The proposed Optimized EfficientNetV2 and the other classification techniques are trained using the minor attribute set to show the value of attribute selection. In Table 5, the experimental findings are displayed. Figure 8 also displays the ROC curve as well as various AUC values. The evaluation outcomes in Table 5 and Figure 8 demonstrate that the proposed Optimized EfficientNetV2 outperformed the Decision Tree, Logistic Regression, Random Forest, XGBoost, AdaBoost, and SVM. This improvement demonstrates the success of the feature selection stage. As a result, a powerful strategy for predicting CKD uses feature selection in conjunction with Optimized EfficientNetV2.
Figure 9. Accuracy, Sensitivity, and Specificity for each classifier after feature selection. The proposed Optimized EfficientNetV2 outperformed other classifiers to perform well without feature selection shown in Figure 9.
Correlation Matrix
[026] A correlation matrix is often used in statistics and data analysis to understand relationships between multiple variables. Each cell in the matrix represents the correlation coefficient between two variables. If the data contains n variables, the correlation matrix will be an n x n matrix. Correlation matrices are often visualized using heatmaps, which use color gradients to represent the strength of correlations. This visualization can help identify patterns and relationships within the data.
Figure 10 shows the correlation matrix between different features.
Comparison
[027] Here, we proposed a technique for predicting CKD. Table 6 shows the overall comparison of proposed and existing methods. Compared to other techniques, the proposed method achieved 99.92% accuracy compared with 97% of DT, 97.8% of LR+NN, 98.5% of DBN, and 97.5% of HMANN.
Figure 11 shows the overall comparison of the proposed with existing methods.
[028] A confusion matrix is a tool used in machine learning and statistics to visualize and assess the performance of a classification model. It summarizes the predictions made by a model on a classification problem, comparing these predictions with the actual accurate data labels. The confusion matrix is beneficial when evaluating the performance of algorithms for tasks like binary or multi-class classification. Figure 12 shows the confusion matrix with and without optimization techniques.
Evaluation of training and testing
[029] Figure 13 displays a loss value and categorization accuracy graph as the number of iteration steps increased. The graph shows how the approach covered in this study benefits convergence.
Computation Time
[030] Performance indicators for the chosen classifier include its error and computing time. Precision, recall, and categorization accuracy will all be taken into account. The computing time is taken into account by each classifier. With low computation time, the proposed Optimized EfficientNet V2 technique. The computation time of the proposed Optimized EfficientNet V2 system is displayed in Figure 14. The computation time for the proposed methodology and existing methodologies.
[031] The computation time using the proposed framework and the most recent methods is shown in Figure 14. Our proposed method outperformed earlier approaches regarding prediction accuracy and computing time.
Conclusion
[032] In this research, we propose a technique to predict chronic kidney disease. There are five steps to achieving it. First, in the pre-processing stage, remove missing values and normalize the data while reducing noise. Then, employ the EfficientNet V2 approach to extract the features. The Binary Dandelion Algorithm (BDA) must be used to choose the necessary features once features have been extracted to speed up classification evaluation. Then, using the HMLSTM approach, determine whether the person has CKD. We employed the Lion Swarm Optimization Algorithm (LSOA) to increase forecast accuracy. The dataset on chronic renal illness provides the data we need for the experiment. The evaluation performance of the proposed method achieved 99.92% accuracy with less computation time compared to other existing techniques. The proposed method overcome the previous literature issues using feature selection technique and optimization algorithm. In the future, we will develop a hybrid technique with an optimization algorithm to increase the accuracy of disease identification before the condition reveals itself in humans.
[033] It is to be understood that the above description is intended to be illustrative, and not restrictive. For example, the above-discussed embodiments may be used in combination with each other. Many other embodiments will be apparent to those of skill in the art upon reviewing the above description.
[034] The benefits and advantages which may be provided by the present invention have been described above with regard to specific embodiments. These benefits and advantages, and any elements or limitations that may cause them to occur or to become more pronounced are not to be construed as critical, required, or essential features of any or all of the embodiments.
[035] While the present invention has been described with reference to particular embodiments, it should be understood that the embodiments are illustrative and that the scope of the invention is not limited to these embodiments. Many variations, modifications, additions and improvements to the embodiments described above are possible. It is contemplated that these variations, modifications, additions and improvements fall within the scope of the invention.
, Claims:1. A method for predicting Chronic Kidney Disease (CKD) using a deep-learning technique, the method comprising:
pre-processing data to remove missing values, normalize said data, and reduce noise;
extracting features from the normalized data using the EfficientNet V2 approach;
selecting essential features from the extracted features using the Binary Dandelion Algorithm (BDA);
determining the presence of CKD in a subject using the HMLSTM approach based on the selected features; and
optimizing prediction accuracy using the Lion Swarm Optimization Algorithm (LSOA).
2. The method as claimed in claim 1, wherein said pre-processing further comprises reducing noise in the data.
3. The method as claimed in claim 1, wherein the features extracted using the EfficientNet V2 approach provide detailed characteristics of potential CKD indicators.
4. The method as claimed in claim 1, wherein the Binary Dandelion Algorithm (BDA) enhances the speed of classification evaluation by selecting only necessary features.
5. The method as claimed in claim 1, wherein the HMLSTM approach is employed to distinguish between subjects with CKD and those without CKD based on the selected features.
6. The method as claimed in claim 1, wherein the Lion Swarm Optimization Algorithm (LSOA) refines the deep-learning model parameters to improve prediction accuracy.
7. A system for predicting Chronic Kidney Disease (CKD), comprising:
a data processor configured to pre-process data by removing missing values, normalizing said data, and reducing noise;
a feature extractor configured to use the EfficientNet V2 approach to extract features from the pre-processed data;
a feature selector employing the Binary Dandelion Algorithm (BDA) to choose necessary features from the extracted features;
a classifier utilizing the HMLSTM approach to determine the presence of CKD based on the selected features; and
an optimizer utilizing the Lion Swarm Optimization Algorithm (LSOA) to enhance the prediction accuracy of the classifier.
8. The system as claimed in claim 7, wherein the data processor further reduces noise in the data for more accurate feature extraction.
9. The system as claimed in claim 7, wherein the feature extractor, using the EfficientNet V2 approach, details potential CKD indicators for improved classification.
10. The system as claimed in claim 7, wherein the optimizer refines the parameters of the deep-learning model for enhanced prediction reliability.

Documents

Application Documents

# Name Date
1 202341058998-STATEMENT OF UNDERTAKING (FORM 3) [02-09-2023(online)].pdf 2023-09-02
2 202341058998-REQUEST FOR EARLY PUBLICATION(FORM-9) [02-09-2023(online)].pdf 2023-09-02
3 202341058998-FORM-9 [02-09-2023(online)].pdf 2023-09-02
4 202341058998-FORM 1 [02-09-2023(online)].pdf 2023-09-02
5 202341058998-DRAWINGS [02-09-2023(online)].pdf 2023-09-02
6 202341058998-DECLARATION OF INVENTORSHIP (FORM 5) [02-09-2023(online)].pdf 2023-09-02
7 202341058998-COMPLETE SPECIFICATION [02-09-2023(online)].pdf 2023-09-02