Sign In to Follow Application
View All Documents & Correspondence

System/Method To Detect Diabetes Mellitus Using A Neighborhood Component Analysis With A Hybrid Machine Learning Approach

Abstract: One of the most rapidly increasing chronic diseases in the modern era is diabetes mellitus. Regardless of where you live or your race or ethnicity, you will likely encounter someone with type 1 or type 2 diabetes, the two main categories of this fast-growing disease. The immune system's erroneous attack on the pancreatic beta cells causes type 1 diabetes. This results in the body receiving very little insulin. When insulin production drops below normal levels or insulin resistance develops, the result is type 2 diabetes. Diabetes was also examined in three groups by another set of inventors. The third form is known as gestational diabetes in this context. Only when a woman's hormones fluctuate during her pregnancy can she get this form of diabetes, which is also called gestational diabetes. Polyuria, polydipsia, polyphagia, type 1 and type 2 diabetes, respectively, as well as weakness, baldness, delayed recovery, impaired vision, itching, irritability, thrush, partial paresis, muscle stiffness, and partial paraesthesia are among the most prevalent symptoms of diabetes. To begin, when we talk about a neighbourhood in NCA, we're referring to the surrounding area. Neighbours in this context are like samples; a short default distance indicates a high degree of similarity. Training and testing more data with support vector machines and random forest classifiers follows the revised dataset.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
18 July 2025
Publication Number
30/2025
Publication Type
INA
Invention Field
CHEMICAL
Status
Email
Parent Application

Applicants

MLR Institute of Technology
Hyderabad

Inventors

1. Mr. G.Vidyasagar
Department of Information Technology, MLR Institute of Technology, Hyderabad
2. Mrs.Chandana
Department of Information Technology, MLR Institute of Technology, Hyderabad
3. Mr.Anwar Ali
Department of Information Technology, MLR Institute of Technology, Hyderabad
4. Dr.B.Varija
Department of Information Technology, MLR Institute of Technology, Hyderabad

Specification

Description:Field of Invention
Diabetes is a chronic disease that adversely affects life, has chronic and acute side effects, needs medical help and care to relieve or prevent the social and economic characteristics directly or indirectly, and negatively affects the health sector due to the high health expenses it creates. In the last century, due to the development of health and technology, the prolongation of life expectancy, urbanization, laziness, and variations in food consumption habits increase the risk of diabetes. According to the World Health Organization data, it is thought that more than 300 million people in the world will be diagnosed with diabetes in 2025. Type 1 diabetes accounts for approximately 25% of all diabetic events. A person with diabetes should have a high level of knowledge and positive behaviors in order to provide daily diabetes management in a positive way. In this case, the attitudes, beliefs and actions of individuals who are disturbed about self-healing with care for their own health provide the basis for curing diabetes.
Background of the Invention
Diabetes is a chronic disease that negatively affects the lives of individuals in various dimensions. Getting used to this disease is not a simple matter. The success of the drugs taken for the purpose in the treatment is ensured by individual control. People with diabetes have to change many things in their lives. Along with the change in this lifestyle, social activity and nutrition program, the person needs to be aware of this situation and take responsibility. While the risk of increased type 1 blood sugar is trying to maintain its stability, the decrease in blood sugar negatively affects this situation. People with type 2 diabetes have to make differences in their lifestyles. This is not easy for patients. It is known that diabetes is managed well, the patient takes responsibility, blood sugar level drops, blood sugar levels drop, they get more positive responses in their lives and they are very satisfied with their recovery. However, most people with diabetes avoid taking responsibility alone. Diagnosis of diabetes may be faced with situations such as “withdrawal”, “not accepting”; in this case, it makes it difficult for the person to keep their illness under control. When the diagnosis is made, the individual may not feel well psychologically, which causes a decrease in the self-confidence of the person. Those with diabetes should get help from health professionals, psychological support, information and educational support that will contribute to their individual management.
Diabetes happens when the body's blood sugar levels are either too high or too low compared to what they should be. Diabetes can lead to the failure of many organs, including the kidneys, liver, heart, and other organs, which can cause death. Machine learning algorithms have been shown and pushed as a way to improve the health care of diabetic patients by predicting their glucose levels. SVM, Random Forest, ANN, and other algorithms that find patterns in data can play a big role in early diabetes detection, prediction, and preventive actions in diabetic patients to improve health care. Instead of a single classification model, the health care industry often uses a number of classifiers to sort data into different groups based on different limits. To figure out the pooled prevalence of DM in Nigeria, a random effects model was used to take into account differences between and within studies. They found that the number of cases of DM in Nigeria has been going up in all affected parts of the country, with the most cases happening in the south-south geopolitical zones. Diabetes is more likely to happen in Nigeria if people live in cities, don't exercise, get older, and eat poorly. They pushed for a national policy to prevent and care for people with diabetes.
Artificial Neural Network is a computer model inspired by the central nervous system of animals and capable of machine learning and pattern recognition. Artificial neural networks are usually presented as interconnected neural systems capable of computing input values and simulating the behavior of biological neural networks. In data mining, k-means clustering is a clustering technique that aims to divide n observations into k groups, and each observation belongs to the group closest to the mean. This leads to the separation of the data space in the Voronoi diagram. K-means clustering tends to find comparable groups of spatial expansion, while the expectation maximization mechanism takes a different form(US20220044055A1).
Decision Tree is a behavioral approach that uses charts to represent various alternatives and outcomes of investment decisions, as well as the probabilities that they will occur. It is based on estimates and probabilities associated with the outcomes of competing courses of action. Essentially, decision trees are diagrams that allow us to represent and evaluate problems involving sequential decisions, highlighting the risks and financial results identified in the various courses of action. In order to analyse data and construct models for regression and classification, Support Vector Machine (SVM) employs a variety of supervised learning techniques. A non-probabilistic linear classifier, standard SVM reads a dataset and makes a prediction about which of two classes each input could belong to. The instances in each category are shown as points in space with as much open space between them as possible in an SVM's depiction of examples. The new examples are then mapped to the same room and, depending on which side of the room they are placed on, must fall into the same category. SVM finds a dividing line, better known as a hyperplane, between the data of the two classes. This line tries to maximize the distance between the closest points in relation to each class. The distance between a hyperplane and the first point of each class is often referred to as an edge. SVM prioritizes the classification of classes, thus identifying each point belonging to each class, maximizing margins. That is, first classify the class correctly, and then, based on this constraint, determine the distance between the edges. The assessment of machine learning algorithms consists of generating experiments for data classification and/or prediction, using machine learning methods for example neural networks, decision tree algorithms, random forest and k-nearest neighbors.
A hybrid model was implemented using an arrangement of a genetic algorithm with a nearest neighbor classifier to predict the risk of cardiovascular complications in patients with type 2 diabetes. For this invention, it was verified that data such as age at which the patient was diagnosed, mean arterial pressure, glucose, cholesterol, gender, presence of familial diabetes mellitus and the use of calcium antagonists represent the subset that provides greater efficiency for the classification algorithm to provide a better result of accuracy and performance (US20190041843A1).
Summary of the Invention
For every country, diabetes mellitus is a major health problem that has a significant impact on people's lives and the economy. Concurrently, there needs to be a reduction in the rate of commonality and a resolution to the confusions surrounding diabetes. To learn is to grow by doing, to reflect on one's actions and the results they produced so that one might use that knowledge more effectively in future instances. In this invention, Neighborhood Component Analysis (NCA) is used for dimension reduction of training dataset. According to dimensionally reduced dataset, further data is trained and tested with support vector machine and random forest classifier.

Brief Description of Drawings
Figure 1: Architecture of the hybrid approach for feature selection based DM Identification
Detailed Description of the Invention
Diabetes is a lifelong condition that occurs when the pancreas doesn't make enough insulin or can't use the insulin it makes efficiently. As a result, people are unable to use sugar, which is glucose that enters the bloodstream from food and raises blood sugar levels. Failure to take regular steps to maintain normal blood sugar levels can cause many problems. As the world population increases, the number of people with diabetes is increasing dramatically. The main reasons for the increase in diabetes are; unbalanced nutrition, overweight, aging, ease of transportation, inactivity, the use of computers in all areas of life, the use of the internet, smart phones, tablets and the state of being under constant stress caused by e-mail traffic in business life. There are 450 million people in the world facing diabetes, and there are more than 77 million diabetics in India. In addition to the fact that India is at second rank in the world after China in diabetic population. In this way, large expenditures are made for the treatment of diabetes, which is seen in a wide range. The vast majority of diabetes cases are separated into two groups recognized as Type 1 Diabetes and Type 2 Diabetes. Type 2 Diabetes is the way of most common diabetes, representing about 90% of diabetic patients. The pancreas is responsible for the production of part of the hormones important for the digestive system. Usually when the glucose level increases in the blood, a group of cells called beta cells produce insulin. Respecting the body's needs at all times, it is possible that this glucose becomes energy or is stored as a reserve, in the form of fat. Type 1 Diabetes is an outcome of the demolition of pancreatic beta cells by an immunological procedure, leading to little or no insulin deficiency is released into the body. It is possible to detect antibodies in blood tests that attack beta cells in most cases. Type 1 Diabetes has an abrupt onset (a few days to a few days months), with symptoms strongly indicating the presence of the disease. It usually appears in adolescence or childhood, but may be appear in adults also. Insulin, medications, planning food and physical activities are the essential line of treatment for Type 1 diabetes. Type 2 diabetes arises when the body does not produces enough insulin to maintain healthy blood glucose levels. It has a beginning distributed over a variable period of time, it may take years for symptoms to appear. During this period, different stages called alternating fasting glucose and glucose tolerance diminished. These stages result from a combination of resistance to insulin action and beta cell dysfunction.
The architecture of machine learning based automatic detection of diabetes mellitus is shown in Figure 1. Where NCA is used for dimension reduction of training dataset. According to dimensionally reduced dataset, further data is trained and tested with support vector machine and random forest classifier. Neighbourhood component analysis has been a pioneer in dimensionality reduction and metric learning, and it has been an influential tool for many other dimensionality reduction projects since its inception. To begin, when we talk about a neighbourhood in NCA, we're referring to the surrounding area. Neighbours in this context are like samples; a short default distance indicates a high degree of similarity. The concept of "neighbours" is central to dimensionality reduction and metric learning, and numerous methods are built on it. KNN Classification is the foundation of NCA. Hence, NCA is an algorithm that requires supervision. Prior to training, you must provide the data and category labels. Here we propose the idea of metric learning and the learnable Mahalanobis distance since using Euclidean distance computation will lead to an extremely enormous amount of calculation and because the dimensional space is exceptionally high. The following definitions of metric learning and Mahalanobis distance should be reviewed before proceeding.
Support vector Machine is a set of supervised learning methods that analyze data and define the models used for classification and regression analysis. Standard SVM takes a dataset as input and predicts, for each input, which of the two possible classes the input belongs to, making the SVM a non-probabilistic linear classifier. An SVM is a representation of examples as points in space, which are rendered in such a way that the examples in each category are separated by as much free space as possible. The new examples are then mapped to the same room and, depending on which side of the room they are placed on, must fall into the same category The SVM algorithm is a classification method in which each data is graphed as a point in a space of 𝑛 dimensions (where 𝑛 is the number of variables you have). With the value of each variable being the value of a particular coordinate, this algorithm will determine whether a person is being affected by diabetes disease, or not. SVM finds a dividing line, better known as a hyperplane, between the data of the two classes. This line tries to maximize the distance between the closest points in relation to each class. The distance between a hyperplane and the first point of each class is often referred to as an edge. SVM prioritizes the classification of classes, thus identifying each point belonging to each class, maximizing margins. That is, first classify the class correctly, and then, based on this constraint, determine the distance between the edges Classification is present in many real-world problems. The support vector machine was originally designed to handle binary (+/- 1) tasks. In terms of accuracy, the results obtained with this approach are comparable to those obtained directly with one method versus another. Faced with a practical problem, the choice of approach will depend on the available limits. Relevant factors include the accuracy required, the available design time, the processing time, and the nature of the classification problem.
Random forest is a general training method for estimation, regression and other activities, based on building different decision trees in the training phase and generating classes in both the estimation (estimation) mode and in the mean mode, predictions of individual trees. In a random forest, each tree assigns a classification to classify a new object by attribute, and we say that the tree is “consistent” with that class. The forest chooses the rating with the most votes. Each tree is planted and grown as follows: If the number of observations in the training set is N, a random sample of observations from n, but with replacement, is selected. This example is an assembly unit for growing a tree. For the input variable M, we define the number m << M, so that the variable m is chosen at random from the M at each node, and this best division by m is used to separate the nodes. The m value remains constant as the forest grows and each tree is grown optimally No size. , Claims:The scope of the invention is defined by the following claims:

Claim:
1. A System/Method to Detect Diabetes Mellitus using a Neighborhood Component Analysis with a hybrid machine Learning Approach comprising the steps of
a) The data is pre-processed to handle noisy data and missing values on the PIMA Dataset. From the preprocessed data relevant features were extracted and selected to train the model.
b) The model is trained to classify the diabetes mellitus after the feature extraction and selection.
c) The mining techniques are preferred for performing data pre-processing to handle noisy data and missing values data.
2. According to claim 1, neighbourhood component analysis is used to select the relevant features from the pre-processed data.
3. According to claim 1, support vector machine and random forest techniques are used to classify the diabetes.

Documents

Application Documents

# Name Date
1 202541068703-REQUEST FOR EARLY PUBLICATION(FORM-9) [18-07-2025(online)].pdf 2025-07-18
2 202541068703-FORM-9 [18-07-2025(online)].pdf 2025-07-18
3 202541068703-FORM FOR STARTUP [18-07-2025(online)].pdf 2025-07-18
4 202541068703-FORM FOR SMALL ENTITY(FORM-28) [18-07-2025(online)].pdf 2025-07-18
5 202541068703-FORM 1 [18-07-2025(online)].pdf 2025-07-18
6 202541068703-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [18-07-2025(online)].pdf 2025-07-18
7 202541068703-EVIDENCE FOR REGISTRATION UNDER SSI [18-07-2025(online)].pdf 2025-07-18
8 202541068703-EDUCATIONAL INSTITUTION(S) [18-07-2025(online)].pdf 2025-07-18
9 202541068703-DRAWINGS [18-07-2025(online)].pdf 2025-07-18
10 202541068703-COMPLETE SPECIFICATION [18-07-2025(online)].pdf 2025-07-18