Sign In to Follow Application
View All Documents & Correspondence

A System /Method For Improving The Classification Performance In Plant Diseases Using Hybrid Feature Selection Techniques

Abstract: The plant diseases vary as per the environmental conditions, types of soil, type of groundwater, etc., which makes it a very challenging task even for experienced plant pathologists, machine learning methods have become a crucial tool to deal with the classification of such vast types of diseases more accurately. The need for the dataset to train a machine learning model is resolved by collecting the leaf diseases dataset from online sources and some from the farmer's land. Images with low contrast and blurred may lead to over or under-segmentation. The effect of sunlight, dust particles on the leaves, and the camera's position are also big challenges in feature extraction. To remove the background information and highlight the diseased portion, a fusion of the random path method and k-means clustering (RPK-means) is used for segmentation. RPK-means takes input pixels from the k-means clusters and supplies the coordinates to a random path method for segmentation. This proposed segmentation algorithm solved the problem of under and over segmentation by keeping more accurate color and texture information and improved classification performance. A hybrid feature selection strategy is applied for feature selection, where the performance of the traditional feature extraction technique is compared with the deep features. Finally, the transfer learning approach was adopted to classify plant diseases. Different pre-trained models are used to extract features, and then a new classifier is trained on top of these features and the performance is measured by using 15 layer deep learning model. 4 Claims & 2 Figures

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
29 June 2024
Publication Number
27/2024
Publication Type
INA
Invention Field
COMPUTER SCIENCE
Status
Email
Parent Application

Applicants

MLR Institute of Technology
Laxman Reddy Avenue, Dundigal-500043

Inventors

1. Mr. D. Sandeep
Department of Information Technology, MLR Institute of Technology, Laxman Reddy Avenue, Dundigal-500043
2. Mr. V. Nitin
Department of Information Technology, MLR Institute of Technology, Laxman Reddy Avenue, Dundigal-500043
3. Mr. B. VeeraSekharReddy
Department of Information Technology, MLR Institute of Technology, Laxman Reddy Avenue, Dundigal-500043
4. Mr. G. Satyanarayana
Department of Information Technology, MLR Institute of Technology, Laxman Reddy Avenue, Dundigal-500043

Specification

Description:Field of Invention
The incidence of leaf diseases in the plants increases over time, which is a reason for the loss in production. The usual practice of disease detection in crops is performed by taking a vote from the experts of the field and testing in laboratories. However, these techniques could be improved when the disease is early, or the spot's color needs to be clarified and shape irregularly. In addition, Traditional disease detection methods in crops are often labour intensive, time-consuming, and subjective. Because of these challenges, developing an image based computer-aided leaf disease detection system has become very important. This helps detect a large number of input images quickly and reduces the experts' workload by assisting them in decision-making. Also, the final decision becomes more objective than expert-driven techniques. The study comprises an accurate, lightweight leaf disease detection model for the plant that can help farmers and experts in their decision-making to identify a disease.
Background of the Invention
Image segmentation is a crucial step for extracting the ROI from an image, and it has been widely used in agriculture to detect diseased portions of crops and vegetables. The boundary of leaves is irregular due to the degree of roughness the presence of dust particles, the impact of sunlight, the existence of shadow, and the presence of bacterial and fungal infections, which results in erroneous segmentation when using conventional image processing technique .
Various segmentation techniques have been used for plant disease detection, but they often need help with issues such as over-segmentation, under-segmentation, and low accuracy in classification. Color-based segmentation needs to work on overlapping leaves and similar backgrounds. Clustering methods like k-means and mean shift have limitations in accurately detecting lesion boundaries, while random walk segmentation requires significant human intervention. Fusion of multiple techniques, such as color space transformation with clustering or super pixel segmentation with k-means, has been proposed to enhance accuracy (CN104839010B). However, the problem of under and over-segmentation persists. Future research could focus on developing fusion approaches that address these challenges and improve the accuracy of segmentation results.
Several studies have explored various techniques for plant disease detection, including color, texture, shape, and hybrid feature selection methods. Although these approaches have demonstrated encouraging outcomes, it is imperative to tackle many concerns. An obstacle arises from the substantial computational expense associated with deep features and the extensive feature vectors generated when merging several deep features. In order to address this issue, many strategies such as feature ranking or relevant feature selection have been suggested. Another concern arises from the requirement for a substantial quantity of annotated data to train deep learning models. Future research should prioritize the optimization of feature selection strategies, the resolution of computational problems, and the exploration of methods to efficiently employ limited labeled data for accurate plant disease diagnosis JP6699030B2). The utilization of deep learning models in plant disease identification has demonstrated encouraging outcomes. Nevertheless, there are certain concerns and obstacles that require careful consideration. An obstacle arises when there is a need for a substantial quantity of labeled data to train deep learning models. Additionally, the choice of the appropriate pre-trained model and fine-tuning techniques can impact the accuracy and performance of the disease identification system. Transfer learning has been widely used to overcome the data limitation problem and achieve high accuracy in disease detection. Various deep learning architectures, such as ResNet50, VGG16, and DenseNet169, have been applied and compared, with ResNet50 demonstrating superior performance in several studies. Future research could address data limitations, optimize model selection, create lightweight models, and explore advanced techniques for plant disease identification using deep learning (CN105210748A).
Hybrid feature selection techniques have been used in several studies to detect plant diseases. These techniques involve combining multiple types of features, such as color-based, texture based, and shape-based, to improve the accuracy of disease detection. A study by Hasan et al. used a hybrid feature selection approach that combined color, texture, and shape features for the detection of apple diseases, but when hybrid features were obtained by combining color, texture, and morphology features, researchers recorded significant improvement in the classification accuracy. To select the best features, some others used Principal Component Analysis (PCA) along with these features to improve the performance. While deep features are more accurate, dealing with thousands of features will increase the computation cost, and combining two or more deep features will make a large feature vector.
Summary of the Invention
Images with low contrast and blurred may lead to over or under-segmentation. The effect of sunlight, dust particles on the leaves, and the camera's position are also big challenges in feature extraction. To develop a new hybrid segmentation technique to remove the background information and Extract ROI from the input image. The main objective is to develop a segmentation technique that can solve the problem of over and under-segmentation using a combination of k-means clustering and the Random Path (RP) method. K-means and RP methods hybridized to generate the RPK means algorithm. We adapted k-means clustering to get the cluster coordinates, and the same coordinates are taken as input for the RP method for further segmentation. After performing background subtraction and Region of Interest (ROI) extraction using a hybrid segmentation technique that deals with feature extraction and selection strategies to improve the performance of the classifiers.
Brief Description of Drawings
Figure 1 Procedure of RPK-means algorithm
Figure 2: Representation of Pixel Prediction Using K-Means and RPK-Means Techniques
(a) Input image (b) Segmentation using K-Means (c) Segmentation using RPK-means
Detailed Description of the Invention
The incidences of leaf diseases are increasing due to abrupt weather fluctuations and declining soil quality. Clear images of leaves are needed to diagnose these illnesses using computer based algorithms; however, when a farmer takes a picture of sick leaves from the field, some other leaves may be visible in the background. We constantly require an automated system to precisely segment the ROI to separate these background leaves, which could improve feature extraction. To analyze these different datasets are considered. The dataset is pre-processed utilizing filtering methods to reduce the impact of undesired noise, contrast stretching to improve low-contrast images, and edge sharpening algorithms to improve images' sharpness. Contrast stretching is a vital technique used in plant image analysis to enhance the visual appearance of images and improve their interpretability. In plant imaging, contrast stretching is employed to overcome low contrast, poor visibility of important features, and limited dynamic range. Contrast stretching enhances the differentiation of various plant components, such as leaves, stems, and fruits, by expanding the intensity values of the image to encompass the entire possible range. This improvement facilitates more accurate analysis and categorization. Contrast stretching is carried out by mapping the initial intensity values of an image to a new range, usually ranging from 0 to 255. This method increases the range of intensity values and improves the contrast by magnifying the distinctions between adjacent pixels. Some common strategies for contrast stretching are linear stretching, histogram equalization, and adaptive contrast stretching. Calculate the minimum and maximum intensity values for the stretched image, represented as min and max correspondingly. In this min is set to zero and max is set to 255, which corresponds to the full range of pixel values for an 8-bit image.
A key factor in substantially reducing the amount of data to be analyzed and improving feature extraction is the segmentation of images for the correct ROI extraction. Therefore, extracting only ROI to analyze the required problem successfully is preferable. Several investigators have employed fuzzy logic and k-means clustering-based approaches to separate the ROI from plant images. When using conventional image processing techniques, the boundary of leaves cannot be accurately segmented because of the irregular texture, dust particles on the leaf, sunlight's effect, shadows, and noise. In the real-time images, there may be lots of leaves in the background, which need to separate for better segmentation of the lesion portion and has a major challenge in this area. Certain authors have also employed k-means clustering and soft computing techniques to determine the segmentation threshold value. These methods mostly dealt with the issue of under- and over-segmentation. Therefore, a few researchers suggested the fusion of two or more approaches to improve segmentation outcomes. Color space change and clustering were combined on the leaves of vegetables and crops. Super pixel segmentation and k-means clustering were used in another study, and improved accuracy was obtained; however, the study was only confined to two kinds of disease.
A hybrid segmentation approach is formed by combining k-means clustering and the random path algorithm. The k-means clustering provides a starting point by generating initial clusters, while the random path algorithm refines these clusters by considering pixel similarities and spatial relationships. This hybrid approach overcomes the limitations of each algorithm, leading to improved segmentation results. We are focusing on the images with overlapping leaves and similar background information, so for the experiments, we choose only three datasets with real-time unprocessed images: the Mendeley Dataset, the CG Dataset, and the self-created dataset. The Mendeley data 51 set contains 5932 number images, including four kinds of leaf diseases, i.e.1584, images of Bacterial Leaf Blight (BLB), 1440 images of Blast (RB), 1600 images of Brown Spot (BS), and 1308 images of Tungro. The CG Dataset consists of four classes containing 240 images of Bacterial Leaf Blight, 100 images of Shealth Blight, 88 images of Blast, and 191 images of Healthy Leaves. In the self-created dataset, we collected images of 6 major diseases, including 149 images of bacterial Leaf Blight (BLB), 569 of Brown Spots, 246 healthy images, 560 images of the leaf scale,111 images of blast, and 141 images of health blight diseases.
First step is Unequal contrast, color-illumination, noise, and unsharp edges artifacts generate difficulties in segmenting the diseased portion and removal of background information. Such issues are resolved by applying the CLAHEEP algorithm. The output image of the CLAHEEP method is denoted by I and treated as an input for the next phase.
Second step is the pre-processed image (I) is taken as input for k-means clustering to generate clusters of segmented images and save the center of the cluster. The k-means clustering is a well-known algorithm widely used in many researches for the segmentation of the image, in which clusters of foreground and background pixels are made based on the similarity of the intensities and colors of the pixels. Here, we are dividing the image into 2 clusters.
Third step is in this phase, an RP method is applied which is very suitable while dealing with images that have irregular boundaries. It leverages the concept of an RP on a graph to assign labels to pixels based on their affinity or similarity to known regions. We used the coordinates of the pixels of clusters C1 and C2 obtained in step 2 as seed pixels of foreground and background,
The final step is developing a hybrid segmentation method. In this approach, infected leaves are initially separated by background removal, and at the same time, the ROI is extracted from infected leaves. K-means clustering was used to divide the image into two groups for the foreground and background, but we found that the background couldn't be completely separated since some background pixels share information with the foreground pixels. The affected region of a leaf cannot be segmented by the Random Path method for segmentation, despite the fact that it removes the background area extremely successfully. Additionally, this is a significant drawback of the requirement for human involvement in providing the foreground and background pixel coordinates. Combining these two algorithms yields the proposed RPK-means algorithm, which not only correctly segments the lesion but also doesn't require human intervention. To carry out next-level segmentation, we have randomly chosen a group of photos from the dataset. Because the foreground and background pixel coordinates are stored after k-means clustering, this technique additionally minimizes the issue of over and under-segmentation as it chooses the seed pixel with a higher degree of precision. The final segmented image is then produced by feeding it into a Random Path algorithm. Figure 4.1 below provides a graphic illustration of the RPK-means method.
To create a deep learning model for classification, we used all the convolution filters of 3x3x3 with stride [1,1] and 'same' padding; at the first stage number of such filters is 8 (Conv8), then at the second stage, it is 16 (Conv-16), and at the third stage it is 32 (Conv-32). For feature reduction, max pooling used a 2x2 filter size with stride [2,2] and 'same' padding. Neurons in the fully connected layers are set as per the classes available in the dataset. The arrangement of layers changed many times to analyze the effect on classification accuracy; finally, the model has been selected for further processing. Here we developed a Hybrid Feature Fusion and Selection Strategy (HFFSS). The proposed strategy comprises three steps, First, all the possible feature sets are created using a fusion of LF and GF. In the second step, these feature vectors are sent for majority voting to select the best feature vector. For majority voting Quadratic SVM model is used and training-testing ratio is selected between 10% to 30%. Finally, most relevant features are selected using MRMR (Maximum Relevance and Minimum Redundant) to reduce the number of features from the selected feature vector. The idea behind majority voting is to aggregate the predictions of different models to make a final decision based on the majority opinion. We are using the same concept to aggregate the features. The feature vectors with the highest votes are aggregated to find the best feature vector. The Maximum Relevant and Minimum Redundant feature selection algorithm selects a dataset's most informative and non-redundant subset of features. The algorithm aims to maximize the relevance of the selected features with the target variable while minimizing their redundancy with each other. The proposed feature fusion and selection strategy comprises all the combination of feature vectors by blending LF and GF. After making database of all the possible combination of feature vectors, the most useful feature is selected using majority voting algorithm.
4 Claims & 2 Figures , Claims:The scope of the invention is defined by the following claims:

Claim:
1. A System /Method for improving the classification performance in Plant diseases using Hybrid Feature Selection Techniques comprising the steps of
a) A method is adopted that resolves the problem of over and under-segmentation.
b) A feature extraction model is developed to incorporate local and global features to increase a classification model's accuracy.
c) A classifier is designed for measuring the performance of designed model.
2. A System /Method for improving the classification performance in Plant diseases using Hybrid Feature Selection Techniques as claimed in claim 1,the RPK means algorithm that combines k-means clustering and the Random Path (RP) method to reduce over and under segmentation problems.
3. A System /Method for improving the classification performance in Plant diseases using Hybrid Feature Selection Techniques as claimed in claim 1, the hybrid feature fusion and selection strategy is developed to create the best feature vector having minimum redundant and maximum relevant features to improve the classifier's performance.
4. A System /Method for improving the classification performance in Plant diseases using Hybrid Feature Selection Techniques as claimed in claim 1, the twelve layer deep learning model along with SVM is used to compare the performance of the developed models.

Documents

Application Documents

# Name Date
1 202441049923-REQUEST FOR EARLY PUBLICATION(FORM-9) [29-06-2024(online)].pdf 2024-06-29
2 202441049923-OTHERS [29-06-2024(online)].pdf 2024-06-29
3 202441049923-FORM-9 [29-06-2024(online)].pdf 2024-06-29
4 202441049923-FORM FOR STARTUP [29-06-2024(online)].pdf 2024-06-29
5 202441049923-FORM FOR SMALL ENTITY(FORM-28) [29-06-2024(online)].pdf 2024-06-29
6 202441049923-FORM 1 [29-06-2024(online)].pdf 2024-06-29
7 202441049923-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [29-06-2024(online)].pdf 2024-06-29
8 202441049923-EDUCATIONAL INSTITUTION(S) [29-06-2024(online)].pdf 2024-06-29
9 202441049923-DRAWINGS [29-06-2024(online)].pdf 2024-06-29
10 202441049923-COMPLETE SPECIFICATION [29-06-2024(online)].pdf 2024-06-29