Real Time Intelligent Auscultation Sound Diagnostic System

< Back

Real Time Intelligent Auscultation Sound Diagnostic System

Abstract: The present invention discloses a continuous and remote real-time monitoring system (S) that utilizes deep learning algorithms (DLA) to analyze and categorize lung sounds automatically. The system (S) utilizes a combination of electronic stethoscopes, audio recording software (ARS), and deep learning algorithms (DLA) to classify and analyze respiratory sounds, offering a real-time diagnosis of lung health status. The system design incorporates a Raspberry Pi for live recording and can classify sounds into groups based on respiratory conditions. The system (S) is integrated with a state-of-the-art web platform (WP) that provides an automated diagnosis of lung sounds and generate a comprehensive diagnostic report (CDR), which is promptly notified to the user via email. The respiratory diagnosis report is shared with remote medical professional upon the user’s request.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

06 July 2023

Publication Number

35/2023

Publication Type

INA

Invention Field

BIO-MEDICAL ENGINEERING

Status

Parent Application

Applicants

AMRITA VISHWA VIDYAPEETHAM

Amritapuri Campus, Amrita School of Computing, Amritapuri, Clappana PO, Kollam 690 525

Inventors

1. SUJITH, Abhishek

276B 2nd Cross Street, Vinayagar Colony, Agasthiyarpatti, Ambasamudram, Tirunelveli District, Tamil Nadu 627428

2. THACHANKATTIL, Anjali

Room No 11, Amrita Sindhu, MA Math, Kollam, Kerala 690546

Specification

Description:FIELD OF THE INVENTION:
The present invention relates to a continuous and remote real-time monitoring system that utilizes deep learning algorithms to analyze and categorize lung sounds automatically. More particularly, the present invention relates to a system which uses deep learning algorithms powered by artificial intelligence to automatically live-record, identify and categorize lung sounds and share the same with health-care provider in order to diagnose respiratory diseases in patients.

BACKGROUND OF THE INVENTION:
Respiratory diseases such as chronic obstructive pulmonary disease (COPD) are a leading cause of mortality in India and globally. Respiratory disorders continue to threaten public health significantly, making accurate diagnosis and treatment essential. The sounds produced by the respiratory system during breathing are significant indicators of respiratory health and illness.

Auscultation i.e., listening to internal body sounds, is a crucial component of the physical examination that enables clinicians to identify respiratory conditions and administer first aid. Traditional auscultation using a stethoscope is a rapid, effective, and inexpensive method of assessing respiratory sounds. However, this method of diagnosing respiratory disorders is subjective and reliant on the considerable expertise of healthcare professionals, which may be affected by inter-listener variability, resulting in a high risk of misdiagnosis.

Thus, there is need for accurate, and non-invasive diagnostic techniques that can accurately analyze respiratory sounds and provide an objective diagnosis and thus, improve the treatment of respiratory disorders.
In recent years, the advancements of computer systems and deep learning algorithms, capable of gathering and analyzing large volumes of data has enabled the development of numerous non-invasive diagnostic methods for various illnesses, including respiratory diseases. Deep-learning algorithms can accurately diagnose respiratory illnesses such as asthma, pneumonia, bronchitis, bronchiolitis etc. by analyzing breathing sounds.

OBJECT OF THE INVENTION:
In order to obviate the drawbacks of the existing state of the art, the present invention discloses a continuous and remote real-time monitoring system that utilizes deep learning algorithms to analyze and categorize auscultationsounds automatically, in order to enable diagnosis of respiratory diseases in patients.

The main object of the present invention is to provide a remote and continuous real-time sound monitoring system with automated sound analysis capabilities, incorporating deep learning algorithms, to categorize sounds automatically.

Another object of the invention is to provide a diagnostic system which is portable, affordable, and accessible, thus making it particularly useful for medical professionals working in remote areas.

Another object of the invention is to identify sounds using an electronic stethoscope and audio recording software to accurately classify and diagnose respiratory conditions.

Another object of the invention is to provide audio recording software utilizing advanced deep-learning algorithms powered by artificial intelligence (AI) comprising of Gated Recurrent Units (GRUs), Leaky Rectified Linear Unit (Leaky RELU) and Convolutional Neural Networks (CNNs), for analyzing and diagnosing respiratory conditions and generating a comprehensive diagnostic report.

Another object of the invention is to identify lung sounds, which are then categorized into one of seven groups: Healthy, Bronchiectasis, Bronchiolitis, COPD, Asthma, Pneumonia, and URTI.

Another object of the invention is to provide a user-friendly interface for uploading and analyzing valuable lung data.

Another object of the invention is to provide a prompt notification of the diagnosed lung condition to the user via email, along with detailed information on the lung health status, and sharing of diagnosis report with remote medical professional upon the user’s request.

SUMMARY OF THE INVENTION:
The present invention discloses a remote and continuous real-time sound monitoring system with automated sound analysis capabilities. This system identifies respiratory sounds using an electronic stethoscope and audio recording software employing deep-learning algorithms to accurately classify and diagnose respiratory conditions, thus overcoming the limitations of traditional auscultation methods.

The system of the present invention uses deep learning algorithms to live-record and identify lung sounds using an electronic stethoscope and automatically classify these respiratory sounds, using audio recording software, into one of seven groups: Healthy, Bronchiectasis, Bronchiolitis, COPD, Asthma, Pneumonia, and URTI. The system is portable andaffordable, making it particularly useful for medical professionals working in remote areas. The web platform of the system provides automated diagnosis for lung sounds, further enhancing the usability and accessibility of the system. The web platform provides a user-friendly interface for uploading and analyzing valuable lung data of the patient and sharing the same with a remotely located health care professional for diagnosis and treatment of the respiratory disease.

The system of the present invention is user-friendly such that even non-technical medical professionals can use it to diagnose respiratory disorders. Integrating an innovative web platform that offers automated diagnosis for lung sounds adds a unique dimension to the proposed invention, allowing users to upload and analyze valuable lung data quickly. The platform utilizes advanced algorithms powered by artificial intelligence and deep learning to generate a comprehensive diagnostic report, providing users with a clear understanding of their lung health status.

STATEMENT OF INVENTION
The present invention discloses a continuous real-time respiratory sound monitoring system having automated sound diagnosis and analysis capabilities. The system utilizes a combination of Raspberry Pi and deep learning algorithms to develop a novel system that can live-record and identify lung sounds for diagnostic purposes. The compact, inexpensive, and accessible solution is designed for medical practitioners operating in outlying locations.

The present invention aims to provide a scalable diagnostic system that can accommodate multiple patients and various diseases. This scalability makes the proposed method more adaptable to healthcare settings and patient populations.

The present invention utilizes an advanced web platform for the automated diagnosis of lung sounds. The platform is designed with a user-friendly interface, ensuring that uploading and analyzing crucial lung data is straight-forward, irrespective of the user’s technical or medical knowledge.

The system of the present invention also allows for sharing the diagnosis with healthcare providers at the user’s request, ensuring seamless collaboration between the user and their healthcare providers for proper diagnosis and treatment.

The present invention is a cutting-edge diagnostic system that seamlessly integrates with other health information systems, electronic medical records (EMR), and telemedicine systems.

BRIEF DESCRIPTION OF THE DRAWINGS:
Fig. 1 depicts respiratory sound monitoring and diagnosis system design

Fig. 2 depicts Distribution of Respiratory Conditions before Data Augmentation

Fig. 3 depicts Distribution of Respiratory Conditions after Data Augmentation

Fig. 4 depicts Scatter Plot of Respiratory Cycle Distribution

Fig. 5 depicts Box Plot of Respiratory Cycle Distribution

Fig. 6 depicts deep learning module

Fig. 7 depicts GRU architecture

Fig. 8 depicts Convolutional Neural Network (CNN)

Fig. 9 depicts Physicians Data Upload WebApp

Fig. 10 depicts Anvil Web Architecture

Fig. 11 depicts Anvil WebApp Working

Fig. 12 depicts Tele Diagnosis WebApp

Fig. 13 depicts Raspberry Pi Architecture

Fig. 14 depicts GRU - Accuracy Curve

Fig. 15 depicts GRU - Loss Curve

Fig. 16 depicts AIMS-GRU Accuracy Curve

Fig. 17 depicts AIMS- GRU Loss Curve

Fig. 18 depicts CNN - ROC Curve

DETAILED DESCRIPTION OF THE INVENTION:
The present invention discloses an automated, continuous, scalable, remote and real-time monitoring and analysis system (S) that utilizes deep learning algorithms (DLA) to analyze and categorize lung sounds automatically. The system (S), as disclosed, is a significant advancement in non-invasive diagnostic methods for respiratory disorders, leveraging cutting-edge technologies to improve healthcare outcomes. The system utilizes a combination of Raspberry Pi with integrated deep learning algorithms for recording, analyzing and categorizing respiratory sounds into major classes of respiratory disorders, thus enabling diagnosis of respiratory conditions of a patient and generating a comprehensive diagnostic report (CDR).

The system (S) further comprises a user-interface (UI) for notifying the CDR to the user of the system (S), via email, which is then shared with health care professionals upon the user’s request. This feature adds a significant advantage to the disclosed respiratory sound monitoring system by allowing patients to share their diagnoses with healthcare providers easily. The comprehensive diagnostic report (CDR) can be used to monitor the progress of a patient’s respiratory illness and identify any necessary treatment adjustments. Thus, sharing the diagnosis with healthcare providers, allows for a collaborative approach to patient care, promoting better patient outcomes.

INTELLIGENT AUSCULTATION SOUND DIAGNOSIS SYSTEM
The automated, real-time sound monitoring and analysis system (S) of the present invention utilizes a combination of Raspberry Pi and deep learning algorithms (DLA) to develop a novel system that can live-record and identify lung sounds for diagnostic purposes. The compact, inexpensive, and accessible solution is designed for medical practitioners operating in outlying locations. It seeks to offer an intuitive diagnostic method that even non-technical medical practitioners may use. The system offers the following major benefits to patients and health-care professionals in accurate diagnosis and management of respiratory conditions:

Real-time Diagnostic System: to live-record respiratory sounds for various respiratory conditions in multiple patients and, by using an electronic stethoscope and to analyze the recorded sounds using deep learning algorithms (DLA) to prepare a comprehensive diagnostic report (CDR).

Tele - Diagnosis Web Platform: comprising of an advanced web platform (WP) with a user-friendly interface for uploading and analyzing crucial lung data, including the type of abnormality detected, its severity and possible next steps for treatment, thus ensuring seamless collaboration between the user and their healthcare providers for proper diagnosis and treatment.

Seamless Healthcare Integration: providing a cutting-edge diagnostic system that seamlessly integrates with other health information systems, electronic medical records (EMR), and telemedicine systems to enhance the quality of patient care.
The automated, real-time sound monitoring and analysis system (S) of the present invention is depicted in Fig. 1. The system can be used for analysis and classification of both the pre-recorded sound as well as auscultation sounds collected in real time. The system pre-processes the pre-recorded as well as real-time sounds, extracts distinguishing features of sounds and classifies the sounds into major classes of respiratory conditions, for diagnosing the respiratory condition of the patient.

(A) SOFTWARE ARCHITECTURE (SA)
The system (S) of the present invention utilizes an audio recording software (ARS) for recording and analyzing respiratory sounds. The software architecture (SA) of the system comprises of: the sound processing module (SPM), the deep learning module (DLM), and the web platform module (WPM).

Sound Processing Module (SPM): The sound processing is performed by the sound processing module (SPM), which is responsible for capturing and filtering the respiratory sounds.

(i) Sound Processing
The sound data is captured using the wearable stethoscope and transmitted to the Raspberry Pi via Bluetooth. The sound processing module (SPM) filters out the noise and amplifies the respiratory sounds, making them more audible for the deep learning module (DLM).

(ii) Text Pre – Processing
Data pre-processing is a critical step in preparing data for use in machine learning algorithms. In the context of auscultation sound classification, data pre-processing steps can help to optimize the accuracy and reliability of the classification model.
In the given scenario, there is only one patient without an age, and the diagnosis is COPD. To impute the missing age value, the median age of patients with COPD is used as a measure of central tendency. This approach is robust to outliers or extreme values in the data and ensures that the imputed age value is representative of the relevant population. The use of median as an imputation technique is a standard approach in data pre-processing.
Similarly, there is only one patient without sex information. In this case, the most common outcome (male) is used to impute the missing value. Imputing the missing value with the mode of the feature (in this case, sex) is a common technique in data preprocessing, especially when dealing with categorical variables.

In addition to missing age and sex values, there are missing values for BMI. To impute these missing values, the weight and height of the patient are used. BMI is calculated as the ratio of the patient’s weight (in kilograms) to the square of their height (in meters). Imputing missing values for BMI, using a simple formula, is a widely used approach in data preprocessing.

(iii) Audio Pre – Processing
Fig. 2 denotes that the classes of respiratory conditions are imbalanced. Therefore, by applying different modifications to the original data, new examples may be generated, that are still representative of the original class, but are less imbalanced and contain variations that can help the model to better generalize to unseen data. This is done using” Data Augmentation”.

Data augmentation is a technique commonly used in machine learning to artificially increase the size of the training data set, thus allowing simulation of different environmental conditions or speaker variations. Several techniques are used for data augmentation in auscultation sound classification, some of which are as follows:
• Adding Noise: Adding random noise to the auscultation sound can help to simulate the real-world noise that might be present during the recording process. For example, by adding white noise i.e., noise containing many frequencies with equal intensities, to a recording of a heart murmur, we can generate a new example that sounds similar to the original, but that may contain additional features that can help the model better discriminate between different types of murmurs. This can help to make the model more robust to noise and improve its accuracy.
• Shifting: Shifting the sound waveform by a few samples to the left or right can create a new sample that is similar to the original one, but with a different start time. This can simulate changes in the position of the stethoscope on the body or variations in the duration of the sound. For instance, shifting a recording of a wheeze by a few milliseconds can generate a new example that sounds similar to the original, but that captures a slightly different phase of the wheeze. This can help the model to learn variations in timing and improve its performance.
• Stretching: Stretching the sound waveform by a certain rate can help to simulate variations in the duration of the sound. This can simulate variations in the rate of breathing or the length of the sound produced. By stretching a recording of a crackle, for example, we can generate a longer version of the sound that captures more of its acoustic properties and provides the model with additional information to make a classification. This can help the model to learn variations in duration and improve its performance.

Figure 3 depicts the sounds post data augmentation, wherein the respiratory sounds representative of different lung health conditions, are more balanced when compared with unprocessed sounds.

Respiratory Cycle Extraction is another method of audio pre-processing in which, at the time of preparing audio data for respiratory cycle classification, portions of the audio file containing respiratory cycles are extracted. This can be achieved by utilizing the start and end time specified for these cycles in the data frame. The start time is a value that represents the beginning of a respiratory cycle, and it is usually given in seconds. In order to extract the corresponding portion of the audio signal, the start time is multiplied by the sampling rate for aligning the start time with the corresponding index in the array. Once the portions of the audio signal that contain respiratory cycles have been identified, it is important to confirm that they have the same length.

To determine the best length for the respiratory cycle segments, the distribution of the segment lengths can be analyzed by using scatter plot, as depicted in Fig. 4 and box plot, as depicted in Fig. 5. Upon analysis of these plots, it has been determined that a length of around 6 is a good choice for the respiratory cycle segments. If the difference between the start and end times for a given cycle segment is less than 6, there is a need to zero pad the segment to bring it up to the required length. ‘Zero padding’ is adding silence to the end of the segment to make it the required length. It is important to note that a single sample of an audio file may contain multiple respiratory cycles. Therefore, multiple files need to be created from a single audio file in order to capture all of the respiratory cycles. This can be achieved by dividing the audio signal into non-overlapping segments of the appropriate length, and then saving each segment as a separate file.

DEEP LEARNING MODULE (DLM)
The deep learning module (DLM), deployed in the system of the present invention, is responsible for analyzing the respiratory sounds and generating a diagnostic report. The deep learning module (DLM) of the present invention uses a Gated Recurrent Unit (GRU) and Leaky Rectified Linear Unit (LeakyReLU) to classify the respiratory sounds into one of seven groups: Healthy, Bronchiectasis, Bronchiolitis, COPD, Asthma, Pneumonia, and URTI. The deep learning module’s accuracy ranges from 93%-95%. The diagnostic report generated by the deep learning module (DLM) is displayed on the mobile device. The architecture of deep learning module (DLM) has been depicted in Fig. 6, wherein, after data acquisition and its pre-processing, the data is split for training and testing. Finally, classification of respiratory sounds is done based on evaluation of the data set.

Gated Recurrent Unit (GRU)
Gated Recurrent Units (GRUs) are a type of neural network architecture that have shown excellent results in processing sequential data, such as natural language processing, speech recognition, and time-series analysis. GRUs are a variation of the commonly used Recurrent Neural Network (RNN) architecture and have been found to perform well in various classification tasks, including auscultation sound classification.

The GRU architecture is composed of a set of recurrent neural network (RNN) units that are gated to selectively allow the flow of information. The gates control the flow of information by regulating the amount of information that is passed between the current time step and the previous time step. The gating mechanism is designed to selectively allow relevant information to pass while blocking irrelevant information.

The GRU architecture is depicted in Fig. 7. It is composed of two types of gates: the update gate and the reset gate. The update gate is responsible for determining how much of the previous state should be passed on to the current state, while the reset gate is responsible for deciding how much of the new input should be incorporated into the current state. The combination of these two gates allows the GRU to effectively capture the temporal dependencies between sequential data, which is crucial in the analysis of auscultation sounds. GRUs are commonly used in the deep learning models developed for auscultation sound classification due to their ability to effectively capture the temporal dependencies between sequential data.

Leaky Rectified Linear Unit (LeakyReLU)
The Leaky Rectified Linear Unit (LeakyReLU) is a type of activation function that was introduced to address the vanishing gradient problem in neural networks. The vanishing gradient problem occurs when the gradient of the cost function becomes very small during backpropagation, causing the network to learn slowly or not learn at all. The LeakyReLU function provides a small non-zero gradient for negative input values, unlike the traditional ReLU function, which returns zero for negative inputs. The LeakyReLU function can be defined as:
f(x) = {x, x >= 0
{alpha * x, x < 0
where alpha is a hyperparameter that defines the slope of the function for negative input values. Typically, alpha is set to a small positive value, such as 0.01.

In auscultation sound classification, the LeakyReLU function is commonly used in the hidden layers of convolutional neural networks (CNNs). The CNNs are trained on spectrograms of heart or lung sounds, which are two dimensional representations of the sound signals. The LeakyReLU function is used as an activation function after the convolutional layers to introduce non-linearity in the network and improve its ability to extract relevant features from the input data. The spectrograms of heart or lung sounds contain both positive and negative values. The LeakyReLU function ensures that negative values are not completely ignored by the network during training. The use of LeakyReLU also helps in preventing overfitting by regularizing the network, as it introduces noise in the gradients during backpropagation.

Convolutional Neural Networks (CNNs)
Convolutional Neural Networks (CNNs) are a class of neural networks that are particularly effective at processing grid-like data, such as images and sounds. In auscultation sound classification, CNNs can be used to automatically extract relevant features from the sound signals and classify them into different categories. A typical CNN, as depicted in Fig. 8, consists of several layers, including convolutional layers, pooling layers, and fully connected layers. The input to the network is a sound signal, which is typically represented as a 1-dimensional waveform. The first layer of the network is usually a convolutional layer, which applies a set of filters to the input waveform to extract features that are relevant to the classification task. The output of the convolutional layer is a set of feature maps, which are then passed through a non-linear activation function LeakyReLU.

The output of the first convolutional layer is then passed through one or more pooling layers, which reduce the spatial dimensions of the feature maps and help to extract more abstract features from the data. The pooling layer typically performs a down sampling operation, such as max pooling, which selects the maximum value within a small region of the feature map.

After the pooling layers, the feature maps are flattened into a 1-dimensional
vector and passed through one or more fully connected layers, which perform the final classification. The output of the fully connected layers is a set of class probabilities, which can be used to classify the input sound signal into one of several categories.

CNNs have been shown to be highly effective at auscultation sound classification, achieving state-of-the-art performance on a variety of datasets. In particular, they are well-suited to the task of feature extraction from complex sound signals, which can be difficult to perform manually. Additionally, CNNs can be trained end-to-end, which means that the feature extraction and classification steps are learned jointly from the data, rather than being designed by hand.

WEB PLATFORM MODULE (WPM)
The web platform module (WPM) provides an automated diagnosis for lung sounds and is accessible from any device with an internet connection. The web platform used in the present invention has been depicted in Fig. 9. The user can upload respiratory sound data to the web platform, where it is analyzed using the same deep learning algorithms as the Raspberry Pi. The platform has a user-friendly interface that makes it easy for non-technical or medical professionals to upload and analyze valuable lung data. Once the respiratory sound data is analyzed, the web platform generates a comprehensive diagnostic report (CDR) and sends it to the user via email. The comprehensive diagnostic report (CDR) contains important information such as the type of abnormality detected and potential next steps, providing users with a clear understanding of their lung health. The user can also choose to share the diagnosis with their healthcare providers upon request. The web platform module ensures that patients can receive accurate and timely diagnosis regardless of their location, allowing for prompt medical intervention when necessary.

USER INTERFACE (UI)
The system (S) of the present invention has a user interface (UI) which is an essential aspect of the system’s design. The user interface (UI) connects the user with the web platform (WP), thus ensuring that users can easily control the system, upload respiratory sound data, and receive a diagnostic report. The user interface (UI) provides users with an interface for accessing the system’s capabilities by connecting with the web platform module (WPM). The user interface of the present invention has been depicted in Fig. 10 and Fig. 11. The user is at the center of the user interface (UI), with access to the data stored in the web.

The web platform (WP) is accessible in the user interface (UI), from any device with an internet connection, making it convenient for users who may not have access to the mobile application. The web platform (WP) is designed to be intuitive and straightforward, ensuring that even users with limited technical or medical knowledge can use it. The interactive Tele diagnosis web app of the present invention has been depicted in Fig. 12. This platform enables users to upload respiratory sound data to the system for analysis, using the deep learning algorithms (DLA). Once the analysis is complete, the web platform (WP) generates a comprehensive diagnostic report (CDR) that is sent to the user via email. The report provides users with information on the type of abnormality detected and potential next steps, ensuring that users have a clear understanding of their lung health status. Furthermore, the web platform module (WPM) enables users, via the user interface (UI), to share their diagnostic report with healthcare providers upon request. This feature is essential in ensuring that users can receive the necessary medical attention and treatment.

HARDWARE ARCHITECTURE (HA)
The system (S) of the present invention has a convenient and robust hardware architecture (HA) comprising of a Raspberry Pi, a wearable stethoscope, and a mobile device. The Raspberry Pi is the main component that processes and analyzes the sound data captured by the stethoscope. The stethoscope is equipped with integrated processors and sensors that enable the development of mobile and cordless respirators. The stethoscope’s design is lightweight and comfortable to wear, making it easy for patients to use. The mobile device is used to control the Raspberry Pi and display the diagnostic report generated by the system.

Raspberry Pi
Raspberry Pi is the main component that processes and analyzes the sound data captured by the stethoscope. Typical architecture of Raspberry Pi has been depicted in Fig. 13. It is a small, single-board computer that has been used in various applications including in auscultation sound classification. It is a low-cost and power-efficient device that can be used as a standalone system or integrated into a larger system. This device can be connected to a microphone and a set of headphones, which can be used to record and play back lung sounds. The audio data can be processed using machine learning algorithms running on the Raspberry Pi, allowing for real-time classification of heart and lung sounds. The portability of Raspberry Pi is a significant advantage in auscultation sound classification. Its small size and low power consumption allow for the creation of a portable auscultation system that can be taken to remote areas or used in regions where resources are limited. The device can also be integrated into a larger system, enabling remote monitoring and data transmission.

ANALYSIS, CLASSIFICATION AND DIAGNOSIS OF RESPIRATORY LUNG SOUNDS:
The Real-time Intelligent Auscultation Sound Diagnostic System (S) of the present invention was tested in The Amrita Institute of Medical Sciences & Research Centre (AIMS) situated in Kochi and the Amrita Kripa Hospital situated in Amritapuri. Recordings in this dataset were meticulously captured under various situations, carefully encompassing a range of conditions that closely mirror real-life scenarios. This includes recordings of both clear respiratory sounds and noisy recordings, allowing researchers to tackle the challenges encountered in real-world diagnostic settings. Including noisy recordings is particularly significant, as it provides a realistic representation of the complexities faced by healthcare practitioners when diagnosing respiratory conditions. This diversity of recording conditions makes the dataset highly suitable for developing and evaluating algorithms and models aimed at automating the diagnosis of respiratory disorders.

The dataset consists of a total of 1979 recordings, with a cumulative duration of 9.4 hours. These recordings were obtained from a substantial cohort of 1294 individual patients, indicating a robust sample size that allows for comprehensive analysis and study.

The dataset encompasses a meticulously curated assemblage of recordings representing seven distinct respiratory disorders, namely: wheezing, bronchiectasis, bronchiolitis, healthy control (representing individuals devoid of any respiratory pathology), pneumonia, upper respiratory tract infection (URTI), and chronic obstructive pulmonary disease (COPD). Moreover, the dataset extends its breadth to encompass instances of tuberculosis and cancer, augmenting the spectrum of pathological conditions encompassed. Demographics of the collected samples have been provided in Table 1. This demographic information contains essential attributes, including age, gender, and detailed clinical profiles. By considering variables such as age and gender, researchers can discern possible variations in respiratory sound patterns attributed to physiological and anatomical disparities.
Table 1: Demographics of samples collected

The data acquisition process entailed the utilization of a repertoire of advanced tools to capture the intricate acoustic characteristics of respiratory sounds. These instrumentalities encompassed the 3M Littmann Classic III Monitoring Stethoscope, renowned for its exceptional sensitivity and fidelity, the 3M Littmann Core Digital Stethoscope Black Stem 8480, featuring cutting-edge digital signal processing capabilities, and the AKG C417 PP Professional Lavalier Microphone, engineered for precise sound capture in clinical settings.

The initial training accuracy of the data in GRU model was 97%, i.e., the model was able to classify 97% of the training data correctly. The initial testing accuracy of the model was 92%, which means that the model was able to classify 92% of the testing data correctly.

Implementation of early stopping in the present system, improved the training accuracy of the model to 98%. This means that the model was able to classify 98% of the training data correctly. However, the testing accuracy decreased to 88%, indicating that the model may have overfit to the training data. The implementation of early stopping is a technique used to prevent overfitting of the model to the training data. It stops the training process once the performance of the model on a validation set does not improve for a specified number of epochs, wherein, ‘Epoch’ refers to one complete pass of the training dataset through the algorithm.

The accuracy and loss curves are important visualizations used in the evaluation of machine learning models, including GRU (Gated Recurrent Unit) models. The accuracy curve shows how the accuracy of the model changes during the training process, while the loss curve shows how the loss (i.e., the difference between predicted and actual values) changes during the training process.

The accuracy curve of a GRU model, as depicted in Fig. 14, typically shows an upward trend as the model is trained on the data. This indicates that the model is improving its ability to correctly classify instances in the dataset. However, it is important to monitor the accuracy on both the training and validation data, as a model that overfits to the training data may have high accuracy on the training set but low accuracy on the validation or testing set.

The loss curve, on the other hand, typically shows a downward trend during training. This indicates that the model is reducing the difference between predicted and actual values, which is the goal of the training process.

The loss curve can also help identify potential issues with the model, such as overfitting or underfitting. The loss curve obtained by the system of the present invention is depicted in Fig. 15. In general, a good GRU model should have a high accuracy and a low loss, indicating that it is able to accurately classify instances and reduce the difference between predicted and actual values. However, it is important to consider the trade-off between accuracy and loss, as increasing one may come at the expense of the other.

The model was additionally evaluated using other metrics, such as Precision, Recall and F1 score to gain a more comprehensive understanding of its performance.

Precision is the proportion of true positives (correctly classified positive instances) among the total positive predictions made by the model. The precision of the present model is 0.924573, indicating that out of all the positive predictions made by the model, 92.45% were correct.

Recall is the proportion of true positives among the actual positive instances. The recall of the present model is 0.920398, indicating that out of all the actual positive instances, the model was able to correctly identify 92.04%.

The F1 score is the harmonic mean of precision and recall, providing a balance between the two metrics. The F1 score of the present model is 0.920106.

Cohen’s kappa coefficient measures the agreement between predicted and actual classifications, taking into account the possibility of random agreement. The Cohen’s kappa coefficient of the present model is 0.904365, indicating a good level of agreement between predicted and actual classifications.

The Matthews correlation coefficient is another metric that takes into account true positive, true negative, false positive, and false negative predictions. The Matthews correlation coefficient of the present model is 0.905281.

Based on the results of performance evaluation of the system by different metrics, it is clear that the overall performance of the system is highly accurate and precise. However, the decrease in testing accuracy after implementing early stopping suggests that the model may have overfit the training data, which should be taken into consideration when evaluating its performance.

The classification report, as indicated in Table 2 provides a summary of Precision, Recall, F1 score, and Support metrics for each class in the predicted data. It is, thus, a useful tool for evaluating the performance of a GRU model for classification tasks. The table provides a summary of Precision, Recall, F1-score and Support metrics for each of six respiratory conditions considered. The average accuracy of the system to detect the respiratory conditions is 0.92 i.e., it is able to correctly classify the respiratory conditions with a 92% accuracy.

Table 2: GRU (Kaggle) - Classification Report
Diagnosis Precision Recall F1-score
Bronchiectasis 1 0.89 0.94
Bronchiolitis 0.84 1 0.91
COPD 0.93 0.9 0.92
Healthy 0.89 0.91 0.9
Pneumonia 0.92 1 0.96
URTI 0.94 0.84 0.89
Accuracy 0.92
Macro Avg. 0.92 0.92 0.92
Weighted Avg. 0.92 0.92 0.92

DIAGNOSIS OF RESPIRATORY DISORDERS IN AMRITA INSTITUTE OF MEDICAL SCIENCES & RESEARCH CENTRE (AIMS):
The GRU model trained on the Amrita Institute of Medical Sciences & Research Centre Entrance (AIMS) dataset achieved a training accuracy of 99% and a testing accuracy of 93%. However, after applying early stopping, the model achieved a slightly lower training accuracy of 98% and a testing accuracy of 89%, indicating that the model might have overfit the training data.

For the GRU model in the AIMS dataset, the accuracy curve, as depicted in Fig. 16, shows a steep increase initially, indicating that the model is learning and improving quickly. The curve then levels off or continues to increase gradually as the model fine-tunes its performance. The fact that the Training Accuracy is 99% suggests that the model has learned the training data well and is likely to perform well on similar data.

The Loss curve, as depicted in Fig. 17, represents the change in the model’s loss or error during the training process. Ideally, the Loss curve should decrease steadily as the model learns and improves its performance. However, in the present system, the Loss curve show an initial decrease followed by a plateau, indicating that the model is overfitting the data i.e., no longer improving on the training data.
Table 3: GRU (AIMS) - Classification Report
Diagnosis Precision Recall F1-score
Bronchiectasis 0.83 0.86 0.84
Bronchiolitis 0.88 0.75 0.81
COPD 1 0.97 0.99
Healthy 0.87 0.96 0.92
Pneumonia 0.99 0.99 0.99
URTI 0.83 0.76 0.79
Wheezing 0.94 0.92 0.93
Accuracy 0.92
Macro Avg 0.91 0.89 0.9
Weighted Avg 0.92 0.92 0.92

The classification report of the model, as depicted in Table 3, reveals an overall high performance, with a Precision of 0.924830, Recall of 0.924188, and F1 score of 0.923444. The Cohen’s kappa coefficient of 0.907507 and Matthews correlation coefficient of 0.907887 further confirm the reliability of the model in terms of agreement between predicted and actual classifications.

Convolutional Neural Network (CNN):
The performance of a Convolutional Neural Network (CNN) model has been assessed on a Kaggle dataset, with respect to its training and testing accuracy. The results of assessment have been depicted in Table 4. The training accuracy of the CNN model is 95%, indicating that the model is able to correctly classify 95% of the training data. The testing accuracy of the model is 90%, which means that the model is able to correctly classify 90% of the testing data.

The classification report provides a detailed evaluation of the performance of a classification model on a given dataset. In the present system, the model used is a CNN (Convolutional Neural Network), and the dataset is a Kaggle dataset. The report includes metrics such as precision, recall, and F1-score for each class in the dataset.

Table 4: CNN (Kaggle) - Classification Report
Diagnosis Precision Recall F1-score
Bronchiectasis 1 0.67 0.8
Bronchiolitis 0 0 0
COPD 0.96 0.99 0.98
Healthy 0.5 0.29 0.36
Pneumonia 0.67 0.57 0.62
URTI 0.6 0.6 0.6
Accuracy 0.91
Macro Avg. 0.62 0.52 0.56
Weighted Avg. 0.91 0.91 0.91

As indicated in the table above, the precision of the model is highest for the Bronchiectasis class, with a value of 1.0, indicating that all positive predictions for this class were correct. However, the recall for this class is 0.67, which means that the model missed identifying one-third of the actual instances of Bronchiectasis.

Further, the Bronchiolitis class has a precision and recall of 0.0, which means that the model did not correctly identify any instances of this class. The COPD class has the highest precision and recall values of 0.96 and 0.99, respectively, indicating that the model performed satisfactorily in identifying instances of this class.

The Healthy class has a relatively low precision and recall, indicating that the model did not perform well in identifying instances of this class. The Pneumonia class has a precision of 0.67 and a recall of 0.57, indicating that the model correctly identified two-thirds of the positive instances of Pneumonia. The URTI class has a precision of 0.6 and a recall of 0.6, indicating that the model performed reasonably well in identifying instances of this class.

Furthermore, the model achieved a remarkable The Receiver Operating Characteristic (ROC) AUC of 0.977231, which is a measure of the model’s ability to distinguish between positive and negative instances. This indicates that the model has a strong performance in correctly identifying the classes within the AIMS dataset. The ROC Curve is a graphical representation of the performance of a binary classification model at different classification thresholds. It is created by plotting the true positive rate (TPR) against the false positive rate (FPR) at various threshold settings.

The area under the AUC curve is a measure of how well the model is able to distinguish between positive and negative instances. An AUC of 1.0 represents a perfect classifier, while an AUC of 0.5 indicates a classifier that is no better than random guessing. The ROC curve of the system has been depicted in Fig. 17. The AUC values for all classes except Bronchiolitis and Healthy are very high, indicating that the model is able to distinguish between positive and negative instances with a high degree of accuracy. The AUC values for Bronchiolitis and Healthy are slightly lower, which suggests that the model may have some difficulty distinguishing between these classes.

ACCURACY COMPARISON:
Table 5 depicts the comparison between the accuracy of Kaggle and AIMS systems based on GRU and CNN. The GRU model trained on the AIMS dataset shows a strong performance, achieving high accuracy and precision scores, a strong Cohen’s kappa coefficient, Matthews correlation coefficient, and an exceptional ROC AUC value. This highlights the potential of GRU models in accurately classifying medical datasets, providing a reliable tool for healthcare professionals in their diagnostic and treatment processes.
Table 5: Accuracy Comparison of Kaggle & AIMS Dataset
Dataset Algorithm
GRU CNN
Kaggle 92%+-1% 90%+-0.5%
AIMS 93%+-1% 90%+-1%

Overall, the real-time auscultation sound diagnostic system, as disclosed in the present invention has the potential to significantly improve the diagnosis and treatment of respiratory illnesses, ultimately improving patient outcomes and quality of life.
, Claims:1. A continuous, remote, real-time monitoring system (S) with automated sound analysis, utilizing deep learning module (DLM) for analyzing and categorizing respiratory sounds comprising hardware architecture (HA), software architecture (SA) and User Interface (UI), wherein:
said hardware architecture (HA) comprises:
- electronic stethoscope for capturing sound data,
- Raspberry Pi for analysing sound data captured by the stethoscope,
- mobile device for controlling the said Raspberry Pi and displaying the diagnostic report,
said software architecture (SA) comprises:
- sound processing module (SPM), incorporating an audio recording software (ARS), for live-recording respiratory sound data,
- deep learning module (DLM), utilizing said advanced deep-learning algorithms powered by artificial intelligence (AI) comprise of Gated Recurrent Units (GRUs), Leaky Rectified Linear Unit (Leaky RELU) and Convoluted Neural Networks (CNNs) to analyze and diagnose respiratory conditions and generate comprehensive diagnostic report (CDR), and
- web platform module (WPM) for automatic diagnosis of lung sounds,
said user interface (UI) being connected to the said web platform module (WPM) providing users with an interface for accessing the system’s capabilities for uploading and analyzing lung data,
characterized in that:
- prompt notification of the said lung diagnosis is provided to the user via email along with detailed information on the lung health status, and
- said diagnosis report is shared with a remotely located medical professional upon the user’s request.
2. The system (S) as claimed in claim 1, wherein deep learning module (DLM) uses a Gated Recurrent Unit (GRU) and Leaky Rectified Linear Unit (LeakyReLU) to classify the respiratory sounds into groups based on the respiratory condition.
3. The system (S) as claimed in claim 1, wherein GRU architecture is composed of a set of recurrent neural network (RNN) units that are gated to selectively allow the flow of information.
4. The system (S) as claimed in claim 1, wherein the said recorded lung sounds are classified in groups including: Healthy, Bronchiectasis, Bronchiolitis, COPD, Asthma, Pneumonia, and URTI..
5. The system (S) as claimed in claim 1, wherein the respiratory sound data is uploaded to the web platform (WP) for analysis using deep learning algorithms (DLA).
6. The system (S) as claimed in claim 1, wherein the web platform generates a diagnostic report which is transmitted to the user via email.
7. The system (S) as claimed in claim 1, wherein the diagnostic report is shared with a remotely located healthcare provider upon user’s request.
8. The system (S) as claimed in claim 1, wherein the accuracy of detection of respiratory conditions using GRU is approximately 92%.
9. The system (S) as claimed in claim 1, wherein the accuracy of detection of respiratory conditions using CNN is 91%.
10. The system (S) as claimed in claim 1, wherein the overall accuracy of detection of respiratory conditions using the said system (S) is approximately 93%
11. A method to record, analyze and categorize respiratory sounds by the system (S) as claimed in claim 1, comprising the steps:
a) capturing respiratory sounds by electronic stethoscope,
b) uploading the recorded sound data in the Raspberry Pi, pre-processing the recorded sound data, extracting sound features for analysis,
c) analyzing said extracted respiratory sounds and classified by the web platform (WP), using deep-learning algorithms (DLA) powered by artificial intelligence (AI) comprising of Gated Recurrent Units (GRUs), Leaky Rectified Linear Unit (Leaky RELU) and Convoluted Neural Networks (CNNs),
d) generating lung diagnostic report by the web platform (WP), and transmitting the said diagnostic report to the user by email,
e) sharing said lung diagnosis report with a remotely located health care provider upon user’s request.

Documents

Application Documents

#	Name	Date
1	202341045291-STATEMENT OF UNDERTAKING (FORM 3) [06-07-2023(online)].pdf	2023-07-06
2	202341045291-FORM FOR SMALL ENTITY(FORM-28) [06-07-2023(online)].pdf	2023-07-06
3	202341045291-FORM 1 [06-07-2023(online)].pdf	2023-07-06
4	202341045291-FIGURE OF ABSTRACT [06-07-2023(online)].pdf	2023-07-06
5	202341045291-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [06-07-2023(online)].pdf	2023-07-06
6	202341045291-EDUCATIONAL INSTITUTION(S) [06-07-2023(online)].pdf	2023-07-06
7	202341045291-DRAWINGS [06-07-2023(online)].pdf	2023-07-06
8	202341045291-DECLARATION OF INVENTORSHIP (FORM 5) [06-07-2023(online)].pdf	2023-07-06
9	202341045291-COMPLETE SPECIFICATION [06-07-2023(online)].pdf	2023-07-06
10	202341045291-FORM-9 [10-07-2023(online)].pdf	2023-07-10
11	202341045291-FORM 18 [10-07-2023(online)].pdf	2023-07-10
12	202341045291-Proof of Right [28-07-2023(online)].pdf	2023-07-28
13	202341045291-ENDORSEMENT BY INVENTORS [28-07-2023(online)].pdf	2023-07-28
14	202341045291-FORM-26 [08-08-2023(online)].pdf	2023-08-08
15	202341045291-FER.pdf	2025-01-30
16	202341045291-OTHERS [28-07-2025(online)].pdf	2025-07-28
17	202341045291-FER_SER_REPLY [28-07-2025(online)].pdf	2025-07-28
18	202341045291-DRAWING [28-07-2025(online)].pdf	2025-07-28

Search Strategy

1	202341045291_SearchStrategyNew_E_SS35E_29-01-2025.pdf