Abstract: The medical industry or any other field, technological innovations rapidly impact all aspects of life. Despite its decision-making process based on data analysis, artificial intelligence has demonstrated encouraging outcomes in the field of health care. In a short period of time, COVID-19 has impacted more than 100 countries. People on every continent are susceptible to its long-term effects. The creation of a coronavirus detection control system is essential. The detection of diseases with the aid of various artificial intelligence (AI) programmes may be one way of regulating the current chaos. This invention used classical and cooperative machine-learning methods to divide textual medical reports into four types. For feature engineering, techniques like Term frequency/inverse document frequency (TF/IDF), Bag of words (BOW), and report length. Conventional and ensembles machine learning classifiers received these features. With 96.2% testing accuracy, regression using logit and multinomial naive Bayes regression outperformed other machine learning techniques. 5 Claims & 1 Figure
Description:Field of Invention
The present invention is relating to a system and method identifying coronavirus using machine learning based approaches by analyzing textual data.
The Objectives of this Invention
Developing an administration system that can identify the coronavirus is the primary goal of this invention. Detecting diseases with various AI tools may be one way of controlling the current chaos. The primary goal of the present discovery is to implement the framework with significantly greater precision, which integrates the categorization for additional effects.
Background of the Invention
SARS-COV-2, also known as new coronavirus, first appeared in China and quickly spread over the world. The disease that resulted was given the designation COVID-19 by the World Health Organization (WHO). On March 11, 2020, COVID-19 was declared a pandemic. Fever, cough, tiredness, and myalgia are among the early signs of COVID-19. Shortness of breath, pneumonia, severe acute respiratory illness, cardiac issues, and even death might occur in more serious situations. It's critical to figure out who's affected as soon as possible so that quarantine and contact tracing may be used to keep the disease from spreading further. Governments all across the world have issued social separation and self-isolation decrees in reaction to COVID-19. According to (WO2021/231044A1), introduced a machine-learning-based coronavirus detection method is revealed. One or more processors are used to interact with a number of wearable medical sensors in the system (WMSs). The processors are set up to accept physiological data from WMSs as well as questionnaire data from a user interface. The processors are also set up to construct at least one coronavirus inference model by training at least one neural network based on raw physiological data and questionnaire data supplemented with synthetic data and subjected to a grow-and-prune methodology.
The another type of application invented in (CN2020/111834010A), provides a COVID-19 false negative detection technique based on attribute reduction and XGboost, which includes the steps: s1: obtaining data from COVID-19 case samples, preprocessing, and improving the data; s2, attribute reduction, data dimensionality reduction, and the split of sample data into training and test sets; s3, importance screening on COVID-19 detection core indexes using the XGboost tree-lifting extensible system; S4, using the data in the training set to train the XGboost method assessment methodology and construct an evaluation system; and S5, using evaluations algorithm to determine the case data. Another method was invented in (US2020/10689716B1), here, test kits and methods for identifying the presence of Coronavirus polynucleotides in biological samples are discussed. The (TR2020/06563A2), invention's kit consists of novel primer and probe sets that supply all of these processes, as well as customized enzymatic activity as well as antagonist combinations.
Recently, there has been a lot of excitement in NLP, particularly in the area of text analysis. One of the key tasks in the analysis of texts is categorization, which may be carried out using a variety of methods (Wu et al. [2020], Nature, Vol. 579, pp. 265-269). For the purpose of extracting unorganised information, Kumar et al. [2018], International Journal of Information Technology, 12, pp. 1159–1169, conducted a SWOT assessment of various controlled and uncontrolled text categorization techniques. The categorization of text has several uses, including analysis of sentiment, identifying fraudulent activity, and recognising spam, . The main applications of sentiment mining are in companies, marketing, and elections. With the aid of lexicon-based dictionaries, Verma et al. (2019), International Journal of Recent Technology and Engineering (IJRTE), Vol. 8, No. 3, pp. -8338-8341, evaluated the opinions of Indian government programmes. The use of machine learning has altered the way that diagnoses are thought of by effectively treating conditions like diabetic and seizures.
Bullock et al. (2020), Journal of Artificial Intelligence Research, Vol. 69, pp. 807-845, claim that deep learning and machine learning technologies can replace individuals by providing an accurate diagnosis. The ideal diagnosis can be more economical than COVID-19 routine examinations and can spare radiologists’ time. To train an automated learning model, X-rays and CT scans, or computed tomography scans, might be used. In this context, several projects are in progress.
In (Chinnasamy et al. (2022), Materials Today: Proceedings, Volume 64, Part 1, Pages 448-451), For the purposes of the study, we provide sentiment analysis utilising data from Twitter. Our program first retrieves tweets and hashtags about various types of covid vaccines posted on Twitter using the Twitter API. The imported Tweets are then automatically set up to produce a set of variables that are random and uninformed rules. The doctor's commitment can be reduced by a diagnostic decision-making tool that helps the clinician analyse the victims' lung scans. In the present investigation, machine learning techniques, notably the Convolutional Neural Networks (CNN) VGG16 model, have been built by (K. S. Prasad et al. (2022), 2022 International Conference on Computer Communication and Informatics (ICCCI), Coimbatore, India, 2022, pp. 1-5).
Summary of the Invention
The current invention primarily addresses and resolves technological issues that existed in the prior art. In order to address these issues, the current invention uses machine learning based approaches to forecast COVID-19 utilizing clinical text data. In this invention, we have several modules to analyze the data to predict the covid-19 diseases by applying machine learning algorithms.
Detailed Description of the Invention
Coronavirus pandemic was deemed an imminent danger by the W.H.O. Information on this worldwide epidemic is freely available from researchers and institutions. We gathered the data from GitHub, a publicly accessible data resource. This contains information on 212 patients with coronavirus and other virus characteristics, make up the majority of the data, which has roughly 24 properties. License. Medical and other notes. We gathered medical records and findings because our work involves text mining. Clinical notes are made up of text, while characteristic findings are made up of labels for the associated text. The length of the 212 documents was determined. Due to the text's lack of organisation, it required to be improved so that machine learning techniques could be used. In this phase, various procedures are used to clean up the text by getting rid of extraneous text. Lemmatization and capitalization are used to improve the data's level of refinement. Stop words, symbols, URLs, and links are deleted to improve classification accuracy. Numerous features have been removed according to interpretation from the transformed clinical reports and transformed to stochastic values. The TF//IDF approach is used to extract pertinent characteristics. The bag of words was also considered, and bigrams and unigrams were extracted. We found 40 pertinent traits that can be used to classify data. Machine learning algorithms are fed the same data by assigning the feature that corresponds to weight.
Text and sentiment analysis can be used as a part of a working model to identify COVID-19 diseases based on textual data. Here's a high-level explanation of how such a model might work:
Data collection: The first step is to gather a dataset of textual data related to COVID-19, such as social media posts, news articles, research papers, and medical reports. This dataset should include a mix of positive and negative cases, including both confirmed COVID-19 cases and other respiratory illnesses. Data preprocessing: The collected textual data needs to be preprocessed to remove noise and standardize the format. This preprocessing may involve steps such as removing irrelevant characters or symbols, converting text to lowercase, removing stopwords, and performing tokenization (splitting the text into individual words or tokens). Feature extraction: Next, relevant features need to be extracted from the preprocessed text data. In the case of text analysis, common techniques include bag-of-words (BoW), term frequency-inverse document frequency (TF-IDF), and word embeddings (e.g., Word2Vec or GloVe). These techniques capture the semantic meaning and context of the words in the text. Sentiment analysis: Once the features are extracted, sentiment analysis techniques can be applied to determine the sentiment expressed in the text. Sentiment analysis aims to classify the sentiment as positive, negative, or neutral. This step can be crucial in identifying the emotional tone associated with the disease, such as anxiety, fear, or relief, expressed by individuals in the text data. Classification model: The preprocessed and feature-extracted data, along with sentiment labels, can be used to train a classification model. Various machine learning algorithms can be employed for this purpose, such as decision trees, random forests, support vector machines (SVM), or more advanced deep learning models like recurrent neural networks (RNNs) or transformers. Model training and evaluation: The dataset is split into training and testing sets. The training set is used to train the classification model, which learns the patterns and relationships between the textual features and sentiment labels. The testing set is then used to evaluate the model's performance by measuring metrics like accuracy, precision, recall, and F1-score. Prediction: Once the model is trained and evaluated, it can be used to predict the sentiment and classify new, unseen text data. The model will assign a sentiment label to the text, indicating whether it is related to a COVID-19 case or not. Refinement and improvement: The model can be continuously refined and improved by incorporating new data and fine-tuning the model parameters. This iterative process helps to enhance the model's performance and adapt to evolving patterns in the text data. . Classification module - With the help of this module, we are able to run all traditional techniques on feature information and calculate their accuracy while also providing the accuracy, precision, recall, and F Score for each algorithm. Each algorithm's accuracy, precision, recall, and f score are displayed in a group bar graph with the algorithm name on the x-axis and the values on the y-axis in the above graph. It's important to note that while text and sentiment analysis can provide valuable insights, they should not be considered as the sole diagnostic tool for identifying COVID-19 diseases. These techniques can complement other medical and diagnostic approaches, such as laboratory tests and clinical evaluations, for more accurate disease identification.
The detection and analysis of COVID-19 in the healthcare sector using various techniques and technologies offer several advantages. Here are some key benefits:
Early detection: Efficient COVID-19 detection and analysis methods allow healthcare professionals to identify cases early, even before individuals exhibit severe symptoms. This early detection enables timely intervention, isolation, and appropriate treatment, reducing the spread of the virus and potentially saving lives. Informed decision-making: Data-driven COVID-19 detection and analysis provide valuable insights into the epidemiological patterns, transmission rates, and severity of the disease. These insights enable healthcare authorities to make informed decisions regarding public health measures, resource allocation, and policy implementations, helping to mitigate the impact of the virus. Improved treatment strategies: Analysis of COVID-19 data, including patient symptoms, comorbidities, and treatment outcomes, helps researchers and healthcare professionals develop and refine treatment strategies. By understanding the factors that influence disease progression and treatment response, healthcare providers can tailor interventions and improve patient outcomes. Monitoring and surveillance: COVID-19 detection and analysis play a crucial role in monitoring the spread of the virus and assessing its impact on public health. Real-time data analysis allows authorities to identify hotspots, emerging variants, and potential outbreaks, enabling them to implement targeted interventions and preventive measures. Remote monitoring and telemedicine: With the help of COVID-19 detection and analysis tools, remote monitoring and telemedicine have become increasingly viable options. Patients can provide health-related information, such as symptoms and vital signs, remotely, reducing the need for physical visits to healthcare facilities. This approach not only minimizes exposure risks but also allows healthcare providers to monitor and assess patients' conditions more efficiently. Research and development: The analysis of COVID-19 data contributes to ongoing research and development efforts. By examining large-scale datasets, researchers can gain insights into the virus's behavior, its impact on different population groups, and the effectiveness of various interventions. This knowledge aids in the development of vaccines, therapies, and public health strategies to combat the pandemic effectively. Overall, COVID-19 detection and analysis in the healthcare sector offer significant advantages by enabling early detection, informed decision-making, resource optimization, and improved patient care. These approaches play a vital role in mitigating the spread of the virus, managing the impact on healthcare systems, and ultimately saving lives.
5 Claims & 1 Figure
Brief description of Drawing
In the figure which are illustrate exemplary embodiments of the invention.
Figure 1, The Process of identifying COVID-19 using Machine Learning Algorithms , Claims:The scope of the invention is defined by the following claims:
Claim:
1. A system/method for detecting COVID-19 using machine learning algorithm, said system/method comprising the steps of:
a) The system initiates with patient data (1), from this data collection will be performed (2).
b) After that, the dataset creation (3) is started with all the information about the patient. Then, the data preprocessing (4) will start to remove the unwanted attributes from the dataset.
c) The feature engineering (5) is used to extract various features based on the necessity.
d) The classification (6) is used to apply the machine learning algorithms to predict the Covid-19 variant (7)
2. As mentioned in claim 1, the data collection process is used to collect the information relates to patient to make new dataset.
3. According to claim 1, the data preprocessing is used to remove all the unnecessary data from the dataset.
4. As per claim 1, the feature engineering process is used to extract the features from the dataset and calculates the probabilistic values/scores.
5. As per claim 1, the machine learning process is used to apply the different classification algorithms to predict the Covid-19 as well as improve the accuracy of prediction.
| # | Name | Date |
|---|---|---|
| 1 | 202341077576-REQUEST FOR EARLY PUBLICATION(FORM-9) [15-11-2023(online)].pdf | 2023-11-15 |
| 2 | 202341077576-FORM-9 [15-11-2023(online)].pdf | 2023-11-15 |
| 3 | 202341077576-FORM FOR STARTUP [15-11-2023(online)].pdf | 2023-11-15 |
| 4 | 202341077576-FORM FOR SMALL ENTITY(FORM-28) [15-11-2023(online)].pdf | 2023-11-15 |
| 5 | 202341077576-FORM 1 [15-11-2023(online)].pdf | 2023-11-15 |
| 6 | 202341077576-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [15-11-2023(online)].pdf | 2023-11-15 |
| 7 | 202341077576-EVIDENCE FOR REGISTRATION UNDER SSI [15-11-2023(online)].pdf | 2023-11-15 |
| 8 | 202341077576-DRAWINGS [15-11-2023(online)].pdf | 2023-11-15 |
| 9 | 202341077576-COMPLETE SPECIFICATION [15-11-2023(online)].pdf | 2023-11-15 |