Abstract: SENTIMENT ANALYSIS OF JOURNALS TO MEASURE MEDITATION EFFECTS ON EMOTIONAL WELL-BEING The present invention relates to a computer-implemented method for analyzing the sentiments expressed in journal entries to evaluate the effects of meditation on emotional well-being. The method comprises acquiring a dataset of textual journal entries related to meditation practices, followed by data preprocessing that includes cleaning missing values through mean and mode imputation, removing stop words, punctuation, and special characters, and applying tokenization. The preprocessed text is transformed into numerical features using techniques such as term frequency-inverse document frequency (TF-IDF) and word embeddings like Word2Vec or GloVe. The dataset is split into training and testing sets in an 80:20 ratio to ensure balanced sentiment distribution. Various machine learning models, including Support Vector Machine (SVM), Random Forest, Naïve Bayes, and Convolutional Neural Network (CNN), are trained with optimized hyperparameters using grid search or random search techniques. Model performance is evaluated using accuracy, precision, recall, F1-score, and confusion matrix metrics to ensure effective sentiment classification. The invention provides a robust framework for assessing the emotional impact of meditation using natural language processing and machine learning techniques.
Description:FIELD OF THE INVENTION
This invention relates to Sentiment Analysis of Journals to Measure Meditation Effects on Emotional Well-being
BACKGROUND OF THE INVENTION
Mental health refers to an individual's emotional, psychological, and social well-being, influencing thoughts, emotions, and behaviors. It plays a critical role in shaping how individuals manage stress, maintain relationships, and make decisions in their daily lives. Optimal mental health is essential for a fulfilling and productive life, contributing to an individual's ability to thrive in various aspects of personal and professional domains. However, mental illness is an extremely common phenomenon that generally falls under the vast condition of disrupting thought, feeling, and action. A lot of distress has emerged from these conditions and therefore affects people across all segments of age, backgrounds, and walks of life to impair their effective functioning in life. Sentiment Analysis is a technique that gives a computational way of computing the emotional tone or simply the sentiment of any written text.
It provides insights into an individual's emotional state, attitudes, or opinions through textual data analysis, making it a very useful tool in various fields, such as mental health research. This process is more than simple textual analysis because it reveals underlying emotional patterns that can indicate psychological well-being. Its utility extends to analyzing personal journals, social media posts, or feedback forms, offering an objective lens to measure emotional trends over time. There is much potential for the application of sentiment analysis in mental health research. Using advanced algorithms, sentiments can be classified as positive, negative, or neutral and help researchers gain an in-depth understanding of emotional patterns.
Machine learning techniques have revolutionized the technique of sentiment analysis. Automatic classification of vast amounts of textual data is now possible with this approach, which reduces manual effort while enhancing the precision and scalability of sentiment evaluations. The technique’s adaptability makes it suitable for studying individual behaviors, public opinions, and the effects of interventions like meditation. A wide array of machine learning algorithms powers sentiment analysis, including Support Vector Machines (SVM), Logistic Regression, Naïve Bayes, Random Forest, and Convolutional Neural Networks (CNNs), Fine-Tuning or Inference with XLNet, Bert. Each algorithm brings unique strengths, from handling linear classifications to identifying intricate patterns within text. This integration of models into sentiment analysis frameworks has revolutionized the field and advanced it in the ability to detect more subtle and accurate sentiments.
SUMMARY OF THE INVENTION
This summary is provided to introduce a selection of concepts, in a simplified format, that are further described in the detailed description of the invention.
This summary is neither intended to identify key or essential inventive concepts of the invention and nor is it intended for determining the scope of the invention.
To further clarify advantages and features of the present invention, a more particular description of the invention will be rendered by reference to specific embodiments thereof, which is illustrated in the appended drawings. It is appreciated that these drawings depict only typical embodiments of the invention and are therefore not to be considered limiting of its scope. The invention will be described and explained with additional specificity and detail with the accompanying drawings.
This developing capacity underlines sentiment analysis as a transformative tool, particularly in understanding and addressing mental health challenges through innovative data-driven approaches.
I. Literature Review:
The science of sentiment analysis has come to be a benchmark in understanding emotional states and trends in mental health. Since the past decades, several techniques of sentiment analysis have been employed by researchers to analyze text data, which ranges from public opinion to customer feedback and even mental health assessment. This section provides an overview of the existing literature on the application of sentiment analysis in mental health research, with a particular focus on the methodologies and models employed in this study.
eXtreme Language Model (XLNet):
XLNet is an advanced model of the transformer model for NLP.This allows the model to capture a broader range of contextual relationships and dependencies in the text, enabling it to understand complex, long-range dependencies. Such capabilities are especially useful in the Sentiment Analysis of Journals to Measure Meditation Effects on Emotional Well-being, as meditation-related journal entries can be long and often contain subtle emotional shifts. XLNet models the relationships between words in many permutations, which makes it understand emotional nuances in personal journals better. When applied to this project, XLNet can be fine-tuned on a dataset of journal entries. This enables it to classify emotions as positive, negative, or neutral. Through this, it can accurately monitor the changes in emotional well-being over time, particularly before and after meditation practices. With regard to flexibility, the flexibility allows it to capture the inherent variabilities within sentence structures and expressions present within a typical personal narrative, thereby capturing delicate emotional transitions in meditation that are otherwise not possible. XLNet has been trained on a very large dataset of language and thus generalized, even when fine-tuned on smaller domain-specific data sets. This makes it especially useful for analyzing personal journals, where the data may not be as extensive as other sources. In summary, the powerful contextual modeling and pre-training on large datasets make XLNet a suitable and effective model to understand how meditation influences emotional well-being through sentiment analysis of journal entries.
Bidirectional Encoder Representations from Transformers(BERT):
BERT (Bidirectional Encoder Representations from Transformers) is a transformer-based model that understands the context of words in a sentence by considering both the left and right surrounding words, making it very suitable for tasks where context plays a crucial role, such as sentiment analysis. In this project, BERT may be fine-tuned on a dataset of personal journal entries, capturing the emotional tone of the text and categorizing it as positive, negative, or neutral. Since BERT is bidirectional, it is able to understand the full context of a journal entry, meaning that sentiments will be well identified even when the language used is nuanced or indirect.
The pretraining of BERT on massive amounts of text data (like Wikipedia and BooksCorpus) enables it to generalize very well, even when fine-tuned on smaller specific datasets like meditation journals. After fine-tuning, BERT can be utilized to measure emotional shifts within journal entries before and after meditation sessions to provide insight into how meditation affects mental and emotional well-being over time. Given its ability to understand complex relationships between words, BERT is well suited to identify subtle emotional changes in text, which is critical for tracking the effects of meditation on mental health. This makes BERT a powerful tool in sentiment analysis for understanding and improving emotional well-being.
Support Vector Machines (SVM):
SVM is a supervised learning algorithm mainly utilized for classification tasks and basically works by finding the hyperplane that best classifies the data into different classes. In the context of sentiment analysis, SVM can be trained for journal entries to be categorized into emotive labels such as positive, negative, or neutral based on the content of the text.SVM's strength is its ability to work with high-dimensional data and to determine the best possible decision boundaries even if the data is not linearly separable. Journal entries are rich in emotional content, so they can be preprocessed using techniques like tokenization, stop word removal, and feature extraction, such as TF-IDF or word embeddings. Once features have been extracted, SVM can be trained on the patterns learned from the labeled data to classify sentiment.By using SVM to analyze journals for measuring emotional well-being before and after meditation, the shifts in sentiment and changes in emotional tone can be identified over time. SVM's ability to deal with complex and varied textual data is well suited for understanding nuanced emotional changes in meditation journals. Thus, this approach can provide useful insights about how meditation influences emotional states by making a quantitative measurement of well-being over time through SVM for sentiment analysis.
Random Forest:
Random Forest can be trained to classify journal entries as positive, negative, or neutral by learning from features extracted from the text, such as word frequencies or word embeddings. It can analyze journal entries before and after meditation and identify patterns and shifts in emotional tone, thus allowing changes in emotional well-being to be measured.The model is also capable of handling high-dimensional data, which makes it suitable for text analysis, where features like word occurrence or sentiment-related terms are used. It is also noise-resistant, which is a common feature in personal journal entries. This approach can effectively track how meditation affects emotional states by using Random Forest, providing valuable insights into the long-term impact of meditation on mental health.
Naive Bayes:
Naïve Bayes can be trained on a labeled dataset of journal entries categorized according to their emotional tone. The model learns the relationship between specific words or phrases in the text and the corresponding sentiment. Once trained, Naïve Bayes can classify new journal entries, giving insights into how meditation impacts emotional well-being over time. This classifier has proved to be quite successful in text data, where the size of feature spaces can be very large. Its simplicity also contributes much to its success. A representation of journal entries into numerical vectors is feasible with feature extraction techniques like TF-IDF or bag-of-words, and that numerical vector is used in training the model. Thus, this approach is going to track emotional changes from the journal entries before and after meditation, providing great psychological insight into the effect of meditation.
Convolutional Neural Networks (CNNs):
The use of CNNs in journal entry classification to positive, negative, or neutral categories. The text data are preprocessed and represented as word embeddings, such as Word2Vec , capturing the semantic meaning of words. The CNN applies convolutional layers to extract relevant features from the text and classifies it based on the emotional tone detected. CNNs are particularly effective in capturing hierarchical patterns in text, such as relationships between words and phrases, which makes them suitable for the analysis of journal entries with different structures and emotional subtleties. By applying CNN to sentiment analysis, this model can track emotional shifts in journals before and after meditation, which would provide insights into how meditation influences emotional well-being and offer a detailed understanding of the emotional impact over time.
DETAILED DESCRIPTION OF THE INVENTION
The detailed description of various exemplary embodiments of the disclosure is described herein with reference to the accompanying drawings. It should be noted that the embodiments are described herein in such details as to clearly communicate the disclosure. However, the amount of details provided herein is not intended to limit the anticipated variations of embodiments; on the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the scope of the present disclosure as defined by the appended claims.
It is also to be understood that various arrangements may be devised that, although not explicitly described or shown herein, embody the principles of the present disclosure. Moreover, all statements herein reciting principles, aspects, and embodiments of the present disclosure, as well as specific examples, are intended to encompass equivalents thereof.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments. As used herein, the singular forms “a",” “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises,” “comprising,” “includes” and/or “including,” when used herein, specify the presence of stated features, integers, steps, operations, elements and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components and/or groups thereof.
It should also be noted that in some alternative implementations, the functions/acts noted may occur out of the order noted in the figures. For example, two figures shown in succession may, in fact, be executed concurrently or may sometimes be executed in the reverse order, depending upon the functionality/acts involved.
In addition, the descriptions of "first", "second", “third”, and the like in the present invention are used for the purpose of description only, and are not to be construed as indicating or implying their relative importance or implicitly indicating the number of technical features indicated. Thus, features defining "first" and "second" may include at least one of the features, either explicitly or implicitly.
Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which example embodiments belong. It will be further understood that terms, e.g., those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
Data Preprocessing:
The dataset, which is composed of journal entries on meditation and emotional well-being, has undergone the following preprocessing steps:
1. Data Cleaning: Missing values were handled using mean imputation for continuous features (such as word count or sentiment scores) and mode imputation for categorical features (such as sentiment labels).
2. Text Preprocessing: The text data was cleaned by removing stop words, punctuation, and special characters. Tokenization was used to split the journal entries into individual words or phrases.
3. Feature Extraction: Term frequency-inverse document frequency (TF-IDF) and word embeddings, for example, Word2Vec or GloVe, were applied to transform the text data into numerical vectors that can be processed by machine learning algorithms.
4. Distributing Data to Training Set and Testing Set: The dataset split was done as 80:20 in order for the sentiments across the training sets and test sets to appropriately be distributed: positive, negative, neutral.
Hyper-Parameter Tuning:
Each machine learning model was trained end tuned with the use of either grid search or a random search in order for the hyper-parameters tuning.
1. SVM: Used both linear and RBF kernels. Important hyperparameters like the regularization parameter C and gamma were optimized.
2. Random Forest: Hyperparameters like the number of trees, maximum depth, and minimum samples for splitting were optimized.
3. Naïve Bayes: Smoothing parameter was optimized to be either Laplace or additive smoothing to achieve better performance on the text classification task.
4. CNN: The number of convolutional layers, kernel sizes, and the count of filters for feature extraction have been optimized.
Evaluation Metrics:
All models have been evaluated on the test set using the following metrics:
• Accuracy: Overall correct sentiment prediction percentage.
• Precision, Recall, F1-Score: The score on these metrics are extremely important to understand whether or not the model can successfully classify each sentiment, more particularly for imbalanced data.
• Confusion Matrix: Calculation of true positives, false positives, true negatives, and false negatives were computed for analysis of classification accuracy by categories of sentiments.
II. Results:
Model Accuracy Precision Recall F1-Score
XLNet 0.8656 1.00 1.00 1
SVM 0.7274 0.76 0.60 0.64
Random Forest 0.7081 0.81 0.56 0.63
LOGISTIC REGRESSION:
In the first chart, near-perfect performance reflects accuracy, precision, recall, and F1-score by the CNN model. There is very minimal misclassification between positive, negative, and neutral sentiment labels; hence, it is clear that the model is doing a great job.
DECISION TREE:
Near-perfect performance with accuracy, precision, recall, and F1-score at 1.0; an excellent classifier with little error. However, such perfect results might be an indication of overfitting, which should then be cross validated for unseen data as well.
SUPPORT VECTOR MACHINE:
The SVM model shows excellent performance with precision, recall, and F1-score all at 1.0 indicating almost flawless classification with very few errors. Nonetheless, such results may indicate overfitting to some extent, so further cross-validation on unseen data is required for robustness.
ADABOOST CLASSIFIER:
AdaBoost classifier gives very close to perfect performance accuracy, precision, recall, and F-score at 1.0, which signifies strong classification with few mistakes. The perfect score again can be suggestive of overfitting, and therefore, the model needs to be validated on unseen data through cross-validation.
RANDOM FOREST CLASSIFIER :
The Random Forest classifier yields near-perfect performance in terms of accuracy, precision, recall, and F1-score at 1.0, which is further evidence of good classification and a low number of errors. Such perfect metrics, however, may indicate overfitting, and validation on unseen data is necessary to establish its generalization capability.
K NEAREST NEIGHBOUR:
KNN classifier shows almost perfect performances when accuracy, precision, recall, and F1-score are all 1.0, which means good classification with minimal mistakes. However, these perfect results may indicate overfitting, so an evaluation over unseen data through cross-validation becomes imperative to validate its robustness.
ARTIFICIAL NEURAL NETWORKS:
The ANN classifier performs near perfectly with accuracy, precision, recall, and F1-score all set as 1.0, which reflects exceptional classification with a minimum error. However, perfect metrics can be the result of overfitting, so the performance on unseen data with cross-validation is ensured for reliable generalization.
The bar chart of the accuracy scores between various models; all are almost very close to 100%, from Logistic Regression, KNN, Decision Tree, SVM, Random Forest, AdaBoost, and ANN models. This suggests that all the models perform well, but sometimes such results might indicate overfitting, thus a necessity to cross-validate again on unseen data to be generalized.
LEARNING CURVES:
The learning curve shows that cross-validation scores and training scores vary according to training size. The curve of the training score (in red) stays very high, whereas for the cross-validation score (in green), it is low at the beginning but progresses with increasing size, indicating that more extensive data generalize better without overfitting.
III. CONCLUSION:
This paper compares six machine learning models: SVM, Decision Trees, AdaBoost, k-Nearest Neighbors (KNN), Random Forest, and Artificial Neural Networks (ANN). Among these, the best model was found to be ANN, with the highest value of recall and overall F1-score. One of the reasons it is a good fit for this application is its capability to model complex, non-linear relationships between features. Despite the computational load, accuracy, and reliability, ANN also promises improved CKD diagnostic capabilities with improved patient outcomes. Future works in this direction can explore some avenues in the following areas: Investigating advanced architectures of ANN, such as CNNs or RNNs that handle larger and complexities in datasets. Addition of techniques such as SHAP (Shapley Additive Explanations) to interpretable ANN models that can increase trust among clinicians. Combination of ANN with ensemble methods like Random Forest + ANN so that strengths of both can be maximized. By countering those, these ANN-based systems could make an even bigger impact with diagnostic precision, as well as early identification of CKD on a global scale.
, Claims:1. A computer-implemented method for sentiment analysis of journal entries to measure meditation effects on emotional well-being, the method comprising:
collecting journal entries associated with meditation practices and emotional states;
performing data cleaning by imputing missing continuous values using mean and categorical values using mode;
preprocessing the journal entries by removing stop words, punctuation, and special characters, and applying tokenization;
extracting features from the preprocessed text using term frequency-inverse document frequency (TF-IDF) and word embeddings such as Word2Vec or GloVe;
and classifying sentiments using trained machine learning models.
2. The method as claimed in claim 1, wherein the machine learning models include at least one selected from a group consisting of:
a support vector machine (SVM) with linear and RBF kernels,
a random forest classifier,
a Naïve Bayes classifier with optimized smoothing techniques,
and a convolutional neural network (CNN) having optimized convolutional layers, kernel sizes, and filters.
3. The method as claimed in claim 1, wherein the dataset is split into training and testing sets in an 80:20 ratio to ensure balanced distribution of sentiment categories comprising positive, negative, and neutral.
4. The method as claimed in claim 1, wherein hyperparameter tuning of the machine learning models is performed using at least one method selected from the group consisting of grid search and random search, to optimize model-specific parameters including but not limited to:
regularization parameter and gamma for SVM,
number of trees and maximum depth for random forest,
and smoothing parameter for Naïve Bayes.
5. The method as claimed in claim 1, wherein model evaluation is carried out using performance metrics comprising accuracy, precision, recall, F1-score, and confusion matrix analysis, to assess classification effectiveness across various sentiment categories.
| # | Name | Date |
|---|---|---|
| 1 | 202541050035-STATEMENT OF UNDERTAKING (FORM 3) [24-05-2025(online)].pdf | 2025-05-24 |
| 2 | 202541050035-REQUEST FOR EARLY PUBLICATION(FORM-9) [24-05-2025(online)].pdf | 2025-05-24 |
| 3 | 202541050035-POWER OF AUTHORITY [24-05-2025(online)].pdf | 2025-05-24 |
| 4 | 202541050035-FORM-9 [24-05-2025(online)].pdf | 2025-05-24 |
| 5 | 202541050035-FORM FOR SMALL ENTITY(FORM-28) [24-05-2025(online)].pdf | 2025-05-24 |
| 6 | 202541050035-FORM 1 [24-05-2025(online)].pdf | 2025-05-24 |
| 7 | 202541050035-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [24-05-2025(online)].pdf | 2025-05-24 |
| 8 | 202541050035-EVIDENCE FOR REGISTRATION UNDER SSI [24-05-2025(online)].pdf | 2025-05-24 |
| 9 | 202541050035-EDUCATIONAL INSTITUTION(S) [24-05-2025(online)].pdf | 2025-05-24 |
| 10 | 202541050035-DECLARATION OF INVENTORSHIP (FORM 5) [24-05-2025(online)].pdf | 2025-05-24 |
| 11 | 202541050035-COMPLETE SPECIFICATION [24-05-2025(online)].pdf | 2025-05-24 |