Abstract: AN UNIFIED TRANSFORMER FRAMEWORK SYSTEM FOR SARCASM, IRONY, HUMOR, AND REGULAR TEXT DETECTION ON SOCIAL MEDIA The present invention discloses a system and method for unified detection of sarcasm, irony, humor, and regular text on social media platforms. The system comprises a preprocessing module, a balanced dataset, a transformer-based classification module utilizing a fine-tuned T5 model, an adversarial testing module, and an interpretability module. Unlike conventional binary sarcasm detectors, the invention provides multi-class classification, enabling nuanced detection of expressive language. The framework demonstrates superior performance with over 96% accuracy and F1-scores above 0.91 across categories. Robustness is ensured through adversarial testing on hyperbole, slang, idioms, ambiguous expressions, and cultural variations. The method involves preprocessing raw text, classifying into four categories, testing adversarial resistance, and generating interpretability outputs for transparency. The invention is applicable for sentiment analysis, content moderation, chatbots, and online monitoring, offering a scalable, cloud-deployable, and culturally adaptable solution for detecting context-dependent textual phenomena in dynamic digital environments.
Description:FIELD OF THE INVENTION
The present invention relates to the field of natural language processing (NLP) and artificial intelligence, specifically to transformer-based deep learning systems for context-aware sentiment and expression classification. More particularly, the invention discloses a unified transformer framework capable of detecting sarcasm, irony, humor, and regular text within social media and online communications.
BACKGROUND OF THE INVENTION
Sarcasm detection is a challenging task in natural language processing that plays an important role in sentiment analysis and contextual understanding. This study presents a novel implementation of the T5 (Text-to-Text Transfer Transformer) model for detecting sarcasm and related textual phenomena, including irony, humor, and regular text. To train T5 model a dataset containing four balanced classes was mixed to ensure robust training and evaluation. The model achieved an overall accuracy of 96%, with class-specific F1-scores exceeding 0.91, demonstrating its efficacy in handling nuanced and context-dependent linguistic expressions. The model is also tested on adversarial texts with different test cases to evaluate the model's robustness.
US11080485B2: Joke recognition methods include using server(s) coupled with data store(s) to communicatively couple with a first computing device through a telecommunications network. A first communication is provided to a user through a user interface of the first computing device or is received through the user interface. A second communication is provided to the user through the user interface or is received through the user interface. In response to providing or receiving the second communication, the server(s) determine whether the second communication, relative to the first communication, includes a joke and/or a punchline. Upon determining that the second communication includes a joke and/or a punchline, the server(s) initiate sending one or more responses to the first computing device. The response(s) initiate providing, through the user interface, an indication to the user that the second communication is recognized as a joke/punchline. Systems for joke recognition provide the disclosed joke recognition methods.
US10216850B2: In one embodiment, a method includes accessing a plurality of communications, each communication being associated with a particular content item and including a text of the communication; calculating, for each of the communications, sentiment-scores corresponding to sentiments, wherein each sentiment-score is based on a degree to which n-grams of the text of the communication match sentiment-words associated with the sentiments; determining, for each of the communications, an overall sentiment for the communication based on the calculated sentiment-scores for the communication; calculating sentiment levels for the particular content item corresponding sentiments, each sentiment level being based on a total number of communications determined to have the overall sentiment of the sentiment level; and generating a sentiments-module including sentiment-representations corresponding to overall sentiments having sentiment levels greater than a threshold sentiment level.
Traditional sarcasm and humor detection approaches fail to capture context-dependent, non-literal expressions effectively. Existing models generally operate as binary classifiers, limiting themselves to distinguishing sarcasm from non-sarcasm, while ignoring related categories such as irony and humor. These models lack robustness against adversarial examples, fail to generalize across cultural and linguistic variations, and provide limited interpretability. This leads to misclassification of nuanced textual expressions, thereby reducing reliability in real-world applications like sentiment analysis, chatbots, and social media monitoring.
The present invention solves these issues by providing a multi-class transformer-based framework trained on a balanced dataset, optimized to detect sarcasm, irony, humor, and regular text simultaneously. It achieves higher robustness, better contextual understanding, and adaptability across linguistic nuances compared to conventional methods.
SUMMARY OF THE INVENTION
This summary is provided to introduce a selection of concepts, in a simplified format, that are further described in the detailed description of the invention.
This summary is neither intended to identify key or essential inventive concepts of the invention and nor is it intended for determining the scope of the invention.
The invention provides a unified sarcasm and expression detection system utilizing the Text-to-Text Transfer Transformer (T5) model. Unlike existing binary models, the proposed framework supports multi-class classification into four categories: sarcasm, irony, humor, and regular text. It leverages large-scale, balanced datasets sourced from platforms like Twitter and Reddit, enabling improved accuracy and contextual learning.
The system integrates adversarial robustness by training and testing on diverse linguistic cases, including idiomatic expressions, cultural slang, overstatements, and ambiguous statements. It achieves superior performance, with an accuracy exceeding 96% and F1-scores above 0.91 across categories.
The invention provides a comprehensive framework that combines data preprocessing, transformer-based classification, adversarial evaluation, and contextual interpretability into a scalable model deployable across industries. The system may be implemented on cloud-based infrastructures, integrated with social media monitoring platforms, or embedded within conversational AI tools to enhance sentiment understanding and content moderation.
By addressing cultural variability, adversarial resistance, and nuanced contextual comprehension, the invention creates a robust NLP solution for modern digital communication challenges.
To further clarify advantages and features of the present invention, a more particular description of the invention will be rendered by reference to specific embodiments thereof, which is illustrated in the appended drawings. It is appreciated that these drawings depict only typical embodiments of the invention and are therefore not to be considered limiting of its scope. The invention will be described and explained with additional specificity and detail with the accompanying drawings.
Context Dependence: Sarcasm and irony often depends on external context, making them difficult to detect from text alone. For example, “Great job, team!” this sentence could be true or sarcastic, depending on its previous sentence and background knowledge.
Linguistic Complexity: Sarcasm employs overstatement, understatement, and not literal questions. For example, “Sure, let’s add pineapple to pizza. What could possibly go wrong?” combines overstatement and not literal questioning, making computational interpretation difficult in automatic detection process.
Data Imbalance: Sarcasm detection datasets often suffer from class imbalance, where sarcastic examples are significantly fewer than non-sarcastic ones, leading to biased model predictions.
Evaluation and Robustness: Traditional evaluation metrics like accuracy and F1-score may not fully capture a model’s robustness. Adversarial testing, where models are tested against intentionally challenging examples, has revealed vulnerabilities in existing approaches.
BRIEF DESCRIPTION OF THE DRAWINGS
The illustrated embodiments of the subject matter will be understood by reference to the drawings, wherein like parts are designated by like numerals throughout. The following description is intended only by way of example, and simply illustrates certain selected embodiments of devices, systems, and methods that are consistent with the subject matter as claimed herein, wherein:
FIGURE 1: SYSTEM ARCHITECTURE
FIGURE 2 PERFORMANCE PRECISION, RECALL AND F1 SCORE OF EACH CLASS
The figures depict embodiments of the present subject matter for the purposes of illustration only. A person skilled in the art will easily recognize from the following description that alternative embodiments of the structures and methods illustrated herein may be employed without departing from the principles of the disclosure described herein.
DETAILED DESCRIPTION OF THE INVENTION
The detailed description of various exemplary embodiments of the disclosure is described herein with reference to the accompanying drawings. It should be noted that the embodiments are described herein in such details as to clearly communicate the disclosure. However, the amount of details provided herein is not intended to limit the anticipated variations of embodiments; on the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the scope of the present disclosure as defined by the appended claims.
It is also to be understood that various arrangements may be devised that, although not explicitly described or shown herein, embody the principles of the present disclosure. Moreover, all statements herein reciting principles, aspects, and embodiments of the present disclosure, as well as specific examples, are intended to encompass equivalents thereof.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments. As used herein, the singular forms “a",” “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises,” “comprising,” “includes” and/or “including,” when used herein, specify the presence of stated features, integers, steps, operations, elements and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components and/or groups thereof.
It should also be noted that in some alternative implementations, the functions/acts noted may occur out of the order noted in the figures. For example, two figures shown in succession may, in fact, be executed concurrently or may sometimes be executed in the reverse order, depending upon the functionality/acts involved.
In addition, the descriptions of "first", "second", “third”, and the like in the present invention are used for the purpose of description only, and are not to be construed as indicating or implying their relative importance or implicitly indicating the number of technical features indicated. Thus, features defining "first" and "second" may include at least one of the features, either explicitly or implicitly.
Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which example embodiments belong. It will be further understood that terms, e.g., those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
The invention introduces a transformer-based deep learning framework designed to classify social media text into sarcasm, irony, humor, and regular text. The model architecture employs the T5 (Text-to-Text Transfer Transformer), which is particularly suited for sequence-to-sequence learning and contextual text transformations.
The model begins with a preprocessing stage where raw social media data is cleaned, tokenized, and normalized. Tweets, Reddit posts, and other short-form online texts are processed to remove noise such as special characters, URLs, and emojis while retaining meaningful linguistic markers that often carry sarcasm or humor.
The dataset used for training and evaluation is balanced across the four categories, ensuring that minority classes such as sarcasm and irony are not underrepresented. A total of approximately 79,813 samples are included, providing sufficient diversity for generalization across contexts.
The T5 model is fine-tuned on this dataset using supervised learning, where the input sequence is a raw text and the output sequence is the class label. Unlike traditional classification models, the T5 architecture reformulates the classification problem as a text generation problem, which enhances the contextual relationship capture between words and phrases.
During training, performance is monitored across multiple metrics, including accuracy, precision, recall, F1-score, Cohen’s Kappa, and Matthews Correlation Coefficient. These metrics provide a multi-dimensional view of model performance beyond traditional accuracy.
Adversarial robustness is tested by introducing challenging test cases such as hyperbole, idioms, ambiguous statements, slang, cross-cultural expressions, and conversational context. The model demonstrates resilience by maintaining consistent classification across these variations, surpassing the limitations of conventional approaches.
The system also allows deployment in real-world scenarios where incoming social media streams can be automatically classified. This enables businesses, researchers, and governments to analyze public sentiment more accurately, identify humor-based trends, and detect sarcasm in customer feedback or online discourse.
Cultural adaptability is incorporated through multilingual extensions, where the base T5 model can be fine-tuned for other languages using transfer learning. This ensures global applicability of the invention.
The invention further includes interpretability mechanisms, where attention weights and output explanations can be visualized to provide users with an understanding of why a particular classification decision was made. This addresses the “black box” limitation often associated with transformer-based models.
Additionally, the system supports integration with AI-based chatbots and content moderation systems, enabling automated detection of sarcastic or humorous statements that may otherwise lead to misinterpretation in automated decision-making pipelines.
The framework can be implemented as a standalone server system, a cloud-based SaaS API, or embedded within enterprise-level NLP pipelines. This flexibility ensures that the invention is scalable and adaptable across diverse industrial applications.
Through its design, the invention creates a unified adaptive framework that enhances the accuracy, robustness, and interpretability of sarcasm and humor detection, providing a critical advancement in NLP technologies.
Best Method of Working
The best method of working involves deploying the model as a cloud-based NLP service integrated with social media monitoring dashboards. The method begins with continuous data ingestion from platforms such as Twitter and Reddit. Preprocessing pipelines normalize incoming data before classification.
The fine-tuned T5 model is deployed within a containerized environment (e.g., Docker) for scalability and integrated with an inference API. Incoming text is classified into sarcasm, irony, humor, or regular text in real-time. Performance is continuously monitored with feedback loops for retraining on newly collected data, ensuring robustness against evolving language patterns.
The system may also be extended with multilingual support through transfer learning, where the model is fine-tuned on regional datasets. For enterprise applications, interpretability dashboards can visualize attention weights and classification justifications, offering transparency in decision-making.
This method ensures real-world usability, high throughput, and adaptability to dynamic online linguistic environments.
Inventors propose the use of the T5 model for multi-class sarcasm detection, like sarcasm irony, humor from text. The proposed model demonstrates superior performance in sarcasm detection, achieving an overall accuracy of 96% across four classes: irony, sarcasm, humor, and regular text. We utilize a well-balanced dataset collected from Twitter and Reddit, consisting of 79,813 samples across four categories, ensuring a diverse and representative evaluation of sarcasm in social media text. The proposed model demonstrates robustness in handling adversarial text, showing its ability to identify sarcasm in miscellaneous contexts such as overstatement, slang, idiomatic expressions, and cultural variability. The model is evaluated on different evaluation metrics, including accuracy, precision, and recall, F1-score, Cohen’s Kappa, and Matthews Correlation Coefficient, to ensure a wide-ranging assessment.
The proposed approach significantly outperforms earlier models (such as BERT-based architectures), particularly in handling context-dependent and non-literal language, showcasing the adaptability and strength of the T5 model in linguistic subtleties. Unlike traditional binary sarcasm detectors, this work employs the T5 (Text-to-Text Transfer Transformer) model to classify text into four nuanced categories—sarcasm, irony, humor, and regular text—enabling more granular understanding of expressive language.
, Claims:1. A system for detecting sarcasm, irony, humor, and regular text on social media, comprising:
a) a preprocessing module configured to clean and normalize raw text;
b) a dataset module containing multiple categories of text samples including sarcasm, irony, humor, and regular text;
c) a transformer-based sequence-to-sequence classification module fine-tuned for multi-class text detection;
d) an adversarial testing module configured to evaluate hyperbole, idioms, slang, and ambiguous statements;
e) an interpretability module configured to provide attention-weight visualizations and decision explanations;
wherein said modules are interconnected to form a unified framework for robust detection of context-dependent textual expressions.
2. The system as claimed in claim 1, wherein the preprocessing module is adapted to remove noise including URLs, special characters, and emojis while retaining linguistic markers relevant for sarcasm detection.
3. The system as claimed in claim 1, wherein the dataset module includes balanced samples sourced from online social platforms to ensure diversity and coverage.
4. The system as claimed in claim 1, wherein the adversarial testing module evaluates robustness against cultural variability, ambiguous statements, and cross-linguistic idioms.
5. The system as claimed in claim 1, wherein the interpretability module provides visual explanations to enhance transparency of the classification process.
6. A method for detecting sarcasm, irony, humor, and regular text using a transformer-based framework, comprising the steps of:
a) preprocessing raw text data to normalize and clean linguistic content;
b) classifying the preprocessed text into sarcasm, irony, humor, and regular categories using a fine-tuned transformer-based sequence-to-sequence model;
c) evaluating adversarial robustness against hyperbole, slang, idioms, ambiguous expressions, and cultural variability;
d) generating interpretability outputs including attention-weight visualizations and classification reasoning;
wherein the steps are executed within an adaptive unified framework for robust sarcasm and humor detection.
7. The method as claimed in claim 6, wherein the classification step achieves high accuracy and F1-scores across all categories of sarcasm, irony, humor, and regular text.
8. The method as claimed in claim 6, wherein adversarial robustness testing includes conversational context and cross-cultural idiomatic expressions.
9. The method as claimed in claim 6, wherein multilingual adaptability is enabled through transfer learning on region-specific datasets.
10. The method as claimed in claim 6, wherein deployment is performed as a cloud-based application programming interface integrated with social media monitoring dashboards.
| # | Name | Date |
|---|---|---|
| 1 | 202541089583-STATEMENT OF UNDERTAKING (FORM 3) [19-09-2025(online)].pdf | 2025-09-19 |
| 2 | 202541089583-REQUEST FOR EARLY PUBLICATION(FORM-9) [19-09-2025(online)].pdf | 2025-09-19 |
| 3 | 202541089583-POWER OF AUTHORITY [19-09-2025(online)].pdf | 2025-09-19 |
| 4 | 202541089583-FORM-9 [19-09-2025(online)].pdf | 2025-09-19 |
| 5 | 202541089583-FORM FOR SMALL ENTITY(FORM-28) [19-09-2025(online)].pdf | 2025-09-19 |
| 6 | 202541089583-FORM 1 [19-09-2025(online)].pdf | 2025-09-19 |
| 7 | 202541089583-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [19-09-2025(online)].pdf | 2025-09-19 |
| 8 | 202541089583-EVIDENCE FOR REGISTRATION UNDER SSI [19-09-2025(online)].pdf | 2025-09-19 |
| 9 | 202541089583-EDUCATIONAL INSTITUTION(S) [19-09-2025(online)].pdf | 2025-09-19 |
| 10 | 202541089583-DRAWINGS [19-09-2025(online)].pdf | 2025-09-19 |
| 11 | 202541089583-DECLARATION OF INVENTORSHIP (FORM 5) [19-09-2025(online)].pdf | 2025-09-19 |
| 12 | 202541089583-COMPLETE SPECIFICATION [19-09-2025(online)].pdf | 2025-09-19 |