Abstract: The Humanized Voice Assistant represents a significant advancement in artificial intelligence by integrating Natural Language Processing (NLP) and Emotion AI technologies. This sophisticated Innovation is designed to not only comprehend and respond to natural language inputs but also to detect and adapt to users’ emotional states in real-time. By leveraging advanced NLP algorithms, the assistant can parse complex linguistic structures, understand context, and generate coherent and contextually appropriate responses.Emotion AI plays a crucial role in enhancing user interactions by analyzing vocal tone, speech patterns, and other non-verbal cues to infer the user’s emotional state. This capability allows the assistant to provide empathetic and personalized responses, thereby fostering a deeper connection with users. Features such as voice customization enable users to tailor the assistant’s voice to their preferences, further enhancing the sense of personalization.The assistant’s adaptive response mechanism ensures that interactions are not only relevant but also emotionally attuned to the user’s current state. This is achieved through continuous machine learning processes that allow the Innovation to learn from past interactions and improve over time. Robust privacy protections are embedded within the Innovation to ensure that user data is handled securely and ethically, addressing concerns related to data privacy and security.By prioritizing ethical behavior and transparency in its design, the Humanized Voice Assistant sets a new standard for responsible AI. It emphasizes the importance of user-centric design principles, ensuring that the technology is not only advanced but also aligned with the values of privacy, security, and ethical interaction. 5 claims and 3 figures
Description:Field of Invention
The proposed Innovation has an integration of Natural Language Processing (NLP), Deep Learning, and Emotion AI, as well as an additional feature that allows voice customisation based on abstract emotions, the suggested innovation aims to improve user-voice assistant interaction. The goal of this innovation is to develop a voice assistant that is more emotionally intelligent and individualised. It guarantees a sense of closeness and familiarity by allowing users to customise the voice to that of known people, offering a distinctive and reassuring user experience. In-order to provide a more intimate and interesting relationship between people and artificial intelligence, the technology combines language understanding, emotion identification, and personalised voice modification.
Objective of this Invention
The primary objective of proposing this idea of a Humanised Voice Assistant combining NLP and Emotion AI is to improve user interactions with technology. Specifically, the goal is to create a voice assistant that can understand spoken language and respond appropriately to a range of human emotions. By leveraging NLP and Emotion AI technology, the voice assistant may adapt its responses to the user's emotions, so providing a more personalised and compassionate experience. This innovation intends to improve user interactions by humanising and customising voice assistants to each user's emotional state, eliminating the impersonal nature of traditional voice assistants.
Background of the Invention
Creating a humanized interaction with technology has become paramount in today's fast-paced world. The conventional voice assistant lacks the personal touch required for an enriched user experience. Recognizing this need, the proposed invention aims to revolutionize user-computer interactions through a Humanized Voice Assistant with Natural Language Processing (NLP) and Emotion AI.
In a world where technology often feels impersonal, this invention strives to create a voice assistant that not only comprehends natural language but also responds to human emotions in a nuanced manner. Traditional voice assistants lack the ability to adapt their responses based on the user's emotional state, leading to a somewhat disconnected experience.
US8706476B2This Patent introduces a method and Innovation for processing natural language sentences to accurate and efficiently extract information. The proposed method involves converting a given sentence into a set of primitive sentences through a series of steps. Initially, verbal blocks in the sentence are identified, and the sentence is then divided into subordinate and/or coordinate logical clauses. Ambiguous verbal blocks within each logical clause are disambiguated, and a primitive sentence is formed for each verbal block by duplicating shared noun phrases. The process aims to enhance the precision and efficiency of information extraction from natural language sentences. The Patent suggests using simple regular expression-like word patterns to extract information from the resulting primitive sentences in a specific embodiment. Overall, the method outlined in the Patent provides a Innovationatic approach for parsing and transforming complex natural language sentences into simpler, more manageable structures for accurate information extraction.100326CN110032636A This invention employs reinforced learning and an asynchronous model to generate emotionally enriched text. By selecting an agent and predicting suitable emotion keywords, the method enhances responses with heightened emotional correlation and intensity. Through an asynchronous generation framework, it abandons traditional left-to-right approaches, reducing decoder strain and diversifying responses. This innovation optimizes emotion generation, delivering a more expressive and varied user interaction experience.CN104350541AThis invention introduces a humanoid robot capable of engaging in dialogue with users through two speech recognition modes: open and closed. The closed mode is defined by a concept characterizing a dialogue sequence, and the robot can also be influenced by non-speech or text events. Notably, the robot can execute behaviors, generate expressions, and convey emotions. The innovation reduces programming time and execution latency for dialogue sequences, resulting in a more natural and fluent interaction akin to human dialogues. The humanoid robot includes sensors, an event recognition module, an event generation module, and an artificial intelligence engine to control the output of the event generation module, offering a comprehensive and responsive user-dialogue experience.
US9368102B2This text-to-speech synthesis method and Innovation involve generating a voice dataset from incidental audio input of speech by an input speaker. Simultaneously, text input is received, synthesized into personalized speech using the voice dataset, and enriched with analyzed expressions. In the case of video communication, the audio input aligns with a visual input of the speaker, facilitating the synthesis of a personalized image. The innovative approach strives for natural and expressive text-to-speech, offering a personalized voice experience, even using recordings from normal audio communications without the sender's awareness.KR102299455B1 This Patent introduces a neural network-based emotion analysis method involving voice data from users. It calculates emotion analysis results through a two-step process: (a) neural network-based voice analysis, and (b) context analysis using an emotion dictionary. The results are then integrated through a neural network, providing a comprehensive emotion analysis for the input data. The broader context extends to aInnovation addressing emotion analysis and treatment, aligning with the contemporary strategy of incorporating human emotions into engineering applications for product development.CN113094502AThis invention presents a multi-granularity sentiment analysis method for user comments, focusing on real user data. The method involves preprocessing comment data through conversion, duplication removal, and segmentation. It utilizes end-to-end training with a baseline network and enhances feature extraction through an added attention mechanism. Various models, including fastText and BERT-MRC, are employed to explore diverse data characteristics, addressing the limitations of the main model. The approach achieves improved training and classification results efficiently. Finally, the method performs deep fusion on multiple models to jointly undertake an emotion recognition task for sales comment data, offering comprehensive sentiment analysis.CN110021308BThis application presents a way for recognising emotions in speech that does not rely on conventional voice recognition technology. Rather, it uses direct analysis of user voice data to extract information about the user's attributes, which is subsequently utilised to extract the user's emotional state. In contrast to traditional techniques that only use a universal model, this method customises emotion recognition to specific user characteristics, greatly improving recognition efficacy and accuracy. Using a voice emotion recognition device, the disclosed method can be easily integrated into different types of computer equipment so that the speech emotion recognition feature can be used without requiring text conversion.By combining these technologies, the proposed model seeks to provide users with a voice assistant that not only understands what is said but also how it is said. This innovation aims to foster a more empathetic and personalized interaction between users and technology, thereby enhancing the overall user experience in the realm of voice-enabled applications.
Summary of the Invention
A ground-breaking invention that seeks to transform user-computer interactions is the Humanised Voice Assistant with NLP and Emotion AI. This Innovation combines Natural Language Processing (NLP) and Emotion AI technology, which sets it apart from other voice assistants and enables it to comprehend and react to users' emotions in a sophisticated way. The technology removes the impersonal aspect of typical voice assistants and offers a more personalised and empathic user experience by modifying responses depending on emotional indicators. Motivated by current technological advancements, the model fills in the gaps in the analysis of emotions and provides a complete solution for enhanced human-computer interaction. In the field of voice-enabled apps, this idea aims to offer a more personalised and human-like contact by emphasising both language comprehension and emotional reactivity.
Brief Description of Drawings
The invention will be described in detail with the reference to the exemplary embodiments shown in the figures wherein:
Figure-1:Flowgorithm representing the work flow in the field
Figure-2:Diagramatic representation of the details of the speech-emotion transformation
Figure-3: Flow chart representing the basic training and testing.
Detailed Description of the Invention
The Humanised Voice Assistant with NLP and Emotion AI revolutionises the landscape of user-computer engagement. By utilising compound natural language processing (NLP) techniques, Deep Learning and Emotion AI, the Innovationis able to understand subtleties of natural language and make communication with users simple. One essential component is the incorporation of Emotion AI, which gives the voice assistant the capacity to recognise and react to users' emotional states, resulting in a more personalised and sympathetic exchange.Real-time emotion analysis demonstrates the Innovation's flexibility by dynamically modifying answers to correspond with users' differing emotional cues. Accessibility is improved by the user-friendly interface, which makes the technology entertaining and straightforward for users in a variety of situations. Ensuring user confidence requires addressing privacy and security concerns related to emotion analysis and voice customisation.TheInnovation is propelled beyond static interactions by continuous learning techniques, which enable it to develop and enhance its comprehension of user preferences over time. This technology can be applied in a wide range of fields, from virtual assistants to instructional tools, and it can improve user experiences in a variety of contexts.The dedication to user-centricity and the appropriate integration of emotional intelligence in technology is highlighted by the inclusion of ethical considerations, such as transparency and responsible data use. With its unmatched personalisation and emotional resonance, the Humanised Voice Assistant with NLP and Emotion AI is, in short, a groundbreaking innovation that has the potential to completely transform the dynamics of human-computer interactions.Human-computer interactions have advanced significantly with the introduction of Natural Language Processing (NLP) algorithms into the Humanised Voice Assistant. A smooth and context-aware communication experience for users is provided by the Innovation's ability to understand the nuances of natural language, which is made possible in large part by NLP. The linguistic capabilities of the Innovation are improved by adding features like sentiment analysis, tokenization, and part-of-speech tagging, which help it produce and analyse writing that is similar to that of a human.Deep Learning, a branch of machine learning, is essential to the operation of the Humanised Voice Assistant. By the use of multi-layered neural networks, deep learning allows the Innovation to identify intricate patterns in voice data. By enhancing the Innovation's ability to recognise voices, analyse emotions, and function as a whole, this deep neural network design raises the bar for user experience.A vital component of the Humanised Voice Assistant is Emotion AI, or Affective Computing. Emotion AI gives the Innovation the ability to recognise and react to users' emotional states using facial recognition, voice analysis, and physiological data. Because of this integration, users can engage in more personalised and compassionate interactions as the technology learns to recognise and respond to the subtle emotional indicators that users express during chats.TheHumanised Voice Assistant's voice customisation function is mostly dependent on Automatic Speech Recognition (ASR) technology. With the aid of automatic speech recognition (ASR), spoken language can be converted into written text, allowing the assistant to modify its voice to sound like known people. Furthermore, ASR plays a crucial role in real-time emotion analysis by trancribing spoken words for additional processing, guaranteeing that the Innovation adapts to the changing emotional states of its users.
The Fourier transform, mel scale, and spectrogram are three essential processes in the process of extracting meaningful characteristics from audio. Using a mathematical technique called the Fast Fourier Transform (FFT) algorithm, the Fourier transform breaks down a signal into its component frequencies and amplitudes. Comprehending the frequency components of the audio signal requires completing this stage.Torch is an open-source machine learning package that is useful for implementing transfer learning models. It makes it easier to create and implement deep transfer learning models for melspectrogram creation and audio recognition.An Ubuntu 18 computer with the following hardware characteristics is used in a methodical experimental setup for the deep transfer learning recognition and melspectrogram generation: CPU: Intel(R) Xeon(R) CPU @ 2.20GHz, RAM: 16GB, GPU: NVidia GeForce GTX 1070 with 16GB capacity.The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) is the dataset that was selected for training and testing. A wide variety of emotional speech categories are available in this dataset, such as talks that are neutral, serene, joyful, sad, furious, afraid, disgusted, or astonished. The results derived from this dataset are guaranteed to be dependable and repeatable due to the methodical approach to experimentation.To make a dynamic and emotionally intelligent Innovation, the Humanised Voice Assistant combines deep learning approaches, powerful NLP algorithms, emotion AI, and ASR. Together, these technologies advance voice customisation, real-time emotion analysis, natural language understanding, and continuous learning, ultimately changing the field of human-computer interaction.
Advantages of the Proposed Model,
The portal model stands out as an effective and versatile solution, offering a plethora of advantages that cater to diverse user needs. One of its key strengths lies in centralized information access, providing users with a unified platform to seamlessly navigate a wealth of resources and services. Personalization is another notable advantage, allowing users to tailor their experience, set preferences, and efficiently organize content.Facilitating collaboration and communication, portals serve as hubs for shared forums, messaging Innovations, and collaborative tools. This is particularly valuable for organizations, educational institutions, and communities seeking streamlined communication channels. The integration of various services within the portal environment enhances workflow efficiency, eliminating the need to navigate between disparate platforms.Efficient content management is simplified through the portal model, empowering administrators to update, organize, and control access to information effectively. The scalability and flexibility of this model ensure adaptability to changing demands, making it a future-proof solution for growing organizations. Secure access control features further bolster the model's appeal, ensuring that sensitive data remains confidential and accessible only to authorized users.
5 claims and 3 figures , Claims:The scope of the invention is defined by the following claims:
Claim:
1. The humanized voice assistant with nlp and emotion ai comprising
a) A umanized Voice Assistant this highlights that the voice assistant is designed to interact in a way that feels natural and human-like.
b) The assistant significantly changes the way users interact with computers.
c) The result of this integration is a conversational experience that is unmatched and highly perceptive to the emotional context of the user.
2. As per the claim 1, The Experience an assistant that dynamically adapts its responses based on real-time emotion analysis, ensuring a personalized and empathetic interaction tailored to individual users' emotional states.
3. As per the claim 1, the truly personalized experience with our voice assistant, allowing users to customize the voice to mimic familiar individuals, fostering a sense of closeness and familiarity.
4. According to claim 1, the innovation goes beyond traditional voice assistants, incorporating comprehensive emotion analysis to understand and respond to users' emotional nuances, elevating the level of human-computer interaction.
5. According to claim 1, the benefit from a voice assistant that continually learns and adapts, improving its understanding of user preferences over time, ensuring an evolving and tailored conversational experience.
| # | Name | Date |
|---|---|---|
| 1 | 202541014976-REQUEST FOR EARLY PUBLICATION(FORM-9) [21-02-2025(online)].pdf | 2025-02-21 |
| 2 | 202541014976-FORM-9 [21-02-2025(online)].pdf | 2025-02-21 |
| 3 | 202541014976-FORM FOR STARTUP [21-02-2025(online)].pdf | 2025-02-21 |
| 4 | 202541014976-FORM FOR SMALL ENTITY(FORM-28) [21-02-2025(online)].pdf | 2025-02-21 |
| 5 | 202541014976-FORM 1 [21-02-2025(online)].pdf | 2025-02-21 |
| 6 | 202541014976-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [21-02-2025(online)].pdf | 2025-02-21 |
| 7 | 202541014976-EVIDENCE FOR REGISTRATION UNDER SSI [21-02-2025(online)].pdf | 2025-02-21 |
| 8 | 202541014976-EDUCATIONAL INSTITUTION(S) [21-02-2025(online)].pdf | 2025-02-21 |
| 9 | 202541014976-DRAWINGS [21-02-2025(online)].pdf | 2025-02-21 |
| 10 | 202541014976-COMPLETE SPECIFICATION [21-02-2025(online)].pdf | 2025-02-21 |