Voice Assistant System For Improving Customer Service Interactions

< Back

Voice Assistant System For Improving Customer Service Interactions

Abstract: A voice assistant system for improving customer service interactions comprising a voice input module captures and converts customer's spoken words into text using speech-to-text protocols for analysis, a sentiment analysis module processes both text and audio using trained machine learning models to detect emotional states such as positive, negative or neutral, a natural language processing module adjusts the text formatting to match customer's voice style for better understanding by adjusting punctuation and emphasis customer’s voice style and tone, a response generation module creates personalized responses based on the customer's emotions and voice style by mirror the customer's emotional state to enhance interaction and a feedback module delivers the personalized responses to the customer to improve satisfaction.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

13 August 2025

Publication Number

35/2025

Publication Type

INA

Invention Field

ELECTRONICS

Status

Parent Application

Applicants

SR University

Ananthasagar, Hasanparthy (PO), Warangal-506371, Telangana, India.

Inventors

1. Dr. Durgesh Nandan

School of Computer Science and Artificial Intelligence, SR University, Ananthasagar, Hasanparthy (PO), Warangal-506371, Telangana, India.

2. Manchala Srinithin

School of Computer Science and Artificial Intelligence, SR University, Ananthasagar, Hasanparthy (PO), Warangal-506371, Telangana, India.

Specification

Description:FIELD OF THE INVENTION

[0001] The present invention relates to a voice assistant system for improving customer service interactions by accurately converting spoken language into text for analyzing the emotional state of customers and generating personalized context-aware responses for better customer engagement, satisfaction and overall service quality.

BACKGROUND OF THE INVENTION

[0002] Customer service interactions struggle to fully capture the human communication, particularly emotions and tone. With the rise of voice-assisted technologies, there is a growing demand for systems that not only transcribe spoken words but also understand the emotional context behind them to improve service quality. Voice assisted systems convert speech to text, but fail to address the emotional undertones in the conversation, leading to impersonal responses. Customers, however, expect interactions that feel personal and empathetic, which significantly enhance satisfaction. Ensuring an advanced system that provide efficient voice assist for customer services.

[0003] Traditional customer service systems rely on scripted responses or basic voice recognition technologies, which fail to capture the emotional tone of customer interactions. While these systems lack the capability to assess underlying emotional states, such as frustration, excitement or confusion. Additionally, existing systems generate generic responses that do not take into account individual customer preferences, conversational style or context. This impersonal approach leads to a bad customer experience, as customers feels that their concerns are not truly understood or addressed in a meaningful way. The lack of emotional intelligence and adaptability highlights a need of advance solution that elevate the quality of customer service and enhance overall communication effectiveness.

[0004] US2017091780A1 relates to a computer-implemented method and an apparatus facilitate linking of customer's enterprise-related interactions on non-enterprise related interaction channels to the enterprises. An enterprise-related query provided by a customer of the enterprise on a non-enterprise related interaction channel is received. An enterprise response to the query is provided to the customer on the non-enterprise related interaction channel. The provisioning of the enterprise response on the non-enterprise related interaction channel, at least in part, simulates an effect of provisioning of a reply by the enterprise to the query of the customer on an enterprise interaction channel.

[0005] US11080721B2 relates to improvement of customer experiences during online commerce is accomplished by providing unique experiences to customers as a result of anticipating customer needs, simplifying customer engagement based on predicted customer intent, and updating system knowledge about customers with information gathered from new customer interactions. In this way, the customer experience is improved.

[0006] Conventionally, many systems have been developed for voice assistant, however the devices mentioned in the prior arts have limitations pertaining to convert customer's spoken words into text for analysis, determine emotional state of customers for better understanding and generate personalized responses accordingly for good communication experience.

[0007] In order to overcome the aforementioned drawbacks, there exists a need in the art to develop a system that is required to be capable of transforming the customer's spoken words into text for analysis, assess their emotional state for deeper insight and generate tailored responses to ensure a more effective and engaging communication experience.

OBJECTS OF THE INVENTION

[0008] The principal object of the present invention is to overcome the disadvantages of the prior art.

[0009] An object of the present invention is to develop a system that is capable of voice assisting for improving customer service interactions by understanding spoken input and response to improve user satisfaction and communication effectiveness.

[0010] Another object of the present invention is to develop a system that is capable of converting spoken language into text for context understanding.

[0011] Another object of the present invention is to develop a system that is capable of determining emotional state of customers through combined analysis of their spoken words and vocal characteristics for context-aware understanding.

[0012] Yet another object of the present invention is to develop a system that is capable of generating, personalized responses for meaningful interactions.

[0013] The foregoing and other objects, features, and advantages of the present invention will become readily apparent upon further review of the following detailed description of the preferred embodiment as illustrated in the accompanying drawings.

SUMMARY OF THE INVENTION

[0014] The present invention relates to a voice assistant system for improving customer service interactions by determining emotional state of customers through speech and text analysis to generate personalized responses that align with the customer's mood, enabling more empathetic, context-aware interactions that enhance customer satisfaction and communication effectiveness.

[0015] According to an embodiment of the present invention, a voice assistant system for improving customer service interactions comprises of a voice input module that captures a customer's spoken words and converts them into text for analysis, the voice input module uses a speech-to-text protocol to convert spoken words into text with high accuracy, a sentiment analysis module that processes the text and audio to identify the customer's emotions based on words, phrases, tone and intonation, the sentiment analysis module uses a trained machine learning model to detect positive, negative or neutral emotions from the text and audio.

[0016] According to another embodiment of the present invention, the system further, comprises of a natural language processing (NLP) module that adjusts the text formatting to match the customer's voice style for better understanding, the NLP module adjusts text formatting by adding punctuation and emphasis based on the customer's tone and speech patterns, a response generation module that creates personalized responses based on the customer's emotions and voice style, the response generation module selects words and tone for responses that match the customer's emotional state to enhance interaction and a feedback module that delivers the personalized responses to the customer to improve satisfaction.

[0017] While the invention has been described and shown with particular reference to the preferred embodiment, it will be apparent that variations might be possible that would fall within the scope of the present invention.

BRIEF DESCRIPTION OF THE DRAWINGS

[0018] These and other features, aspects, and advantages of the present invention will become better understood with regard to the following description, appended claims, and accompanying drawings where:
Figure 1 illustrates a flowchart depicting workflow of a voice assistant system for improving customer service interactions.

DETAILED DESCRIPTION OF THE INVENTION

[0019] The following description includes the preferred best mode of one embodiment of the present invention. It will be clear from this description of the invention that the invention is not limited to these illustrated embodiments but that the invention also includes a variety of modifications and embodiments thereto. Therefore, the present description should be seen as illustrative and not limiting. While the invention is susceptible to various modifications and alternative constructions, it should be understood, that there is no intention to limit the invention to the specific form disclosed, but, on the contrary, the invention is to cover all modifications, alternative constructions, and equivalents falling within the spirit and scope of the invention as defined in the claims.

[0020] In any embodiment described herein, the open-ended terms "comprising," "comprises,” and the like (which are synonymous with "including," "having” and "characterized by") may be replaced by the respective partially closed phrases "consisting essentially of," consists essentially of," and the like or the respective closed phrases "consisting of," "consists of, the like.

[0021] As used herein, the singular forms “a,” “an,” and “the” designate both the singular and the plural, unless expressly stated to designate the singular only.

[0022] The present invention relates to a voice assistant system for improving customer service interactions by identifying emotional state of customers by analyzing both speech and text, then crafts personalized, empathetic responses that match the customer's emotional tone and communication style for improving the interaction quality to enhance customer engagement, satisfaction and deliver an overall good experience.

[0023] Referring to Figure 1, illustrate a flowchart depicting workflow of a voice assistant system for improving customer service interactions.

[0024] The system disclosed herein comprises of a voice input module to accurately capture and transcribe a customer's spoken words into text for subsequent analysis. The module utilizes speech-to-text protocols, based on deep learning models such as recurrent neural networks (RNNs) or transformer-based architectures, which are trained on vast datasets of spoken language. These models are capable of recognizing various accents, speech patterns and noise environments to ensure high accuracy in transcription. When a customer speaks into the system, the audio signal is first digitized and preprocessed to remove background noise and enhance speech clarity.

[0025] The cleaned audio is then processed by the speech recognition engine, which decodes the sound into textual form by identifying phonemes, mapping them to words and applying language models to improve coherence and grammatical correctness. This transcribed text is then forwarded to downstream modules, such as sentiment analysis and natural language processing, for deeper emotional and contextual understanding. By providing a highly accurate textual representation of spoken language.

[0026] After converting voice input into text, a sentiment analysis module to evaluate both the textual content and audio characteristics of a customer's speech to accurately determine their emotional state. The sentiment analysis module uses a trained machine learning model, based on deep neural networks or ensemble models, which have been exposed to large datasets labeled with emotional cues. This module analyzes text data by examining specific words, phrases and syntactic patterns that typically reflect positive, negative or neutral sentiments. Simultaneously, processes audio features such as tone, pitch, volume, pace and intonation using acoustic signal processing techniques.

[0027] Combining textual sentiment analysis with audio emotion recognition, the module detects subtle emotional expressions. Feature extraction protocols convert both the text and audio inputs into structured data, which are then fed into the sentiment classification model. The model outputs an emotional label such as positive, negative or neutral, providing a rich emotional profile of the customer. This ensures higher accuracy and contextual. The resulting emotional insights to tailor interactions based on the customer's mood and state of mind, ultimately improving engagement and satisfaction.

[0028] Upon analyzing customer emotion’s, a natural language processing (NLP) module refines the transcribed text from the customer to ensure accurately reflects the customer’s voice style and intent. The natural language processing (NLP) module primary function is to adjust text formatting by analyzing the tone, rhythm, pause patterns and speech emphasis captured in the audio input. By integrating features such as pitch rises, loudness and hesitation points the NLP module inserts punctuation marks, emphasis indicators and other formatting elements to mirror how the customer expressed themselves verbally.

[0029] For instance, a rising tone at the end of a sentence may indicate a question, prompting the module to insert a question mark, while prolonged syllables or loud speech result in added emphasis. This transformation is achieved through machine learning and linguistic rule-based techniques, supported by context-aware language models trained on spoken dialogue corpora. This to make the written version of the customer’s message not only grammatically correct but also emotionally and contextually aligned, this enhances ability to understand the customer’s true intent and emotional tone, leading to more empathetic and effective automated responses.

[0030] After that, a response generation module to generate personalized replies based on insights gathered from the customer's emotions and voice style. Drawing on data from the sentiment analysis module, which identifies whether the customer’s mood is positive, negative or neutral and the NLP module, which reveals their tone and communication style, this module generates responses that mirror the customer's emotional state and linguistic preferences, utilizes natural language generation (NLG) techniques powered by machine learning models, such as transformer-based architectures, which are trained on large datasets of human interactions.

[0031] These models are capable of selecting vocabulary, sentence structure and tone such as formal, casual, empathetic or enthusiastic that align with the customer’s detected sentiment and voice style. For example, a customer expresses frustration in a fast, loud tone, the module might respond with calm, apologetic language and reassuring phrasing. Conversely, the input reflects excitement, the response may echo that enthusiasm with upbeat language. Additionally, the system tailor responses by incorporating context-aware dialogue strategies, including the customer’s previous interactions. This ensures that replies feel human-like, context-sensitive and emotionally attuned, significantly enhancing the quality of interaction.

[0032] A feedback module deliver personalized responses generated by the response generation module back to the customer in a timely and context-aware manner. The feedback module primary ensure that the crafted reply tailored to match the customer’s emotional state and voice style is conveyed in the most effective format. The module allows responses to be sent via chat bots, voice assistants, email or messaging platforms, depending on the user’s interaction mode and also convert text responses back into speech using text-to-speech engines when communicating through voice interfaces, selecting tone, speed and inflection that reflect the intended emotional tone. The feedback module ensures smooth, empathetic and contextually appropriate communication to provide customer satisfaction.

[0033] The present invention works best in the following manner, where the system disclosed herein comprises the voice input module, which captures spoken words and converts them into text using a high-accuracy speech-to-text protocol, the transcribed text, along with the original audio is then processed by the sentiment analysis module which uses a trained machine learning model to identify the customer’s emotional state positive, negative or neutral based on words, phrases, tone and intonation. The output is forwarded to the natural language processing (NLP) module which refines the textual data by adjusting formatting to align with the customer’s voice style by inserting punctuation and emphasis reflecting speech patterns and tone, preserving the intent and emotional nuance of the original message. The response generation module crafts personalized replies by selecting vocabulary, tone and structure to align with the customer’s emotional state and communication style, making the response feel natural and empathetic. The feedback module delivers the personalized responses through appropriate channels voice, text, or digital platforms enhancing the customer’s experience.

[0034] Although the field of the invention has been described herein with limited reference to specific embodiments, this description is not meant to be construed in a limiting sense. Various modifications of the disclosed embodiments, as well as alternate embodiments of the invention, will become apparent to persons skilled in the art upon reference to the description of the invention. , Claims:1) A voice assistant system for improving customer service interactions, comprising:

i) a voice input module that captures a customer's spoken words and converts them into text for analysis;

ii) a sentiment analysis module that processes the text and audio to identify the customer's emotions based on words, phrases, tone, and intonation;

iii) a natural language processing (NLP) module that adjusts the text formatting to match the customer's voice style for better understanding;

iv) a response generation module that creates personalized responses based on the customer's emotions and voice style; and

v) a feedback module that delivers the personalized responses to the customer to improve satisfaction.

2) The system as claimed in claim 1, wherein the voice input module uses a speech-to-text protocol to convert spoken words into text with high accuracy.

3) The system as claimed in claim 1, wherein the sentiment analysis module uses a trained machine learning model to detect positive, negative, or neutral emotions from the text and audio.

4) The system as claimed in claim 1, wherein the NLP module adjusts text formatting by adding punctuation and emphasis based on the customer's tone and speech patterns.

5) The system as claimed in claim 1, wherein the response generation module selects words and tone for responses that match the customer's emotional state to enhance interaction.

Documents

Application Documents

#	Name	Date
1	202541077338-STATEMENT OF UNDERTAKING (FORM 3) [13-08-2025(online)].pdf	2025-08-13
2	202541077338-REQUEST FOR EARLY PUBLICATION(FORM-9) [13-08-2025(online)].pdf	2025-08-13
3	202541077338-PROOF OF RIGHT [13-08-2025(online)].pdf	2025-08-13
4	202541077338-POWER OF AUTHORITY [13-08-2025(online)].pdf	2025-08-13
5	202541077338-FORM-9 [13-08-2025(online)].pdf	2025-08-13
6	202541077338-FORM FOR SMALL ENTITY(FORM-28) [13-08-2025(online)].pdf	2025-08-13
7	202541077338-FORM 1 [13-08-2025(online)].pdf	2025-08-13
8	202541077338-FIGURE OF ABSTRACT [13-08-2025(online)].pdf	2025-08-13
9	202541077338-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [13-08-2025(online)].pdf	2025-08-13
10	202541077338-EVIDENCE FOR REGISTRATION UNDER SSI [13-08-2025(online)].pdf	2025-08-13
11	202541077338-EDUCATIONAL INSTITUTION(S) [13-08-2025(online)].pdf	2025-08-13
12	202541077338-DRAWINGS [13-08-2025(online)].pdf	2025-08-13
13	202541077338-DECLARATION OF INVENTORSHIP (FORM 5) [13-08-2025(online)].pdf	2025-08-13
14	202541077338-COMPLETE SPECIFICATION [13-08-2025(online)].pdf	2025-08-13