Sign In to Follow Application
View All Documents & Correspondence

Advanced Speech Recognition System With Integrated Noise Suppression And Accent Adaptation

Abstract: [026] The invention pertains to an advanced speech recognition system designed to enhance the accuracy and usability of voice-controlled interfaces in diverse environments. This system integrates a microphone array, a digital signal processor (DSP), and an accent adaptation module. The microphone array captures audio from multiple directions, focusing primarily on the speaker's voice while minimizing background noise. The DSP processes these audio signals in real-time, employing sophisticated noise filtering algorithms to reduce ambient noise and improve clarity. Additionally, the accent adaptation module utilizes machine learning models to detect and adjust to the speaker's accent, thereby optimizing speech recognition accuracy across different linguistic backgrounds. This integrated approach not only improves speech recognition reliability in noisy settings but also ensures greater inclusivity by adapting to various accents, making it ideal for global applications in telecommunications, virtual assistants, and automated transcription services.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
10 January 2025
Publication Number
04/2025
Publication Type
INA
Invention Field
ELECTRONICS
Status
Email
Parent Application

Applicants

Soundarya M
Department of Electronics and Communication Engineering, Saveetha School of Engineering, SIMATS, Chennai, Tamil Nadu- 602105,India.
Dr. Vidya Kamma
Assistant Professor, Department of Computer Science and Engineering, Neil Gogte Institute of Technology, Rangareddy, Hyderabad, Telangana-500039, India.
Ramya Palaniappan
Assistant Professor, Department of Computer Science and Engineering, Madanapalle Institute of Technology & Science, Madanapalle, Andhra Pradesh-517325, India.
Dr. M.N. Sudha
Assistant Professor, Department of Information Technology, Government College of Engineering, Erode, Tamil Nadu- 638316 ,India.
Dr. K. Tamilselvan
Associate Professor, Department of Information Technology, AVS Engineering College, Salem, Tamil Nadu-636003, India.
Dr. C. Aarthi
Professor, Department of Electronics and Communication Engineering, Sengunthar Engineering College, Tiruchengode, Tamil Nadu-637205, India.
P. Sapthika Parthi
Assistant Professor, Department of Electronics and Instrumentation Engineering, Erode Sengunthar Engineering College, Thudupathi, Tamil Nadu- 638057, India.
Dr. S. Aruna
Department of Computational Intelligence, School of Computing SRM Institute of Science and Technology, Kattankulathur, Tamil Nadu-603203, India.
Er. Tatiraju V. Rajani Kanth
Senior Manager, TVR Consulting Services Private Limited Gajularamaram, Medchal Malkajgiri, Hyderabad, Telangana- 500055, India.
Dr. R. Logesh Babu
Assistant Professor III, Department of Computer Science and Business Systems, KPR Institute of Engineering and Technology, Avinashi Road, Arasur, Coimbatore, Tamil Nadu- 641407, India.

Inventors

1. Soundarya M
Department of Electronics and Communication Engineering, Saveetha School of Engineering, SIMATS, Chennai, Tamil Nadu- 602105,India.
2. Dr. Vidya Kamma
Assistant Professor, Department of Computer Science and Engineering, Neil Gogte Institute of Technology, Rangareddy, Hyderabad, Telangana-500039, India.
3. Ramya Palaniappan
Assistant Professor, Department of Computer Science and Engineering, Madanapalle Institute of Technology & Science, Madanapalle, Andhra Pradesh-517325, India.
4. Dr. M.N. Sudha
Assistant Professor, Department of Information Technology, Government College of Engineering, Erode, Tamil Nadu- 638316 ,India.
5. Dr. K. Tamilselvan
Associate Professor, Department of Information Technology, AVS Engineering College, Salem, Tamil Nadu-636003, India.
6. Dr. C. Aarthi
Professor, Department of Electronics and Communication Engineering, Sengunthar Engineering College, Tiruchengode, Tamil Nadu-637205, India.
7. P. Sapthika Parthi
Assistant Professor, Department of Electronics and Instrumentation Engineering, Erode Sengunthar Engineering College, Thudupathi, Tamil Nadu- 638057, India.
8. Dr. S. Aruna
Department of Computational Intelligence, School of Computing SRM Institute of Science and Technology, Kattankulathur, Tamil Nadu-603203, India.
9. Er. Tatiraju V. Rajani Kanth
Senior Manager, TVR Consulting Services Private Limited Gajularamaram, Medchal Malkajgiri, Hyderabad, Telangana- 500055, India.
10. Dr. R. Logesh Babu
Assistant Professor III, Department of Computer Science and Business Systems, KPR Institute of Engineering and Technology, Avinashi Road, Arasur, Coimbatore, Tamil Nadu- 641407, India.

Specification

Description:[001] The present invention relates to speech recognition technology. More specifically, it pertains to a system designed to improve the accuracy of speech recognition in noisy environments and adapt to various speech accents.
BACKGROUND OF THE INVENTION
[002] The following description provides the information that may be useful in understanding the present invention. It is not an admission that any of the information provided herein is prior art or relevant to the presently claimed invention, or that any publication specifically or implicitly referenced is prior art.
[003] Further, the approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.
[004] Speech recognition technology has become increasingly essential in various fields such as customer service, automated transcription, and assistive technologies. Traditional speech recognition systems, however, often face challenges when operating in noisy environments or dealing with speakers who have strong accents. These limitations can significantly degrade the system's accuracy and limit its usability, especially in diverse and dynamic settings.
[005] Accordingly, on the basis of aforesaid facts, advancements in digital signal processing and machine learning have opened new avenues for enhancing speech recognition systems. By integrating sophisticated noise suppression techniques and accent adaptation capabilities, it is now possible to develop systems that not only understand spoken language more accurately but also adapt to the linguistic characteristics of different user groups. This adaptation is crucial for applications ranging from multinational corporations to accessibility tools for non-native speakers, where clear and accurate communication is key.
Therefore, it would be useful and desirable to have a system, method, apparatus and interfaces to meet the above-mentioned needs.
SUMMARY OF THE PRESENT INVENTION
[006] The present invention introduces an advanced speech recognition system designed to overcome the limitations of conventional systems by integrating real-time noise filtering and accent adaptation capabilities. This system utilizes a combination of hardware components and software algorithms to enhance the clarity and accuracy of voice inputs, even in challenging acoustic environments. The key components include a microphone array for capturing high-quality audio, a digital signal processor (DSP) for real-time noise reduction, and an accent adaptation module that adjusts to the speaker's accent, ensuring the system remains effective across diverse user demographics.
[007] The operational methodology of the system focuses on delivering real-time processing to ensure seamless interaction with users. The microphone array captures audio from multiple directions, which is then processed by the DSP to differentiate and filter out background noise from the primary speech signal. Concurrently, the accent adaptation module employs machine learning techniques to analyze and adjust to the detected accent, refining the speech recognition process to better match the phonetic and linguistic nuances of the speaker’s language.
[008] The advanced speech recognition system described herein offers significant improvements in speech recognition technology by addressing two major challenges: background noise and accent diversity. By enhancing audio input quality and adapting to various accents, the system provides more reliable and accurate transcription and interaction capabilities, making it highly suitable for a wide range of applications, including telecommunications, virtual assistants, and automated transcription services in multilingual and noisy environments. This innovative approach not only improves user experience but also expands the potential use cases for speech recognition technology across global markets.
[009] In this respect, before explaining at least one object of the invention in detail, it is to be understood that the invention is not limited in its application to the details of set of rules and to the arrangements of the various models set forth in the following description or illustrated in the drawings. The invention is capable of other objects and of being practiced and carried out in various ways, according to the need of that industry. Also, it is to be understood that the phraseology and terminology employed herein are for the purpose of description and should not be regarded as limiting.
[010] These together with other objects of the invention, along with the various features of novelty which characterize the invention, are pointed out with particularity in the disclosure. For a better understanding of the invention, its operating advantages and the specific objects attained by its uses, reference should be made to the accompanying drawings and descriptive matter in which there are illustrated preferred embodiments of the invention.
BRIEF DESCRIPTION OF THE DRAWINGS
[011] When considering the following thorough explanation of the present invention, it will be easier to understand it and other objects than those mentioned above will become evident. Such description refers to the illustrations in the annex, wherein:
[012] FIG. 1, illustrates the System Architecture for the speech recognition system, in accordance with an embodiment of the present invention.
[013] FIG. 2, illustrates the Detailed Module Interaction for the speech recognition system, in accordance with an embodiment of the present invention.
DETAILED DESCRIPTION OF THE INVENTION
[014] The following sections of this article will provide various embodiments of the current invention with references to the accompanying drawings, whereby the reference numbers utilised in the picture correspond to like elements throughout the description. However, this invention is not limited to the embodiment described here and may be embodied in several other ways. Instead, the embodiment is included to ensure that this disclosure is extensive and complete and that individuals of ordinary skill in the art are properly informed of the extent of the invention. Numerical values and ranges are given for many parts of the implementations discussed in the following thorough discussion. These numbers and ranges are merely to be used as examples and are not meant to restrict the claims' applicability. A variety of materials are also recognised as fitting for certain aspects of the implementations. These materials should only be used as examples and are not meant to restrict the application of the innovation.
[015] Referring now to the drawings, these are illustrated in FIG. 1, the advanced speech recognition system comprises several integral components that collectively enhance its functionality. Primary among these are the microphone array, the digital signal processor (DSP), and the accent adaptation module. Each component is optimized to handle specific challenges within the speech recognition process, such as ambient noise and linguistic diversity, ensuring high-quality audio capture and processing tailored to the user's specific speech patterns.
[016] In accordance with another embodiment of the present invention, the microphone array is designed with multiple microphones strategically positioned to capture sound from various directions. This setup allows the system to focus on the primary sound source—typically the speaker's voice—while minimizing input from surrounding noise sources. The array's configuration can be adjusted dynamically based on the environment, providing flexibility and enhancing the system's ability to operate effectively in both quiet and noisy settings.
[017] In accordance with another embodiment of the present invention, the noise reduction capability is the DSP, which is equipped with sophisticated algorithms to analyze and filter audio signals in real-time. The DSP performs continuous frequency analysis to identify the components of the audio signal that correspond to background noise versus those that represent clear speech. By employing adaptive filtering techniques, the DSP can effectively suppress unwanted noise, ensuring that the processed audio is predominantly comprised of the speaker’s voice.
[019] In accordance with another embodiment of the present invention, accent adaptation module utilizes advanced machine learning models that have been trained on a diverse dataset encompassing numerous accents and dialects. This module detects the accent of the speaker as soon as speech input begins and adjusts the speech recognition algorithms accordingly. This dynamic adjustment allows the system to interpret various linguistic nuances accurately, enhancing its utility in a global context as shown in figure 2.
[020] A significant feature of this invention is its capability to perform all processing in real-time. This real-time operation is crucial for applications requiring immediate feedback, such as interactive voice response systems and real-time communication devices. The system's ability to quickly process and adapt to the speaker's voice without noticeable delays ensures a seamless and efficient user experience.
[021] The system is designed to be easily integrated with various platforms and devices, including smartphones, computers, and specialized communication equipment. The user interface is intuitive, allowing users to interact with the system through voice commands. Feedback and settings adjustments can also be voiced directly, providing a hands-free operation that enhances accessibility and ease of use.
[022] The noise filtering process involves several steps, starting with the initial capture of the audio via the microphone array. This raw audio data is then sent to the DSP, where it undergoes a detailed analysis to separate speech from noise based on frequency, amplitude, and other acoustic characteristics. Techniques such as spectral subtraction, Wiener filtering, and beamforming are used to refine the audio signal, ensuring that only the speaker's voice is enhanced and transmitted to the speech recognition engine. Upon detecting the speaker’s accent, the system consults its internal library of accent models to select the most appropriate processing strategy. This involves adjusting specific parameters within the speech recognition algorithm, such as phoneme recognition patterns and intonation profiles, to align with the identified accent. This flexibility not only improves accuracy but also significantly reduces the incidence of misinterpretations and errors in transcription.
[023] The machine learning models within the accent adaptation module are not static; they are designed to evolve and improve over time through continuous learning mechanisms. As the system encounters new accents or dialects, it can update its models to refine its understanding and processing of these new speech patterns. This continuous learning process ensures that the system remains effective as language evolves and as it is exposed to new linguistic environments. The versatility and robustness of this advanced speech recognition system make it ideal for a variety of applications. In customer service, it can provide accurate transcription and interaction despite noisy backgrounds, such as in call centers or public service areas. For multilingual applications, such as in international airports or global customer support, the system's ability to adapt to different accents can greatly enhance communication effectiveness. Additionally, its integration capabilities make it suitable for use in IoT devices and smart home systems, where voice commands are increasingly becoming the primary mode of interaction.
[024] The benefits and advantages that the present invention may offer have been discussed above with reference to particular embodiments. These benefits and advantages are not to be interpreted as critical, necessary, or essential features of any or all of the embodiments, nor are they to be read as any elements or constraints that might contribute to their occurring or becoming more evident.
[025] Although specific embodiments have been used to describe the current invention, it should be recognized that these embodiments are merely illustrative and that the invention is not limited to them. The aforementioned embodiments are open to numerous alterations, additions, and improvements. These adaptations, changes, additions, and enhancements are considered to be within the purview of the invention.
, Claims:1. A speech recognition system comprising:
a microphone array configured to capture audio signals from multiple directions;
a digital signal processor (DSP) programmed to execute real-time noise filtering algorithms to suppress background noise from said audio signals;
an accent adaptation module equipped with machine learning models to detect and adapt the speech recognition process to the speaker's accent based on said audio signals.
2. The speech recognition system as claimed in claim 1, wherein the microphone array includes at least four microphones positioned to optimize the capture of direct voice inputs while reducing the influence of ambient noise.
3. The speech recognition system as claimed in claim 1, wherein the DSP utilizes adaptive filtering algorithms that adjust parameters based on the type and level of background noise detected.
4. The speech recognition system as claimed in claim 1, wherein the accent adaptation module is trained on a dataset comprising speech samples from at least twenty different linguistic backgrounds.
5. The speech recognition system as claimed in claim 1, wherein the system further includes a user interface capable of receiving user input through voice commands and providing auditory feedback.
6. The speech recognition system as claimed in claim 1, wherein the noise filtering algorithms include spectral subtraction techniques specifically tailored for voice frequency bands.
7. The speech recognition system as claimed in claim 1, wherein the DSP is further configured to enhance the clarity of the speech signal by employing beamforming techniques that focus on the primary sound source.
8. The speech recognition system as claimed in claim 1, wherein the accent adaptation module updates its machine learning models in response to new accents detected during operation to improve accuracy over time.
9. The speech recognition system as claimed in claim 1, wherein the machine learning models within the accent adaptation module use deep learning algorithms to enhance the recognition and differentiation of accents.
10. The speech recognition system as claimed in claim 1, further including integration capabilities with external devices and platforms, enabling the system to function across various operating systems and hardware configurations.

Documents

Application Documents

# Name Date
1 202541002268-STATEMENT OF UNDERTAKING (FORM 3) [10-01-2025(online)].pdf 2025-01-10
2 202541002268-REQUEST FOR EARLY PUBLICATION(FORM-9) [10-01-2025(online)].pdf 2025-01-10
3 202541002268-FORM-9 [10-01-2025(online)].pdf 2025-01-10
4 202541002268-FORM 1 [10-01-2025(online)].pdf 2025-01-10
5 202541002268-DRAWINGS [10-01-2025(online)].pdf 2025-01-10
6 202541002268-DECLARATION OF INVENTORSHIP (FORM 5) [10-01-2025(online)].pdf 2025-01-10
7 202541002268-COMPLETE SPECIFICATION [10-01-2025(online)].pdf 2025-01-10