Device For Text To Speech Synthesis

< Back

Device For Text To Speech Synthesis

Abstract: DEVICE FOR TEXT TO SPEECH SYNTHESIS ABSTRACT A device (100) for text to speech synthesis for visually impaired users is disclosed. The device (100) incorporates an image capturing unit (102) designed to photograph text upon user-initiated button (104) press. This captured image undergoes Optical Character Recognition (OCR) via a connected controller (106). Subsequently, the recognized text is converted to audio through a sound unit (108) linked to the controller. This innovative device (100) offers a real-time solution for visually impaired users individuals, facilitating immediate access to printed material. By seamlessly integrating image capture, OCR, and speech synthesis technologies, this device (100) represents a significant advancement in enhancing accessibility and independence for the visually impaired community. Claims: 10, Figures: 2 Figure 2 is selected.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

25 November 2023

Publication Number

51/2023

Publication Type

INA

Invention Field

PHYSICS

Status

Email

Parent Application

Applicants

SR University

SR University, Ananthasagar, Warangal, Telangana-506371, India (IN) Email ID: patent@sru.edu.in Mb: 08702818333

Inventors

1. Rajeshwarrao Arabelli

3-40, Bhawsingpally, Chityal, Bhupalpally Dist-506356, India

2. Pittala Rajesham

11-52, Bheemaram, Manchiryal Dist-504231, India

3. R. Haripriya

SR University, Ananthasagar, Warangal, Telangana, India

4. S. Siri

SR University, Ananthasagar, Warangal, Telangana, India

5. T. Bhavana

SR University, Ananthasagar, Warangal, Telangana, India

6. B. Santhosh

SR University, Ananthasagar, Warangal, Telangana, India

7. M. Chandan

SR University, Ananthasagar, Warangal, Telangana, India

8. Dr. P. Venkata Ramana Rao

2-7-1075, Kanakadurga colony, Waddelally road, Hanumakonda- 506370, India

9. Durgam Rajababu

2-7-999, Kanakadurga colony, Hanamkonda – 506001, India

10. Srinivas. D

2-7-999, Kanakadurga colony, Hanamkonda – 506001, India

Claims

1. A device (100) for text to speech synthesis for visually impaired users, the device (100) comprising: an image capturing unit (102) adapted to capture an image of a text, wherein the image of the text is captured upon pressing of a button (104) by a user; a controller (106) connected to the image capturing unit (102), characterized in that the adapted to perform an Optical Character Recognition (OCR) on the text captured in the image; and a sound unit (108) connected to the controller (106) and adapted to announce the text recognized in the captured image.

2. The device (100) as claimed in claim 1, wherein the image capturing unit (102) is a camera.

3. The device (100) as claimed in claim 1, wherein the button (104) is engraved with braille script to guide the visually impaired.

4. The device (100) as claimed in claim 1, wherein the sound unit (108) is a speaker.

5. The device (100) as claimed in claim 1, comprises a battery to supply operational power to the controller (106).

6. A method for synthesis from text to speech for visually impaired users, the method characterised in steps of: capturing an image of a text using an image capturing unit (102) by pressing a button (104) by a user; performing an Optical Character Recognition (OCR) on the text in the captured image; and announcing the text recognized in the captured image using a sound unit (108).

7. The method as claimed in claim 6, wherein the image capturing unit (102) is a camera.

8. The method as claimed in claim 6, wherein the button (104) is engraved with braille script to guide the visually impaired.

9. The method as claimed in claim 6, wherein the sound unit (108) is a speaker.

10. The method as claimed in claim 6, wherein the Optical Character Recognition (OCR) is performed on the text using a controller (106). Date: November 14, 2023 Place: Noida Dr. Keerti Gupta Agent for the Applicant (IN/PA-1529)

Specification

Description:BACKGROUND
Field of Invention
[001] Embodiments of the present invention generally relate to a device for text to speech synthesis and particularly to a device for text to speech synthesis for visually impaired users.
Description of Related Art
[002] Visual impairment, encompassing various degrees of blindness and low vision, represents a significant challenge for affected individuals in their daily lives. One critical aspect of this challenge is the limited accessibility to printed materials, including books, articles, and digital content.
[003] Traditional methods of accessing written information for the visually impaired primarily involve Braille, which requires specialized training and materials. Additionally, audio-based solutions often employ pre-recorded audiobooks, limiting the availability of real-time content.
[004] There is thus a need for an improved and advanced device for text to speech synthesis for visually impaired users that can administer the aforementioned limitations in a more efficient manner.
SUMMARY
[005] Embodiments in accordance with the present invention provide a device for text to speech synthesis for visually impaired users. The device comprising: an image capturing unit adapted to capture an image of a text, wherein the image of the text is captured upon pressing of a button by a user. The device further comprising: a controller connected to the image capturing unit, and adapted to perform an Optical Character Recognition (OCR) on the text captured in the image. The device further comprising: a sound unit connected to the controller, and adapted to announce the text recognized in the captured image.
[006] Embodiments in accordance with the present invention further provide a method for synthesis from text to speech for visually impaired users. The method comprising steps of: capturing an image of a text using an image capturing unit by pressing of a button by a user; performing an Optical Character Recognition (OCR) on the text in the captured image; and announcing the text recognized in the captured image using a sound unit.
[007] Embodiments of the present invention may provide a number of advantages depending on their particular configuration. First, embodiments of the present application may provide a device for text to speech synthesis for visually impaired users.
[008] Next, embodiments of the present application may provide a device for text to speech synthesis that is user friendly and cost effective.
[009] These and other advantages will be apparent from the present application of the embodiments described herein.
[0010] The preceding is a simplified summary to provide an understanding of some embodiments of the present invention. This summary is neither an extensive nor exhaustive overview of the present invention and its various embodiments. The summary presents selected concepts of the embodiments of the present invention in a simplified form as an introduction to the more detailed description presented below. As will be appreciated, other embodiments of the present invention are possible utilizing, alone or in combination, one or more of the features set forth above or described in detail below.
BRIEF DESCRIPTION OF THE DRAWINGS
[0011] The above and still further features and advantages of embodiments of the present invention will become apparent upon consideration of the following detailed description of embodiments thereof, especially when taken in conjunction with the accompanying drawings, and wherein:
[0012] FIG. 1 illustrates a block diagram depicting a device for text to speech synthesis for visually impaired users, according to an embodiment of the present invention; and
[0013] FIG. 2 depicts a flowchart of a method for synthesis from text to speech for visually impaired users, according to an embodiment of the present invention.
[0014] The headings used herein are for organizational purposes only and are not meant to be used to limit the scope of the description or the claims. As used throughout this application, the word "may" is used in a permissive sense (i.e., meaning having the potential to), rather than the mandatory sense (i.e., meaning must). Similarly, the words “include”, “including”, and “includes” mean including but not limited to. To facilitate understanding, like reference numerals have been used, where possible, to designate like elements common to the figures. Optional portions of the figures may be illustrated using dashed or dotted lines, unless the context of usage indicates otherwise.
DETAILED DESCRIPTION
[0015] The following description includes the preferred best mode of one embodiment of the present invention. It will be clear from this description of the invention that the invention is not limited to these illustrated embodiments but that the invention also includes a variety of modifications and embodiments thereto. Therefore, the present description should be seen as illustrative and not limiting. While the invention is susceptible to various modifications and alternative constructions, it should be understood, that there is no intention to limit the invention to the specific form disclosed, but, on the contrary, the invention is to cover all modifications, alternative constructions, and equivalents falling within the scope of the invention as defined in the claims.
[0016] In any embodiment described herein, the open-ended terms "comprising", "comprises”, and the like (which are synonymous with "including", "having” and "characterized by") may be replaced by the respective partially closed phrases "consisting essentially of", “consists essentially of", and the like or the respective closed phrases "consisting of", "consists of”, the like.
[0017] As used herein, the singular forms “a”, “an”, and “the” designate both the singular and the plural, unless expressly stated to designate the singular only.
[0018] FIG. 1 illustrates a block diagram depicting a device for text to speech synthesis, according to an embodiment of the present invention. The device 100 may be useful for for visually impaired users. The device 100 may serve as an assistive technology designed to facilitate enhanced accessibility and engagement with textual content. According to embodiments of the present invention, the device 100 comprises an image capturing unit 102, a button 104, a controller 106, a sound unit 108, and a battery 110.
[0019] In an embodiment of the present invention, the image capturing unit 102 may be adapted to capture an image of a text. The image of the text may be captured upon pressing the button 104 by a user, in an embodiment of the present invention. In an embodiment of the present invention, the button 104 may be placed on the device 100 within a reachable proximity of fingers of the user. The button 104 may be engraved with braille script to guide the visually impaired, in an embodiment of the present invention.
[0020] In a preferred embodiment of the present invention, the image capturing unit 102 may be a camera. Embodiments of the present invention are intended to include or otherwise cover any type of the image capturing unit 102, including known, related art, and/or later developed technologies. According to embodiments of the present invention, the button 104 may be an electronic button such as, but not limited to, a push button, a selector button, a limit button, a proximity button, a pressure button, a speed button, a temperature button, and so forth. Embodiments of the present invention are intended to include or otherwise cover any type of the electronic button that may be used as the button 104, including known, related art, and/or later developed technologies.
[0021] In an embodiment of the present invention, the controller 106 may be connected to the image capturing unit 102. The controller 106 may be adapted to perform an Optical Character Recognition (OCR) on the text captured in the image, in an embodiment of the present invention.
[0022] The controller 106 may further be configured to execute computer-executable instructions to generate an output relating to the device 100. According to embodiments of the present invention, the controller 106 may be, but not limited to, a Programmable Logic Control (PLC) unit, a microprocessor, a development board, and so forth. Embodiments of the present invention are intended to include or otherwise cover any type of the controller 106 including known, related art, and/or later developed technologies.
[0023] In an embodiment of the present invention, the sound unit 108 may be connected to the controller 106. The sound unit 108 may be adapted to announce the text recognized in the captured image, in an embodiment of the present invention. According to embodiments of the present invention, the sound unit 108 may be, but not limited to, an earphone, a headphone, and so forth. In a preferred embodiment of the present invention, the sound unit 108 may be a speaker. Embodiments of the present invention are intended to include or otherwise cover any type of the sound unit 108, including known, related art, and/or later developed technologies.
[0024] In an embodiment of the present invention, the battery may be adapted to supply operational power to the controller 106. The battery may be rechargeable, in an embodiment of the present invention.
[0025] FIG. 2 depicts a flowchart of a method 200 for synthesis from text to speech for the visually impaired using the device 100, according to an embodiment of the present invention. At step 202, the device 100 may capture the image of the text using the image capturing unit 102 by pressing the button 104 by the user.
[0026] At step 204, the device 100 may perform the Optical Character Recognition (OCR) on the text in the captured image.
[0027] At step 206, the device 100 may announce the text recognized in the captured image using the sound unit 108.
[0028] While the invention has been described in connection with what is presently considered to be the most practical and various embodiments, it is to be understood that the invention is not to be limited to the disclosed embodiments, but on the contrary, is intended to cover various modifications and equivalent arrangements included within the scope of the appended claims.
[0029] This written description uses examples to disclose the invention, including the best mode, and also to enable any person skilled in the art to practice the invention, including making and using any devices or systems and performing any incorporated methods. The patentable scope of the invention is defined in the claims and may include other examples that occur to those skilled in the art. Such other examples are intended to be within the scope of the claims if they have structural elements that do not differ from the literal language of the claims, or if they include equivalent structural elements within substantial differences from the literal languages of the claims. , Claims:CLAIMS
I/We Claim:
1. A device (100) for text to speech synthesis for visually impaired users, the device (100) comprising:
an image capturing unit (102) adapted to capture an image of a text, wherein the image of the text is captured upon pressing of a button (104) by a user;
a controller (106) connected to the image capturing unit (102), characterized in that the adapted to perform an Optical Character Recognition (OCR) on the text captured in the image; and
a sound unit (108) connected to the controller (106) and adapted to announce the text recognized in the captured image.
2. The device (100) as claimed in claim 1, wherein the image capturing unit (102) is a camera.
3. The device (100) as claimed in claim 1, wherein the button (104) is engraved with braille script to guide the visually impaired.
4. The device (100) as claimed in claim 1, wherein the sound unit (108) is a speaker.
5. The device (100) as claimed in claim 1, comprises a battery to supply operational power to the controller (106).
6. A method for synthesis from text to speech for visually impaired users, the method characterised in steps of:
capturing an image of a text using an image capturing unit (102) by pressing a button (104) by a user;
performing an Optical Character Recognition (OCR) on the text in the captured image; and
announcing the text recognized in the captured image using a sound unit (108).
7. The method as claimed in claim 6, wherein the image capturing unit (102) is a camera.
8. The method as claimed in claim 6, wherein the button (104) is engraved with braille script to guide the visually impaired.
9. The method as claimed in claim 6, wherein the sound unit (108) is a speaker.
10. The method as claimed in claim 6, wherein the Optical Character Recognition (OCR) is performed on the text using a controller (106).
Date: November 14, 2023
Place: Noida

Dr. Keerti Gupta
Agent for the Applicant
(IN/PA-1529)

Documents

Application Documents

#	Name	Date
1	202341080230-STATEMENT OF UNDERTAKING (FORM 3) [25-11-2023(online)].pdf	2023-11-25
2	202341080230-REQUEST FOR EARLY PUBLICATION(FORM-9) [25-11-2023(online)].pdf	2023-11-25
3	202341080230-POWER OF AUTHORITY [25-11-2023(online)].pdf	2023-11-25
4	202341080230-OTHERS [25-11-2023(online)].pdf	2023-11-25
5	202341080230-FORM-9 [25-11-2023(online)].pdf	2023-11-25
6	202341080230-FORM FOR SMALL ENTITY(FORM-28) [25-11-2023(online)].pdf	2023-11-25
7	202341080230-FORM 1 [25-11-2023(online)].pdf	2023-11-25
8	202341080230-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [25-11-2023(online)].pdf	2023-11-25
9	202341080230-EDUCATIONAL INSTITUTION(S) [25-11-2023(online)].pdf	2023-11-25
10	202341080230-DRAWINGS [25-11-2023(online)].pdf	2023-11-25
11	202341080230-DECLARATION OF INVENTORSHIP (FORM 5) [25-11-2023(online)].pdf	2023-11-25
12	202341080230-COMPLETE SPECIFICATION [25-11-2023(online)].pdf	2023-11-25
13	202341080230-Proof of Right [13-02-2024(online)].pdf	2024-02-13
14	202341080230-POA [10-01-2025(online)].pdf	2025-01-10
15	202341080230-FORM 13 [10-01-2025(online)].pdf	2025-01-10
16	202341080230-FORM 18 [13-01-2025(online)].pdf	2025-01-13
17	202341080230-Proof of Right [16-01-2025(online)].pdf	2025-01-16