Abstract: The development of object recognition software has altered a wide range of industries, including industrial facilities, autonomous vehicles, and many others. The proposed system combines text-to-speech (TTS) technology to provide a voice-guidance technique that provides information about the objects around users. It incorporates the YOLOV7 (You Only Look Once) algorithm for object identification. The primary objective of the proposed system is to enable people with visual assistance to independently identify objects in a particular location without requiring assistance from others. The proposed system employs technical terms such as deep learning, object recognition, YOLOV7 algorithm, and also image classification techniques for feature identification and extraction of the video frames and categorize them into the respective classes. The performance of the system is evaluated through experiments, demonstrating its effectiveness and potential for real-world applications. 4 Claims 3 Figures
Description:Field of the Invention
The field of invention for the "AUDIBLE SIGHT" device lies within the domain of assistive technology and wearable devices. The "AUDIBLE SIGHT" is an innovative device designed to help visually impaired individuals navigate their surroundings more effectively.
Objective of this invention
The primary objective of proposing the idea of “Audible sight” is to provide a better solution that permits blind people to use a camera to detect their surroundings and receive verbal input to notify them.people who are blind need an assistance since they have no idea about external barriers .The Audible sight is a modern device for assisting the blind for finding their path to any location. To help the visually impaired people to move around from one place to another with confidence by knowing the nearby obstacles
Background of the Invention
The goal of developing “AUDIBLE SIGHT” is to command an assistance for blind people .The development of "AUDIBLE SIGHT " technology has led to the relatively recent development of object detection cameras. Early Development: Researchers started looking at computer vision methods for object detection in the early 2000s. Object recognition in pictures or video streams was traditionally accomplished via feature extraction and matching in computer vision algorithms. These techniques were inaccurate and costly in terms of computing.
The concept of “AUDIBLE SIGHT" emerged as a visionary idea, aiming to restore the sense of vision through advanced technologies. Early efforts focused on sensory substitution, exploring the possibilities of converting visual information into audio modalities such as sound.. In recent times, the fusion of computer vision, artificial intelligence, and visual interfaces has opened up new possibilities for conveying visual information to the blind. The innovation of the "AUDIBLE SIGHT" for the blind person showcases the relentless pursuit of scientific and technological advancements, providing hope for a future where visual impairments may be overcome through ingenuity and innovation.
WO2016086440A1 Also refers to Wearable guiding device for the blind This patent describes a wearable device that can help blind people navigate their environment. The device includes a sensor, a camera, a computer, and a speaker. The sensor detects obstacles in the user's path, and the camera captures images of the user's surroundings. The computer analyzes the images and the sensor data to identify objects and obstacles, and the speaker provides the user with verbal instructions about their surroundings.
US20190026939A1 similarly Systems and methods for blind and visually impaired person environment navigation assistance This patent describes a system that can help blind and visually impaired people navigate their environment. The system includes a wearable device, a computer, and a server. The wearable device includes a sensor, a camera, and a speaker. The sensor detects obstacles in the user's path, and the camera captures images of the user's surroundings. The computer analyzes the images and the sensor data to identify objects and obstacles, and the speaker provides the user with verbal instructions about their surroundings. The server stores information about the user's surroundings, and the computer can access this information to provide the user with more accurate instructions.
US9805619B2 relates Intelligent glasses for the visually impaired This patent describes a system that uses digital images to help blind people navigate their environment. The system includes a camera, a computer, and a speaker. The camera captures images of the user's surroundings, and the computer analyzes the images to identify objects and obstacles. The speaker then provides the user with verbal instructions about their surroundings.
CN104000709A - Glasses for blind people This patent describes a pair of glasses that can help blind people navigate their environment. The glasses include a camera, a computer, and a speaker. The camera captures images of the user's surroundings, and the computer analyzes the images to identify objects and obstacles. The speaker then provides the user with verbal instructions about their surroundings. The glasses also include a GPS receiver, which can be used to provide the user with directions to their destination.
The researchers have also explored the use of advanced sound-based imaging techniques, such as echolocation, inspired by how some animals navigate their surroundings using sound waves. By emitting sounds and analyzing the echoes that bounce back, individuals can gain information about the distance, size, and shape of objects in their environment.
Summary of the Invention
The idea of creating a technology that can offer assistance to those who are blind or visually impaired is to develop the " AUDIBLE SIGHT" for a blind person. Although there are presently object detection cameras , research and development are happening in this field. The concept behind the "AUDIBLE SIGHT" is to transfer altered information to the user through audio employing modern technology, auditory-based approaches, and YOLO algorithms to transmit visual information
The objective is to provide blind people the ability to sense and understand the visual environment, potentially enhancing their capacity for movement, object recognition.The creation of a "AUDIBLE SIGHT" highlights the search for innovative ways to improve the quality of life for those who are blind or visually impaired.The development of "AUDIBLE SIGHT" will definitely help the blind people .The main Agenda of this project is to give assistance to blind with the help of voice feedback.
Brief Description of Drawings
The invention will be described in detail with reference to the “AUDIBLE SIGHT”
Figure-1 : Example For Residual Blocks in YOLO module
Figure-2: Working of the proposed system.
Figure-3: Detailed Design Diagram of the proposed system
Detailed Description of the Invention
The idea of "AUDIBLE SIGHT" for the blind may be improved. It is feasible to create a gadget that can detect and identify objects and motions in the immediate surroundings by integrating these technologies. Real-time visual data recorded by cameras may be examined using the YOLO module,GTTS module, a common object detection approach. It can recognize and categorize items, giving the user useful data. By detecting movement and warning the user of potential hazards or changes in the surroundings, motion detection cameras may also improve the device's functionality. Blind people can get real-time input about their surroundings by incorporating these technologies into a "AUDIBLE SIGHT" device, allowing them to move more securely and confidently in their daily lives.
A deep learning method called the YOLO (You Only Look Once) module is employed for object recognition in pictures and videos. It distinguishes itself with great precision and real-time processing capabilities. YOLO conducts object detection in a single pass, in contrast to conventional object detection algorithms that need many phases. The algorithm creates a grid from the input image and forecasts bounding boxes and class probabilities for each grid cell. These predictions are improved using anchor boxes, and redundant detections are filtered out using non-maximum suppression. YOLO is renowned for its quickness and capacity to concurrently detect several things in a picture. It has been extensively employed in many different applications, including robots, surveillance, and autonomous driving. A crucial tool in creating technologies like the "AUDIBLE SIGHT" for the blind, YOLO's quick and effective object detection technique enables the identification and understanding of things in real-time.
Computer vision systems use object and motion detection modules to recognize and track objects and movements in pictures or video streams. These modules carry out their functions using a variety of methods and algorithms.The existence and placement of particular items or classes of objects within the picture or video are determined by object detection modules after analyzing the visual input. To attain great accuracy, they frequently use deep learning techniques like convolutional neural networks (CNNs). These modules can identify and categorize items, giving useful details about their locations, dimensions, and classifications.
On the other hand, motion detection modules concentrate on spotting changes in the visual scene over time. They look through a series of frames or video streams and identify regions with noticeable motion or movement. This makes it possible to recognize moving objects and follow their trajectories and speeds. Typically, background removal, optical flow, or frame differencing methods are used as the foundation for motion detection modules.
Blind people who use a "AUDIBLE SIGHT" gadget with an object recognition camera and YOLO (You Only Look Once) module can significantly improve their sense of the visual world .The YOLO module can be integrated into the "AUDIBLE SIGHT" device to detect and classify objects in real-time. By analyzing the visual input from the object detection camera, the YOLO algorithm can identify various objects such as pedestrians, vehicles, obstacles, and more. This information can then be conveyed to the user through non-visual feedback, such as auditory cues or haptic feedback, allowing them to understand the presence and location of objects in their surroundings.The combination of object detection cameras and the YOLO module can aid in obstacle detection and navigation for blind individuals. By detecting objects and their spatial positions, the device can provide real-time feedback to the user about potential obstacles in their path. This enables the user to navigate safely and confidently, avoiding collisions or hazards.
The "AUDIBLE SIGHT" device's object detecting cameras may provide users a thorough grasp of their surroundings. The camera can record crucial visual details like a room's layout, the presence of furniture, or the placement of things by continually scanning its surroundings. When the YOLO module processes this information, it may be converted into useful non-visual signals that help blind people construct an accurate mental map of their environment and make wise judgments. The YOLO module can help by supplying contextual data about things that the camera has spotted. For instance, it can recognize and categorize store names, street signs, and other text-based data, then turn it into aural or haptic feedback for the user.The "AUDIBLE SIGHT" technology can provide blind people a more thorough and in-depth awareness of their visual world by integrating the YOLO module with object detection cameras. They are able to move around securely, identify items and their surroundings, and communicate more successfully.
A. Input Module: The input module takes the video frame as input to be processed by the YOLOv7 model. The obtained video frames may be of different sizes and aspect ratios, so they need to be resized to a uniform size that matches the input size of the YOLOv7 model. This resizing step ensures that the object detection model can process the frames efficiently. The pixel values of the resized frames are normalized to have a mean of zero and a standard deviation of one. This normalization step helps the model to learn better and converge faster during training. The blob function reorders the image channels to match the input format expected by the YOLOv7 model. The blob function groups the preprocessed frames into batches to speed up the training and inference process. The batch size is typically defined in the configuration file of the YOLOv7 model .After the blob function has prepared the input frames, they can be fed into the YOLOv7 neural network for object detection. The network outputs bounding boxes and class probabilities for all detected objects in each frame of the video.
B. Object Detection Module: The object detection module is responsible for detecting objects within the input image or video frame. Yolov7 uses a deep neural network architecture to detect objects with high accuracy and speed. The input image is preprocessed to resize it to a fixed size and normalize the pixel values.The preprocessed image is then passed through a convolutional neural network (CNN) to extract features from the image
C. Non-Maximum Suppression (NMS): Non-Maximum suppression (NMS) is a post-processing technique used in YOLOv7 for object detection. It is used to eliminate redundant detections and retain only the most confident ones. It compares the predicted bounding boxes of the detected objects and removes the ones that have a high overlap or intersection. The algorithm selects the detection with the highest confidence score and removes all other detections that have a high overlap with it.
D. Output Module: Voice guidance technologies are synthesized, allowing visually impaired individuals to identify object locations through voice guidance. This approach involves synthesizing the position and name of the object, utilizing Google text-to-speech (GTTS) technology. The output module combines the object detection and Text-To-Speech module to provide voice feedback to the user. It generates a voice message that announces the name of the detected object and its location within the image or video frame. Overall, YOLOv7 with GTTS voice feedback is a powerful system that can accurately and quickly detect objects within images or video frames, and provide real-time voice feedback to visually impaired individuals.
Advantages of the Proposed model,
The project greatly improves how blind people perceive the visual environment. The gadget may offer real-time information regarding the presence, location, and classification of items in the environment by utilizing object detection algorithms and cameras. As a result, consumers are better equipped to comprehend their surroundings and make wise judgments.
Enhanced Safety and Navigation with The YOLO module's interaction with object detection cameras enables efficient obstacle identification and navigation. Blind people can navigate securely and independently by getting rapid input about potential risks, impediments, or changes in their path.
Access to Visual Information of The initiative makes visual information available to the blind that they would not otherwise have access to. Users are able to notice and decipher crucial visual information, such as writing, signs, or spatial layouts, because of the device's conversion of visual data into non-visual signals, such aural or haptic stimulation. Real-Time Assistance of The YOLO module and object recognition cameras provide blind people with real-time information about their surroundings. This enhances the entire user experience by enabling rapid decision-making and adaptation to changing circumstances.
Increased Independence and Empowerment of The project encourages blind people to become more independent and powerful. The "AUDIBLE SIGHT" gadget empowers users to explore, interact, and engage with the environment on their terms, promoting a greater sense of autonomy. This is done by providing a more thorough comprehension of their surroundings. The YOLO module and object detection cameras have the potential to be incorporated into wearable or portable devices, making them practical and usable for daily usage. This makes it possible for users to take the gadget with them wherever they go, improving their perception of and engagement with the visual world in a variety of contexts.
The aforementioned initiative has several benefits, including the capacity to improve blind people's perception, safety, access to visual information, independence, and empowerment, eventually enhancing their quality of life and enabling them to actively engage in society.
4 Claims & 3 Figures , Claims:The scope of the invention is defined by the following claims:
Claims:
1. The proposed invention Audible sight comprises the following special features,
a) The invention Improves sense of Visual recognition for blind people which increases accessibility in the environment.
b) The device helps blind individuals by enhancing their independence and mobility. The devise can recognize and categorize various environmental elements, giving blind users useful information about their surroundings, such as the presence of people, furniture, doors, or certain objects they may meet.
2. As per claim 1, the device assists blind individuals in creating mental maps of their environment and developing spatial awareness.
3. As per claim 1, the devise is made to be adaptive and configurable to user preferences. To suit their unique sensory perception and comfort, users can change the sensory signals' intensity, frequency, or modality.
4. As per claim 1, the devise provides blind people access to visual information and enabling them to take part more completely in daily activities, the AUDIBLE SIGHT technology seeks to greatly improve their quality of life.
| # | Name | Date |
|---|---|---|
| 1 | 202341067746-REQUEST FOR EARLY PUBLICATION(FORM-9) [10-10-2023(online)].pdf | 2023-10-10 |
| 2 | 202341067746-FORM FOR STARTUP [10-10-2023(online)].pdf | 2023-10-10 |
| 3 | 202341067746-FORM FOR SMALL ENTITY(FORM-28) [10-10-2023(online)].pdf | 2023-10-10 |
| 4 | 202341067746-FORM 1 [10-10-2023(online)].pdf | 2023-10-10 |
| 5 | 202341067746-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [10-10-2023(online)].pdf | 2023-10-10 |
| 6 | 202341067746-EVIDENCE FOR REGISTRATION UNDER SSI [10-10-2023(online)].pdf | 2023-10-10 |
| 7 | 202341067746-EDUCATIONAL INSTITUTION(S) [10-10-2023(online)].pdf | 2023-10-10 |
| 8 | 202341067746-DRAWINGS [10-10-2023(online)].pdf | 2023-10-10 |
| 9 | 202341067746-COMPLETE SPECIFICATION [10-10-2023(online)].pdf | 2023-10-10 |
| 10 | 202341067746-FORM-9 [28-10-2023(online)].pdf | 2023-10-28 |