Personal Smart Robot To Assist To Blind People Using Deep Learning

< Back

Personal Smart Robot To Assist To Blind People Using Deep Learning Techniques

Abstract: The personal smart robot for assisting blind people utilizes deep learning techniques and consists of a Raspberry Pi 4 microprocessor, PI camera, L293D motor driver module, OLED screen, power supply adapter, ultrasonic sensors, USB microphone, and speaker. It enables control via a computer, smartphone, or voice command, with wireless communication facilitating real-time interaction. The system employs Python programming for movement control, machine learning models for text and number recognition, and TensorFlow with OpenCV for emotion detection.

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

15 February 2025

Publication Number

09/2025

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

Parent Application

Applicants

UTTARANCHAL UNIVERSITY

ARCADIA GRANT, P.O. CHANDANWARI, PREMNAGAR, DEHRADUN - 248007, UTTARAKHAND, INDIA

Inventors

1. AKHILESH PANDEY

ASSISTANT PROFESSOR UTTARANCHAL SCHOOL OF COMPUTING SCIENCES UTTARANCHAL UNIVERSITY DEHRADUN, UTTARAKHAND, INDIA PIN CODE-248007

2. YASHWANT SINGH BISHT

DEPARTMENT OF MECHANICAL ENGINEERING, UTTARANCHAL INSTITUTE OF TECHNOLOGY, UTTARANCHAL UNIVERSITY, DEHRADUN 248007-INDIA

3. DHARMENDRA KUMAR

ASSISTANT PROFESSOR UTTARANCHAL SCHOOL OF COMPUTING SCIENCES UTTARANCHAL UNIVERSITY DEHRADUN, UTTARAKHAND, INDIA PIN CODE-248007

4. KAMESH YADAV

ASSISTANT PROFESSOR UTTARANCHAL SCHOOL OF COMPUTING SCIENCES UTTARANCHAL UNIVERSITY DEHRADUN, UTTARAKHAND, INDIA PIN CODE-248007

Claims

1. A personal smart robotic system to assist blind people using deep learning techniques, comprising: A Raspberry Pi 4 microprocessor configured to control the robot car via computer, smartphone, or voice command; A PI camera attached to the robot car for surveillance and real-time video feed; A wireless communication module for interaction between the Raspberry Pi 4 and the user device; A L293D motor driver module for motor control; An OLED screen for displaying information; A power supply adapter for energy distribution; An ultrasonic sensor for obstacle detection and avoidance; A USB microphone for voice command input; A speaker for audio feedback; and A machine learning-based processing unit to analyze video feeds, recognize text, detect objects, and interpret facial expressions.

2. The robotic system as claimed in claim 1, wherein Python programming language is used to implement robot movement control and voice command processing.

3. The robotic system as claimed in claim 1, wherein a machine learning-trained model enables text and number detection, allowing the robot to read aloud detected text for the user.

4. The robotic system as claimed in claim 1, wherein emotion recognition is implemented using TensorFlow and OpenCV, enabling the robot to interpret a user’s expressions from live video feed.

5. The robotic system as claimed in claim 1, wherein facial recognition is used for computer vision, enabling the robot to identify individuals by comparing their facial images with a stored database.

6. The robotic system as claimed in claim 1, wherein an object detection and obstacle avoidance module is implemented using ultrasonic sensors, allowing autonomous navigation around obstacles.

7. The robotic system as claimed in claim 1, wherein deep learning techniques enhance the robot’s decision-making for assisting blind users with environmental awareness.

8. The robotic system as claimed in claim 1, wherein real-time notifications and alerts are provided via a voice output system to inform the user about environmental changes and detected objects.

9. The robotic system as claimed in claim 1, wherein trained deep learning algorithms continuously improve the accuracy of text recognition, facial detection, and obstacle avoidance over time.

10. The robotic system as claimed in claim 1, wherein the smart assistance system enables blind users to interact with their surroundings through computer vision, natural language processing, and deep learning algorithms.

Specification

Description:FIELD OF THE INVENTION
This invention relates to Personal Smart Robot to Assist to Blind people using Deep Learning Techniques.
BACKGROUND OF THE INVENTION
Personal Smart Robot exhibit intelligent behavior, learn, demonstrate, explain, and advice to users. Smart Robot are systems that understand, think, learn, and behave by the Implementation of Human Intelligence in Machines. To generate common sense technology in robot using reinforcement learning will allows to take decision automatically and establish emotional relationship between robots and human beings.
Despite of the attempts and the research work by the smart robot researchers to emulate human intelligence and vision, the result is not achieved. Most robots still cannot see and are not versatile object is not properly recognized by it. For the effective and proper mechanism of smart robotics technology it is important to prioritize the inefficiency associated in it.
The growing need for efficiently processing and analyzing the information contained in digital images and live processing is a continuous challenge in order to apply image processing and computer vision technologies to robotics and automation. Digital images are commonly processed in a brute force style, by analyzing all the pixels contained in the images, no matter how big and redundant they are. Since images may contain hundreds of thousands of pixels, any sequential processing upon them becomes easily tedious, even for the fastest processors, limiting thus the complexity of the visual tasks that can be performed with success.
In practice, hardware accelerated processors are frequently utilized in industry in order to perform simple image processing and computer vision algorithms in real time. However, these hardware solutions become quickly overcome as faster and faster off-the-shelf processors are developed. Thus, the high investments necessary for acquiring such specialized hardware are difficult to be justified in the end, especially when considering that maybe a year after the acquisition of the equipment; any personal computer can perform at similar speed or even faster than the dedicated hardware for a fraction of its cost. Hence, from the industry standpoint, it is important to develop fast image processing and computer vision techniques that manage to generate results efficiently by relying just on software algorithms and not on specialized hardware. This is one of the current challenges for the image processing and computer vision communities. The efficiency problem related to the processing of digital images comes from the fact that digital images are large representations, typically containing hundreds of thousands of data that must be repetitively processed. Reducing the size of digital images is a problem that has been extensively tackled from the perspective of image storage and transmission. As a result, a wide variety of standard techniques have already been devised for compressing digital images, achieving spectacular size reductions through popular algorithms, such as GIF or JPEG compression. The goal of this project is to predict, from the grayscale picture of a person’s face, which emotion the facial expression conveys. Our evaluation metric will be the accuracy for each emotion (fraction of correctly classified images), supplemented by a confusion matrix which highlights which emotions are better recognized than others. Emotion recognition shares a lot of challenges with detecting moving objects in video: identifying an object, continuous detection, incomplete or unpredictable actions, etc. Text detection in natural scenes is a challenging task and more complicated than text extraction in document text images, where there is a clear distinction between background and foreground and each character is separated from the context. In natural scenes, text can be appear in numerous states; dark text in light background and vice versa, with wide variety of fonts, even for characters of the same word, part of words can be overlapped by object of the environment and as a result the detection of these parts can be impossible. Other factors, like camera settings, may cause blurry images or perspective distortions. A major factor that makes the text detection and recognition in natural scenes difficult, are the illumination conditions. The light of the environment may create reflections on the text surfaces, object of the environment may cast shadows on the text surface, and also the intensity of the objects depends on the light source. A complete face recognition system includes face detection, face preprocessing and face recognition processes. Therefore, it is necessary to extract the face region from the face detection process and separate the face from the background pattern, which provides the basis for the subsequent extraction of the face difference features. The recent rise of the face based on the depth of learning detection methods, compared to the traditional method not only shorten the time, and the accuracy is effectively improved. Face recognition of the separated faces is a process of feature extraction and contrast identification of the normalized face images in order to obtain the identity of human faces in the images. However, those techniques were not designed for applying image processing or computer vision operations upon the compact representations obtained as a result of the compression process. Thus, even though the images are kept in a compact form, they must be uncompressed prior to being able to process them.
The suitability of these alternative representations for the application of other basic computer vision and image processing operations is still an open issue. This has been the basic motivation for the work done in this dissertation.
STATE OF THE ART/RESEARCH GAP
Smart Robot Personal Assistants – 2022 designed for blind people to assist them to recognize the people after detecting the face of known and unknown person. Robot can be moody as it also helps to recognize the emotions and feelings like bad mood, happy, aggression. The model is designed for detecting objects, face, emotion and text using computer vision with OpenCV which is a very famous library for computer vision and image processing tasks. It makes use deep learning models that have proved to be a very powerful tool because of its ability to handle large amounts of data. The interest to use hidden layers has surpassed traditional techniques, especially in pattern recognition. One of the most popular deep neural networks used in our model is Convolutional Neural Networks which is integrated within Raspberry Pi. The movement (left or right) of Personal Smart robot can be controlled by voice command. This make use of Natural language Processing technique which makes it possible hear speech, interpret it, measure sentiment and determine which parts are important. It is able to detect the obstacle during the motion of robot using ultrasonic technique.
SUMMARY OF THE INVENTION
This summary is provided to introduce a selection of concepts, in a simplified format, that are further described in the detailed description of the invention.
This summary is neither intended to identify key or essential inventive concepts of the invention and nor is it intended for determining the scope of the invention.
The personal smart robot for assisting blind people utilizes deep learning techniques and consists of a Raspberry Pi 4 microprocessor, PI camera, L293D motor driver module, OLED screen, power supply adapter, ultrasonic sensors, USB microphone, and speaker. It enables control via a computer, smartphone, or voice command, with wireless communication facilitating real-time interaction. The system employs Python programming for movement control, machine learning models for text and number recognition, and TensorFlow with OpenCV for emotion detection.
Additionally, the robot features facial recognition using computer vision, enabling identification by comparing images with a stored database. Ultrasonic sensors facilitate obstacle detection and avoidance, ensuring safe navigation. The deep learning-powered voice output system provides real-time notifications about environmental changes, enhancing the user’s awareness.
The smart assistance system integrates natural language processing and deep learning algorithms, improving text recognition, facial detection, and mobility assistance over time. This system significantly enhances the independence of blind users by allowing seamless interaction with their surroundings.
To further clarify advantages and features of the present invention, a more particular description of the invention will be rendered by reference to specific embodiments thereof, which is illustrated in the appended drawings. It is appreciated that these drawings depict only typical embodiments of the invention and are therefore not to be considered limiting of its scope. The invention will be described and explained with additional specificity and detail with the accompanying drawings.
Figure shows the overall concept design of this project. Raspberry Pi 4 is utilized as a microprocessor that is used to control the robot car through a computer or Smartphone or Voice command. The surveillance can be seen from PI camera that is attached to the robot car. The interface of the robot controller is installed the laptop or Smartphone and the user will press the button movement or using keyboard or voice to instruct the robot car to move. The instruction and video feed received from the robot car enabled through wireless connection between Raspberry Pi and the laptop/smartphone and speak if detect the person.
BRIEF DESCRIPTION OF THE DRAWINGS
The illustrated embodiments of the subject matter will be understood by reference to the drawings, wherein like parts are designated by like numerals throughout. The following description is intended only by way of example, and simply illustrates certain selected embodiments of devices, systems, and methods that are consistent with the subject matter as claimed herein, wherein:
FIGURE 1: Working design of P-Robo
FIGURE 2: Block Diagram of P-Robo
The figures depict embodiments of the present subject matter for the purposes of illustration only. A person skilled in the art will easily recognize from the following description that alternative embodiments of the structures and methods illustrated herein may be employed without departing from the principles of the disclosure described herein.
DETAILED DESCRIPTION OF THE INVENTION
The detailed description of various exemplary embodiments of the disclosure is described herein with reference to the accompanying drawings. It should be noted that the embodiments are described herein in such details as to clearly communicate the disclosure. However, the amount of details provided herein is not intended to limit the anticipated variations of embodiments; on the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the scope of the present disclosure as defined by the appended claims.
It is also to be understood that various arrangements may be devised that, although not explicitly described or shown herein, embody the principles of the present disclosure. Moreover, all statements herein reciting principles, aspects, and embodiments of the present disclosure, as well as specific examples, are intended to encompass equivalents thereof.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments. As used herein, the singular forms “a",” “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises,” “comprising,” “includes” and/or “including,” when used herein, specify the presence of stated features, integers, steps, operations, elements and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components and/or groups thereof.
It should also be noted that in some alternative implementations, the functions/acts noted may occur out of the order noted in the figures. For example, two figures shown in succession may, in fact, be executed concurrently or may sometimes be executed in the reverse order, depending upon the functionality/acts involved.
In addition, the descriptions of "first", "second", “third”, and the like in the present invention are used for the purpose of description only, and are not to be construed as indicating or implying their relative importance or implicitly indicating the number of technical features indicated. Thus, features defining "first" and "second" may include at least one of the features, either explicitly or implicitly.
Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which example embodiments belong. It will be further understood that terms, e.g., those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
Figure shows the overall concept design of this project. Raspberry Pi 4 is utilized as a microprocessor that is used to control the robot car through a computer or Smartphone or Voice command. The surveillance can be seen from PI camera that is attached to the robot car. The interface of the robot controller is installed the laptop or Smartphone and the user will press the button movement or using keyboard or voice to instruct the robot car to move. The instruction and video feed received from the robot car enabled through wireless connection between Raspberry Pi and the laptop/smartphone and speak if detect the person.
The design consists more on actual planning of hardware part than the code to be created.
? Night Vision Raspberry pi controller,
? L293d motor driver module
? OLed Screen
? Camera design,
? Power supply adapter
? Ultrasonic sensor
? USB mic
? Speaker
? Motor control design.
Software Implementation
It required number of software’s, Programming Tools & Languages to build a project.
a. Raspbian OS
Raspbian is a free operating system based on Debian optimized for the Raspberry Pi hardware. An operating system is the set of basic programs and utilities that make your Raspberry Pi run. However, Raspbian provides more than a pure OS: it comes with over 35,000 packages, pre-compiled software bundled in a nice format for easy installation on your Raspberry Pi.
b. Python
Python is a high-level, interpreted, interactive and object-oriented scripting language. Python is designed to be highly readable. It uses English keywords frequently where as other languages use punctuation, and it has fewer syntactical constructions than other languages.
c. HTML/CSS
Hypertext Mark-up Language, commonly abbreviated as HTML, is the standard mark-up language used to create web pages. Along with CSS, and JavaScript, HTML is a cornerstone technology used to create web pages, as well as to create user interfaces for mobile and web applications. Cascading Style Sheets (CSS) is a style sheet language used for describing the presentation of a document written in a language. Although most often used to set the visual style of web pages and user interfaces written in HTML.
WORKING METHODOLOGY: -
? The Text and Numbers detection will use a Machine Learning Trained model.
? The code will be implemented in python to give voice command and to control the robot movements.
? The emotion recognition module will be implemented using TensorFlow libraries and Open cv that will be used to identify the user’s expression through the code from the live webcam video output.
? The facial recognition module will be used for computer vision to recognize the persons face that was in the database by simply comparing the photo of the user with their photo present in the database.
? The object detection and avoiding code will be used to run the ultrasonic sensors to avoid the obstacles and allowing the robot to traverse around.
Advantages:
? To be able to move freely around the environment using ultrasonic sensors to detect the obstacles.
? Design and development of algorithm to read the text and numbers around in its surroundings and read them for the user with trained machine learning model.
? To read the human facial features and tell them how they are feeling using computer vision.
? To achieve the ability to perform simple and complex everyday tasks with ease and perfection
, Claims:1. A personal smart robotic system to assist blind people using deep learning techniques, comprising:
A Raspberry Pi 4 microprocessor configured to control the robot car via computer, smartphone, or voice command;
A PI camera attached to the robot car for surveillance and real-time video feed;
A wireless communication module for interaction between the Raspberry Pi 4 and the user device;
A L293D motor driver module for motor control;
An OLED screen for displaying information;
A power supply adapter for energy distribution;
An ultrasonic sensor for obstacle detection and avoidance;
A USB microphone for voice command input;
A speaker for audio feedback; and
A machine learning-based processing unit to analyze video feeds, recognize text, detect objects, and interpret facial expressions.
2. The robotic system as claimed in claim 1, wherein Python programming language is used to implement robot movement control and voice command processing.
3. The robotic system as claimed in claim 1, wherein a machine learning-trained model enables text and number detection, allowing the robot to read aloud detected text for the user.
4. The robotic system as claimed in claim 1, wherein emotion recognition is implemented using TensorFlow and OpenCV, enabling the robot to interpret a user’s expressions from live video feed.
5. The robotic system as claimed in claim 1, wherein facial recognition is used for computer vision, enabling the robot to identify individuals by comparing their facial images with a stored database.
6. The robotic system as claimed in claim 1, wherein an object detection and obstacle avoidance module is implemented using ultrasonic sensors, allowing autonomous navigation around obstacles.
7. The robotic system as claimed in claim 1, wherein deep learning techniques enhance the robot’s decision-making for assisting blind users with environmental awareness.
8. The robotic system as claimed in claim 1, wherein real-time notifications and alerts are provided via a voice output system to inform the user about environmental changes and detected objects.
9. The robotic system as claimed in claim 1, wherein trained deep learning algorithms continuously improve the accuracy of text recognition, facial detection, and obstacle avoidance over time.
10. The robotic system as claimed in claim 1, wherein the smart assistance system enables blind users to interact with their surroundings through computer vision, natural language processing, and deep learning algorithms.

Documents

Application Documents

#	Name	Date
1	202511013046-STATEMENT OF UNDERTAKING (FORM 3) [15-02-2025(online)].pdf	2025-02-15
2	202511013046-REQUEST FOR EARLY PUBLICATION(FORM-9) [15-02-2025(online)].pdf	2025-02-15
3	202511013046-POWER OF AUTHORITY [15-02-2025(online)].pdf	2025-02-15
4	202511013046-FORM-9 [15-02-2025(online)].pdf	2025-02-15
5	202511013046-FORM FOR SMALL ENTITY(FORM-28) [15-02-2025(online)].pdf	2025-02-15
6	202511013046-FORM 1 [15-02-2025(online)].pdf	2025-02-15
7	202511013046-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [15-02-2025(online)].pdf	2025-02-15
8	202511013046-EVIDENCE FOR REGISTRATION UNDER SSI [15-02-2025(online)].pdf	2025-02-15
9	202511013046-EDUCATIONAL INSTITUTION(S) [15-02-2025(online)].pdf	2025-02-15
10	202511013046-DRAWINGS [15-02-2025(online)].pdf	2025-02-15
11	202511013046-DECLARATION OF INVENTORSHIP (FORM 5) [15-02-2025(online)].pdf	2025-02-15
12	202511013046-COMPLETE SPECIFICATION [15-02-2025(online)].pdf	2025-02-15
13	202511013046-Proof of Right [22-11-2025(online)].pdf	2025-11-22