Abstract: SYSTEM AND METHOD FOR DETECTING AND UTILIZING DISTINCT GESTURES FOR EMERGENCY ASSISTANCE ABSTRACT The present invention discloses a system (100) for detecting and utilizing distinct gestures for emergency assistance. The system (100) comprises a camera (102) for capturing live feed focusing on user gestures, a processing unit (104) configured to identify data points from the captured live feed using an artificial intelligence engine (108), a dataset (106) for recognizing a "signal for help" gesture, and an output unit (114) triggering a response mechanism (116) based on estimated temporal parameters. The system leverages advanced technologies such as machine learning, and neural networks for precise gesture recognition and effective emergency response. Claims: 10, Figures: 4 Figure 1A is selected.
Description:BACKGROUND
Field of Invention
[001] Embodiments of the present invention generally relate to a system for detecting hand movements and particularly to a system and method for detecting and utilizing distinct gestures for emergency assistance.
Description of Related Art
[002] Highways, particularly during nighttime, pose considerable safety challenges owing to reduced visibility and heightened risks necessitating urgent emergency measures. These conditions are especially concerning for vulnerable groups such as girls and children who may face heightened risks from unsafe individuals in poorly lit or isolated areas. Communication about such unsafe circumstances becomes crucial but can be hindered due to limitations in existing emergency response systems.
[003] Several existing solutions and prior art address related problems but may not specifically target the unique challenges faced during nighttime highway emergencies or the specific needs of individuals in distress. The existing solutions and the prior art include algorithms for detecting animals on roads to prevent accidents caused by human-animal conflicts, distress signal recognition in urban environments using deep learning techniques, and hand movement recognition using Convolutional Neural Networks (CNNs) for gesture-based interaction systems.
[004] While these prior systems contribute significantly to their respective domains, they may not directly address the challenges of nighttime highway emergencies or provide immediate assistance to individuals in distress using a unique gesture-based approach. The ability to discreetly and effectively signal for help through gestures becomes paramount, especially for individuals facing threats in isolated or poorly lit areas. These gestures not only serve as a means of alerting others to their distress but also facilitate quick and appropriate emergency responses, ensuring the safety and well-being of vulnerable individuals, including girls and children, in potentially dangerous situations. It's important to acknowledge potential risks associated if the gestures for help are recognized by malicious individuals such as thieves or terrorists.
[005] The landscape of the existing solutions and prior art reflects a diverse range of efforts to address safety and communication challenges in various contexts. However, the existing solutions fall short in addressing specific challenges related to nighttime highway safety, such as the swift detection of distress signals, integration with existing Close-Circuit Television (CCTV) infrastructure for immediate response, and differentiation between standard gestures and distress signals. Existing systems also lack robustness in low-light conditions, comprehensive coverage of diverse gestures, and efficient communication with emergency services.
[006] There is thus a need for a system and method for detecting and utilizing distinct gestures for emergency assistance that can overcome the limitations of the prior art in a more efficient manner.
SUMMARY
[007] An aspect of the present invention provides a system for detecting and utilizing distinct gestures for an emergency assistance. The system comprising a camera configured to capture a live feed of a user. The system further comprising a processing unit connected to the camera. The processing unit is configured to receive the captured live feed of the user from the camera. The processing unit is further configured to identify data points extracted from the captured live feed by utilizing an artificial intelligence engine. The processing unit is further configured to determine if the identified data points establish a "signal for help" gesture, using a dataset. The processing unit is further configured to estimate temporal parameters selected from an urgency level of the detected gesture, a location of the user, and a severity of a situation based on the interpreted data points. The processing unit is further configured to trigger an output unit to initiate a response mechanism to generate a feedback or response actions based on the estimated temporal parameters.
[008] Another aspect of the present invention provides a method for method for detecting and utilizing distinct gestures for an emergency assistance. The method comprising step of: receiving a captured live feed of a user from a camera (102); identifying data points extracted from the captured live feed by utilizing an artificial intelligence engine (108); determining if the identified data points establish a "signal for help" gesture using a dataset (106); estimating temporal parameters selected from an urgency level of the detected gesture, a location of the user, and a severity of a situation based on the interpreted data points; triggering an output unit (114) to initiate a response mechanism (116) for generating a feedback or response actions based on the estimated temporal parameters.
[009] The aspect of the present invention may provide a number of advantages depending on its particular configuration. In one implementation, the present application may provide a system and method for detecting and utilizing distinct gestures for emergency assistance.
[0010] Next, embodiments of the present application may provide a system for enhancing nighttime highway safety by leveraging existing Close-Circuit Television (CCTV) camera infrastructure.
[0011] Next, embodiments of the present application may provide a system that is seamlessly integrated with current Close-Circuit Television (CCTV) systems to aid individuals in distress, with a specific emphasis on vulnerable populations such as women and young individuals facing potential dangers.
[0012] Next, embodiments of the present application may provide a system that employs cutting-edge technologies like computer vision and artificial intelligence.
[0013] Next, embodiments of the present application may provide a system that recognizes distress signals, particularly a "signal for help" gesture, separate from standard sign language, thereby bridging critical response gaps during emergencies and providing essential assistance to those in helpless situations.
[0014] These and other advantages will be apparent from the aforementioned embodiments of the present application described herein.
[0015] The preceding is a simplified summary to provide an understanding of some aspects of the present invention. This summary is neither an extensive nor exhaustive overview of the present invention and its various implementations. The summary presents selected concepts of the implementations of the present invention in a simplified form as an introduction to the more detailed description presented below. As will be appreciated, other implementations of the present invention are possible utilizing, alone or in combination, one or more of the features set forth above or described in detail below.
BRIEF DESCRIPTION OF THE DRAWINGS
[0016] The above and still further features and advantages of embodiments of the present invention will become apparent upon consideration of the following detailed description of embodiments thereof, especially when taken in conjunction with the accompanying drawings, and wherein:
[0017] FIG. 1A illustrates a block diagram of a system for detecting and utilizing distinct gestures for an emergency assistance, according to an embodiment of the present invention;
[0018] FIG. 1B illustrates a pictorial diagram of data points recognition for using the system, according to an embodiment of the present invention;
[0019] FIG. 2 illustrates a block diagram of a processing unit for the system for detecting and utilizing distinct gestures for the emergency assistance, according to an embodiment of the present invention; and
[0020] FIG. 3 illustrates a flowchart of a method for detecting and utilizing distinct gestures for the emergency assistance using the system, according to an embodiment of the present invention.
[0021] The headings used herein are for organizational purposes only and are not meant to be used to limit the scope of the description or the claims. As used throughout this application, the word “may” is used in a permissive sense (i.e., meaning having the potential to), rather than the mandatory sense (i.e., meaning must). Similarly, the words “include”, “including”, and “includes” mean including but not limited to. To facilitate understanding, like reference numerals have been used, where possible, to designate like elements common to the figures. Optional portions of the figures may be illustrated using dashed or dotted lines, unless the context of usage indicates otherwise.
DETAILED DESCRIPTION
[0022] The following description includes the preferred best mode of one embodiment of the present invention. It will be clear from this description of the invention that the invention is not limited to these illustrated embodiments but that the invention also includes a variety of modifications and embodiments thereto. Therefore, the present description should be seen as illustrative and not limiting. While the invention is susceptible to various modifications and alternative constructions, it should be understood, that there is no intention to limit the invention to the specific form disclosed, but, on the contrary, the invention is to cover all modifications, alternative constructions, and equivalents falling within the scope of the invention as defined in the claims.
[0023] In any embodiment described herein, the open-ended terms “comprising”, “comprises”, and the like (which are synonymous with “including”, “having” and “characterized by”) may be replaced by the respective partially closed phrases “consisting essentially of”, “consists essentially of”, and the like or the respective closed phrases “consisting of”, “consists of”, the like.
[0024] As used herein, the singular forms “a”, “an”, and “the” designate both the singular and the plural, unless expressly stated to designate the singular only.
[0025] FIG. 1A illustrates a block diagram of a system for detecting and utilizing distinct gestures for an emergency assistance, according to an embodiment of the present invention. In an embodiment of the present invention, the system 100 may observe gestures such as movements of hands of a user, eyes, shoulder, posture, and a posture of a body of the user. The system 100 may further generate feedback based on the observed gestures, in an embodiment of the present invention. The system 100 may be adapted to observe and analyze even the most minute motions with an impressive level of sensitivity. For instance, the system 100 may accurately detect delicate finger movements indicative of a "signal for help" gesture, distinguishing them from regular hand gestures or motions. This exceptional sensitivity extends to an ability of the system 100 to comprehensively capture and assess the nuances of the gestures in various environmental conditions. For example, the system 100 may operate effectively in low-light conditions or noisy environments, ensuring reliable detection and utilization of the "signal for help" gesture. Additionally, the system 100 is capable of real-time processing and response, enabling swift and precise actions upon detecting the designated hand gesture.
[0026] Thereby, the system 100 may identify the gestures along with discerning and documenting subtle actions such as hand positions, trajectory, and speed of movement. This detailed analysis contributes to the system's high accuracy in recognizing the "signal for help" gesture, enhancing its effectiveness in providing emergency assistance and improving overall safety in public spaces. In an embodiment of the present invention, the system 100 may further enable a user to study the feedback generated. According to embodiments of the present invention, the user may be, but not limited to, a woman, a child, a man, a person among a group of people, and so forth. Embodiments of the present invention are intended to include or otherwise cover any user.
[0027] According to embodiments of the present invention, the system 100 may be installed in locations, such as, but not limited to, a road, a school, a private place, a public place, a bus stand, a parking, a garden, a company, a hospital, a marketplace, a rehabilitation center, and so forth. Embodiments of the present invention are intended to include or otherwise cover any location for installation of the system 100, including known, related art, and/or later developed technologies.
[0028] According to embodiments of the present invention, the system 100 may comprise a camera 102, a processing unit 104, a dataset 106, an artificial intelligence (AI) engine 108, a cloud computing unit 110, a storage unit 112, an output unit 114, a response mechanism 116, and a power supply unit 118.
[0029] In an embodiment of the present invention, the camera 102 may be configured to capture a gesture of the user. The camera 102 may be installed at the location, in an embodiment of the present invention. In an embodiment of the present invention, the camera 102 may be installed with such an orientation, that camera 102 may be able to capture the movements of the user's hands and body accurately, enabling precise detection and interpretation of the "signal for help" gesture. This orientation may include a wide-angle view to encompass the user's entire body within the camera's field of view, ensuring comprehensive gesture recognition and effective emergency response. According to embodiments of the present invention, the orientation for the installation of the camera 102 may be, but not limited to, a rooftop, a mast, and so forth. Embodiments of the present invention are intended to include or otherwise cover any orientation for the installation of the camera 102, including known, related art, and/or later developed technologies.
[0030] The camera 102 may also be configured to transmit the live feed of the user’s hand to a central monitoring unit (not shown), in an embodiment of the present invention. In an embodiment of the present invention, the central monitoring unit may be configured for continuous monitoring of the live feed of the user’s hand. In an embodiment of the present invention, the central monitoring unit may be automated using a computer system. In another embodiment of the present invention, a manual monitoring of the live feed of the user may be done by the system administrator.
[0031] According to the other embodiments of the present invention, a resolution for the captured live feed of the user using the camera 102 may be in a range from 320 pixels by 240 pixels to 1920 pixels by 1080 pixels, and so forth. Embodiments of the present invention are intended to include or otherwise cover any resolution for the live feed of the user captured using the camera 102, including known, related art, and/or later developed technologies.
[0032] According to the other embodiments of the present invention, the camera 102 may be, but not limited to, a still camera, a video camera, a color balancer camera, a thermal camera, an infrared camera, a telephoto camera, a wide-angle camera, a macro camera, a Close-Circuit Television (CCTV) camera, a web camera, and so forth. Embodiments of the present invention are intended to include or otherwise cover any type of the camera 102, including known, related art, and/or later developed technologies.
[0033] In an embodiment of the present invention, the processing unit 104 may be connected to the camera 102. The processing unit 104 may further be configured to execute the computer-executable instructions to generate an output relating to the system 100. According to embodiments of the present invention, the processing unit 104 may be, but not limited to, a Programmable Logic Control (PLC) unit, a microprocessor, a development board, and so forth. Embodiments of the present invention are intended to include or otherwise cover any type of the processing unit 104 including known, related art, and/or later developed technologies. In an embodiment of the present invention, the processing unit 104 may further be explained in conjunction with FIG. 2.
[0034] In an embodiment of the present invention, the dataset 106 stores a comprehensive collection of image or video data featuring the "signal for help" gesture, captured using camera 102. The dataset 106 of the stored video data may be adapted to train a machine learning technique for estimating gesture parameters that may further include a neural network trained on the dataset 106, in an embodiment of the present invention. The machine learning technique for extracting precise movement parameters may encompass the neural network that may be meticulously trained on the dataset 106 encompassing an array of gestures based on different stages of the "signal for help" gesture, thereby enabling robust and refined parameter estimation. In an embodiment of the present invention, the dataset 106 may be stored in a database (not shown).
[0035] According to embodiments of the present invention, the database may be for example, but not limited to, a distributed database, a personal database, an end-user database, a commercial database, a Structured Query Language (SQL) database, a non-SQL database, an operational database, a relational database, an object-oriented database, a graph database, and so forth. In a preferred embodiment of the present invention, the database may be a cloud database. Embodiments of the present invention are intended to include or otherwise cover any type of the database including known, related art, and/or later developed technologies.
[0036] Further, the database may be stored in a cloud server, in an embodiment of the present invention. In an embodiment of the present invention, the cloud server may be remotely located. In an exemplary embodiment of the present invention, the cloud server may be a public cloud server. In another exemplary embodiment of the present invention, the cloud server may be a private cloud server. In yet another embodiment of the present invention, the cloud server may be a dedicated cloud server. According to embodiments of the present invention, the cloud server maybe, but not limited to, a Microsoft Azure cloud server, an Amazon AWS cloud server, a Google Compute Engine (GEC) cloud server, an Amazon Elastic Compute Cloud (EC2) cloud server, and so forth. Embodiments of the present invention are intended to include or otherwise cover any type of the cloud server including known, related art, and/or later developed technologies.
[0037] In an embodiment of the present invention, the artificial intelligence engine 108 may incorporate a dynamic feedback mechanism to provide feedback that may be personalized to the user's specific needs and preferences. In an embodiment of the present invention, the artificial intelligence engine 108 may utilize advanced computer vision, and advanced machine learning algorithms such as deep learning models, convolutional neural networks (CNNs), and recurrent neural networks (RNNs) to process and interpret the captured video data for gesture detection and analysis. In another embodiment of the present invention, the artificial intelligence engine 108 may integrate with existing emergency response systems (not shown), including communication protocols, alert mechanisms, and coordination with emergency services, to provide seamless assistance upon detecting the "signal for help" gesture.
[0038] According to embodiments of the present invention, the artificial intelligence engine 108 may be selected from a variety of AI engines, including but not limited to, TensorFlow, PyTorch, Keras, Caffe, and OpenCV. Embodiments of the present invention are intended to include or otherwise cover any type of the artificial intelligence engine 108, including known, related art, and/or later developed technologies. According to embodiments of the present invention, the artificial intelligence engine 108 may utilize a Mediapipeline Holistic Framework for feature extraction and gesture recognition. This framework may integrate multiple machine learning models and algorithms to capture video frames with precise human body structure analysis.
[0039] In an embodiment of the present invention, the cloud computing unit 110 may be connected to the processing unit 104. The cloud computing unit 110 may provide cloud data storage to the system 100, in an embodiment of the present invention. In an embodiment of the present invention, the cloud computing unit 110 may be remotely located. In an exemplary embodiment of the present invention, the cloud server may be a public cloud computing unit. In another exemplary embodiment of the present invention, the cloud server may be a private cloud computing unit. In yet another embodiment of the present invention, the cloud server may be a dedicated cloud computing unit. According to embodiments of the present invention, the cloud computing unit 110 maybe, but not limited to, a Microsoft Azure cloud computing unit, an Amazon AWS cloud computing unit, a Google Compute Engine (GEC) cloud computing unit, an Amazon Elastic Compute Cloud (EC2) cloud computing unit, and so forth. Embodiments of the present invention are intended to include or otherwise cover any type of the cloud computing unit 110 including known, related art, and/or later developed technologies.
[0040] In an embodiment of the present invention, the storage unit 112 may be connected to the processing unit 104. The storage unit 112 may provide local data storage to the system 100, in an embodiment of the present invention. In an embodiment of the present invention, the storage unit 112 may be a non-transitory storage medium. In an embodiment of the present invention, non-limiting examples of the storage unit 112 may be a Read Only Memory (ROM), a Random-Access Memory (RAM), an Erasable Programmable Read Only Memory (EPROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a hard drive, a removable media drive for handling memory cards, and so forth. Embodiments of the present invention are intended to include or otherwise cover any type of the storage unit 112, including known, related art, and/or later developed technologies.
[0041] In an embodiment of the present invention, the output unit 114 may presenting analyzed information and response actions to the user and relevant stakeholders. According to embodiments of the present invention, the output unit 114 may be, but not limited to, a visual feedback unit, an illuminated feedback unit, a haptic feedback unit, and so forth. Embodiments of the present invention are intended to include or otherwise cover any type of the output unit 114, including known, related art, and/or later developed technologies.
[0042] In a preferred embodiment of the present invention, the output unit 114 may initiate the response mechanism 116. In an embodiment of the present invention, the processing unit 104 establishes a connection with the cloud computing unit 110 for initiating the response mechanism 116. In an embodiment of the present invention, the response mechanism 116 may be adapted to provide auditory cues. The response mechanism 116 may further comprise a sound unit (not shown) for delivering the auditory cues, in an embodiment of the present invention.
[0043] According to embodiments of the present invention, the sound unit in the response mechanism 116 may be, but not limited to, a speaker, a loudspeaker, a siren, an earphone, a headphone, a headset, an earbud, a buzzer, and so forth. Embodiments of the present invention are intended to include or otherwise cover any type of the sound unit, including known, related art, and/or later developed technologies.
[0044] In an embodiment of the present invention, the power supply unit 118 may be connected to one or more of the camera 102, the processing unit 104, and the output unit 114. The power supply unit 118 may further provide an operational power supply to one or more of the camera 102, the processing unit 104, and the output unit 114, in an embodiment of the present invention. In an embodiment of the present invention, the power supplied from the power supply unit 118 may be regulated using a regulator (not shown).
[0045] In an exemplary embodiment of the present invention, the power supply unit 118 may provide power from a battery. In another exemplary embodiment of the present invention, the power supply unit 118 may provide power from a wall-outlet power supply. In yet another exemplary embodiment of the power supply unit 118 may supply power from any source.
[0046] In an embodiment of the present invention, the battery power supply may be from a rechargeable battery. In another embodiment of the present invention, the battery power supply may be from a non-rechargeable battery. According to embodiments of the present invention, the battery for power supply may be of any composition such as, but not limited to, a Nickel-Cadmium battery, a Nickel-Metal Hydride battery, a Zinc-Carbon battery, a Lithium-Ion battery, and so forth. Embodiments of the present invention are intended to include or otherwise cover any composition of the battery, including known, related art, and/or later developed technologies.
[0047] In an embodiment of the present invention, the wall-outlet power supply may be from a grid power line supply. In another embodiment of the present invention, the wall-outlet power supply may be from a generator line power supply. According to embodiments of the present invention, the wall-outlet power supply may be of any rating such as, but not limited to, a 110-volt supply, a 220-volt supply, and so forth. Embodiments of the present invention are intended to include or otherwise cover any rating of the wall-outlet power supply, including known, related art, and/or later developed technologies.
[0048] According to an embodiment of the present invention, the power supply unit 118 may supply an Alternating Current (AC) power supply. According to another embodiment of the present invention, the power supply unit 118 may supply a Direct Current (DC) power supply. According to yet another embodiment of the present invention, the power supply unit 118 may supply any type of power supply.
[0049] FIG. 1B illustrates a pictorial diagram 120 of data points recognition for using the system 100 to detect and interpret the "signal for help" gesture, according to an embodiment of the present invention. In this embodiment, the system 100 may find and recognize the data points by detecting the gestures of the user, specifically focusing on hand movements and facial expressions. The system 100 may further analyze these gestures to determine a context and urgency of the help signal. Additionally, the system 100 may provide auditory cues such as alarms or voice prompts to alert nearby individuals or emergency services, enhancing the effectiveness of the emergency response.
[0050] In an embodiment of the present invention, the data points may be estimated for full body pose estimations, right and left-hand landmarks for hand gesture identification, facial landmarks for capturing unique facial expressions, and so forth. The estimated data points may help in enabling an accurate interpretation of a distinct gesture such as the "signal for help" gesture.
[0051] In an exemplary embodiment of the present invention, the system 100 may process 33 data points for the full body pose estimations for enabling the detection of various human movements with high precision. Additionally, the system 100 may process 21 data points for the both right and left-hand landmarks to identify a wide range of hand gestures, including sign language detection. Moreover, the system 100 may leverage 468 data points for the facial landmarks for enabling the capture of unique facial expressions and gestures. In other embodiments of the present invention, the system 100 may process any number of the data points for the full body pose estimations, both the right and left-hand landmarks and the facial landmarks. The estimations of the data points may enable the system 100 to accurately detect the distinct gesture of the user in need.
[0052] In an embodiment of the present invention, the processing unit 104 (as shown in FIG. 1) may comprise Long Short-Term Memory (LSTM) layers and dense layers in a neural network architecture for enabling sequential data processing to capture short-term, medium-term, and long-term temporal patterns for robust gesture interpretation and parameter estimation. The LSTM layers play a crucial role in understanding sequential patterns within the input sequence, ranging from low-level sequential features to high-level abstract patterns, ensuring comprehensive analysis and precise recognition of the "signal for help" gesture. Furthermore, the processing unit 104 performs data preprocessing to organize the collected data from the dataset 106 into sequences of frames for detecting the "signal for help" gesture. This involves setting up label mapping, loading data from specified directories, and converting them into categorical format using one-hot encoding. The sequences of frames, represented numerically as NumPy arrays, are then processed through the LSTM layers to extract relevant features and enable accurate detection of the distinct hand gesture in real-time, ensuring effective emergency assistance.
[0053] FIG. 2 illustrates a block diagram of the processing unit 104 for the system 100, according to an embodiment of the present invention. The processing unit 104 may comprise the computer-executable instructions in form of programming modules such as a data receiving module 200, a data identification module 202, a data estimation module 204, and a feedback generation module 206.
[0054] In an embodiment of the present invention, the data receiving module 200 may be configured to receive the captured live feed from the camera 102. The data receiving module 200, within the processing unit 104 of the system 100, may receive a continuous stream of the video data captured by the camera 102, focusing on the gestures and movements of the user, especially the "signal for help" gesture. The data receiving module 200 may employ video processing techniques to extract relevant features such as hand movements, facial expressions, and body posture from the live feed. The captured data is then transmitted to the data identification module 202 for further analysis.
[0055] Upon receiving the video data, the data identification module 202 may activate its gesture recognition algorithms, which are trained on a comprehensive dataset to accurately identify and classify specific gestures, including the "signal for help" gesture. This module may utilize computer vision techniques such as image segmentation, feature extraction, and pattern recognition to detect and differentiate between various gestures in real time.
[0056] The identified gestures and their contextual data are then passed to the artificial intelligence engine 108. This engine may integrate advanced machine learning algorithms, including deep learning models such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs), to analyze and interpret the gestures' significance and urgency levels. The artificial intelligence engine 108 may enhance an ability of the system 100 to understand complex gestures, adapt to varying environmental conditions, and provide personalized responses based on the detected gestures and their inferred meanings. Once the data points are analyzed and contextualized by the artificial intelligence engine 108, the results are sent to the data estimation module 204.
[0057] The data estimation module 204 may further refine the analysis, estimating temporal parameters such as the urgency levels of the detected gesture, a location of the user, and a severity of a situation based on the interpreted gestures and their temporal patterns. The data estimation module 204 may trigger the feedback generation module 206.
[0058] The feedback generation module 206 may initiate the response mechanism 116, integrated into the system 100, to execute the appropriate response actions based on the analysis performed by the data estimation module 204 and the artificial intelligence engine 108. The feedback generation module 206 may trigger emergency alerts, notify nearby individuals or emergency services, and provide real-time guidance or instructions to the user through auditory cues or visual displays. It ensures prompt and effective assistance in critical situations by coordinating with external systems and stakeholders as needed.
[0059] FIG. 3 illustrates a flowchart of a method 300 for detecting and utilizing the distinct gesture for the emergency assistance using the system 100, according to an embodiment of the present invention.
[0060] At step 302, the system 100 may receive the captured live feed of the user from the camera 102, focusing on the gestures and movements of the user.
[0061] At step 304, the system 100 may identify the data points extracted from the captured live feed, including the hand movements, the facial expressions, and the body posture, by utilizing the artificial intelligence engine 108.
[0062] At step 306, the system 100 may determine if the identified data points establish the "signal for help" gesture using the dataset 106, the system 100 may move to step 308. Otherwise, the system 100 may return to the step 302.
[0063] At step 308, the system 100 may estimate temporal parameters such as the urgency level of the detected gesture, the location of the user, and the severity of the situation based on the interpreted data points and their temporal patterns, leveraging neural network models and AI-driven analysis.
[0064] At step 310, the system 100 may trigger the output unit 114 to initiate the response mechanism 116 for generating the feedback or the response actions based on the estimated temporal parameters. This may include activating emergency alerts, notifying nearby individuals or emergency services, and providing real-time guidance or instructions to the user through auditory cues or visual displays, ensuring prompt and effective assistance in critical situations.
[0065] Embodiments of the invention are described above with reference to block diagrams and schematic illustrations of methods and systems according to embodiments of the invention. While the invention has been described in connection with what is presently considered to be the most practical and various embodiments, it is to be understood that the invention is not to be limited to the disclosed embodiments, but on the contrary, is intended to cover various modifications and equivalent arrangements included within the scope of the appended claims.
[0066] This written description uses examples to disclose the invention, including the best mode, and also to enable any person skilled in the art to practice the invention, including making and using any devices or systems and performing any incorporated methods. The patentable scope the invention is defined in the claims, and may include other examples that occur to those skilled in the art. Such other examples are intended to be within the scope of the claims if they have structural elements that do not differ from the literal language of the claims, or if they include equivalent structural elements within substantial differences from the literal languages of the claims. , Claims:CLAIMS
I/We Claim:
1. A system (100) for detecting and utilizing distinct gestures, comprising:
a camera (102) configured to capture a live feed of a user; and
a processing unit (104) connected to the camera (102), characterized in that the processing unit (104) is configured to:
receiving the captured live feed of the user from the camera (102),
identifying data points extracted from the captured live feed by utilizing an artificial intelligence engine (108);
determining if the identified data points establish a "signal for help" gesture, using a dataset (106);
estimating temporal parameters based on the interpreted data points; and
triggering an output unit (114) to initiate a response mechanism (116) to generate a feedback or response actions based on the estimated temporal parameters.
2. The system (100) as claimed in claim 1, wherein the dataset (106) comprises data for training machine learning models to recognize the "signal for help" gesture.
3. The system (100) as claimed in claim 1, wherein the artificial intelligence engine (108) estimates the temporal parameters that are selected from an urgency level of the detected gesture, a location of the user, a severity of a situation, or a combination thereof.
4. The system (100) as claimed in claim 1, wherein the processing unit (104) establishes a connection with a cloud computing unit (110) for initiating the response mechanism (116).
5. The system (100) as claimed in claim 1, wherein the processing unit (104) triggers an output unit (114) to provide feedback through one or more of auditory cues, visual displays, haptic feedback mechanisms for versatile communication with the user, or a combination thereof.
6. The system (100) as claimed in claim 1, wherein the artificial intelligence engine (108) utilizes a Mediapipeline Holistic Framework for feature extraction and gesture recognition.
7. The system (100) as claimed in claim 1, wherein the processing unit (104) comprises Long Short-Term Memory (LSTM) layers and dense layers in a neural network architecture for enabling sequential data processing to capture short-term, medium-term, and long-term temporal patterns for robust gesture interpretation and parameter estimation.
8. The system (100) as claimed in claim 1, wherein the processing unit (104) performs data preprocessing to organize the collected data from the dataset (106) into sequences of frames for detecting the "signal for help" gesture.
9. The system (100) as claimed in claim 1, wherein the camera (102), the processing unit (104), and the output unit (114) receive an operational power from a power supply unit (118).
10. A method (300) for detecting and utilizing distinct gestures using a system (100), the method (300) charactered by steps of:
receiving a captured live feed of a user from a camera (102);
identifying data points extracted from the captured live feed by utilizing an artificial intelligence engine (108);
determining if the identified data points establish a "signal for help" gesture using a dataset (106);
estimating temporal parameters selected from an urgency level of the detected gesture, a location of the user, and a severity of a situation based on the interpreted data points; and
triggering an output unit (114) to initiate a response mechanism (116) for generating a feedback or response actions based on the estimated temporal parameters.
Date: April 4, 2024
Place: Noida
Dr. Keerti Gupta
Agent for the Applicant
(IN/PA-1529)
| # | Name | Date |
|---|---|---|
| 1 | 202441029589-STATEMENT OF UNDERTAKING (FORM 3) [12-04-2024(online)].pdf | 2024-04-12 |
| 2 | 202441029589-REQUEST FOR EARLY PUBLICATION(FORM-9) [12-04-2024(online)].pdf | 2024-04-12 |
| 3 | 202441029589-POWER OF AUTHORITY [12-04-2024(online)].pdf | 2024-04-12 |
| 4 | 202441029589-OTHERS [12-04-2024(online)].pdf | 2024-04-12 |
| 5 | 202441029589-FORM-9 [12-04-2024(online)].pdf | 2024-04-12 |
| 6 | 202441029589-FORM FOR SMALL ENTITY(FORM-28) [12-04-2024(online)].pdf | 2024-04-12 |
| 7 | 202441029589-FORM 1 [12-04-2024(online)].pdf | 2024-04-12 |
| 8 | 202441029589-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [12-04-2024(online)].pdf | 2024-04-12 |
| 9 | 202441029589-EDUCATIONAL INSTITUTION(S) [12-04-2024(online)].pdf | 2024-04-12 |
| 10 | 202441029589-DRAWINGS [12-04-2024(online)].pdf | 2024-04-12 |
| 11 | 202441029589-DECLARATION OF INVENTORSHIP (FORM 5) [12-04-2024(online)].pdf | 2024-04-12 |
| 12 | 202441029589-COMPLETE SPECIFICATION [12-04-2024(online)].pdf | 2024-04-12 |
| 13 | 202441029589-FORM-26 [11-07-2024(online)].pdf | 2024-07-11 |