Sign In to Follow Application
View All Documents & Correspondence

An Autonomous Vehicle

Abstract: AN AUTONOMOUS VEHICLE The present invention relates to a method which comprises capturing at least one of audio data and video data in an environment. The method also comprises processing the at least one captured audio data and video data to determine at least one context in the environment. The processing of the at least one captured audio data and video data comprises extracting one or more parameters from the at least one captured audio data and video data and comparing the extracted one or more parameters with one or more predefined parameters related to the at least one context in the environment. The at least one context comprises at least one of an undesirable human activity, a misplaced object and a suspicious object. The method further comprises generating an output signal in response to the determination of the at least one context. [Fig. 2]

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
29 December 2018
Publication Number
27/2020
Publication Type
INA
Invention Field
COMPUTER SCIENCE
Status
Email
ipo@knspartners.com
Parent Application

Applicants

ZENSAR TECHNOLOGIES LIMITED
ZENSAR KNOWLEDGE PARK, PLOT # 4, MIDC, KHARADI, OFF NAGAR ROAD, PUNE-411014, MAHARASHTRA, INDIA

Inventors

1. KULKARNI, Sumant
T-307, Nammane Apartments, Judicial Layout Main Road, Talaghattapura, Bangalore -560062, Karnataka, India
2. NAMBIAR, Ullas Balan
1086 Prestige Kensington Gardens, Bangalore, KA - 560013, Karnataka, India

Specification

FORM 2
THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENTS RULES, 2003
COMPLETE SPECIFICATION (See section 10, rule 13)
“AN AUTONOMOUS VEHICLE”
ZENSAR TECHNOLOGIES LIMITED, Zensar Knowledge Park, Plot # 4, Midc, Kharadi, Off Nagar Road, Pune-411014, Maharashtra, India
The following specification particularly describes the invention and the manner in which it is to be performed.

AN AUTONOMOUS VEHICLE
CROSS-REFERENCE TO RELATED APPLICATIONS AND PRIORITY
The present application claims priority from Indian Provisional Patent Application No. 201821049794 filed on 29th December 2018, the entirety of which is incorporate herein by a reference.
TECHNICAL FIELD
The present disclosure relates to the field of autonomous vehicles, more particularly, the present disclosure relates to an autonomous vehicle for identifying suspicious objects, misplaced objects, and undesirable human activities.
BACKGROUND
The background information herein below relates to the present disclosure but is not necessarily prior art.
Generally, public places like large corporate areas, shopping mall, airport or university campuses require surveillance systems for maintaining safety and security, mitigating risk, increasing operational efficiency, preventing loss of products and a variety of other applications. Various security personnel are employed on various gates of these places to watch over any such activities which are undesirable from safety and security point of view. Also, various indoor video surveillance systems are used to maintain safety and security inside the campus of these places. However, such techniques require humans to manually identify suspicious items and/or undesirable activities. Therefore, such techniques are prone to human errors, which is not desirable.
It is therefore desirable to provide efficient, accurate, and automatic surveillance system to monitor and identify suspicious objects, misplaced objects and undesirable human activities in an area.

SUMMARY OF THE INVENTION
One or more shortcomings discussed above are overcome, and additional advantages are provided by the present disclosure. Additional features and advantages are realized through the techniques of the present disclosure. Other embodiments and aspects of the present disclosure are described in detail herein and are considered a part of the disclosure.
According to an aspect, the present disclosure provides to a method which comprises capturing at least one of audio data and video data in an environment. The method also comprises processing the at least one captured audio data and video data to determine at least one context in the environment. The processing of the at least one captured audio data and video data comprises extracting one or more parameters from the at least one captured audio data and video data and comparing the extracted one or more parameters with one or more predefined parameters related to the at least one context in the environment. The at least one context comprises at least one of an undesirable human activity, a misplaced object and a suspicious object. The method further comprises generating an output signal in response to the determination of the at least one context.
According to an aspect of the present disclosure, a first set of parameters may classify desirable human activity in the environment, a second set of parameters may classify a misplaced object in the environment and a third set of parameters may classify a suspicious object in the environment.
According to an aspect of the present disclosure, processing the at least one captured audio data and video data to determine the at least one context comprises comparing the extracted one or more parameters with the first set of parameters to determine undesirable human activity, comparing the extracted one or more parameters with the second set of parameters to determine the misplaced object and comparing the extracted one or more parameters with the third set of parameters to determine the suspicious object.

According to an aspect of the present disclosure, the one or more parameters extracted from the audio data comprises at least one of type of audio, frequency of audio, amplitude of audio, pitch of audio, language of audio and words of audio; and the one or more parameters extracted from the video data comprises at least one of shape of an object, color of an object, size of an object, posture of a human, action of a human, and body language of a human.
According to an aspect of the present disclosure, the first set of parameters comprises at least one of type of audio, frequency of audio, amplitude of audio, pitch of audio, language of audio, words of audio, posture of a human, action of a human and body language of a human and the second and third set of parameters comprises at least one of type of audio, frequency of audio, amplitude of audio, pitch of audio, language of audio, words of audio, color of an object, shape of an object and size of an object.
According to an aspect of present disclosure, the method of generating the output signal further comprises determining a distance of at least one of undesirable human activity, misplaced object, and suspicious object with respect to a reference point; and determining a location of at least one of undesirable human activity, misplaced object and suspicious object based on the respective determined distance of the at least one of undesirable human activity, misplaced object, and suspicious object.
According to an aspect of present disclosure, the method of generating the output signal further comprises processing the output signal to generate an alert or transmitting the output signal to a remote-control station for generating an alert.
Another aspect of the present disclosure provides a device comprising at least one media capturing unit configured to capture at least one of audio data and video data in an environment. The device also comprises a context determination unit operatively coupled to the at least one media capturing unit. The context determination unit includes a memory unit and at least one processing unit

operatively coupled to the memory unit. The at least one processing unit is configured to process the at least one captured audio data and video data to determine at least one context in the environment by extracting one or more parameters from the at least one captured audio data and video data. The at least one processing unit is also configured to compare the extracted one or more parameters with one or more predefined parameters related to the at least one context in the environment. The at least one context includes at least one of an undesirable human activity, a misplaced object, and a suspicious object. The device also comprises an output unit operatively coupled to the context determination unit and configured to generate an output signal in response to the determination of the at least one context.
Still another aspect of the present disclosure provides a system comprising a database unit configured to store one or more predefined parameters. The system also comprises an autonomous vehicle including at least one media capturing unit. The at least one media capturing unit is configured to capture at least one of audio data and video data in an environment. The autonomous vehicle also includes a context determination unit operatively coupled to the at least one media capturing unit. The context determination unit comprises a memory unit and at least one processing unit operatively coupled to the memory unit. The at least one processing unit is configured to process the at least one captured audio data and video data to determine at least one context in the environment by extracting one or more parameters from the at least one captured audio data and video data and compare the extracted one or more parameters with the one or more predefined parameters related to the at least one context in the environment. The at least one context comprises at least one of an undesirable human activity, a misplaced object, and a suspicious object. The autonomous vehicle also comprises an output unit operatively coupled to the context determination unit. The output unit is configured to generate an output signal in response to the determination of the at least one context and transmit the generated output signal. The system further comprises a remote-control station operatively coupled to the autonomous vehicle and

configured to receive the output signal and process the received output signal to generate an alert.
It is to be understood that the aspects and embodiments of the disclosure described above may be used in any combination with each other. Several of the aspects and embodiments may be combined together to form a further embodiment of the disclosure.
In the above paragraphs, the most important features of the disclosure have been outlined, in order that the detailed description thereof that follows may be better understood and in order that the present contribution to the art may be better appreciated. There are, of course, additional features of the disclosure that will be described hereinafter and which will form the subject of the claims appended hereto. Those skilled in the art will appreciate that the conception upon which this disclosure is based may readily be utilized as a basis for the designing of other structures for carrying out the several purposes of the invention. It is important therefore that the claims be regarded as including such equivalent constructions as do not depart from the spirit and scope of the invention.
BRIEF DESCRIPTION OF ACCOMPANYING DRAWING
Further aspects and advantages of the present disclosure will be readily understood from the following detailed description with reference to the accompanying drawings, where like reference numerals refer to identical or functionally similar elements throughout the separate views. The figures together with the detailed description below, are incorporated in and form part of the specification, and serve to further illustrate the aspects and explain various principles and advantages, in accordance with the present disclosure wherein:
Fig. 1 illustrates a block diagram of an exemplary system of an autonomous vehicle in accordance with an embodiment of the present disclosure.

Fig. 2 illustrates an autonomous device in accordance with an embodiment of the present disclosure.
Figure 3 illustrate a flow an exemplary method of generating an output based on a determined context, in accordance with an embodiment of the present disclosure.
Skilled person in art will appreciate that elements in the drawings are illustrated for simplicity and have not necessarily been drawn to scale. For example, the dimensions of some of the elements in the drawings may be exaggerated relative to other elements to help to improve understanding of aspects of the present disclosure.
DETAILED DESCRIPTION
Referring now to the drawings, there is shown an illustrative embodiment of the disclosure “An autonomous vehicle”. It should be understood that the disclosure is susceptible to various modifications and alternative forms; specific embodiment thereof has been shown by way of example in the drawings and will be described in detail below. It will be appreciated as the description proceeds that the disclosure may be realized in different embodiments.
The terms “comprises”, “comprising”, or any other variations thereof, are intended to cover a non-exclusive inclusion, such that a setup, device that comprises a list of components does not include only those components but may include other components not expressly listed or inherent to such setup or device. In other words, one or more elements in a system or apparatus proceeded by “comprises… a” does not, without more constraints, preclude the existence of other elements or additional elements in the system or apparatus or device.
The term like “the autonomous vehicle” or “the autonomous device” or “the device” may be used interchangeably throughout the specification.
An aspect of the present disclosure provides an autonomous vehicle that may be configured to determine at least one context such as a misplaced object, a suspicious

object and an undesirable human activity in an environment. The autonomous vehicle may generate an output signal based on the determination of the at least one context. Further, an alert may be generated based on the generated output signal. Therefore, the autonomous vehicle may provide an efficient and effective way to maintain safety and security in an environment. The manner in which the above-mentioned objective is achieved is described below with respect to the drawings.
The terminology used, in the present disclosure, is only for the purpose of explaining a particular embodiment and such terminology shall not be considered to limit the scope of the present disclosure. As used in the present disclosure, the forms "a," "an," and "the" may be intended to include the plural forms as well, unless the context clearly suggests otherwise. The terms "comprises," "comprising," "including," and "having," are open ended transitional phrases and therefore specify the presence of stated features, integers, steps, operations, elements, modules, units and/or components, but do not forbid the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. The particular order of steps disclosed in the method and process of the present disclosure is not to be construed as necessarily requiring their performance as described or illustrated. It is also to be understood that additional or alternative steps may be employed.
When an element is referred to as being "mounted on," "engaged to," "connected to," or "coupled to" another element, it may be directly on, engaged, connected or coupled to the other element. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed elements.
Fig. 1 illustrates a block diagram of an exemplary system 100 for monitoring an environment. The system 100 may be configured to determine one or more contexts such as a misplaced object, a suspicious object and/or an undesirable human activity based on comparison of one or more predefined parameters with one or more extracted parameters. The system 100 may include a database unit 102, an

autonomous vehicle 104, and a remote-control station 116 operatively coupled to the autonomous vehicle 104.
The database unit 102 may be operatively coupled to the autonomous vehicle 104 and/or the remote-control station 116. In an embodiment, the database unit 102 may be directly connected to the autonomous vehicle 104 and/or the remote-control station 116. In alternative embodiment, the database unit 102 may be connected to the autonomous vehicle 104 and/or the remote-control station 116 via a network (not shown). The database unit 102 may be, but not restricted to, a Random-Access Memory (RAM) unit and/or a non-volatile memory unit such as a Read Only Memory (ROM), optical disc drive, magnetic disc drive, flash memory, Electrically Erasable Read Only Memory (EEPROM), a memory space on a server or cloud and so forth.
The database unit 102 may be configured to store the one or more predefined parameters. The one or more predefined parameters may relate to one or more environments. Example of such environment, may include, but not limited to, a school, a college, a temple, a shopping mall and so forth. Further, the one or more predefined parameters relates to at least one context in an environment. The at least one context may include an undesirable human activity, a misplaced object and/or a suspicious objection.
In an exemplary embodiment, the one or more predefined parameters may include, but not limited to, a first set of parameters classifying desirable human activity in the environment, a second set of parameters classifying misplaced object in the environment and a third set of parameters classifying suspicious object in the environment. The first set of parameters may include at least one of type of audio, frequency of audio, amplitude of audio, pitch of audio, language of audio, words of audio, posture of a human, action of a human and body language of a human. The second and third set of parameters may include at least one of type of audio, frequency of audio, amplitude of audio, pitch of audio, language of audio, words of

audio, shape of an object and size of an object. In an embodiment, the one or more predefined parameter may define a range of value of each of these parameters corresponding the context and/or environment. For an example, the first set of parameters in a school campus may define range of values for characteristic of audio such as pitch, language, amplitude etc. which are permissible within the school campus. In another example, the second set of parameters may define specific characteristics such as color, shape, size etc. of a misplaced object.
The database unit 102 may be configured to store the one or more predefined parameters based on their relationship with the context and/or the environment. In an alternative embodiment, the database unit 102 may be configured to store the one or more parameters based on the type of parameters. For an example, the parameters which relate to audio data may be stored in one category and the parameters which relate to video data may be stored in second category. However, embodiments are intended to cover, or otherwise cover, any possible storage means for storing the one or more predefined parameters in the database unit 102. In an embodiment, the database unit 102 may be configured to communicate the said one or more predefined parameters to the autonomous vehicle 104. In an alternative embodiment, the autonomous vehicle 104 may be configured to fetch the one or more predefined parameters from the database unit 102.
The autonomous vehicle 104 may include, but not limited to, any type of unmanned vehicle, including an unmanned aerial vehicle, an unmanned terrestrial vehicle, a drone, a gyrocopter, an unmanned oceanic vehicle, etc. In alternative embodiments, the autonomous vehicle 104 may be also include semi-autonomous vehicle which may be remotely controlled a user. The autonomous vehicle 104 may include at least one media capturing unit 106, a context determination unit 108 and an output unit 114 operatively coupled to each other.
The at least one media capturing unit 106 may be configured to capture at least one of audio data and video data in an environment. The at least one media capturing

unit 106 may include at least one audio capturing unit (not shown) configured to capture audio data in an environment. The at least one media capturing unit 106 may also include at least one video capturing unit (not shown) configured to capture video data in the environment. The audio capturing unit may be, but not limited to, a plurality of microphones. The video capturing unit may be selected from the group consisting of, but not limited to, an optical camera, a stereo camera, an infrared camera, or any combination thereof. The at least one media capturing unit 106 may be operatively coupled to the context determination unit 108. In an embodiment, the at least one media capturing unit 106 may transmit the captured audio data and/or video data to the context determination unit 108.
The context determination unit 108 may be configured to process the captured audio data and the video data to determine at least one context in the environment. The context determination unit 108 may include a memory unit 110 and at least one processing unit 112. In an exemplary embodiment, the at least one processing unit 112 may be configured to process the captured audio data and video data. Example of the at least one processing unit 112 may include, but not restricted to, a general-purpose processor, a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), a Digital Signal Processor (DSP), microprocessors, microcomputers, micro-controllers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. The memory unit 110 may be configured to store the captured audio and video data. The memory unit 110 may be, but not restricted to, volatile memory, such as static random access memory (SRAM) and dynamic random access memory (DRAM), and/or a non- volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes.
In an exemplary embodiment, the at least one processing unit 112 may be configured to extract one or more parameters from the at least one captured audio data and video data. The one or more parameters extracted from the audio data may

include at least one of type of audio, frequency of audio, amplitude of audio, pitch of audio, language of audio and words of audio. In an exemplary embodiment, the one or more parameters extracted from the audio data may indicate one or more characteristics of audio detected from the audio data. In particular, the one or more parameters extracted from the audio data may indicate a high pitch audio which includes abusive words or low pitch audio signifying undesirable activity, for example, criminal activity or audio in a language not frequently used in an environment. Further, the one or more parameters extracted from the video data comprises at least one of shape of an object, size of an object, color of an object, posture of a human, action of a human, and body language of a human. In particular, the one or more parameters extracted from the video data may indicate one or more characteristics of an image or a video detected from the video data. For example, the one or more parameters extracted from video data may indicate a red color box which is rectangular in size. In another example, the one or more parameters extracted from the video data may indicate anchor point of human body, a height of human body, number of human bodies and so on.
Further, the at least one processing unit 112 may include, but not limited to, a speech recognition unit, a voice recognition unit, an image processing unit, a face recognition unit, an event recognition unit, a context identification unit, a situation prediction unit, and an object recognition unit. The at least one processing unit 112 may be configured to implement one or more techniques to extract one or more parameters from the captured audio data and video data. The techniques to extract one or more parameters from the audio data may include, but not limited to, speech recognition, natural language processing, and so forth. The techniques to extract one or more parameters from the video data may include, but not limited to, object detection, human detection, image processing, artificial intelligence and so forth. Embodiments are intended to cover, otherwise cover all other potential technique which can be used to perform the required operation of the at least one processing unit 112.

The at least one processing unit 112 may be configured to compare the extracted one or more parameters with the one or more predefined parameters to determine the at least one context in the environment. The at least one context comprises at least one of an undesirable human activity, a misplaced object, and a suspicious object. In an embodiment, the at least processing unit 112 may be configured to compare the extracted one or more parameters with the first set of parameters to determine an undesirable human activity. In another embodiment, the at least processing unit 112 may be configured to compare the extracted one or more parameters with the second set of parameters to determine a misplaced object. In yet another embodiment, the at least processing unit 112 may be configured to compare the extracted one or more parameters with the third set of parameters to determine a suspicious object. In an example, the at least one processing unit 112 may extract a size of an object which is 5x5 unit square, a shape of the object which rectangular and a color of object with is red from video data. The at least one processing unit 112 then compares the extracted parameters to the second and third set of predefined parameters stored to determine a context. Therefore, in case the extracted parameters matches with the second set of parameters, the at least one processing unit 112 may specify the object as misplaced output. The at least one processing unit 112 may be operatively coupled to the output unit 114.
The output unit 114 may be configured to generate an output signal in response to determination of at least one context by the context determination unit 108. For example, in above mentioned situation, the output signal may be a message indicating that a misplaced object has been found. The output signal may include a message, email, a video message, an audio message, and/or any other message signal that may indicate the determination of the at least one context. In an embodiment, the output unit 114 may transmit the output signal to a remote-control station 116.
The remote-control station 116 may receive the output signal transmitted from the output unit 114 of the autonomous vehicle 104. The remote-control station 116 may

be configured to process the output signal and generate an alert. In an embodiment, the remote-control station 116 may extract relevant information from the output signal and generate an alert. For an example, if an output signal is a message indicating that a misplaced object has been detected. The remote-control station 116 may process that message and generate an alert at a lost and found facility. In another example, in if an output signal is a message indicating that a suspicious object has been detected. The remote-control station 116 may process that message and generate an alert at a security room. In other embodiments, the remote-control station 116 may be configured to display images, videos, or play a voice recording upon receiving the output signals. In yet another embodiment, the remote-control station 116 may be configured to display a text or a graphic upon receiving the output signals. In a further embodiment, the remote-control station 116 may include an alerting unit 118 to generate an alert based on determined context. The alerting unit 118 may generate the alert based on a predefined set of rules. The predefined set of rules may define which alert is to be generated corresponding to the context. For example, the predefined set of rules may define a high audio alert has to be generated when somebody tries to enter in an office premises by tailgating. Similarly, these predefined set of rules may be based on at least one of the contexts and/or the environment.
In an alternative embodiment, the alerting unit 118 may be a remotely connected to the remote-control station 116 and/or the autonomous vehicle 104 configured to generate an alert based the output signal. The alerting unit 118 may receive the output signal from the autonomous vehicle 104 and/or the remote-control station 116. The alerting unit 118 may trigger an annunciator such as audio alarm, a buzzer, a siren, a hooter or a flashing light. In some other embodiments, the alerting unit 118 may be fitted within the autonomous vehicle 104.
In some embodiments, the autonomous vehicle 104 may also include a location determination unit (not shown) configured to determine a current location of the autonomous vehicle 104. The location determination unit may include, but not

limited to, a Global Positioning System (GPS) sensor. The autonomous vehicle 104 may also include a distance determination unit (not shown) configured to determine a distance of a position of at least one of context with respect to a reference point. The reference point may be current location of the autonomous vehicle 104. The autonomous vehicle 104 may also be configured to determine the location of the at least one context based on the current location of the autonomous vehicle 104 and the distance of the position of the at least one context. Embodiments are intended to cover, any other suitable location determination method which can be used to determine the location of the at least one context. The autonomous vehicle 104 may transmit the determined location of the at least one context to the remote-control station 116 along with the output signal. In some embodiments, the remote-control station 116 may be configured to generate an alert based on the determined location of the at least one context. For example, in case the autonomous vehicle 104 may determine a fight between students playing in playground, the remote-control station 116 may generate an alert message on a portable device of a security guard near to the playground instead of generating an alert for all the security guard.
Fig. 2 illustrates an autonomous device 200 (interchangeably referred to as “the device 200” in accordance with an embodiment of the present disclosure. The autonomous device 200 depicted in Fig. 2 is similar to the autonomous vehicle depicted in Fig. 1. In the interest of brevity and clarity, the description of similar components will not be repeated.
The autonomous device 200 may include a media capturing unit 202, a context determination unit 208, an output unit 220, an alerting unit 222, a location determination unit 216, a distance determination unit 218, a central controlling unit 214 and a communication unit 224. Each of the said components of the device 200 may be operatively connected to each other.
The media capturing unit 202 may be similar to the media capturing unit 106 of the autonomous vehicle 104. The media capturing unit 202 may be configured to

capture at least one of audio data and video data in an environment. In some embodiment, the media capturing unit 202 may be operatively coupled to the central controlling unit 224. The central controlling unit 224 may be configured to control the media capturing unit 202. In an embodiment, the central controlling unit 224 may be configured to allow a remote user to change angle/position of the media capturing unit 202 to make closer inspections and improve the operation of the media capturing unit 202. The media capturing unit 202 may include a video capturing unit 204 configured to capture video data in the environment. The media capturing unit 202 may also include an audio capturing unit 206 configured to capture audio data in the environment. In an embodiment, the media capturing unit 202 may be configured to transmit the captured audio data and video data to the context determination unit 208.
The context determination unit 208 may be similar to the context determination unit 108 of the autonomous vehicle 104. The context determination unit 208 may include a memory unit 210 configured to store captured audio data and video data. The context determination unit 208 may also include at least one processing unit 212 operatively coupled to the memory unit 210. The context determination unit 208 configured to process the at least one captured audio data and video data and determine at least one context in the environment. The at least one context may include at least one of an undesirable human activity, a misplaced object, and a suspicious object. The context determination unit 208 may determine the at least one context in the environment by extracting one or more parameters from the at least one captured audio data and video data and compare the extracted one or more parameters with one or more predefined parameters related to the at least one context in the environment. In an embodiment, the one or more predefined parameters may be stored in the memory unit 210. In alternative embodiments, the at least one processing unit 212 may retrieve the one or more predefined parameters from a remote server and store in the memory unit 210. The context determination unit 208 may be communicably coupled to the central controlling unit 214. The

context determination unit 208 may communicate the identified context to the central controlling unit 214.
The device 200 may also include the location determination unit 216. The location determination unit 216 may be configured to determine a location of the device 200. The location determination unit 216 may identify position data such as, but not limited to, latitude, longitude, and altitude of the device 200. The location determination unit 216 may include any location sensing device, such as a satellite navigation unit (GPS), an inertial navigation unit, or other location sensing devices. In an embodiment, the location determination unit 216 may form part of the context determination unit 208.
The device 200 may also include the distance determination unit 218. The distance determination unit 218 may be configured to determine a distance between a first reference point and a second reference point. In an exemplary embodiment, the first reference point may be current location of the device 200. The second reference point may be a position of the determines context. For example, if the determined context relates to undesirable human activity, the second reference point may a position of one or more persons involved in such activity. In another example, if the determined context relates to a suspicious or misplaced object, the second reference point may be a position of the object. The distance determination unit 218 may be any suitable distance determining sensor such as, but not limited to, an ultrasonic sensor, a Light Detection and Ranging (LIDAR) sensor, a Radio Detection and Ranging (RADAR) sensor, or any combination thereof. In an embodiment, the distance determination unit 21 may form part of the context determination unit 208.
Each of the location determination unit 216 and the distance determination unit 218 may also be communicably coupled to the central controlling unit 214. The location determination unit 216 may be configured to communicate the current location of the device 200 to the central controlling unit 214. The distance determination unit

218 may be configured to communicate the determined distance to the central controlling unit 214. The central controlling unit 214 may be configured to determine a position of the context based on the current location of the device 200 and the determined distance between the device 200 and the context.
The central controlling unit 214 may be configured to communicate the determined context and the respective position of the context to the output unit 220. The output unit 220 may be similar to the output unit 114 of the autonomous vehicle 104. The output unit 220 may generate an output signal based on the determined context and/or the respective position of the context. The output unit 220 may be communicably coupled to the alerting unit 222.
The alerting unit 222 may be similar to the alerting unit 116 of the system 100. The alerting unit 222 may be configured generate an alert based on the identified context and/or the respective position of the context.
In an embodiment, the device 200 may be communicably coupled to a remote-control station 116 (shown in Fig. 1) via the communication unit 224. The communication unit 224 may be configured to establish a connection between the device 200, one or more remote-control stations and/or a user via a communication network. The communication network may include, but not limited to, a wireless Local Area Network (LAN), a Wide Area Network (WAN), a packet data network, a mobile telephone network (e.g., cellular networks), wireless data network and so forth. In some embodiments, the device 200 may transmit the determine context and the respective position of the context to the remote-control station 116.
Figure 3 illustrate a flow an exemplary method 300 of generating an output based on a determined context, in accordance with an embodiment of the present disclosure. This flowchart is merely provided for exemplary purposes, and embodiments are intended to include or otherwise cover any methods or procedures for determining a context. Fig. 3 is described in reference with Figs. 1-2.

In accordance with the flowchart of Fig. 3, at block 302 at least one of audio data and video data is captured in an environment by the media capturing unit (106, 202). In an embodiment, the media capturing unit (106, 202) may capture audio data and video data in a shopping mall.
At block 304, the at least one captured audio data and video data is processed by the context determination unit (108, 208) to determine at least one context in the shopping mall. The context determination unit (108, 208) may process the at least one captured audio data and video data by extracting one or more parameters from the at least one captured audio data and video data and comparing the extracted one or more parameters with one or more predefined parameters related to the at least one context in the shopping mall , and wherein the at least one context comprises at least one of an undesirable human activity, a misplaced object and a suspicious object in the shopping mall. The one or more predefined parameters may include a first set of parameters classifying desirable human activity in the shopping mall, a second set of parameters classifying misplaced object in the shopping mall and a third set of parameters classifying suspicious object in the shopping mall. Further, the first set of parameters may also include at least one of type of audio, frequency of audio, amplitude of audio, pitch of audio, language of audio, words of audio, posture of a human, action of a human and body language of a human. The second and third set of parameters may at least one of type of audio, frequency of audio, amplitude of audio, pitch of audio, language of audio, words of audio, shape of an object and size of an object. The one or more predefined parameters may be stored in the database unit 102 and/or in the memory unit (110, 210).
In exemplary scenario, wherein the environment is a shopping mall, the non-limiting example is the first set of parameters may define a type of audio as human conversation, loudspeaker voice, footstep voice as related to desirable human activity. Further, type of audio for example, a gun sound may be defined as undesirable human activity. In environment like shopping mall a high pitch of audio

between two or person may be defined as undesirable human activity. Moreover, language and words of audio which is non-offensive and include polite words may be regarded as desirable human activity. While the language and words of audio which is offensive, aggressive and which include slangs and abusive words may be regarded as undesirable human activity. Further, in environment like shopping mall, an action of a human for example fast running may be regarded as undesirable human activity. While, the action of a human for example dancing maybe regarding as desirable human activity. Similarly, the other one or more parameters may be defined based on the environment and the context. The context determination unit (108, 208) may extract the one or more parameters such as, but not limited to type of audio, frequency of audio, amplitude of audio, pitch of audio, language of audio and words of audio from the audio data.
The context determination unit (108, 208) may extract the one or more parameters such as, one of shape of an object, size of an object, posture of a human, action of a human, and body language of a human from video data. The context determination unit (108, 208) then perform the comparison of the one or more extracted parameters and the predefined parameters. For example, in this scenario, the context determination unit (108, 208) may first identify a human based on technique such as, but not limited to, image processing. Then the context determination unit (108, 208) may identify anchor points to identify the posture and/or action of the human body. Thereafter, the context determination unit (108, 208) may compare the action of human body with the action defined by the one or more predefined parameters. If the action matches with the action specified as fast running, the context determination unit (108, 208) may define the context as undesirable human activity.
At block 304, an output signal in response to the determination of the at least one context is generated by the output unit (114, 220). For example, the output signal may be a message indicating that a person who is running fast has been detected. In some embodiments, a message may be transmitted to the remote-control station

and/or the alerting unit to generate an alert. The alert may be an audio hooter or a or image video displaying the running person.
Embodiments of the present disclosure may be implemented in various scenarios. One of such scenarios may relate to a misplaced bicycle. In this scenario, one or predefined parameters relating to the bicycle may be predefined in the database unit 102. For example, a second set of parameters may define that an object having shape of a bicycle (shape of object) which is X inches in size (size of object) may be regarded as misplaced bicycle. Therefore, when the autonomous vehicle 104 may identify an object and the extracted parameters of said object match with the predefined parameters of the misplaced object. The autonomous vehicle 104 may identify the object as misplaced object (i.e. misplaced bicycle) and generate an output signal. Thereafter, an alert may be generated based on the output signal. In this example, an alert may be an instant message to the owner of the bicycle.
In another embodiment, one or predefined parameters may relate to a suspicious object may be stored in the database unit 102. For example, a third set of parameters may define a money bag shaped object (shape of object), in a public place (environment), which generate clock sound (type of sound) may be regarded as suspicious object. Therefore, when the autonomous vehicle 104 may identify an object and the extracted parameters of said object match with predefined parameters of the suspicious object. The autonomous vehicle 104 may classify the object as suspicious object and generate an output signal that drives the alerting unit 118 to generate a "suspicious item" alert. The alerting unit 118 may generate the alert of “suspicious item” to the security personnel at that campus.
The alerts, which are generated by the alerting unit 118 may trigger the annunciators to notify the user or an operator about the identified misplaced object, suspicious object, or undesirable human activity.
The foregoing description of the specific embodiments so fully reveals the general nature of the embodiments herein that others can, by applying current knowledge,

readily modify and/or adapt for various applications such specific embodiments without departing from the generic concept, and, therefore, such adaptations and modifications should and are intended to be comprehended within the meaning and range of equivalents of the disclosed embodiments. It is to be understood that the phraseology or terminology employed herein is for the purpose of description and not of limitation. Therefore, while the embodiments herein have been described in terms of preferred embodiments, those skilled in the art will recognize that the embodiments herein can be practiced with modification within the spirit and scope of the embodiments as described herein.
Throughout this specification the word "comprise", or variations such as "comprises" or "comprising", will be understood to imply the inclusion of a stated element, integer or step, or group of elements, integers or steps, but not the exclusion of any other element, integer or step, or group of elements, integers or steps.
The use of the expression "at least" or "at least one" suggests the use of one or more elements or ingredients or quantities, as the use may be in the embodiment of the disclosure to achieve one or more of the desired objects or results.
Any discussion of documents, acts, materials, devices, articles or the like that has been included in this specification is solely for the purpose of providing a context for the disclosure. It is not to be taken as an admission that any or all of these matters form a part of the prior art base or were common general knowledge in the field relevant to the disclosure as it existed anywhere before the priority date of this application.
While considerable emphasis has been placed herein on the components and component parts of the preferred embodiments, it will be appreciated that many embodiments can be made and that many changes can be made in the preferred embodiments without departing from the principles of the disclosure. These and other changes in the preferred embodiment as well as other embodiments of the

disclosure will be apparent to those skilled in the art from the disclosure herein, whereby it is to be distinctly understood that the foregoing descriptive matter is to be interpreted merely as illustrative of the disclosure and not as a limitation.

We claim:
1. A method comprising:
capturing at least one of audio data and video data in an environment;
processing the at least one captured audio data and video data to determine at least one context in the environment, wherein the at least one captured audio data and video data is processed upon extracting one or more parameters from the at least one captured audio data and video data and comparing the extracted one or more parameters with one or more predefined parameters related to the at least one context in the environment, and wherein the at least one context comprises at least one of an undesirable human activity, a misplaced object and a suspicious object; and
generating an output signal in response to the determination of the at least one context.
2. The method of claim 1, wherein:
the one or more predefined parameters comprises:
a first set of parameters classifying desirable human activity in the environment;
a second set of parameters classifying misplaced object in the environment; and
a third set of parameters classifying suspicious object in the environment;
and
wherein processing the at least one captured audio data and video data to determine the at least one context comprises:
comparing the extracted one or more parameters with the first set of parameters to determine undesirable human activity;
comparing the extracted one or more parameters with the second set of parameters to determine the misplaced object; and
comparing the extracted one or more parameters with the third set of parameters to determine the suspicious object.

3. The method of claim 1, wherein the one or more parameters extracted from
the audio data comprises at least one of type of audio, frequency of audio, amplitude
of audio, pitch of audio, language of audio and words of audio; and
wherein the one or more parameters extracted from the video data comprises at least one of shape of an object, color of an object, size of an object, posture of a human, action of a human, and body language of a human.
4. The method of claim 2, wherein:
the first set of parameters comprises at least one of type of audio, frequency of audio, amplitude of audio, pitch of audio, language of audio, words of audio, posture of a human, action of a human and body language of a human; and
the second and third set of parameters comprises at least one of type of audio, frequency of audio, amplitude of audio, pitch of audio, language of audio, words of audio, color of an object, shape of an object and size of an object.
5. The method of claim 1, further comprising:
determining a distance of at least one of undesirable human activity, misplaced object, and suspicious object with respect to a reference point; and
determining a location of at least one of undesirable human activity, misplaced object and suspicious object based on the respective determined distance of the at least one of undesirable human activity, misplaced object, and suspicious object.
6. The method of claim 1, further comprising:
processing the output signal to generate an alert or transmitting the output signal to a remote-control station for generating an alert.
7. A device comprising:
at least one media capturing unit configured to capture at least one of audio data and video data in an environment;
a context determination unit operatively coupled to the at least one media capturing unit and comprising:

a memory unit; and
at least one processing unit operatively coupled to the memory unit and configured to process the at least one captured audio data and video data to determine at least one context in the environment by extracting one or more parameters from the at least one captured audio data and video data and compare the extracted one or more parameters with one or more predefined parameters related to the at least one context in the environment, wherein the at least one context comprises at least one of an undesirable human activity, a misplaced object and a suspicious object; and
an output unit operatively coupled to the context determination unit and configured to generate an output signal in response to the determination of the at least one context.
8. The device of claim 7, wherein the one or more predefined parameters are stored in a database unit operatively coupled to the at least one processing unit.
9. The device of claim 7, wherein:
the one or more predefined parameters comprises:
a first set of parameters classifying desirable human activity in the environment;
a second set of parameters classifying misplaced object in the environment; and
a third set of parameters classifying suspicious object in the environment;
and
wherein the at least one processing unit is configured to process the at least one captured audio data and captured video data to determine the at least one context by:
comparing the extracted one or more parameters with the first set of parameters to determine undesirable human activity;
comparing the extracted one or more parameters with the second set of parameters to determine the misplaced object; and

comparing the extracted one or more parameters with the third set of parameters to determine the suspicious object.
10. The device of claim 7, wherein the one or more parameters extracted from
the audio data comprises at least one of type of audio, frequency of audio, amplitude
of audio, pitch of audio, language of audio and words of audio; and
wherein the one or more parameters extracted from the video data comprises at least one of shape of an object, color of an object, size of an object, posture of a human, action of a human, and body language of a human.
11. The device of claim 9, wherein:
the first set of parameters comprises at least one of type of audio, frequency of audio, amplitude of audio, pitch of audio, language of audio, words of audio, posture of a human, action of a human and body language of a human; and
the second and third set of parameters comprises at least one of type of audio, frequency of audio, amplitude of audio, pitch of audio, language of audio, words of audio, color of an object, shape of an object and size of an object.
12. The device of claim 7, wherein the context determination unit comprises a
distance determination unit configured to determine a distance of at least one of
undesirable human activity, misplaced object and suspicious object with respect to
a reference point, wherein the reference point is a current location of the device;
and
wherein the context determination unit comprises a location determination unit operatively coupled to the distance determination unit and configured to determine a location of at least one of undesirable human activity, misplaced object and suspicious object based on the respective determined distance of the at least one of undesirable human activity, misplaced object and suspicious object.
13. The device of claim 7, wherein the at least one media capturing unit
comprises at least one audio capturing unit for capturing the audio data and at least
one video capturing unit for capturing the video data;

wherein the at least one audio capturing unit comprises one or more microphones; and
wherein the at least one video capturing unit comprises at least one of an optical camera, a stereo camera and an infrared camera.
14. The device of claim 7, wherein the at the least one processing unit is configured to process the output signal to generate an alert or wherein the output unit is configured to transmit the output signal to a remote-control station for generating an alert.
15. A system comprising:
a database unit configured to store one or more predefined parameters;
an autonomous vehicle comprising:
at least one media capturing unit configured to capture at least one of audio
data and video data in an environment;
a context determination unit operatively coupled to the at least one media
capturing unit and comprising:
a memory unit; and
at least one processing unit operatively coupled to the memory unit and configured to process the at least one captured audio data and video data to determine at least one context in the environment by extracting one or more parameters from the at least one captured audio data and video data and compare the extracted one or more parameters with the one or more predefined parameters related to the at least one context in the environment, wherein the at least one context comprises at least one of an undesirable human activity, a misplaced object and a suspicious object; an output unit operatively coupled to the context determination unit and configured to:
generate an output signal in response to the determination of the at least one context and transmit the generated output signal; and a remote-control station operatively coupled to the autonomous vehicle and configured to:

receive the output signal; and
process the received output signal to generate an alert.

Documents

Application Documents

# Name Date
1 201821049794-STATEMENT OF UNDERTAKING (FORM 3) [29-12-2018(online)].pdf 2018-12-29
2 201821049794-PROVISIONAL SPECIFICATION [29-12-2018(online)].pdf 2018-12-29
3 201821049794-PROOF OF RIGHT [29-12-2018(online)].pdf 2018-12-29
4 201821049794-POWER OF AUTHORITY [29-12-2018(online)].pdf 2018-12-29
5 201821049794-FORM 1 [29-12-2018(online)].pdf 2018-12-29
6 201821049794-DRAWINGS [29-12-2018(online)].pdf 2018-12-29
7 201821049794-DECLARATION OF INVENTORSHIP (FORM 5) [29-12-2018(online)].pdf 2018-12-29
8 201821049794-Proof of Right (MANDATORY) [07-05-2019(online)].pdf 2019-05-07
9 201821049794-RELEVANT DOCUMENTS [19-11-2019(online)].pdf 2019-11-19
10 201821049794-FORM 13 [19-11-2019(online)].pdf 2019-11-19
11 201821049794-FORM 18 [29-12-2019(online)].pdf 2019-12-29
12 201821049794-DRAWING [29-12-2019(online)].pdf 2019-12-29
13 201821049794-CORRESPONDENCE-OTHERS [29-12-2019(online)].pdf 2019-12-29
14 201821049794-COMPLETE SPECIFICATION [29-12-2019(online)].pdf 2019-12-29
15 201821049794-ORIGINAL UR 6(1A) FORM 1-080519.pdf 2019-12-31
16 Abstract1.jpg 2020-01-01
17 201821049794-FER.pdf 2021-10-18
18 201821049794-OTHERS [12-01-2022(online)].pdf 2022-01-12
19 201821049794-FER_SER_REPLY [12-01-2022(online)].pdf 2022-01-12
20 201821049794-COMPLETE SPECIFICATION [12-01-2022(online)].pdf 2022-01-12
21 201821049794-CLAIMS [12-01-2022(online)].pdf 2022-01-12
22 201821049794-US(14)-HearingNotice-(HearingDate-20-02-2024).pdf 2024-01-08
23 201821049794-US(14)-ExtendedHearingNotice-(HearingDate-02-04-2024).pdf 2024-02-13
24 201821049794-Correspondence to notify the Controller [19-03-2024(online)].pdf 2024-03-19

Search Strategy

1 2021-05-3110-56-41E_31-05-2021.pdf