A Method And System For Edge To Cloud Video Computing.

< Back

A Method And System For Edge To Cloud Video Computing.

Abstract: ABSTRACT Title: A METHOD AND SYSTEM FOR EDGE TO CLOUD VIDEO COMPUTING. The present invention discloses a method and a system to enable geographically distributed Edge devices to amalgamate various sensory data with Video data, and carrying out video computing tasks on those video data in a collaborative manner taking help of the Cloud resident computing stack, and correlate video metadata with sensory data in a single unified structure. The Edge devices are configured to connect with one or more video sources and are equipped with a processor, memory and storage to execute computer programs and enabling the device to performs various Video computing tasks with coordination with the Video computing cloud stack. Fig 1

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

10 December 2019

Publication Number

24/2021

Publication Type

INA

Invention Field

COMPUTER SCIENCE

Status

anjanonline@bsnl.in

Parent Application

Applicants

VIDEONETICS TECHNOLOGY PRIVATE LIMITED

Plot: AI/154/1, Action Area -1A, 4th Floor, WBHIDCO Utility Building, Beside Tank No. 3, New Town, Kolkata, West Bengal, INDIA-700156

Inventors

1. ACHARYA, Tinku;

E375, Baishnabghata-Patuli Township, Kolkata, West Bengal INDIA-700094

2. BOSE, Tuhin;

Plot: AI/154/1, Action Area -1A, 4th Floor, WBHIDCO Utility Building, Beside Tank No. 3, New Town, Kolkata, West Bengal, INDIA-700156

3. DAS, Monotosh;

Plot: AI/154/1, Action Area -1A, 4th Floor, WBHIDCO Utility Building, Beside Tank No. 3, New Town, Kolkata, West Bengal INDIA-700156

Specification

DESC:FIELD OF INVENTION:
The present invention relates to an integrated computing framework involving geographically distributed devices (Edge devices) and a Cloud resident computing facility (Video computing cloud stack). More specifically, the present invention is directed to a method and system to enable a plurality of geographically distributed Edge devices to amalgamate various sensory data with Video data, and carrying out video computing tasks on those video data in a collaborative manner taking help of the Cloud resident computing stack, and correlate video metadata with sensory data in a single unified structure. The Edge devices are configured to connect with one or more video sources and are equipped with a processor, memory and storage to execute computer programs and enabling the device to performs various Video computing tasks with coordination with the Video computing cloud stack. Advantageously, the Edge devices extends their computing capability beyond what is existing within the Edge devices with the help of the Cloud stack and also piggyback the sensory data along with data generated from video analysis on the video, and thereby generating a multi-sensory data stack. A set of mechanisms involving computing and communication is developed where sensory devices can be connected to the Edge device to initiate one or more video computing tasks depending on the sensory data value, and thus serving the purpose of a flexible distributed computing platform related to video computing and sensory data handling tasks in Video IoT (Internet of Things) domain.

BACKGROUND OF THE INVENTION:
IoT is all pervasive now, and there are many frameworks to handle and manage IoT devices. The sensory data points are generated usually by various transducers in the IoT domain, and the main focus is to transmit, store and analyse these data using some central computing facility. The cloud comes in picture at this stage, hosting the required central computing facility. However, in case of video the data points are contained inside the video streams and significant amount of computing is required to extract the data points from video streams using various means e.g. video analytics. Unlike sensory data, video is voluminous and transportation of it requires significant resources in terms of network bandwidth and computing infrastructure. With introduction of multi megapixel cameras, it has become necessary to process the video at the Edge (i.e. where cameras are located). This in turn requires high computing infrastructure at the Edge, and Edge devices are millions in numbers and in many cases they are located in adverse environments where high-end processors may not sustain their lives or do not have sufficient power for their operations.
Another challenge is to correlate sensory data with video data. This correlation is required to get better situational awareness and also to device correct action plans when sensory data shows any kind of anomaly. The correlation is usually done using some database at the central computing facility and anybody who wants to access correlated data points needs to consult database and device different mechanisms to access those data. Data sharing becomes a complex task.
Thus there has been a need for developing an improved technique to address the above two problems.

OBJECT OF THE INVENTION:
It is thus the basic object of the present invention to provide a system and a method for advanced integration of sensory data with video at source for easy correlation of multiple data points in a single multi-sensory data stack to advance the mechanisms of data correlation in both temporal and spacious domain, data transportation and data sharing.
Another object of present invention is to develop a system and a method to enhance the sensory data computing paradigm by involving a mechanism of collaborative computing involving Edge devices and cloud computing facility, where a given computing task is distributed between an Edge device and a cloud resident video computing stack depending on limitations of their respective computing and communication infrastructure.
Yet another object of the present invention is to provide a system and a method to enrich the data by means of appending information on the same multi-sensor data stack by various agencies, and thus enriching the data and creating enhanced situational awareness.

SUMMARY OF THE INVENTON
Thus according to the basic aspect of the present invention there is provided a method for geographically distributed edge device based amalgamation of various sensory data with Video data, and carrying out video computing tasks on said video data involving the sensory data comprising
involving multiple geographically distributed edge devices and configuring each of said edge devices to connect with one or more video sources and external sensory devices;
configuring said video sources to stream the video data to its corresponding edge device;
configuring said external sensory devices to send signals to the corresponding edge device over internet protocol based connection on detection of any anomaly in the data that the sensory devices generate;
involving storage element of the edge device to store the received data in an event queue with timestamps recording their time of generation;
involving processor of the edge device to segment the video stream into video files and reads the queue to generates a Group of picture (GOP) header with those sensory data synced in the timeline with the video data that is contained in the GOP, thus forming an extension of video data or a multi sensory data stack for the video computing tasks.

In the above method, the edge device allows the external sensory device to send periodic sensory data with a request to embed those data in the video clips with an intention to get a visual footprint of the location that the external sensory device monitors.

In the above method, the edge device is connected to a cloud resident video computing stack for data exchange and users use a graphical user interface to configure the edge device for its operation including requesting the edge device to perform various video computing task as independent activity or in response to the value of sensory data input.

In the above method, the edge device on receiving the service request from the users, sends a forward request to the Cloud computing stack for the said service and the Cloud computing stack in turn sends communication to the Edge device to execute a certain program embodied in the processor identified by an APP ID and accordingly generating the multi-sensory data stack which is then sent to the Cloud for further processing.

In the above method, the sensory data is embedded in the video files by including
configuring the sensory devices to use API server which ensure sensor generated events or periodic sensory data in the event queue;
configuring Packeizer-1 in the processor to take video feed from the video sources i.e. camera and segments the video streams in chunks of some predefined time duration (video files) and embeds the event data taken from the queue in the video files.
storing the sensory data embedded video files thus generated in store 1;

In the above method, the Video computing stack divides that requests into two parts, Part A: A task that is to be executed by the Cloud stack itself and Part B: A task that is to be executed by the edge device who has requested for the service, whereby controller in the edge device upon receiving the service request from the users sends the APP ID corresponding to the service request to the Cloud resident video computing stack and the video computing stack in turn divides the task in two parts (Part A and Part B) based on preset rule, and sends the executable code corresponding to Part B to the edge device; and
wherein the cloud resident computing stack puts the program code and configuration files required to run the program in a file folder, zips the folder and sends the name of the folder to the controller; and
wherein a Code collector of the edge device receives the name of the zipped folder from the controller and downloads that particular zipped folder, unzips the zipped folder and stores in the application store including downloading multiple such folders corresponding to multiple service requests from the users.

In the above method, the folders contain a shell script by fixed name and App starter program executed on the edge device reads the store and against each folder it spawns an execution thread and runs the corresponding shell script within it.

In the above method, the edge device includes App threads and packtizer-2 wherein the App thread connects the store 1 to read sensory data embedded video files and analyses the content of the video frames of the files using the algorithm intrinsic to the executable code (Part B) and the metadata it generates is pushed to an App queue by the App thread; and
wherein the Paketizer-2 appends this metadata in the sensory data embedded video files, thus constructing the next layer within the multi-sensory data stack and the files created by the packetizer-2 are stored in the store 2.

In the above method, the store 2 contains the composite data consisting of Video, sensory data and metadata generated by Video analytics application (Part B) and this composite data is then sent to Cloud stack for further processing corresponding to Part A.

According to another aspect of the present invention there is provided a geographically distributed edge device based system to amalgamate various sensory data with Video data, and carrying out video computing tasks on said video data involving the sensory data comprising
multiple geographically distributed edge devices, each said edge devices is connected with one or more video sources and external sensory devices;
said video sources stream the video data to its corresponding edge device;
said external sensory devices send signals to the corresponding edge device over internet protocol based connection on detection of any anomaly in the data that the sensory devices generate;
one or more storage element in the edge device to store the received data in an Event queue with timestamps recording their time of generation
a processor in the edge device to segment the video stream into video files and reads the queue and generates a Group of picture (GOP) header with those sensory data synced in the timeline with the video data that is contained in the GOP, thus forming an extension of video data or a multi sensory data stack for the video computing tasks.

BRIEF DESCRIPTION OF THE ACCOMPANYING DRAWINGS:
Fig. 1 shows components associated with the system to enable a plurality of geographically distributed Edge devices to amalgamate various sensory data with Video data, and carrying out video computing tasks on those video data in a collaborative manner taking help of the Cloud resident computing stack, and correlate video metadata with sensory data in a single unified structure in accordance with an embodiment of the present invention.
Fig. 2 shows mechanism to embed sensory data inside video files in accordance with an embodiment of the present invention.
Fig. 3 shows mechanism to run Part B of the video computing task in accordance with an embodiment of the present invention.
Fig. 4 shows mechanism to infuse App generated data in multi-sensory data stack in accordance with an embodiment of the present invention.
Fig 5 shows structure of a video frame in video file in accordance with an embodiment of the present invention.
Fig 6 shows structure of a sensor data embedded video file in accordance with an embodiment of the present invention.
Fig 7 shows mechanism to sensory event data in multi-sensory data stack in accordance with an embodiment of the present invention.

DESCRIPTION OF THE INVENTION WITH REFERENCE TO THE ACCOMPANYING DRAWINGS:
As stated hereinbefore the present invention discloses a method and a system to enable geographically distributed Edge devices to amalgamate various sensory data with Video data, and carrying out video computing tasks on those video data in a collaborative manner taking help of the Cloud resident computing stack, and correlate video metadata with sensory data in a single unified structure. The Edge devices are configured to connect with one or more video sources and are equipped with a processor, memory and storage to execute computer programs and enabling the device to performs various Video computing tasks with coordination with the Video computing cloud stack.
The Edge device is primarily connected to various video sources and receives the video streams. The video streams are chunked in small segments of predefined time duration (say, 10 sec). The external sensory devices connected to the device are free to send signals to the device over internet protocol based API. The external sensory devices are expected to send signals on detection of any anomaly in the data that the sensors generate. As for example, the external sensory device can send signal when the AQI (Air Quality Index) is beyond acceptable limit or say when a motor runs slower than its normal speed.
The device also allows the external sensory device to send periodic sensory data with a request to embed those data in the video clips with an intention to get a visual footprint of the location that the external sensory device monitors. The device receives the data through API interface and stores them in storage element of the edge device in an Event queue format with timestamps recording their time of generation. The Packetizer within the device processor segments the video stream into video files and reads the queue and generates a Group of picture (GOP) header with those sensory data synced in the timeline with the video data that is contained in the GOP, thus forming an extension of video data or a multi sensory data stack.
The external sensory devices register themselves to the device by sending their unique GUID. Once registered, the external sensory devices will use any one or more of the services that the Edge device offers to serve. e.g., “embed data in video”, “detect number of persons in the scene and embed data in a 15 sec video clip”, “measure traffic flow and embed data in a 15 sec video clip” etc. As for example, an AQI sensor can keep on sending Air quality index data and ask the device to detect traffic flow if the AQI data exceeds a preset threshold value.
The device when registered with multiple sensory data sources receives a set of requests from those external sensory devices, and sends the request to the Cloud resident video computing stack. The Video computing stack divides that requests into two parts,
(i) Part A: A task that is to be executed by the Cloud stack itself and
(ii) Part B: A task that is to be executed by the Edge device who has requested for the service.
If the device is not already equipped with the computer program to do its task (part B), then the Cloud computing task is equipped to send an executable code to the device and the device is equipped to receive the code. Packetizer-1 embeds sensory data in the video files, App threads analyze the sensory data embedded packetized video files by running executable code and generate metadata related to the video segment contained in the file. Packetizer-2 appends this metadata in the video files and stores them in storage.
Task scheduler periodically polls the storage and send the extended video files (multi-sensory data stack) to the Cloud resident stack for further processing. As for example, when the request is to detect traffic flow in a scene, the Cloud resident computing stack instructs the device using the mechanism stated above to send video clips whenever there is a motion in the scene. The executable code inside the device analyses video streams and push an event to the App queue when motion is detected in the scene so that Packetizer-2 can append motion related metadata (time of motion, degree of motion) in the sensory data embedded video clips, and store them in storage. If the program to detect motion by analysing video stream is not available in the device, the Cloud resident computing stack sends an executable code to the device, the executable code having that capability of analyzing the video stream and detecting motion in the scene.
The extended video clip generated by Packetizer-2 has information of the video source, sensors and metadata related to the video. Further, it also embeds an application ID. The application ID tells the Cloud resident computing stack what type of further processing is required on this video clip (in this case, detecting the traffic flow). The Cloud resident computing stack analyses the video clip and determines traffic flow in the scene and appends this data again in the video clip itself. The Video clips are catalogued in the Cloud resident system, and is usable by other devices/applications which can further analyze the video clip and generate more metadata and put that in the said extended video clip, thus forming a multi-sensor data stack that contain Video, sensory data as well as other metadata related to one another in time and space.
Reference is now invited to the accompanying Fig 1 which describes the primary components in the system of the present invention. The Video feed comes from IP cameras (401) connected to the Edge device (200) over network connection. Multiple sensory data sources (402) are also connected to the device over communication channels. The Edge device is connected to a Cloud resident video computing stack (300) for data exchange. Users use a graphical user interface (100) to configure the device for its operation and also to ask the device to perform various video computing task as independent activity or in response to the value of sensory data input. On receiving a service request from users, the Edge device sends a request to Cloud computing stack for the said service and the Cloud computing stack in turn sends communication to the Edge device to execute a certain program, identified by an APP ID. The multi-sensory data stack generated by the device is then sent to Cloud for further processing.
The accompanying Fig 2 describes the mechanism to embed sensory data in video files. The sensory devices use API server (202) to ensure sensor generated events or periodic sensory data in the Event queue (203). Packeizer-1 (201) takes video feed from the camera and segments the video streams in chunks of some predefined time duration (video files) and embeds Event data taken from the queue in the video files. Sensory data embedded video files thus generated are stored in store 1 (204).The structure of a video frame that of a sensor data embedded video file and the method of inserting sensory event data in video files are described in Fig 5, Fig 6 and Fig 7 respectively.
The accompanying Fig 3 describes the functionality of structural units within the device that are responsible to fetch program code (Part B) from the cloud and executing it. The controller (207) upon receiving the service request from the users sends the APP ID corresponding to the service request to the Cloud resident video computing stack (300). Video computing stack in turn divides the task in two parts (Part A and Part B) based on preset rule, and sends the executable code corresponding to Part B to the device. In this process the cloud resident computing stack puts the program code and configuration files required to run the program in a file folder, zips the folder and sends the name of the folder to the controller (207). The Code collector (208) receives the name of the zipped folder from the controller and downloads that particular zipped folder, unzips the zipped folder and stores in the application store (206). Multiple such folders can be downloaded by the device corresponding to multiple service requests from the users.
The folders among other binary files and configuration files contain a shell script by the fixed name run.sh in it. App starter is a program that is executed on system restart. App starter reads the store (206) and against each folder it spawns an execution thread (210) and runs the corresponding run.sh within it. The executable code corresponding to Part B of video computing task is thus executed in the device.
The accompanying Fig 4 describes the operation of App threads and packtizer-2 present in the device. The App thread (210) connects Store 1 (204) to read sensory data embedded video files and analyses the content of the video frames of the files using the algorithm intrinsic to the executable code (Part B). The metadata it generates is pushed to an App queue (211) by the App thread. Paketizer-2(212) appends this metadata in the sensory data embedded video files, thus constructing the next layer within the multi-sensory data stack. The files created by packetizer-2 are stored in Store 2(205).
Thus Store 2 contains the composite data consisting of Video, sensory data and metadata generated by Video analytics application (Part B). This composite data is then sent to Cloud stack for further processing by the Task scheduler module present in the device. The cloud stack analyses this data (using a paragram corresponding to Part A).
Fig 8 shows the structure of multi-sensory data stack after Packetizer-2 (212) appends video metadata in the multi-sensory data stack.
,CLAIMS:WE CLAIM:
1. A method for geographically distributed edge device based amalgamation of various sensory data with Video data, and carrying out video computing tasks on said video data involving the sensory data comprising
involving multiple geographically distributed edge devices and configuring each of said edge devices to connect with one or more video sources and external sensory devices;
configuring said video sources to stream the video data to its corresponding edge device;
configuring said external sensory devices to send signals to the corresponding edge device over internet protocol based connection on detection of any anomaly in the data that the sensory devices generate;
involving storage element of the edge device to store the received data in an event queue with timestamps recording their time of generation;
involving processor of the edge device to segment the video stream into video files and reads the queue to generates a Group of picture (GOP) header with those sensory data synced in the timeline with the video data that is contained in the GOP, thus forming an extension of video data or a multi sensory data stack for the video computing tasks.

2. The method as claimed in claim 1, wherein the edge device allows the external sensory device to send periodic sensory data with a request to embed those data in the video clips with an intention to get a visual footprint of the location that the external sensory device monitors.

3. The method as claimed in claim 1 or 2, wherein the edge device is connected to a cloud resident video computing stack for data exchange and users use a graphical user interface to configure the edge device for its operation including requesting the edge device to perform various video computing task as independent activity or in response to the value of sensory data input.

4. The method as claimed in anyone of claims 1 to 3, wherein the edge device on receiving the service request from the users, sends a forward request to the Cloud computing stack for the said service and the Cloud computing stack in turn sends communication to the Edge device to execute a certain program embodied in the processor identified by an APP ID and accordingly generating the multi-sensory data stack which is then sent to the Cloud for further processing.

5. The method as claimed in anyone of claims 1 to 4, wherein the sensory data is embedded in the video files by including
configuring the sensory devices to use API server which ensure sensor generated events or periodic sensory data in the event queue;
configuring Packeizer-1 in the processor to take video feed from the video sources i.e. camera and segments the video streams in chunks of some predefined time duration (video files) and embeds the event data taken from the queue in the video files.
storing the sensory data embedded video files thus generated in store 1;

6. The method as claimed in anyone of claims 1 to 5, wherein the Video computing stack divides that requests into two parts, Part A: A task that is to be executed by the Cloud stack itself and Part B: A task that is to be executed by the edge device who has requested for the service, whereby controller in the edge device upon receiving the service request from the users sends the APP ID corresponding to the service request to the Cloud resident video computing stack and the video computing stack in turn divides the task in two parts (Part A and Part B) based on preset rule, and sends the executable code corresponding to Part B to the edge device; and
wherein the cloud resident computing stack puts the program code and configuration files required to run the program in a file folder, zips the folder and sends the name of the folder to the controller; and
wherein a Code collector of the edge device receives the name of the zipped folder from the controller and downloads that particular zipped folder, unzips the zipped folder and stores in the application store including downloading multiple such folders corresponding to multiple service requests from the users.

7. The method as claimed in claim 6, wherein the folders contain a shell script by fixed name and App starter program executed on the edge device reads the store and against each folder it spawns an execution thread and runs the corresponding shell script within it.

8. The method as claimed in anyone of claims 1 to 7, wherein the edge device includes App threads and packtizer-2 wherein the App thread connects the store 1 to read sensory data embedded video files and analyses the content of the video frames of the files using the algorithm intrinsic to the executable code (Part B) and the metadata it generates is pushed to an App queue by the App thread; and
wherein the Paketizer-2 appends this metadata in the sensory data embedded video files, thus constructing the next layer within the multi-sensory data stack and the files created by the packetizer-2 are stored in the store 2.

9. The method as claimed in anyone of claims 1 to 7, wherein the store 2 contains the composite data consisting of Video, sensory data and metadata generated by Video analytics application (Part B) and this composite data is then sent to Cloud stack for further processing corresponding to Part A.

10. A geographically distributed edge device based system to amalgamate various sensory data with Video data, and carrying out video computing tasks on said video data involving the sensory data comprising
multiple geographically distributed edge devices, each said edge devices is connected with one or more video sources and external sensory devices;
said video sources stream the video data to its corresponding edge device;
said external sensory devices send signals to the corresponding edge device over internet protocol based connection on detection of any anomaly in the data that the sensory devices generate;
one or more storage element in the edge device to store the received data in an Event queue with timestamps recording their time of generation
a processor in the edge device to segment the video stream into video files and reads the queue and generates a Group of picture (GOP) header with those sensory data synced in the timeline with the video data that is contained in the GOP, thus forming an extension of video data or a multi sensory data stack for the video computing tasks.

Documents

Application Documents

#	Name	Date
1	201931051075-STATEMENT OF UNDERTAKING (FORM 3) [10-12-2019(online)].pdf	2019-12-10
2	201931051075-PROVISIONAL SPECIFICATION [10-12-2019(online)].pdf	2019-12-10
3	201931051075-FORM 1 [10-12-2019(online)].pdf	2019-12-10
4	201931051075-DRAWINGS [10-12-2019(online)].pdf	2019-12-10
5	201931051075-ENDORSEMENT BY INVENTORS [09-12-2020(online)].pdf	2020-12-09
6	201931051075-DRAWING [09-12-2020(online)].pdf	2020-12-09
7	201931051075-COMPLETE SPECIFICATION [09-12-2020(online)].pdf	2020-12-09
8	201931051075-FORM 18 [29-11-2023(online)].pdf	2023-11-29
9	201931051075-Proof of Right [16-07-2025(online)].pdf	2025-07-16
10	201931051075-FORM-26 [16-07-2025(online)].pdf	2025-07-16
11	201931051075-FER.pdf	2025-10-13

Search Strategy

1	201931051075_SearchStrategyNew_E_Search201931051075E_29-05-2025.pdf