Method Of Collaborative Visual Navigation In A Robotic Network

< Back

Method Of Collaborative Visual Navigation In A Robotic Network

Abstract: When navigation of the robots requires the robots to automatically perform localization, the state of the art mechanisms fail to provide a collaboration approach for the localization and navigation. The disclosure herein generally relates to robotic networks, and, more particularly, to a method for collaborative visual navigation in the robotic networks. In this approach, each client robot of the robotic network generates own local maps by identifying Key-Frames (KFs) from a set of images captured and based on estimated pose of the KFs. The local maps are then used by a server to generate a combined server map. The server also optimizes KFs and Map Points (MPs) received from the client robots, and the optimized KFs and MPs are used by the client robot to update/optimize the local maps. Data from the local maps and the combined server map are used by the client robots for navigation. To be published with FIG. 3

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #

Filing Date

28 February 2020

Publication Number

36/2021

Publication Type

INA

Invention Field

ELECTRONICS

Status

kcopatents@khaitanco.com

Parent Application

Applicants

Tata Consultancy Services Limited

Nirmal Building, 9th Floor, Nariman Point Mumbai 400021 Maharashtra, India

Inventors

1. SAHA, Arindam

Tata Consultancy Services Limited Building 1B, Eco Space, Plot IIF/12 , New Town, Rajarhat, Kolkata 700160 West Bengal, India

Specification

FORM 2
THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENT RULES, 2003
COMPLETE SPECIFICATION (See Section 10 and Rule 13)
Title of invention:
METHOD OF COLLABORATIVE VISUAL NAVIGATION IN A
ROBOTIC NETWORK
Applicant
Tata Consultancy Services Limited A company Incorporated in India under the Companies Act, 1956
Having address:
Nirmal Building, 9th floor,
Nariman point, Mumbai 400021,
Maharashtra, India
Preamble to the description
The following specification particularly describes the invention and the manner in which it is to be performed.

TECHNICAL FIELD [001] The disclosure herein generally relates to robotic networks, and, more particularly, to a method for collaborative visual navigation in the robotic networks.
BACKGROUND
[002] With the increasing popularity of industrial automation concept robots are deployed in a variety of fields to perform various tasks. For example, robots are used for warehouse management. Such applications require the deployed robots to move around to perform the assigned tasks. In some of the applications, the movement of the robots may be only through fixed paths. However in some other applications, the robots may be required to dynamically define paths considering one or more factors, and then perform navigation.
[003] When the robots are deployed to handle larger tasks, for example, surveillance of an area, agricultural exploration and so on, a group of robots collaborate and the tasks are divided among them. Each robot in the group carry out the assigned task, and together the larger task is completed.
[004] Visual Simultaneous Localization and Mapping (Visual SLAM) is an approach used by a group of robots for self localization and global map generation of an environment in which the robots are located, which in turn can be used by the robots for navigation. A disadvantage of the state of the art Visual SLAM approach is that the extent of collaboration of the robots in navigation is minimal. Also, abrupt movements of the robots may cause the robots to lose tracks, which in turn affects the collaborative operation of the group of robots and the task completion.
SUMMARY
[005] Embodiments of the present disclosure present technological
improvements as solutions to one or more of the above-mentioned technical
problems recognized by the inventors in conventional systems. For example, in
one embodiment, a robotic network is provided. The robotic network includes a

server and a plurality of client robots. Each of the plurality of robots is in communication with the server for a collaborative visual navigation. During the collaborative visual navigation, each client robot in the robotic network uses one or more image sensors to collect a plurality of images of own surroundings. Further, one or more feature-points are extracted from each of the plurality of images, by each of the plurality of client robots. Further, a depth information is calculated from the plurality of feature-points, and then the feature points are converted as Map-Points (MPs), by each of the plurality of client robots. Further, pose of each of the plurality of images is estimated, by each of the plurality of client robots. Further, one or more of the plurality of images are determined as Key-Frames (KF), by each of the plurality of client robots. Then a local map is generated based on the estimated poses of KFs, and the MPs of the KFs, by each of the plurality of client robots, wherein the local map indicates past positions of the client robot with respect to a world coordinate system and position of the estimated MPs as obstacles when the client robot is in motion. The local map and information on the determined KFs and MPs are transmitted to the server by each of the plurality of client robots, which are received by the server. Further, a map for each of the plurality of client robots is created in a server map stack, based on the information received, by the server. Then a combined server map is generated by merging the maps in the server map stack, by the server. The server then generates optimized KFs and MPs for each KFs and MPs received from each of the plurality of client robots. If any of the client robots is determined as having one or more KFs and MPs for which the optimized KFs and MPs have been generated, the server transmits the optimized KFs and MPs, to the at least one client robot, wherein the at least one client robot uses the optimized KFs and MPs for pose optimization of images captured at any later instance of time. Each of the client robots then performs navigation based on the local map, and the updated KFs and MPs.
[006] In another aspect, a method for collaborative visual navigation in a robotic network is provided. The robotic network includes a server and a plurality of client robots. Each of the plurality of robots is in communication with the

server for a collaborative visual navigation. During the collaborative visual navigation, each client robot in the robotic network uses one or more image sensors to collect a plurality of images of own surroundings. Further, one or more feature-points are extracted from each of the plurality of images, by each of the plurality of client robots. Further, a depth information is calculated from the plurality of feature-points, and then the feature points are converted as Map-Points (MPs), by each of the plurality of client robots. Further, pose of each of the plurality of images is estimated, by each of the plurality of client robots. Further, one or more of the plurality of images are determined as Key-Frames (KF), by each of the plurality of client robots. Then a local map is generated based on the estimated poses of KFs, and the MPs of the KFs, by each of the plurality of client robots, wherein the local map indicates past positions of the client robot with respect to a world coordinate system and position of the estimated MPs as obstacles when the client robot is in motion. The local map and information on the determined KFs and MPs are transmitted to the server by each of the plurality of client robots, which are received by the server. Further, a map for each of the plurality of client robots is created in a server map stack, based on the information received, by the server. Then a combined server map is generated by merging the maps in the server map stack, by the server. The server then generates optimized KFs and MPs for each KFs and MPs received from each of the plurality of client robots. If any of the client robots is determined as having one or more KFs and MPs for which the optimized KFs and MPs have been generated, the server transmits the optimized KFs and MPs, to the at least one client robot, wherein the at least one client robot uses the optimized KFs and MPs for pose optimization of images captured at any later instance of time. Each of the client robots then performs navigation based on the local map, and the updated KFs and MPs.
[007] It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
BRIEF DESCRIPTION OF THE DRAWINGS

[008] The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate exemplary embodiments and, together with the description, serve to explain the disclosed principles:
[009] FIG. 1 illustrates a block diagram of the robotic network in which the collaborative visual navigation is performed, according to some embodiments of the present disclosure.
[010] FIG. 2 is a functional block diagram depicting components of a client robot in the robotic network of FIG. 1, according to some embodiments of the present disclosure.
[011] FIG. 3 is a block diagram depicting components of server in the robotic network of FIG. 1, in accordance with some embodiments of the present disclosure.
[012] FIGS. 4A and 4B are flow diagrams depicting steps involved in the process of the collaborative visual navigation performed by the robotic network of FIG. 1, according to some embodiments of the present disclosure.
[013] FIGS. 5A and 5B are flow diagrams depicting steps involved in the process of performing a recovery operation in the event of a track-loss scenario, by the robotic network of FIG. 1, in accordance with some embodiments of the present disclosure.
[014] FIG. 6 is a flow diagram depicting steps involved in the process of performing space recovery at each of the client robots in the robotic network of FIG. 1, in accordance with some embodiments of the present disclosure.
DETAILED DESCRIPTION OF EMBODIMENTS [015] Exemplary embodiments are described with reference to the accompanying drawings. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. Wherever convenient, the same reference numbers are used throughout the drawings to refer to the same or like parts. While examples and features of disclosed principles are described herein, modifications, adaptations, and other implementations are possible without departing from the scope of the disclosed

embodiments. It is intended that the following detailed description be considered as exemplary only, with the true scope being indicated by the following claims.
[016] Referring now to the drawings, and more particularly to FIG. 1 through 6, where similar reference characters denote corresponding features consistently throughout the figures, there are shown preferred embodiments and these embodiments are described in the context of the following exemplary system and/or method.
[017] FIG. 1 illustrates a block diagram of the robotic network in which the collaborative visual navigation is performed, according to some embodiments of the present disclosure. The robotic network 100 includes a server 102 and a plurality of client robots 101 (represented as 101.a, 101.b, …, 101.n in FIG. 1 and are collectively represented as 101) that are in communication with the server 102 using one or more appropriate channels. Each client robot 101 includes a visual odometry processing unit 201, a memory 202, a communication manager 204, and an optimizer 203 in addition to other standard modules required by the client robots for movement as well for performing one or more tasks. This is depicted in FIG. 2. The server 102 includes one or more client managers 301.a, 301.b,…301.n (collectively represented as client manager 301), a storage manager 302, an inter-place-map recognizer 303, and an optimizer 304. This is depicted in FIG. 3. Each of the components of the client robot 101 and the server 102 may have one or more hardware processors (not shown) to perform data processing associated with the collaborative visual navigation being performed by the robotic network 100. Further, each of the robotic clients 101 as well as the server 102 are configured to store a plurality of instructions in an associated storage space, which when executed by the one or more hardware processors of each of the components, cause each of the robotic clients 101 and the server 102 to perform one or more steps in the collaborative visual navigation.
[018] When the client robots 101 are deployed to collaboratively execute one or more tasks, each of the client robots 101 captures images of surroundings using one or more image sensors, and generates own local map, and the local map from all the client robots 101 are transmitted to the server 102 along with

supporting information. The server 102 creates a combined server map using the local maps and the supporting information received from the client robots 101. Information in own local map and the combined server map are used by each of the client robots 101 to perform the navigation. Steps involved in the process of collaborative visual navigation performed by the robotic network 100 are depicted in FIG. 4 and are now explained with reference to the hardware components depicted in FIG. 1 through FIG. 3.
[019] Each client robot 101 has a plurality of image sensors including at least one monocular camera, at least one stereo camera, at least one Red-Green-Blue-Depth (RGBD) camera, and any other image sensor that can capture images of surroundings and/or specific objects around each of the client robots 101, in any required format and resolution. Each client robot 101 collects/captures (402) one or more images of the surroundings, using one or more image sensors. In an embodiment, the client robot 101 captures the images using only one image sensor at a time. In another embodiment, the client robot 101 captures the images using more than one image sensor at a time, wherein the image sensors used may be of the same type or different types. When more than one image sensor is used, the client robot 101 uses an appropriate input fusion technique that can be used to fuse the images captured by different image sensors, for further processing. For example, when multiple image sensors are used by the client robot 101, the client robot 101 collects images of each scene using all the image sensors available, at once. The images captured at once using the different image sensors are then fused to generate a single image.
[020] The data processing being carried out by each of the plurality of client robots 101 is same as depicted in FIG. 3 and FIG. 4. Hence, to avoid repetition of contents, the process is now being explained from a single client robot perspective.
[021] The visual odometry processing module 201 in the client robot 101 processes the collected images using one or more image processing techniques to extract (404) a plurality of feature points from each of the plurality of images. The feature points are some specific pixels that carry some specific properties and can

be uniquely identifiable in different images by the client robot 101. The visual odometry processing module 201 further calculates depth of the features by comparing the features extracted from each frame with the features extracted from subsequent frames. If the calculated depth is at least equal to a threshold of depth for any feature, that feature is identified as a landmark, and in turn is converted as Map Points (MP) for the local map. The threshold of depth may be defined configured with the visual odometry processing module 201, statically or dynamically.
[022] The visual odometry processing module 201 further estimates (408) pose of each of the images, by establishing relation between 3-Dimentional MP in the local map and 2-Dimensional feature points of current images and solving this as Perspective-n-Point problem, and further creates a pose graph based on the estimated poses of the images. The visual odometry processing module 201 further determines (410) one or more most representative frames from among the captured images as Key-Frames (KFs), based on one or more factors such as but not limited to difference in positions between a current frame and last KF, and feature point overlap between the current frame and the last KF. The visual odometry processing module 201 further generates (412) a local map, using the KFs, information on the pose of the KFs, and the MPs of the KFs. During this process, each KF stores relative pose from the last inserted KF. Each KF and MP are assigned a unique identifier to avoid any ambiguity. The local map of any robotic client 101 represents local vicinity of the client robot 101 through a connected graph among consecutive KFs and MPs. The local map includes nearest N number of KFs from the current location of the camera and is periodically updated either through new KF insertion or old KFs obtained from the server 102 when the client robot 101 visits any location that has been visited in the past. The client robot 101 then transmits (414) the local map, and supporting information including the estimated pose of each KF, and the MPs of the KFs to the server 102.
[023] The client manager 301 in the server 102, that is configured to communicate with the robotic client 101, receives the local map and the

supporting information from all the client robots 101. In an embodiment, one dedicated client manager 301 each is assigned to handle each of the client robots 101 i.e. number of client managers 301 active in the server 102 is at least equal to number of robotic clients 101 communicating with the server 101. Further, a map manager 306 in the server 102 creates (416) a map for each of the client robots 101 and stores the maps in a server map stack maintained in the storage manager 302 of the server 102. Each map may be assigned a unique identification number, by the map manager 306. The map manager further initializes an intra map place recognizer 307 in the client manager 301 and an inter map place recognizer 303 of the server 102. The intra map place recognizer 307 processes the maps in the server map stack and performs loop detection and loop merging. The inter map place recognizer 303 compares the maps in the server map stack to identify matching maps i.e. maps having matching KFs and common MPs. Consider that Fs is current KF from map Ms and Fd is current KF from map Md. Matched MPs generate one set of 3D-3D point correspondence between Ms and Md, one set of 3D-2D point correspondence from Ms and Fd, another set of 3D-2D point correspondence from Md to Fs, and one set of 2D-2D point correspondence between Fs and Fd. An average scale difference from Ms to Md is then calculated by taking the ratio of calculated distances of pair of matched 3D points in Ms and same pair of matched 3D points in Md. Further, a SE(3) transformation from Ms to Md after scaling of Ms is calculated. 3D-2D correspondence sets generate independent Sim(3) transformation from Ms to Md. Further, an average Sim(3) transformation is formulated by rotation averaging and translation averaging. 2D-2D correspondences generate a pose of Fs in map Md. Any appropriate transformation such as but not limited to SE(3) or Sim(3) transformation, that produces nearest pose of calculated pose using 2D-2D point correspondences, is selected and used. This transformation allows the MPs and KFs of Ms to shift into coordinate system of Md. This in turn helps in creating a new map structure Mf to hold a fused map. Mf holds same coordinate system of Md, which allows MPs

and KFs to enter directly to Mfand MPs and KFs of Ms transfer to Mfusing the calculated Sim(3) transformation.
[024] By virtue of the map fusion, the maps in the server map stack are merged to generate (418) a combined server map. The combined server map contains KFs and MPs from all the client robots 101. The client robots 101 in the robotic network 100 have access to the combined server map.
[025] The server 102 optimizes the KFs and MPs fetched from each of the client robots 101 and optimizes the fetched KFs and MPs to generate optimized KFs poses and MPs positions. The server 102 may use any suitable approach such as but not limited to a ‘Levenberg-Marquardt Bundle Adjustment’ which minimizes reprojection error . The server 102 further checks (422) whether any of the client robots 101 has KFs and MPs for which the optimized KFs and MPs have been generated. If any of the client robots 101 has been found to have the KFs and MPs for which the optimized KFs and MPs have been generated, then the optimized KFs and MPs are transmitted (424) by the server 102 to the client robot 101. Upon receiving the optimized KFs and MPs from the server 102, the client robot 101 updates and optimizes the local map using the optimized KFs and MPs.
[026] The client robot 101 then performs (426) navigation using the local map and the KFs and MPs in the local map (updated/optimized KFs and MPs, if received from the server 102).
[027] Sometimes, due to abrupt movements of the client robot 101 or due to any other technical reason, the client robot 101 may experience losing track i.e. the client robot is unable to navigate using the local map. This is referred to as a track-lose scenario, and steps executed in a recovery operation by the robotic network 100 in the event of the track-lose scenario are depicted in FIG. 5.
[028] Upon determining (502) that tracking based on the local map has failed, the client robot 101 attempts (504) to re-localize using the local map. If the client robot 101 is able to re-localize within a pre-set time period, then the client robot 101 continues the navigation. If the client robot 101 is unable to re-localize within the pre-set time period, then the client robot 101 enters (510) a track loss

mode and sends a track-loss message with a last KF-Id to the server 102. Upon receiving the track-loss message, the server 102 allocates (512) next available client-Id to the client robot (alternately referred to as ‘track-lost client robot 101’) and then opens (514) a new client manager 301 in the server 102 with the allocated client-Id. The server 102 receives (516) the last KF-Id transmitted by the client robot 101, and transmits (518) an acknowledgement along with the new client-Id to the client robot 101. Upon receipt of the new client-Id, the client robot 101 re-initializes tracking and communication with the server 102, using the new client-Id.
[029] Each of the client robots 101 captures the images of the surroundings when in motion, and the KFs are stored in the local map. Storage space in the memory 202 may be limited. For smooth functioning of the client robot 101, certain amount of storage space may have to be free at any point of time. Steps involved in the process of freeing up memory space is depicted in FIG. 6. The client robot 101 frees up space by deleting some of the KFs in the local map. In an embodiment, the client robot 101 is configured to select (602) old KFs for deletion whenever the KFs are to be deleted to free up memory space i.e. recently captured KFs are retained and selected old KFs are deleted. However even the old KFs are to be maintained in the robotic network 100 so that they can be used/reused during navigation of the client robots 101. Before deleting the selected KFs from the local map, the robotic client 101 checks (604) whether the selected KFs have been backed up in the server 102. If the selected KFs have not been backed up in the server 102, then the selected KFs are transmitted (606) to the server 102, and are backed up in the database for place recognition in the storage manager 302. Further the selected KFs are deleted (608) from the local map.
[030] In various embodiments, steps in methods 400, 500, and 600 may be performed in the same order as depicted in FIG. 4A through FIG. 6 or in any alternate order that is technically correct. Also, one or more steps in the methods 400, 500, and 600 may be omitted.

[031] The written description describes the subject matter herein to enable any person skilled in the art to make and use the embodiments. The scope of the subject matter embodiments is defined by the claims and may include other modifications that occur to those skilled in the art. Such other modifications are intended to be within the scope of the claims if they have similar elements that do not differ from the literal language of the claims or if they include equivalent elements with insubstantial differences from the literal language of the claims.
[032] The embodiments of present disclosure herein address unresolved problem of collaborative visual navigation in a robotic network. The embodiment, thus provides a method for collaborative visual navigation in which a server facilitates navigation of client robots using map data populated by the plurality of client robots. Moreover, the embodiments herein further provide a recovery method to re-establish connection in case of a track-loss scenario.
[033] It is to be understood that the scope of the protection is extended to such a program and in addition to a computer-readable means having a message therein; such computer-readable storage means contain program-code means for implementation of one or more steps of the method, when the program runs on a server or mobile device or any suitable programmable device. The hardware device can be any kind of device which can be programmed including e.g. any kind of computer like a server or a personal computer, or the like, or any combination thereof. The device may also include means which could be e.g. hardware means like e.g. an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or a combination of hardware and software means, e.g. an ASIC and an FPGA, or at least one microprocessor and at least one memory with software processing components located therein. Thus, the means can include both hardware means and software means. The method embodiments described herein could be implemented in hardware and software. The device may also include software means. Alternatively, the embodiments may be implemented on different hardware devices, e.g. using a plurality of CPUs.
[034] The embodiments herein can comprise hardware and software elements. The embodiments that are implemented in software include but are not

limited to, firmware, resident software, microcode, etc. The functions performed by various components described herein may be implemented in other components or combinations of other components. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can comprise, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
[035] The illustrated steps are set out to explain the exemplary
embodiments shown, and it should be anticipated that ongoing technological
development will change the manner in which particular functions are performed.
These examples are presented herein for purposes of illustration, and not
limitation. Further, the boundaries of the functional building blocks have been
arbitrarily defined herein for the convenience of the description. Alternative
boundaries can be defined so long as the specified functions and relationships
thereof are appropriately performed. Alternatives (including equivalents,
extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope of the disclosed embodiments. Also, the words “comprising,” “having,” “containing,” and “including,” and other similar forms are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items, or meant to be limited to only the listed item or items. It must also be noted that as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise.
[036] Furthermore, one or more computer-readable storage media may be utilized in implementing embodiments consistent with the present disclosure. A computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored. Thus, a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform

steps or stages consistent with the embodiments described herein. The term “computer-readable medium” should be understood to include tangible items and exclude carrier waves and transient signals, i.e., be non-transitory. Examples include random access memory (RAM), read-only memory (ROM), volatile memory, nonvolatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any other known physical storage media.
[037] It is intended that the disclosure and examples be considered as exemplary only, with a true scope of disclosed embodiments being indicated by the following claims.

We Claim
1. A robotic network (100), comprising: a server (102); and
a plurality of client robots (101), wherein each of the plurality of client robots is in communication with the server for a collaborative visual navigation, the collaborative visual navigation comprising:
collecting (402) a plurality of images of surroundings, using one or more image sensors in each of the plurality of client robots; extracting (404) one or more feature-points from of each of the plurality of images, by each of the plurality of client robots; calculating (406) a depth information from the plurality of feature-points and converting the feature-points as Map-Points (MPs), by each of the plurality of client robots;
estimating (408) pose of each of the plurality of images, by each of the plurality of client robots;
determining (410) one or more of the plurality of images as Key-Frames (KF), by each of the plurality of client robots;
generating (412) a local map based on the estimated poses of KFs, and the MPs of the KFs, by each of the plurality of client robots, wherein the local map indicates past positions of the client robot with respect to a world coordinate system and position of the estimated MPs as obstacles when each of the plurality of client robots is in motion;
transmitting (414) the local map and information on the determined KFs and MPs to the server, by each of the plurality of client robots; receiving the local map and information on the determined KFs and MPs of each of the plurality of client robots, by the server; creating (416) a map for each of the plurality of client robots in a server map stack, based on the information received, by the server; generating (418) a combined server map, by merging the maps in the server map stack, by the server;

generating (420) optimized KFs and MPs for each KFs and MPs
received from each of the plurality of client robots, by the server;
determining (422) that at least one client robot has one or more of
the KFs and MPs for which the optimized KFs and MPs has been
generated, by the server;
transmitting (424) the optimized KFs and MPs, to the at least one
client robot, wherein the at least one client robot uses the
optimized KFs and MPs for pose optimization of one or more
images captured at any later instance of time; and
performing (426) navigation based on the local map, and the
updated KFs and MPs, by each of the plurality of client robots.
2. The robotic network (100) as claimed in claim 1, wherein the robotic network performs a recovery operation when a track-loss event is detected on a client robot, comprising:
Determining (502) that tracking based on the local map has failed, by at least one of the plurality of client robots;
attempting (504) to re-localize using the local map, for a pre¬determined time period, by the at least one client robot;
entering (510) a track-loss mode if the attempt to re-localize within the pre-determined time period is unsuccessful, by the at least one client robot;
sending (510) a track-loss message with a last KF-Id to the server, by the at least one client robot in the track-loss mode;
allocating (512) a next available client-id for the at least one client robot in the track-loss mode, by the server;
opening (514) a new client manager of the server with the allocated client-id, by the server, wherein the client manager is dedicated to handle communication and data processing for a single client robot; receiving (516) the last KF-Id from the at least one client robot in the track-loss mode, by the server;

transmitting (518) an acknowledgement along with the new allocated client-id to the at least one client robot, by the server; and
reinitializing (520) tracking from the beginning with the new client-id, by the at least one client robot; and
communicating (520) with the server using the new client-Id, by the at least one client robot in the track-loss mode.
3. The robotic network (100) as claimed in claim 2, wherein determining by
the client robot (101) that the tracking based on the local map has failed,
comprises:
determining accuracy of the estimated pose of each of the plurality
of images, by verifying the estimated pose of each image with the
relation between 3-Dimensional map points and 2-Dimensional
feature-points;
classifying the determined accuracy as one of good and bad by
comparing the determined accuracy with a threshold of accuracy;
and
determining that the tracking based on the local map has failed, if
the determined accuracy is classified as bad.
4. The robotic network (100) as claimed in claim 1, wherein the one or more image sensors comprises at least one monocular camera, at least one stereo camera, and at least one Red-Green-Blue-Depth (RGBD) camera.
5. The robotic network (100) as claimed in claim 1, wherein the server transmits optimized and updated KFs and MPs from every client robot to every other client robots in the robotic network during the collaborative visual navigation.
6. The robotic network (100) as claimed in claim 1, wherein at least a part of the KFs in the local map of each of the client robots is deleted from the

local map when storage space consumed by the KFs reach a threshold of storage defined for each of the plurality of client robots, comprising:
selecting (602) the KFs to be deleted, from among a plurality of
KFs in the local map, wherein oldest KFs are selected for deletion;
checking (604) whether the selected KFs have been already
transmitted to the server for backing up in the combined server
map in the server;
transmitting (606) the selected KFs to the server if the selected KFs
are identified as not transmitted to the server; and
deleting (608) the selected KFs after verifying that the selected
KFs have been backed up in the combined server map.
7. A method (400) for collaborative visual navigation, comprising:
collecting (402) a plurality of images of surroundings, using one or more image sensors in each of a plurality of client robots in a robotic network;
extracting (404) one or more feature-points from of each of the plurality of images, by each of the plurality of client robots; calculating (406) a depth information from the plurality of feature-points and convert these feature-points as Map-Points (MPs), by each of the plurality of client robots;
estimating (408) pose of each of the plurality of images, by each of the plurality of client robots;
determining (410) one or more of the plurality of images as Key-Frames (KF), by each of the plurality of client robots;
generating (412) a local map based on the estimated poses of KFs and the calculated MPs, by each of the plurality of client robots, wherein the local map indicates past positions of the client robot with respect to a world coordinate system and position of the estimated MPs as obstacles when each of the plurality of client robots is in motion;

transmitting (414) the local map and information on the determined KFs and MPs to the server, by each of the plurality of client robots; receiving the local map and information on the determined KFs and MPs of each of the plurality of client robots, by the server; creating (416) a map for each of the plurality of client robots in a server map stack, based on the information received, by the server; generating (418) a combined server map, by merging the maps from the server map stack, by the server;
generating (420) optimized KFs and MPs for each KFs and MPs obtained from each of the plurality of client robots, by the server; determining (422) that at least one client robot has one or more of the KFs and MPs for which the optimized KFs and MPs has been generated, by the server;
transmitting (424) the optimized KFs and MPs, to the at least one client robot, wherein the at least one client robot uses the optimized KFs and MPs for pose optimization of one or more captured images captured at any later instance of time; and performing (426) navigation based on the local map, the updated KFs and MPs, by each of the plurality of client robots.
8. The method as claimed in claim 7, wherein the robotic network (100) performs a recovery operation when a track-loss event is detected on a client robot, comprising:
determining (502) that tracking based on the local map has failed, by at least one of the plurality of client robots;
attempting (504) to re-localize using the local map, for a pre-determined time period, by the at least one client robot; entering (510) a track-loss mode if the attempt to re-localize within the pre-determined time period is unsuccessful, by the at least one client robot;

sending (510) a track-loss message with a last KF-Id to the server, by the at least one client robot;
allocating (512) a next available client-id for the at least one client robot in the track-loss mode, by the server;
opening (514) a new client manager with the allocated client-id, by the server;
receiving (516) the last KF-Id from the at least one client robot in the track-loss mode, by the server;
transmitting (518) an acknowledgement along with the new allocated client-id to the at least one client robot in the track-loss mode, by the server;
reinitializing (520)tracking from the beginning with the new client-id, by the at least one client robot in the track-loss mode; and communicating (522) with the server, using the new client-Id, by the at least one client robot in the track-loss mode.
9. The method as claimed in claim 8, wherein determining by the client robot
that the tracking based on the local map has failed, comprises:
determining accuracy of the estimated pose of each of the plurality
of images, by verifying the estimated pose of each image with the
relation between 3-Dimensional map points and 2-Dimensional
feature- points;
classifying the determined accuracy as one of good and bad by
comparing the determined accuracy with a threshold of accuracy;
and
determining that the tracking based on the local map has failed, if
the determined accuracy is classified as bad.
10. The method as claimed in claim 7, wherein the one or more image sensors
comprises at least one monocular camera, at least one stereo camera, and
at least one Red-Green-Blue-Depth (RGBD) camera.

11. The method as claimed in claim 7, wherein the KFs and MPs from every client robot is transmitted to every other client robots in the robotic network during the collaborative visual navigation, by the server.
12. The method as claimed in claim 7, wherein at least a part of the KFs in the local map of each of the plurality of client robots is deleted from the local map when storage space consumed by the KFs reach a threshold of storage defined for each of the plurality of client robots, comprising:
selecting (602) the KFs to be deleted, from among a plurality of
KFs in the local map, wherein oldest KFs are selected for deletion;
checking (604) whether the selected KFs have been already
transmitted to the server for backing up in the combined server
map in the server;
transmitting (606) the selected KFs to the server if the selected KFs
are identified as not transmitted to the server; and
deleting (608) the selected KFs after verifying that the selected
KFs have been backed up in the combined server map.
13. A client robot (101) in a robotic network, wherein the client robot
performs a collaborative visual navigation, comprising:
collecting a plurality of images of surroundings, using one or more
image sensors of the client robot;
extracting one or more feature-points from of each of the plurality
of images, by the client robot;
calculating a depth information from the plurality of feature-points
and convert these feature-points as Map-Points (MPs), by the client
robot;
estimating pose of each of the plurality of images, by the client
robot;
determining one or more of the plurality of images as Key-Frames
(KF), by the client robot;

generating a local map based on the estimated poses of KFs and
the calculated MPs, by the client robot, wherein the local map
indicates position of the client robot with respect to a world
coordinate system and position of the estimated MPs as obstacles
when the client robot is in motion;
transmitting the local map to a server in the robotic network, by the
client robot;
performing pose-optimization using optimized KFs and MPs
received from the server;
obtaining data from a combined server map of the server, by the
client robot; and
performing the collaborative visual navigation using data in the
local map and the combined server map.
14. A server (102) in a robotic network the server assisting at least one client robot in the robotic network in performing a collaborative visual navigation by:
receiving local maps of a plurality of client robots in the robotic
network, wherein the plurality of client robots are in
communication with the server;
creating a map for each of the plurality of client robots in a server
map stack, based on the local maps received from the plurality of
client robots;
generating a combined server map, by merging the maps in the
server map stack;
generating optimized KFs and MPs for each KFs and MPs
obtained from each of the plurality of client robots;
determining that at least one client robot has one or more of the
KFs and MPs for which the optimized KFs and MPs has been
generated; and

transmitting the optimized KFs and MPs for the one or more KFs and MPs, to the at least one client robot.

Documents

Application Documents

#	Name	Date
1	202021008637-FER_SER_REPLY [20-01-2022(online)].pdf	2022-01-20
1	202021008637-STATEMENT OF UNDERTAKING (FORM 3) [28-02-2020(online)].pdf	2020-02-28
2	202021008637-REQUEST FOR EXAMINATION (FORM-18) [28-02-2020(online)].pdf	2020-02-28
2	202021008637-OTHERS [20-01-2022(online)].pdf	2022-01-20
3	202021008637-FORM 18 [28-02-2020(online)].pdf	2020-02-28
3	202021008637-FER.pdf	2021-11-01
4	202021008637-FORM-26 [09-10-2020(online)].pdf	2020-10-09
4	202021008637-FORM 1 [28-02-2020(online)].pdf	2020-02-28
5	202021008637-Proof of Right [13-05-2020(online)].pdf	2020-05-13
5	202021008637-FIGURE OF ABSTRACT [28-02-2020(online)].jpg	2020-02-28
6	Abstract1.jpg	2020-03-04
6	202021008637-DRAWINGS [28-02-2020(online)].pdf	2020-02-28
7	202021008637-DECLARATION OF INVENTORSHIP (FORM 5) [28-02-2020(online)].pdf	2020-02-28
7	202021008637-COMPLETE SPECIFICATION [28-02-2020(online)].pdf	2020-02-28
8	202021008637-DECLARATION OF INVENTORSHIP (FORM 5) [28-02-2020(online)].pdf	2020-02-28
8	202021008637-COMPLETE SPECIFICATION [28-02-2020(online)].pdf	2020-02-28
9	Abstract1.jpg	2020-03-04
9	202021008637-DRAWINGS [28-02-2020(online)].pdf	2020-02-28
10	202021008637-FIGURE OF ABSTRACT [28-02-2020(online)].jpg	2020-02-28
10	202021008637-Proof of Right [13-05-2020(online)].pdf	2020-05-13
11	202021008637-FORM-26 [09-10-2020(online)].pdf	2020-10-09
11	202021008637-FORM 1 [28-02-2020(online)].pdf	2020-02-28
12	202021008637-FORM 18 [28-02-2020(online)].pdf	2020-02-28
12	202021008637-FER.pdf	2021-11-01
13	202021008637-REQUEST FOR EXAMINATION (FORM-18) [28-02-2020(online)].pdf	2020-02-28
13	202021008637-OTHERS [20-01-2022(online)].pdf	2022-01-20
14	202021008637-STATEMENT OF UNDERTAKING (FORM 3) [28-02-2020(online)].pdf	2020-02-28
14	202021008637-FER_SER_REPLY [20-01-2022(online)].pdf	2022-01-20
15	202021008637-US(14)-HearingNotice-(HearingDate-17-10-2025).pdf	2025-09-18
16	202021008637-FORM-26 [22-09-2025(online)].pdf	2025-09-22
17	202021008637-Correspondence to notify the Controller [13-10-2025(online)].pdf	2025-10-13
18	202021008637-Written submissions and relevant documents [28-10-2025(online)].pdf	2025-10-28

Search Strategy

1	202021008637E_20-10-2021.pdf