Abstract: SYSTEM AND METHOD FOR TRACKING PERSON PRESENT ON A PLAYING FIELD DURING AN ONGOING SPORT Disclosed is a system for tracking a person on a playing field during an ongoing sport. To track each person, the system 102 captures a stream of images comprising a view of the entire playing field by using a camera. The system analyses the stream of images to detect a person. Spatial information of the person is derived from the stream of images. Subsequently, a movement profile of the person is generated based on the spatial information. Further, a unique identification is assigned to the person. The person may then be tracked continuously based on the unique identification and the movement profile of the person. [To be published with Figure 1]
Description:FORM 2
THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENT RULES, 2003
COMPLETE SPECIFICATION
(See Section 10 and Rule 13)
Title of invention:
SYSTEM AND METHOD FOR TRACKING A PERSON ON A PLAYING FIELD DURING AN ONGOING SPORT
APPLICANT:
Quidich Innovation Labs Pvt. Ltd.
Having address as:
No 6, Keytuo, Kondivita Rd, M.I.D.C, Andheri East, Mumbai, 400059
The following specification describes the invention and the manner in which it is to be performed.
PRIORITY INFORMATION
[001] This patent application does not take priority from any application.
TECHNICAL FIELD
[002] The present subject matter described herein, in general, relates to tracking person[s] present on a playing field during an ongoing sport.
BACKGROUND
[003] The contemporary sports broadcasting and streaming industry is encountering several obstacles. The cost and complexity associated with the implementation of conventional player tracking for broadcasting live events pose significant barriers, rendering them inaccessible to a wide range of sports organisations. Furthermore, conventional player tracking systems sometimes depend on the utilisation of numerous cameras and sensors, hence posing challenges in their applicability within outdoor settings.
[004] The accuracy of current player tracking methods is likewise constrained. Conventional player tracking systems may encounter difficulties in accurately monitoring players when faced with occlusions or other challenging circumstances. This phenomenon has the potential to result in data and insights that are not precise or reliable.
[005] In recent years, Machine learning (ML) has paved its way into all industries and it may be extremely beneficial in the broadcasting industry. ML based player tracking systems could provide the capability to effectively tackle the existing issues encountered within the live streamign market.
[006] The absence of robust player monitoring technology in sports not only hinders the sport’s growth and competitiveness but also impacts the players themselves. Data driven insights is crucial in identifying and improving weakness, enhancing training regimes, preventing injuries and prolonging careers. Moreover, technology can play a pivotal role in enhancing fan engagement by providing real-time statistics, interactive visualizations, and an enriched viewing experience. While efforts have been made to address the above lacunae, however such efforts require expensive resources (financial and computing) which may not be an economically viable solution for the broadcasters to implement.
SUMMARY
[007] Before the present systems and methods, are described, it is to be understood that this application is not limited to the particular systems, and methodologies described, as there can be multiple possible embodiments which are not expressly illustrated in the present disclosures. It is also to be understood that the terminology used in the description is for the purpose of describing the particular versions or embodiments only, and is not intended to limit the scope of the present application. This summary is provided to introduce concepts related to systems and methods for tracking a person on a playing field during an ongoing sport and the concepts are further described below in the detailed description. This summary is not intended to identify essential features of the claimed subject matter nor is it intended for use in determining or limiting the scope of the claimed subject matter.
[008] In one implementation, a system for tracking a person on a playing field during an ongoing sport is disclosed. The system may comprise a processor and a memory coupled to the processor. The processor may be configured to execute program instructions stored in the memory. The system may obtain a stream of images of the playing field from an image capturing unit. An image of the stream of images may comprise a view of the playing field in its entirety. Further, the system may analyse the stream of images to detect the person. Furthermore, the system may derive spatial information of the person from the stream of images. The spatial information may comprise a set of position coordinates of the person. Subsequently, the system may generate a movement profile for the person based on the spatial information using a machine learning model. The movement profile may correspond to historical movement trends of the person. A Unique Identifier (UID), from a set of UIDs may be assigned to the person. Further, the system may track the person in the stream of images based on the UID assigned to the person and the movement profile.
[009] In another implementation, a method for tracking a person on a playing field during an ongoing sport is disclosed. In order to track a person on the playing field, initially, a stream of images of the playing field is obtained from an image capturing unit (i.e., camera) mounted at a vantage point in a manner such that each image demonstrates aerial view of the playing field in entirety. Subsequently, the stream of images is analysed to detect the person in an image of the stream of images. Further, spatial information of the person may be derived. The spatial information may comprise a set of position coordinates of the person. In one aspect, the position coordinates are detected to create a bounded box around the position coordinates in a manner such that the bounded box contains the person. The bounded box may also be referred to as a bounding box. Further, the method may comprise generating a movement profile for the person based on the spatial information using a machine learning model. The movement profile corresponds to historical movement trends of the person. The movement profile may comprise a movement velocity and the set of position coordinates. The set of position coordinates may comprise position coordinates of the person in one or more images from the stream of images. The position coordinates may represent at least one of a location of the person in the playing field and pixels of the image depicting the person. Further, a unique Identifier (UID), from a set of UIDs, is assigned to the person. The method may comprise, tracking the person in the stream of images based on the UID assigned to the person and the movement profile.
[0010] In yet another implementation, non-transitory computer readable medium embodying a program executable in a computing device for tracking a person on a playing field during an ongoing sport is disclosed. The program may comprise a program code for obtaining a stream of images of the playing field from an image capturing unit mounted at a vantage point. An image of the stream of images may comprise a view of the playing field in its entirety . The program may comprise a program code for analysing the stream of images to detect the person. The program may comprise a program code for deriving spatial information of the person from the stream of images. The spatial information may comprise a set of position coordinates of the person. The program may comprise a program code for generating a movement profile for the person based on the spatial information using a machine learning model. The movement profile may correspond to historical movement trends of the person. The program may comprise a program code for assigning a Unique Identifier (UID), from a set of UIDs, to the person. The program may comprise a program code for tracking the person in the stream of images based on the UID assigned to the person and the movement profile.
BRIEF DESCRIPTION OF THE DRAWINGS
[0011] The foregoing detailed description of embodiments is better understood when read in conjunction with the appended drawings. For the purpose of illustrating the disclosure, examples of the disclosure are shown in the present document; however, the disclosure is not limited to the specific methods and apparatus disclosed in the document and the drawings.
[0012] The detailed description is given with reference to the accompanying figures. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. The same numbers are used throughout the drawings to refer to like features and components.
[0013] Figure 1 illustrates a network implementation of a system for tracking a person on a playing field during an ongoing sport, in accordance with an embodiment of the present subject matter.
[0014] Figure 2 illustrates the system, in accordance with an embodiment of the present subject matter.
[0015] Figures 3 – 6 illustrate examples, in accordance with an embodiment of the present subject matter.
[0016] Figure 7 illustrates a method for tracking a person on a playing field during an ongoing sport, in accordance with an embodiment of the present subject matter.
DETAILED DESCRIPTION
[0017] Some embodiments of this disclosure, illustrating all its features, will now be discussed in detail. The words "comprising," "obtaining," "analysing," "deriving," "generating," "assigning," "tracking," "having," "containing," and "including," and other forms thereof, are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items, or meant to be limited to only the listed item or items. It must also be noted that as used herein and in the appended claims, the singular forms "a," "an," and "the" include plural references unless the context clearly dictates otherwise. Although any systems and methods similar or equivalent to those described herein can be used in the practice or testing of embodiments of the present disclosure, the exemplary, systems and methods are now described. The disclosed embodiments are merely exemplary of the disclosure, which may be embodied in various forms.
[0018] Various modifications to the embodiment will be readily apparent to those skilled in the art and the generic principles herein may be applied to other embodiments. However, one of ordinary skill in the art will readily recognize that the present disclosure is not intended to be limited to the embodiments illustrated, but is to be accorded the widest scope consistent with the principles and features described herein.
[0019] As there were various challenges observed in the existing art, these challenges necessitated a need for tracking a person on a playing field during an ongoing sport. In order to track each person on the playing field, an image capturing unit (i.e., a camera) is mounted at a vantage point in such a manner that the image capturing unit captures the playing field in entirety. It may be understood that the stream of images of the entire playing field is captured by only one camera. The image capturing unit captures a stream of images of the playing field and transmits the stream of images to a backend processing unit, configured to process the stream of images, positioned at a base station. The base station is located within a vicinity of the playing field. These stream of images is then processed using an image filtering technique to filter out noise from each image and detect each person present in each image.
[0020] It may be understood that the person is, continuously, detected in each image from the stream of images so that they can be tracked until the end of the played sport. In one aspect, detection of each person in an image is performed by marking a bounded box around each person in a manner such that the bounded dox contains a person in it. The bounded box containing a person is then assigned with a unique identification ID, which is then tracked until the sport is being played on the playing field. It may be understood that while detecting each person, if the person disappears and cannot be tracked, due to an occlusion, for a short amount of time (i.e., in few images of the stream of images) during the ongoing sport and re-appears again in subsequent images, same unique identification ID would be assigned to the person as assigned to him/her upon detection in the preceding images. In other words, if the person re-appears and is detected again in the succeeding images, the person is assigned with the unique identification ID same as assigned before to him/her.
[0021] This continuous detection of the person in the stream of images enables monitoring of the movement of the person around the playing field using spatial information including position coordinates pertaining to each person. In one aspect, the continuous detection further enables prediction of probable positions for the person. In one aspect, this continuous detection is performed based on last recorded position of the person, movement velocity based on historic movement of the person, number of frames in which the person goes undetected. In one embodiment, the predicted probable positions may be used to confirm reappearance of the person.
[0022] Thus, in this manner, each person present on a playing field may be tracked and the corresponding unique identifiers assigned to each person may be maintained until the sport is being played on the playing field. While aspects of described system and method for tracking a person on a playing field during an ongoing sport may be implemented in any number of different computing systems, environments, and/or configurations, the embodiments are described in the context of the following exemplary system.
[0023] Referring now to Figure 1, a network implementation 100 of a system 102 for tracking a person on a playing field during an ongoing sport is disclosed. In order to track each person on the playing field, initially, the system 102 obtains a stream of images of the playing field from an image capturing unit 101 mounted at a vantage point in a manner such that each image comprises a view of the playing field in entirety. The system may detect the person in the stream of images. Further, the system may derive spatial information of the person. The spatial information may comprise a set of position coordinates of the person. The set of position coordinates are detected to create a bounded box around the position coordinates in a manner such that each bounded box contains a person, and thereafter assigns a unique identification to each bounded box by using a tracking technique. In one aspect, the tracking technique may be trained using a plurality of historical annotated images comprising position coordinates pertaining to each bounding box. Subsequent to assigning the unique identification, the system 102 tracks movement of each person around the playing field by generating a movement profile. It may be understood that the movement profile corresponds to historical movement trends pertaining to the person.
[0024] In one aspect, the movement of the person is tracked based on the position coordinates by using a Machine Learning (ML) model. Thus, in this manner, the system 102 maintains the unique identification, assigned to the person, based on at least one of the position coordinates, the movement profile, and a range of unique identification using a filtering model. The position coordinates may correspond to at least one of a location in the playing field and pixel coordinates in an image from the stream of images. The system may detect a person in the image.
[0025] It may be understood that the proposed invention has been described considering ‘Cricket’ as the sport. The scope of the proposed methodology cannot be restricted only to Cricket, but can be implemented to any other outdoor sport including Football, Baseball, and Hockey etc.
[0026] Although the present disclosure is explained considering that the system 102 is implemented on a server, it may be understood that the system 102 may also be implemented in a variety of computing systems, such as a laptop computer, a desktop computer, a notebook, a workstation, a mainframe computer, a server, a network server, a cloud-based computing environment present at the base station. It will be understood that the system 102 may be accessed by multiple users through one or more user devices 104-1, 104-2…104-N, collectively referred to as user 104 or stakeholders, hereinafter, or applications residing on the user devices 104. In one implementation, the system 102 may comprise the cloud-based computing environment in which a user may operate individual computing systems configured to execute remotely located applications. Examples of the user devices 104 may include, but are not limited to, a portable computer, a personal digital assistant, a handheld device, and a workstation. The user devices 104 are communicatively coupled to the system 102 through a network 106.
[0027] In one implementation, the network 106 may be a wireless network, a wired network, or a combination thereof. The network 106 can be implemented as one of the different types of networks, such as intranet, local area network (LAN), wide area network (WAN), the internet, and the like. The network 106 may either be a dedicated network or a shared network. The shared network represents an association of the different types of networks that use a variety of protocols, for example, Hypertext Transfer Protocol (HTTP), Transmission Control Protocol/Internet Protocol (TCP/IP), Wireless Application Protocol (WAP), and the like, to communicate with one another. Further the network 106 may include a variety of network devices, including routers, bridges, servers, computing devices, storage devices, and the like.
[0028] Referring now to Figure 2, an embodiment 200 of the system 102 is illustrated. In one embodiment, the system 102 may include at least one processor 202, an input/output (I/O) interface 204, and a memory 206. The at least one processor 202 may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, graphics processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Among other capabilities, the at least one processor 202 is configured to fetch and execute computer-readable instructions stored in the memory 206.
[0029] The I/O interface 204 may include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface, and the like. The I/O interface 204 may allow the system 102 to interact with the user directly or through the client devices 104. Further, the I/O interface 204 may enable the system 102 to communicate with other computing devices, such as web servers and external data servers (not shown). The I/O interface 204 can facilitate multiple communications within a wide variety of networks and protocol types, including wired networks, for example, LAN, cable, etc., and wireless networks, such as WLAN, cellular, or satellite. The I/O interface 204 may include one or more ports for connecting a number of devices to one another or to another server.
[0030] The memory 206 may include any computer-readable medium or computer program product known in the art including, for example, volatile memory, such as static random-access memory (SRAM) and dynamic random-access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes. The memory 206 may include modules 208 and data 210.
[0031] The modules 208 include routines, programs, objects, components, data structures, etc., which perform particular tasks or implement particular abstract data types. In one implementation, the modules 208 may include an image obtaining module 212, a filtering module 214, a detection module 216 and other modules 218. The other modules 218 may include programs or coded instructions that supplement applications and functions of the system 102. The modules 208 described herein may be implemented as software and/or hardware modules that may be executed in the cloud-based computing environment of the system 102.
[0032] The data 210, amongst other things, serves as a repository for storing data processed, received, and generated by one or more of the modules 208. The data 210 may also include a database 220 and other data 222. The other data 222 may include data generated as a result of the execution of one or more modules in the other modules 218.
[0033] As there were various challenges observed in the existing art, these challenges necessitated a need for an automated system 102 which tracks a person in a large playing field using one image capturing device. In order overcome the challenges as elucidated above a user, at first, may use the client device 104 to access the system via the I/O interface 204. The user may register them using the I/O interface 204 in order to use the system 102. In one aspect, the user may access the I/O interface 204 of the system 102. The system 102 may employ the image obtaining module 212, the filtering module 214, the detection module 216, and other modules 218 for tracking each person present on a playing field, in real-time, during the ongoing sport and further generating insightful data of each person. The other modules may comprise at least one of a computer graphics module, image manipulation module, video codec module, and a 3-Dimensional (3D) object module. The other modules may be used to project the insightful data generated by tracking each person for a user viewing a live stream of the sport.
[0034] Further referring to figure 2, the image obtaining module 212 obtains a stream of images of the playing field from a vantage point. It may be noted that the image obtaining module 212 obtains the stream of images from the image capturing unit 101. Further it must be understood that the image capturing unit 101 is mounted on the vantage point in such a manner that each image comprises an aerial view of the playing field, in entirety. An example setup 300 required for the system is illustrated in figure 3. The image capturing unit 101 is mounted on one of the flood lights illuminating the playing field 302 and capturing the aerial view of the playing field 302, in entirety. In one aspect, the image obtaining module 212 pre-processes each image by removing the background noise 304, as shown in the figure 3.
[0035] Each image hereinafter may also refer to as an image which may be ingested into one or more image processing modules to process the image. It may be noted that the image capturing unit 101 is protected and deployed in a water tight environment i.e., a camera box which restricts water ingestion in rainy seasons thereby keeping the image capturing unit 101 safe for use. Further, the camera box comprises a cooling unit such as a fan to maintain temperature of the image capturing unit 101 and keeps temperature within predefined temperature limits. Furthermore, the camera box comprises a splitter and a first converter unit configured to convert image signals, pertaining to the stream of images, to Ethernet signals. In one aspect, the converted Ethernet signals are split and carried through the cables which connect to the base station where the system 102 is deployed.
[0036] In one embodiment, the system 102 also comprises a converter unit, hereinafter referred to as a second converter unit, which converts the Ethernet signals into Standard Definition (SD) signals and then transmits the SD signals to the detection module 216. In one aspect, each image is ingested into the second converter unit to convert the image into a downscaled image to downscale the image in accordance with a predefined format and size. In one aspect, these Ethernet signals converted into the Standard Definition (SD) signals are in 4K and/or 8K format.
[0037] Once the stream of images is obtained from the image capturing unit 101, the detection module 216 ingests the stream of images to analyse the images. Further, the detection module 216 is configured to derive spatial information of each person present in an image, of the stream of images. The spatial information may comprise a set of position coordinates. In order to detect the position coordinates of each person, the detection module 216 comprises a Gstreamer 102-5, a streaming application which allows capturing the stream of images and feed each image into a Computer Vision application deployed in the form of the detection module 216. In one aspect, the Gstreamer 102-5 may be enabled using a plug-in to ingest the stream of images to the filtering module 214 at ‘25’ frames per second (fps). The above process of converting the Ethernet signals into Standard Definition (SD) signals and then transmitting the SD signals to the detection module 216 is continuously performed to provide access of each image to the detection module 216, which is configured to detect the position coordinates of each person present in the image.
[0038] In one embodiment, the position coordinates are detected by a Deep Learning - Machine Learning (ML) model trained using a plurality of training data images. The plurality of training data images depicting the players (persons) scattered around the playing field are captured from a variety of angles, weather conditions, and players wearing jersey of distinct colors. In other words, the Deep Learning - Machine Learning (ML) model may be trained on a data set comprising a plurality of video recordings of a plurality of events, annotations for each person detected in the plurality of recordings. In one aspect, the Deep Learning - Machine Learning (ML) model may be trained based on object detection algorithms and object tracking algorithms.
[0039] With this training, the detection module 216 detects a set of position coordinates of each person present in the image. The set of position coordinates enable the detection module 216 to create a bounded box around the set of position coordinates in a manner such that each bounded box contains a person. It may be understood that each person is detected in each image so that the person can be tracked until the sport is being played. In one aspect, the position coordinates are detected by using at least one of, a person detection technique and a transformation technique. The person detection technique may be used to identify a person in an image. Further, the position coordinates of the person may be derived as pixel coordinates based on a coordinate system defined for the image. Further, the pixel coordinates may be converted to real-world coordinates corresponding to a location in the playing field. The system may use the transformation technique to calculate the real-world coordinates as the position coordinates. The detection module 216 further determines a count of person including players and umpires on the playing field and generate as many bounding boxes to subsume a player in each bounded box.
[0040] The bounding boxes may be created based on a subset of the set of position coordinates. The set of position coordinates may include a plurality of position coordinates of the person in one or more images of the stream of images. The subset may comprise position coordinates of the person in one image. The bounding box may be created in each image based on the position coordinates of the person in that image. The position coordinates may represent at least one of a location of the person in the playing field and pixels of the image depicting the person.
[0041] The system may generate a movement profile of each person based on the spatial information. The movement profile may comprise a movement velocity of each person and the set of position coordinates. Further, the set of position coordinates may be used to plot movement trends of the person. The movement velocity may be determined based on a distance between position coordinates of the person and a frame rate of transmission of the image stream. Consider a first image from the stream of images and a sixtieth image from the stream of images. Considering that the stream of images is transmitted at 30 image frames per second, the time between the first and the sixtieth image is 2 seconds. Therefore, the movement velocity may be determined as the distance between the position coordinates of the person in the first image and the position coordinates of the person in the sixtieth image divided by 2.
[0042] In an embodiment the system may implement a movement profile builder module using a machine learning model to generate the movement profile for a person. The machine learning model may be trained using a training dataset comprising a plurality of streams of images, a plurality of persons detected in the plurality of streams of images, set of position coordinates corresponding to the plurality of persons detected, and sample movement profiles for the plurality of persons detected. The machine learning model may learn patterns and connections between the set of position coordinates of a person and the movement profile of the person. Thus, the machine learning model is trained to generate a movement profile for a person as an output for an input of a stream of images with the person detected in the stream of images.
[0043] In order to continuously track the person, the system may assign a Unique Identification (UID) to the person. In one aspect, the system may assign a UID to each person detected in the stream of images. In an embodiment, the system may annotate the bounding box comprising the person with the UID. It may be understood that the UID is a unique number assigned to each person. In one embodiment, the UID is assigned by using an Operational Support System(OSS), player tracking algorithm customized to assign the unique number to each person in the bounded box. The intent behind assigning the UID is to avoid an occlusion state whenever two or more person overlap leading to swapping and/or loss of identification. It is to be understood that the bounded box once created moves along as the person moves around the playing field. It may be noted that if the person disappears and cannot be tracked, due to the occlusion, for a short amount of time during the ongoing sport and re-appears again in subsequent images, same UID would be assigned to the person as assigned before. In other words, if the person re-appears and detected again in the succeeding image, the person is assigned with the UID same as assigned before to him/her.
[0044] In one embodiment, this continuous detection of the person using through the position coordinates in the stream of images enables monitoring of the movement of the person around the playing field using spatial information including position coordinates pertaining to each person. The continuous detection further enables prediction of probable positions for each person who goes undetected in the preceding images of the stream of images. In one aspect, this continuous detection is performed based on last recorded position of the person, velocity based on historic movement of the person, number of frames in which the person goes undetected, and movement pattern around the playing field. In one embodiment, the predicted probable position is one of coordinates of a point on the playing field, coordinates corresponding to an area on the playing field, and coordinates corresponding to a track followed by the person on the playing field.
[0045] In an embodiment, each person is continuously tracked based on the unique identification (UID) assigned to the person and the movement profile of the person. The system may use the filtering module 214 to maintain the UIDs assigned to each person. Maintaining the UIDs of each person is necessary to avoid incorrect tracking of the person. The filtering module 214 may employ an ML model to ensure that the UIDs assigned to each person are unique and within a range of UIDs. The ML model employed by the filtering module may be referred to as the filtering model in the described embodiments. The range of the UIDs may be predefined based on processing power of the system. The number of persons that can be tracked simultaneously are equal to the total number of UIDs available. The total number of UIDs available may be determined based on the range of UIDs. For example, consider that the range of UIDs is 000-100. The total number of UIDs available are 101 hence, up to 101 persons may be tracked simultaneously.
[0046] Further, the filtering module may categorize the UIDs based on the assignment of the UIDs using the filtering model. The filtering model may be trained to determine categories of the UIDs. The UIDs may be categorized as at least one of Assigned, Unassigned, and Waiting.
[0047] The UID that is assigned to a person may be categorized as Assigned, the UID available for assignment to a person may be categorized as Unassigned, and the UID assigned to a person who is not visible in an image from the stream of images is categorized as Waiting. For example, let us consider that 10 UIDs are assigned to 10 persons detected in a stream of images. Consider image 1 of the stream of images, all 10 persons are visible. Therefore, all 10 UIDs (1,2, 3…10) are assigned to the persons. In this case all 10 UIDs are categorized as assigned.
[0048] Consider a second example, let us assume that the range of UIDs is 1-20 and 10 persons are detected in the image 1. Let us assume that UIDs 1-10 are assigned to the 10 persons that are detected in the image 1. In this case, the UIDs 11-20 are categorized as unassigned.
[0049] Consider the second example, let us assume that the person with UID 4 disappears from the stream of images in image 56 of the stream of images. The person may be occluded due various reasons including advertisements. The UID 4 may be categorized as waiting.
[0050] An example output 400 of the detection module is illustrated in figure 4. The detection module 216 detects position coordinates of each person present in the image and creates a bounded box around the position coordinates in a manner such that each bounded box contains a person. Further, the system assigns a UID to each bounded box, as 400-1, 400-2, 400-3, 400-4, …., 400-n. It may be understood that this UID assigned to each person in the bounded box may remain same throughout the match.
[0051] To avoid incorrect or double assignment of a UID to one or more persons, the filtering module 214 may employ the filtering model. The filtering model may be a machine learning model trained using a training dataset comprising a plurality of streams of images, a plurality of movement profiles corresponding to a plurality of persons in the plurality of streams of images, the plurality of streams of images annotated with the UIDs assigned to the plurality of persons, and a log of categories of the UIDs. The log of categories of the UIDs may comprise categories of the UIDs in each image of a stream of images. In an embodiment, the log of categories of the UIDs may indicate change in categories of the UIDs in the stream of images. The filtering model may be trained to maintain the log of categories of the UIDs.
[0052] Consider the second example explained above, let us assume that the person with UID 4 is called out of the playing field for a certain amount of time. The detection module may not be able to detect the person in an image from the stream of images when the person goes out of the playing field. The image in which the person is not detected may be labelled as undetected image. Let us assume that the person goes out of the playing field after 10 seconds of receiving the stream of images. Also, for instance, the frame rate of the stream of images is 30 frames per second. Therefore, the person may not be detected in image 301. Image 301 may be labelled as the undetected image. Further, UID 4 may be categorized as waiting. Further, for example a new person enters the playing field 30 seconds after receiving the stream of images. The filtering module may assign UID 11 to the new person. In another case, let us assume that 10 new persons enter the playing field 30 seconds after receiving the stream of images. The filtering module may assign the UIDs 11-20 to the 10 new persons. Now, let us assume an 11th new person enters the playing field. The filtering module may reassign the UID 4 to the 11th new person based on a duration for which the person assigned with UID 4 is not detected in the stream of images. The filtering module may reassign the UID 4 to the 11th new person if the 11th new person enters the playing field when the duration is greater than a threshold duration. In an embodiment, the UID 4 may be categorized as unassigned after the threshold duration.
[0053] Further, the filtering model may be configured to check if a new person entering the field is one of one or more persons that are not detected in at least an image of the stream of images. The filtering model may compare the position coordinates of the new person with the movement profile of the one or more persons. In an embodiment, the filtering model may generate a movement profile for the new person based on one or more images of the stream of images after the new person is detected in the stream of images. The filtering model may assign a new UID to the new person in case the movement profile of the new person does not match with the movement profile of the one or more persons. In case the movement profile of the new person matches with movement profile of a person from the one or more persons, the filtering model may assign UID of the person from the one or more persons to the new person.
[0054] Consider an example, in image 1 of a stream of images, 5 persons are detected. The 5 persons are assigned UIDs (1, 2, 3, 4, 5). 2 persons out of the 5 leave the playing field after 10 seconds. Therefore, image 301 may have only 3 persons assigned with the UIDs (1, 2, 3). Let us assume that a new person is detected in image 601. Every person other than the persons detected in the image may be referred to as a new person. It is to be noted that a person that disappears from the stream at a time and reappears may also be referred to as a new person upon reappearance. The filtering model generates a movement profile of the new person. Let us assume that the movement profile of the new person matches the movement profile of the person assigned with UID 4. The new person is reassigned with UID 4. In the above example, the image 601 may be labelled as a redetected image. The UIDs (1, 2, 3, 4, 5) may be referred to as a set of UIDs. Each stream of images may have a corresponding set of UIDs comprising UIDs assigned to the persons detected in the corresponding stream of images.
[0055] In the above example, the UIDs 4 and 5 may be categorized as waiting after the image 301(undetected image). Further, UID 4 may be recategorized as Assigned after the image 601(redetected image).
[0056] In another embodiment, the filtering module 214 may employ a feature extraction algorithm to extract features of each person in the bounding box. Further, in case two or more bounding boxes overlap for a short period of time and move away after the short period of time, the filtering module may match the features of persons in the bounding boxes that overlapped to reassign UIDs to the two or more bounding boxes based on the features of the persons inside the two or more bounding boxes.
[0057] Consider an example, in the playing field, 4 persons are detected. 2 persons are wearing a blue jersey and 2 persons are wearing a yellow jersey. The filtering module may extract features to enable the system to differentiate between the 4 persons. Further, let us assume that in an image from the stream of images one person wearing the blue jersey and one person wearing the yellow jersey are standing at the same position coordinates. The system may create only one bounding box for both the persons, the bounding box may be assigned the UID of anyone of the two persons. In a later image, sequentially after the image, the two persons separate and the system creates two bounding boxes for the two persons. To ensure that the bounding boxes are annotated correct UIDs, features of the two persons may be matched with the features extracted before the two persons overlapped.
[0058] In one embodiment, the filtering module 214 further tracks duration for presence of each unique identification on the playing field. In one aspect, the filtering module 214 can be tracked by at least one tracking technique. Whenever a situation arises that a unique identification counter (generating a unique identification) reaches its configured threshold count, the filtering module 214 re-assigns a unique identification, previously assigned to a person in the bounded box, to a new person detected during the on-going sport. In one aspect, the filtering module 214 may re-assign when the person’s presence on the playing field is less than threshold duration.
[0059] In another embodiment, the filtering module 214 assigns a unique identification to the person detected in each image. It may be understood that the unique identification, assigned to the person, remains same up until the ongoing sport is being played on the playing field, when a person is intermittently detected in the stream of images. In one aspect, the assignment of the unique identification remains same as a result of:
a) tracking movement, of the person assigned with the unique identification, based the position coordinates corresponding to the person, wherein the movement is tracked in an initial set of continuous images of the stream of images,
b) predicting probable position coordinates of the person in a first image, of the stream of images, in which the person goes undetected, wherein the first image is sequentially obtained after the initial set of continuous images, and wherein the probable position coordinates are predicted based on a set of parameters, and
c) re-assigning the unique identification to the person detected in a second image, wherein the unique identification is same as previously assigned to the person in the initial set of continuous images prior to un-detection in the first image, and wherein the second image is sequentially obtained after the first image.
[0060] In one aspect, the probable position coordinates are one of coordinates of a point on the playing field, coordinates corresponding to an area on the playing field, and coordinates corresponding to a track followed by the person on the playing field. In another aspect, the set of parameters comprises position coordinates tracked in the initial set of continuous images, velocity of the person tracked in the initial set of continuous images, a number of frames the person disappeared, and a movement pattern of the person on the playing field.
[0061] Thus, based on the above, it must be understood that the filtering module 214 detects one or more person in each image. The filtering module 214 may identify the unique identification (IDs) assigned to the one or more person detected in the image. If a person is not assigned with any unique identification (ID), the filtering module 214 checks one or more images preceding the image to check if the person is present in an area around the individual in one or more preceding images. If yes, the filtering module 214 assigns the same unique identification (ID) previously assigned to the same person in the preceding images.
[0062] Based on the above methodology, the system creates a bounded box around the position coordinates in a manner such that each bounded box contains a person, assigns the unique identification to each bounded box and/or re-assigns the same unique identification to the same person that was assigned in the preceding images.
[0063] Once the system creates the bounded box and assigns the unique identification to each bounded box, tracks movement of each person around the playing field based on the unique identification assigned to each person based on at least one of the position coordinates, the movement profile, and a range of unique identification. In one aspect, the movement profile corresponds to historical movement trends pertaining to each person. The movement of each person is tracked based on the position coordinates by using a Machine Learning (ML) model.
[0064] Thus, in this manner, each person present on a playing field is tracked, in real-time, during an ongoing sport by maintaining unique identifiers assigned to each person on the playing field. While aspects of described system and method for tracking each person present on the playing field and may be implemented in any number of different computing systems, environments, and/or configurations, the embodiments are described in the context of the following exemplary system.
[0065] It may be understood that after continuously tracking each person and maintaining the unique identification, various meaningful insights can be deduced which may then be intuitively rendered on the viewer’s device for viewers consumption. Few embodiments of such meaningful insights which can be deduced based on continuously tracking and maintaining the unique identification are described hereinafter for better understanding. This includes additional modules comprising an information output module (now shown in the figure) enabling an operator to select at least two candidate objects on the image rendered as the playing field. A candidate object, of the at least two candidate objects, is a person for whom the meaningful insights are to be determined. It may be noted that the candidate object is in the bounded box which may be assigned with the unique identification. The information output module then determines the meaningful insights associated to the at least two candidate objects selected by the operator. Here it may be understood that the meaningful insights may include, but not limited to, a distance indicating a gap between any two fielders on the playing field, a total distance covered by a fielder from the fielder’s original position for a ball delivered by a bowler and highlighting gap between two fielders.
[0066] Referring now to figure 5, an example 500 of overlay of insights on an image is illustrated. To elucidate the functioning of the information output, consider an example where different fielders are placed on a cricketing field 302 as shown in figure 5. In order to deduce the meaningful insights about the fielders, the operator selects two candidate objects, referred to as fielders, on the image rendered as the playing field. As shown in figure 5, the operator selects two points which are nothing but fielders positioned at ‘Mid Wicket’ and ‘Mid On’ marked as ‘A’ and ‘B’ respectively. The information output then deduces the meaningful insights as a distance indicating a gap between the selected fielder’s ‘A’ and ‘B’. Here, in this use case, the information output computes the distance using a pixel to meter conversion technique. The information output 216 may use any other distance computation technique to compute distance between two fielders selected in the image. Similarly, the information output may further deduce the meaningful insights as highlighting gap between two selected fielder’s ‘C’ and ‘D’. Here, in this use case, the information output highlights a section of gap as graded lines, between two selected fielder’s ‘C’ and ‘D’.
[0067] Subsequent to the determination of the meaningful insights, an image translating module (now shown in the figure) translates the position coordinates of each person, present in the image, and the meaningful insights into a destination image. It may be understood that the destination image is an image captured from a camera mounted on a drone. The destination image is captured from a certain height, preferably from a center of the pitch, such that the image captures the playing field, in entirety, in a two-dimensional linear plane format.
[0068] The image translating module translates the position coordinates, of each person, present in the image into the destination image by using a homography technique. The homography technique includes marking a set of points as edges of straight lines on the image. It is to be noted that the set of points are marked in manner such that each straight line intersects the playing field from one end to another end of the playing field. The set of points are marked in the image in a specific order.
[0069] In order to translate the position coordinates of each person and the meaningful insights into the destination image, the homography technique superimposes the destination image on the image having the set of points marked on the image. In one aspect, the destination image is superimposed on the image in a manner such that scale and aspect ratio of the destination image is same as that of the image. The homography technique includes reproducing, as a result of superimposition of the destination image on the image, the set of points on the destination image by using a matrix transformation technique. In other words, this reproduction of the set of points includes marking the set of points on the destination in the same order as marked in the image.
[0070] In one aspect, the above translation using the homography technique is performed to intuitively render, on the viewer’s device, each person present on the sporting field including the candidate object and the meaningful insights in a two-dimensional linear plane format. The above candidate object and the meaningful insights are intuitively rendered on the viewer’s device by pushing the candidate object and the meaningful insights in an output pipeline which is communicatively coupled with broadcaster’s input who then broadcasts a combined example broadcast object 600 comprising the candidate object and the meaningful insights on the viewer’s device, as illustrated in figure 6.
[0071] Thus, in this manner, the system 102 may display the meaningful insights associated to the sport on the viewer’s device. In one aspect, the output pipeline may be communicatively coupled with other downstream applications 103 for visual storytelling tools for media content creators in the broadcasts, sports digital and esports industries. In another aspect, the output pipeline may be communicatively coupled with a cloud environment 103 for storing all such other downstream applications 103 candidate object and the meaningful insights.
[0072] Referring now to Figure 7, a method 700 for tracking a person on a playing field during an ongoing sport is shown, in accordance with an embodiment of the present subject matter. The method 700 may be described in the general context of computer executable instructions. Generally, computer executable instructions can include routines, programs, objects, components, data structures, procedures, modules, functions, etc., that perform particular functions or implement particular abstract data types. The method 700 may also be practiced in a distributed computing environment where functions are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, computer executable instructions may be located in both local and remote computer storage media, including memory storage devices.
[0073] The order in which the method 700 is described is not intended to be construed as a limitation, and any number of the described method blocks can be combined in any order to implement the method 700 or alternate methods. Additionally, individual blocks may be deleted from the method 700 without departing from the scope of the subject matter described herein. Furthermore, the method can be implemented in any suitable hardware, software, firmware, or combination thereof. However, for ease of explanation, in the embodiments described below, the method 700 may be considered to be implemented as described in the system 102.
[0074] At block 702, a stream of images of the playing field is obtained from an image capturing unit 101 mounted at vantage point. The image capturing unit 101 captures an image in a manner such that each image demonstrates aerial view of the playing field in entirety. In one implementation, the stream of images of the playing field may be obtained by the image obtaining module 212.
[0075] At block 704, the stream of images is analysed to detect a person in the stream of images. The stream of images refers to images that are continuously received one after the other. Each image received is analysed to detect a person in the image.
[0076] At block 706, spatial information of the detected person is derived from the images in which the person is detected. In an embodiment, the system may collect all the images in which the person is detected. Further, the system may arrange the images in a sequence based on the order in which the images were received. The system may then analyze the collection of the images to derive the spatial information of the person.
[0077] At block 708, a movement profile of the person is generated. The movement profile may be generated based on the spatial information of the person. The movement profile may correspond to movement trends of the person. Further, the movement profile may be used to predict a probable future position of the person.
[0078] At block 710, a unique identification (UID) may be assigned to the person from a set of UIDs. The set of UIDs may comprise one or more UIDs based on a range of UIDs. The UID may be assigned to the person by creating a bounding box around the person based on position coordinates of the person and annotating the bounding box with the UID.
[0079] At block 712, the person is tracked in the stream of images based on the UID assigned to the person and the movement profile of the person.
[0080] Although implementations for methods and systems for tracking a person on a playing field during an ongoing sport have been described in language specific to structural features and/or methods, it is to be understood that the appended claims are not necessarily limited to the specific features or methods described. Rather, the specific features and methods are disclosed as examples of implementations for tracking each person present on the playing field.
, Claims:We Claim:
1. A method for tracking a person on a playing field during an ongoing sport, the method comprising:
obtaining, by a processor, a stream of images of the playing field from an image capturing unit, wherein an image of the stream of images comprises a view of the playing field in its entirety;
analysing, by the processor, the stream of images to detect the person;
deriving, by the processor, spatial information of the person from the stream of images, wherein the spatial information comprises a set of position coordinates of the person;
generating, by the processor, a movement profile for the person based on the spatial information using a machine learning model, wherein the movement profile corresponds to historical movement trends of the person;
assigning, by the processor, a Unique Identifier (UID), from a set of UIDs, to the person; and
tracking, by the processor, the person in the stream of images based on the UID assigned to the person and the movement profile.
2. The method as claimed in claim 1, wherein an image from the stream of images is ingested into an image converter configured to convert the image into a downscaled image, wherein the image converter downscales the image in accordance with a predefined format and size.
3. The method as claimed in claim 1, wherein the movement profile comprises a movement velocity and the set of position coordinates, and wherein the set of position coordinates comprises position coordinates of the person in one or more images from the stream of images, and wherein the position coordinates represent at least one of location of the person in the playing field and pixels of the image depicting the person.
4. The method as claimed in claim 1 further comprises maintaining the set of UIDs, based on the movement profile corresponding to one or more persons, UIDs assigned to the one or more persons, and a range of UIDs, using a filtering model.
5. The method as claimed in claim 4, wherein the range of UIDs indicates total number of UIDs available for assignment to the one or more persons.
6. The method as claimed in claim 4 further comprises categorizing a UID from the set of UIDs, based on the assignment of the UID to a person of the one or more persons, as at least on of assigned, unassigned, and waiting, wherein a UID is categorized as waiting when the person assigned the UID is not detected in an undetected image, wherein the undetected image is an image from the stream of images in which the person is not detected.
7. The method as claimed in claim 6 further comprises reassigning the UID to the person when the person is detected again in a redetected image sequentially one or more images after the undetected image based on the filtering model, wherein the redetected image is an image from the stream of images in which the person is detected again after being not detected in the undetected image.
8. The method as claimed in claim 7, wherein the person is detected in the redetected image based on the position coordinates of the person in the redetected image and the movement profile of the person.
9. The method as claimed in claim 6 further comprises recategorizing the UID as assigned after reassigning the UID to the person.
10. The method as claimed in claim 6 further comprises recategorizing the UID as unassigned when the person is not detected in a predefined number of images sequentially after the undetected image.
11. The method as claimed in claim 1 further comprises:
creating a bounding box around the person based on the position coordinates of the person; and
annotating the bounding box with the UID assigned to the person.
12. The method as claimed in claim 4, wherein the filtering model is trained to maintain the UIDs using a training dataset comprising a plurality of streams of images, a plurality of movement profiles corresponding to a plurality of persons in the plurality of streams of images, and the plurality of streams of images annotated with the UIDs assigned to the plurality of persons.
13. The method as claimed in claim 12 further comprises updating the training dataset by adding the stream of images, the movement profile of the person, and UID assignment data of the person, wherein the UID assignment data comprises annotations of the UID assigned to the person in the stream of images.
14. A system for tracking a person on a playing field during an ongoing sport, the system comprises:
a memory; and
a processor coupled to the memory, wherein the processor is configured to execute program instructions stored in the memory for:
obtaining a stream of images of the playing field from an image capturing unit, wherein an image of the stream of images comprises a view of the playing field in its entirety;
analysing the stream of images to detect the person;
deriving spatial information of the person from the stream of images, wherein the spatial information comprises a set of position coordinates of the person;
generating a movement profile for the person based on the spatial information using a machine learning model, wherein the movement profile corresponds to historical movement trends of the person;
assigning a Unique Identifier (UID), from a set of UIDs, to the person; and
tracking the person in the stream of images based on the UID assigned to the person and the movement profile.
15. A non-transitory computer program product having embodied thereon a computer program for tracking a person on a playing field during an ongoing sport, the non-transitory computer program product storing instructions for:
obtaining a stream of images of the playing field from an image capturing unit, wherein an image of the stream of images comprises a view of the playing field in its entirety;
analysing the stream of images to detect the person;
deriving spatial information of the person from the stream of images, wherein the spatial information comprises a set of position coordinates of the person;
generating a movement profile for the person based on the spatial information using a machine learning model, wherein the movement profile corresponds to historical movement trends of the person;
assigning a Unique Identifier (UID), from a set of UIDs, to the person; and
tracking the person in the stream of images based on the UID assigned to the person and the movement profile.
| # | Name | Date |
|---|---|---|
| 1 | 202321076772-STATEMENT OF UNDERTAKING (FORM 3) [09-11-2023(online)].pdf | 2023-11-09 |
| 2 | 202321076772-REQUEST FOR EARLY PUBLICATION(FORM-9) [09-11-2023(online)].pdf | 2023-11-09 |
| 3 | 202321076772-FORM-9 [09-11-2023(online)].pdf | 2023-11-09 |
| 4 | 202321076772-FORM FOR STARTUP [09-11-2023(online)].pdf | 2023-11-09 |
| 5 | 202321076772-FORM FOR SMALL ENTITY(FORM-28) [09-11-2023(online)].pdf | 2023-11-09 |
| 6 | 202321076772-FORM 1 [09-11-2023(online)].pdf | 2023-11-09 |
| 7 | 202321076772-FIGURE OF ABSTRACT [09-11-2023(online)].pdf | 2023-11-09 |
| 8 | 202321076772-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [09-11-2023(online)].pdf | 2023-11-09 |
| 9 | 202321076772-EVIDENCE FOR REGISTRATION UNDER SSI [09-11-2023(online)].pdf | 2023-11-09 |
| 10 | 202321076772-DRAWINGS [09-11-2023(online)].pdf | 2023-11-09 |
| 11 | 202321076772-DECLARATION OF INVENTORSHIP (FORM 5) [09-11-2023(online)].pdf | 2023-11-09 |
| 12 | 202321076772-COMPLETE SPECIFICATION [09-11-2023(online)].pdf | 2023-11-09 |
| 13 | 202321076772-STARTUP [16-11-2023(online)].pdf | 2023-11-16 |
| 14 | 202321076772-FORM28 [16-11-2023(online)].pdf | 2023-11-16 |
| 15 | 202321076772-FORM 18A [16-11-2023(online)].pdf | 2023-11-16 |
| 16 | Abstract.jpg | 2023-12-09 |
| 17 | 202321076772-FORM-26 [05-03-2024(online)].pdf | 2024-03-05 |
| 18 | 202321076772-Request Letter-Correspondence [13-03-2024(online)].pdf | 2024-03-13 |
| 19 | 202321076772-Power of Attorney [13-03-2024(online)].pdf | 2024-03-13 |
| 20 | 202321076772-FORM28 [13-03-2024(online)].pdf | 2024-03-13 |
| 21 | 202321076772-Form 1 (Submitted on date of filing) [13-03-2024(online)].pdf | 2024-03-13 |
| 22 | 202321076772-Covering Letter [13-03-2024(online)].pdf | 2024-03-13 |
| 23 | 202321076772-Request Letter-Correspondence [26-03-2024(online)].pdf | 2024-03-26 |
| 24 | 202321076772-Power of Attorney [26-03-2024(online)].pdf | 2024-03-26 |
| 25 | 202321076772-FORM28 [26-03-2024(online)].pdf | 2024-03-26 |
| 26 | 202321076772-Form 1 (Submitted on date of filing) [26-03-2024(online)].pdf | 2024-03-26 |
| 27 | 202321076772-Covering Letter [26-03-2024(online)].pdf | 2024-03-26 |
| 28 | 202321076772-CORRESPONDENCE(IPO)-(WIPO DAS)-02-04-2024.pdf | 2024-04-02 |
| 29 | 202321076772-FER.pdf | 2024-08-20 |
| 30 | 202321076772-Proof of Right [20-02-2025(online)].pdf | 2025-02-20 |
| 31 | 202321076772-FER_SER_REPLY [20-02-2025(online)].pdf | 2025-02-20 |
| 32 | 202321076772-CLAIMS [20-02-2025(online)].pdf | 2025-02-20 |
| 1 | SearchHistoryE_03-07-2024.pdf |