Abstract: Disclosed herein is a method and system for 3D tag detection. A 3D tag is created by using materials with varied combination for generating random 3D pattern in light exposure. The system receives a first image and a second image of the tag in different illumination as captured by an image capturing device. Upon receiving the first image and the second image, the system determines ROI (region of interest) of the first and the second images and computes difference between the ROIs of the first and the second images to generate a resultant image. The system determines the input tag as one of three-dimensional (3D) tag and two- dimensional (2D) tag based on classification of the plurality of image features extracted from the resultant image. The system can also classify the input tag by using a ML model and a DL model.
FIELD OF THE DISCLOSURE
Embodiments of the present disclosure are related, in general to 3D (Three dimensional) tag imaging, and more particularly, but not exclusively to a method and system for detecting 3D tag based on optical characteristics of the 3D tag.
BACKGROUND
Tag detection is a computer vision technique that aids in identifying a tag in an image or video, wherein most of available tags are two dimensional (2D). There are a number of tag detection techniques that are leveraged in many industries such as retail industry, imaging industry, healthcare sector etc. Although techniques for detection of 2D tag has been widely used in the industry, detection of 3D tag from 2D imagery is a challenging problem due to the lack of data and diversity of appearances and shapes of tags within a category. The difficulties in detecting and recognizing 3D tags in context are rooted in the loss of information that occurs when 3D world information is projected onto a 2D image. Further, identification of 3D tags having very small depth such as less than a millimetre cannot be performed due to inadequate processes and devices. The availability of 3D imaging sensors is increasing recently, but techniques handling 3D data are complex in nature and still at emerging stages. Further, the 3D imaging sensors are also very costly.
Conventional methods for identifying an object as 3D or 2D require complex processing of captured images of the object. Further, in some cases such object identification techniques also require a specialized imaging device in order to capture images of the object. The plurality of objects' dimension detection techniques includes but not limited to light pattern-based detection, position-based detection, detection by pixel-matching algorithm, detection by using laser emitters, image classification etc.
US20180121775A1 proposes a multi-dimensional barcode and reader for the purpose of storing more data. The multi-dimensional barcode stores information to be electro-optically read by image capture. The multi-dimensional barcode comprises a plurality of light-modifying elements arranged along first and second directions that are orthogonal to each other in a pattern that stores first and second portions of the information. Some of the elements are having different elevations along a third direction that is orthogonal to the first and second directions to store a third portion of the information, and being coloured to store a fourth portion of the information. The multi-dimensional barcode further comprises a surrounding medium for at least partially encasing at least some of the elements, the surrounding medium having a
2
characteristic that stores a fifth portion of the information.
US20140132725A1 discloses an electronic device for determining a depth of a 3D object image in a 3D environment image is provided. The electronic device includes a sensor and a processor. The sensor obtains a sensor measuring value. The processor receives the sensor measuring value and obtains a 3D object image with a depth information and a 3D environment image with a depth information. The 3D environment image is separated into a plurality of environment image groups according to the depth information of the 3D environment image and there is a sequence among the plurality of environment image groups. The processor selects one of the environment image groups and determines a corresponding depth of the selected the environment image group as a depth of the 3D object image in the 3D environment image according to the sequence and the sensor measuring value to integrate the 3D object image into the 3D environment image.
US9571818B2 discloses techniques for generating robust depth maps from stereo images are described. A robust depth map is generated from a set of stereo images captured with and without flash illumination. The depth map is more robust than depth maps generated using conventional techniques because a pixel-matching algorithm is implemented that weights pixels in a matching window according to the ratio of light intensity captured using different flash illumination levels. The depth map provides a rough estimate of depth relative to neighbouring pixels that enables the flash/no-flash pixel-matching algorithm to devalue pixels that appear to be located at different depths than the central pixel in the matching window. In addition, the ratio map may be used to filter the generated depth map to generate a smooth estimate for the depth of objects within the stereo image.
However, the existing methods do not provide any solution to identify a tag as 3D or 2D with very small depth such as less than a millimetre. Also, the existing methods are devoid of providing the ability to detect 3D tags involving processing with reduced complexity and less costly infrastructure. Therefore, there is a need for a method and system to seamlessly identify a 3D tag having small depth by processing one or more images of the tag captured using simple imaging devices with illuminating capability.
The information disclosed in this background of the disclosure section is only for enhancement of understanding of the general background of the disclosure and should not be taken as an acknowledgement or any form of suggestion that this information forms prior art already known to a person skilled in the art.
3
SUMMARY
One or more shortcomings of the prior art are overcome, and additional advantages are provided through the present disclosure. Additional features and advantages are realized through the techniques of the present disclosure. Other embodiments and aspects of the disclosure are described in detail herein and are considered a part of the claimed disclosure.
Accordingly, the present disclosure relates to a method of detecting a 3D tag. The method includes receiving a first image and a second image of an input tag captured by an image capturing device, wherein the first image is captured in a well-illuminated environment and the second image is captured in normal light without using illuminating source. The method further comprises determining region of interest (ROI) from the first image and the second image thus captured, and computing differences between the ROIs of the first image and the second image to generate a resultant image. The method includes determining a plurality of image features from the resultant image, wherein the image features include reflection points, grey areas, shape of raised edges, gradient in pixel values across the raised edges. The method further includes determining the input tag as one of three-dimensional (3D) tag and two- dimensional (2D) tag based on classification of the plurality of image features.
Further, the disclosure relates to a method of detecting a 3D tag. The method comprising receiving an image of an input tag captured by an image capturing device, wherein the image is captured in either in a well-illuminated environment or in normal light without using illuminating source. The method includes determining ROI (region of interest) of the captured image and determining a plurality of image textural features related to the 3D tags from the determined ROI of the received image by using a Machine Learning (ML) model, wherein the textural features are extracted from reflection points, grey areas along the edges, pattern of raised feature, and other features commonly related to 3D tags. The method comprises classifying the input tag as one of three-dimensional (3D) tag and two- dimensional (2D) tag based on the plurality of image textural features by using the ML model.
Furthermore, disclosure relates to a method of detecting a 3D tag. The method comprising receiving an image of an input tag captured by an image capturing device, wherein the image is captured in either in a well-illuminated environment or in normal light without using illuminating source. The method includes determining ROI (region of interest) of the captured image and classifying the input tag as one of three-dimensional (3D) tag and two- dimensional (2D) tag based on the plurality of image features by using a Deep Learning (DL) model.
4
In another aspect, the disclosure relates to a system for detecting a 3D tag. The system comprises an image capturing device, a light illuminating source configured to enable a well-illuminated environment, a processor coupled with the image capturing device and the light illuminating source, and a memory communicatively coupled with the processor. The memory stores processor-executable instructions, which, on execution, cause the processor to receive a first image and a second image of an input tag captured by the image capturing device, wherein the first image is captured in the well-illuminated environment and the second image is captured in normal light without using illuminating source. The processor is further configured to determine ROI (region of interest) from the first image and the second image, and compute differences between the ROIs of the first image and the second image to generate a resultant image. The processor is further configured to determine a plurality of image features from the resultant image, wherein the image features include reflection points, grey areas, shape of raised edges, gradient in pixel values across the raised edges. The processor is configured to determine the input tag as one of three-dimensional (3D) tag and two- dimensional (2D) tag based on classification of the plurality of image features.
In yet another aspect, the disclosure relates to a system for detecting a 3D tag. The system comprises an image capturing device, a processor coupled with the image capturing device, and a memory communicatively coupled with the processor, wherein the memory stores processor-executable instructions, which, on execution, cause the processor to receive an image an input tag captured by the image capturing device, wherein the image is captured in either in a well-illuminated environment or in normal light without using illuminating source. The processor is further configured to determine ROI (region of interest) of the captured image and determine a plurality of image textural features related to the 3D tags from the determined ROI by using a Machine Learning (ML) model, wherein the textural features are extracted from reflection points, grey areas along the edges, pattern of raised feature, and other features commonly related to 3D tags. The processor is configured to classify the input tag as one of three-dimensional (3D) tag and two-dimensional (2D) tag based on the plurality of image features by using the ML model.
In still another aspect, the disclosure relates to a system for detecting a 3D tag. The system comprises an image capturing device, a processor coupled with the image capturing device, and a memory communicatively coupled with the processor, wherein the memory stores processor-executable instructions, which, on execution, cause the processor to receive an image an input tag captured by the image capturing device, wherein the image is captured in either in
5
a well-illuminated environment or in normal light without using illuminating source. The processor is further configured to determining ROI (region of interest) of the captured image and classify the input tag as one of three-dimensional (3D) tag and two-dimensional (2D) tag based on the plurality of image features by using a Deep Learning (DL) model.
The foregoing summary is illustrative only and is not intended to be in anyway limiting. In addition to the illustrative aspects, embodiments, and features described above, further aspects, embodiments, and features will become apparent by reference to the drawings and the following detailed description.
BRIEF DESCRIPTION OF DRAWINGS
The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate exemplary embodiments and, together with the description, serve to explain the disclosed principles. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. The same numbers are used throughout the figures to reference like features and components. Some embodiments of device or system and/or methods in accordance with embodiments of the present subject matter are now described, by way of example only, and with reference to the accompanying figures, in which:
Figure 1 illustrates an exemplary architecture of a proposed system to detect a 3D tag in accordance with some embodiments of the present disclosure;
Figure 2 illustrates an exemplary block diagram of a tag detection system for enabling detection of 3D tag or 2D tag, in accordance with an embodiment of the present disclosure;
Figure 3A illustrates an exemplary structure of 3D tag, in accordance with an embodiment of the present disclosure;
Figure 3B illustrates an exemplary reflection pattern of a genuine 3D tag, in accordance with an embodiment of the present disclosure;
Figure 3C illustrates an exemplary reflection pattern of a fake tag created with photocopy of the genuine 3D tag, in accordance with an embodiment of the present disclosure;
Figure 3D illustrates an exemplary image of a genuine 3D tag, in accordance with an embodiment of the present disclosure;
Figure 3E illustrates an exemplary image of a fake 2D tag, in accordance with an embodiment of the present disclosure;
6
Figure 4A illustrates a flowchart showing a method of detecting the 3D tag in accordance with some embodiments of the present disclosure;
Figure 4B illustrates a flowchart showing a method of detecting the 3D tag by using a Machine Learning (ML) in accordance with some embodiments of the present disclosure;
Figure 4C illustrates a flowchart showing a method of detecting the 3D tag by using a Deep Learning (DL) model in accordance with some embodiments of the present disclosure; and
Figure 5 illustrates a block diagram of an exemplary computer system for implementing embodiments consistent with the present disclosure.
The figures depict embodiments of the disclosure for purposes of illustration only. One skilled in the art will readily recognize from the following description that alternative embodiments of the structures and methods illustrated herein may be employed without departing from the principles of the disclosure described herein.
DETAILED DESCRIPTION
In the present document, the word "exemplary" is used herein to mean "serving as an example, instance, or illustration." Any embodiment or implementation of the present subject matter described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.
While the disclosure is susceptible to various modifications and alternative forms, specific embodiment thereof has been shown by way of example in the drawings and will be described in detail below. It should be understood, however that it is not intended to limit the disclosure to the particular forms disclosed, but on the contrary, the disclosure is to cover all modifications, equivalents, and alternative falling within the spirit and the scope of the disclosure.
The terms "comprises", "comprising", or any other variations thereof, are intended to cover a non-exclusive inclusion, such that a setup, device or method that comprises a list of components or steps does not include only those components or steps but may include other components or steps not expressly listed or inherent to such setup or device or method. In other words, one or more elements in a device or system or apparatus proceeded by "comprises... a" does not, without more constraints, preclude the existence of other elements or additional elements in the device or system or apparatus.
7
Embodiments of the present disclosure relates to a method and system for detecting the 3D tag with very small depth. In one embodiment, the 3D tag possesses random structure having very small thickness measurable in tens of microns. The 3D tag is manufactured in such way so that one or more types of images of 3D raised random patterns can be captured in different light exposure. To detect an input tag as 3D or 2D, the system receives a first image and a second image of an input tag captured by an image capturing device in two different light exposure, wherein the first image is captured in a well-illuminated environment and the second image is captured in normal light without using illuminating source. The system determines ROI (region of interest) from the first image and the second image, and computes differences between the ROIs of the first image and the second image to generate a resultant image. The system further determines a plurality of image features from the resultant image, wherein the image features include reflection points, grey areas, shape of raised edges, gradient in pixel values across the raised edges. The system determines the input tag as one of three-dimensional (3D) tag and two- dimensional (2D) tag based on classification of the plurality of image features.
In another embodiment, the system receives an image an input tag captured by an image capturing device, wherein the image is captured in either in a well-illuminated environment or in normal light without using illuminating source. The system further determines ROI (region of interest) of the captured image and determines a plurality of image textural features related to the 3D tags from the determined ROI of the received image by using a Machine Learning (ML) model, wherein the textural features are extracted from reflection points, grey areas along the edges, pattern of raised feature, and other features commonly related to 3D tags. The system classifies the input tag as one of three-dimensional (3D) tag or two- dimensional (2D) tag based on the plurality of image features by using the ML model.
In yet another embodiment, the system receives an image of an input tag captured by an image capturing device, wherein the image is captured in either in a well-illuminated environment or in normal light without using illuminating source. The system further determines ROI (region of interest) of the captured image from a rectangle detected by image template-matching and classifies the input tag as one of three-dimensional (3D) tag or two- dimensional (2D) tag based on the plurality of image features by using a Deep Learning (DL) model
The 3D tag is created by printing plurality of layers in order to produce a thickness for the 3D tag, wherein the plurality of layers comprises one or more materials and each of the plurality of layers differs in the combination and concentration of said one or more materials. The layers are configured to generate three-dimensional raised random pattern, wherein the different
8
layers also facilitate in forming the 3D random pattern and protect the formed 3D pattern. In an example, the formed 3D random pattern can be of very small thickness range such as 50-500 microns. In one embodiment, the random 3D pattern illuminates different optical characteristics in response to different light exposure, wherein such optical characteristics can include but not limited to reflection points, grey areas, lighter shades of grey in low thickness locations etc.
In the following detailed description of the embodiments of the disclosure, reference is made to the accompanying drawings that form a part hereof, and in which are shown by way of illustration specific embodiments in which the disclosure may be practiced. These embodiments are described in sufficient detail to enable those skilled in the art to practice the disclosure, and it is to be understood that other embodiments may be utilized and that changes may be made without departing from the scope of the present disclosure. The following description is, therefore, not to be taken in a limiting sense.
Figure 1 illustrates an exemplary architecture of a proposed system to detect 3D tag in accordance with some embodiments of the present disclosure.
As shown in Figure 1, the exemplary system (100) comprises one or more components configured for detecting 3D tag based upon user requirements in near-real time. In one embodiment, the system (100) comprises a 3D tag detection system (TDS) (102), and an image database (104) communicatively coupled via a communication network (106). The communication network (106) may include, without limitation, a direct interconnection, LAN (local area network), WAN (wide area network), wireless network, point-to-point network, or another configuration. One of the most common types of network in current use is a TCP/IP (Transfer Control Protocol and Internet Protocol) network for communication between database client and database server. Other common Internet protocols used for such communication include HTTPS, FTP, AFS, and WAP and other secure communication protocols etc. The TDS (102) can be a handheld, portable user device enabling a user to capture images of a tag and identify the tag as one of 3D or 2D in near-real time.
The TDS (102) comprises an image capturing unit (108) and a light illuminating source (110).
The image capturing unit (108) may be a built-in imaging module embedded in one of the static, handheld, portable user devices such as smart phone, mobile phone, tab etc. In one embodiment, the image capturing unit (108) can be a compact camera, DSLR camera, mirrorless camera, medium format camera, traditional digital cameras etc. capable of acquiring
9
images of objects with and without flashlight, wherein the image capturing unit (108) is configured separately from the TDS (102). The size of objects like 3D tag can be of very small size as in micrometres to millimetres. In an example, the image capturing unit (108) may be configured with image processing capabilities so as to register the captured images, wherein the registration process aims to geometrically align the captured images.
The light illuminating source (110) can be one of light source for exposing light onto the input tag. The light illuminating source (110) is embedded within the TDS (102) or can be externally configured with the TDS (102).
The image database (104) is capable of storing plurality of images of one or more tags, wherein such captured images also include images captured in different light illumination. The plurality of images are further used as training images for Machine Learning (ML) and Deep Learning (DL) processes. In one example, the image database (104) may be integrated within the TDS (102). The image database (104) may be configured, for example, as a standalone data store or as a cloud data storage as illustrated.
The TDS (102) may be configured as a cloud-based implementation server or as a standalone server. In one embodiment, the TDS (102) further comprises a processor (112) and a memory (114) coupled to the processor (112) that stores processor-executable instructions. The processor (112) and the memory (114) are communicatively coupled to the image capturing device (108). The TDS (102) further comprises one or more modules configured to determine an input tag as 3D or 2D in near-real time. In one embodiment, the one or more modules include a data acquisition module (116), an analysis engine (118), an image processing module (120), and a dimension detection module (122). The TDS (102) is configured to receive well-illuminated image and normal image of a tag, determine one or more image features of the tag, compare the parameter values of the image features with a predefined threshold information and determine the tag as a 3D or 2D. In one embodiment, the TDS (102) further facilitates in identifying the tag as 3D or 2D by capturing a single well-illuminated image or an image of the tag in normal light and processing the captured image by using machine learning or deep learning technique. Therefore, the TDS (102) provides a 3D tag detection system to identify 3D tags having small depths in a range of 50 - 500 microns without requiring any specialized device or light patterns.
The TDS (102) may be a typical system as illustrated in Figure 2. In one embodiment, the TDS (102) includes data (204) and modules (206). In one embodiment, the data (204) can be stored
10
within the memory (114). In one example, the data (204) may include imaging data (210), threshold data (212), ROI data (214), dimension data (216), and other data (218). In one embodiment, the data (204) can be stored in the memory (122) in the form of various data structures. Additionally, the aforementioned data can be organized using data models, such as relational or hierarchical data models. The other data (218) may be also referred to as reference repository for storing recommended implementation approaches as reference data. The other data (218) may also store data, including temporary data, temporary files, additional data required for image processing of captured image of 3D tags or 2D tags, and intermediate data for image processing as generated by the modules (206) for performing the various functions of the TDS (102).
The modules (206) may include, for example, the data acquisition module (116), the image processing module (120), the dimension detection module (122), and the analysis engine (118). The analysis engine (118) can further comprise a Machine Learning (ML) model (222) or a Deep Learning (DL) model (224) or both. The modules (206) may also comprise other modules (226) to perform various miscellaneous functionalities of the TDS (102). It will be appreciated that such aforementioned modules may be represented as a single module or a combination of different modules. The modules (206) may be implemented in the form of software performed by the processor, hardware and/or firmware.
In one embodiment, the data acquisition module (116) receives a plurality of design information of 3D tag such as tag dimension, information of physically unclonable functions (PUF), allowable height of raised feature, combination ratio of one or more 3D tag creating materials, dimensions of slopes, small hillocks, etc. from a user or external sources as input and stores the design information in image database (104).
In an example, as illustrated in Figure 3A a typical raised feature with black hillock of the 3D tag is 0.1-0.5 mm high and 1-2 mm wide, with slow rise in height, resulting in grey edges upon exposing to light.
Such raised features further generate different effect on captured images of 3D tag. Therefore, images of a genuine 3D tag and a fake tag created with photocopy of the genuine 3D tag differs significantly in terms of reflection patterns as illustrated in Figure 3B and Figure 3C.
Further, referring back to Figure 2, a plurality of images of genuine 3D tag and photocopy of fake 2D tag are retrieved from external resources to train the ML model (222) and the DL model (224). A pair of such exemplary images i.e., genuine 3D tag is illustrated in Figure 3D
11
and photocopy of fake 2D tag is illustrated in Figure 3E.
In operation, a user of TDS (102) captures one or more images of an input tag by using image capturing unit (108) and transmits the captured images for identifying the tag as one of 3D tag or 2D tag. The tag can be a genuine 3D tag or a fake tag. Further, the one or more captured images can also be transformed to generate fake images of 3D tags.
In one embodiment, the data acquisition module (116) receives a first image and a second image of the input tag as captured by the image capturing unit (108) at the time of scanning the tag and stores the image information as imaging data (210) in the memory (114). The image capturing unit (108) uses a light illuminating source such as flashlight while capturing the first image so that the input tag is well-illuminated while capturing the first image. The illuminated flashlight can fall on the input tag partially or fully, wherein one or more light illuminating sources can be integrated with the image capturing unit (108) or light illuminating source(s) could be placed separately. Under the light illuminating source, the image capturing unit (108) captures one or more images of the tag for varied duration depending upon the random structure of the tag. The image capturing unit (108) captures the second image without using the light illuminating source i.e., the tag is illuminated in naturally available light. In an example, the light ambience can be made totally or partially dark based on the optical nature of the tag.
The data acquisition module (116) further registers the captured first image and the second image. The registration process spatially transforms the captured images to appropriately align the captured images by applying plurality of image processing techniques such as image enhancement, image thresholding, feature definition, feature extraction, homography using perspective relation techniques etc. The plurality of image processing techniques such as scale-invariant feature transform (SIFT), speeded up robust feature (SURF), Oriented FAST and rotated BRIEF (ORB) etc. can also be used for faster and accurate image registration. The data acquisition module (116) can also receive only a single image of the tag captured by the image capturing device and register the single image for analysis by the image processing module (120) by using the ML model (222) or the DL model (224). The single image can be well-illuminated image or the image of the input tag with natural illumination.
The data acquisition module (116) receives one or more threshold information related to one or more image features from a user or from external sources as input and stores the threshold information as threshold data (212) in the memory (114) for future reference. In one embodiment, the data acquisition module (116) receives a single well-illuminated image or
image of the tag captured in natural light from the image capturing unit (108) for determining the dimension of the tag by implementing machine learning or deep learning techniques. In one example, the data acquisition module (116) receives a plurality of images of a plurality of tags and respective dimensional information of such tags from various external sources like public image libraries, image websites, multi-dimensional images produced by different image processing tools etc. The data acquisition module (116) further stores the received dimensional information of the plurality of tags along with the respective images in the image database (104), wherein such received data is used as training data for training the ML model or the DL model.
The image processing module (120) retrieves the registered first image and second image of the tag from the memory (114). Upon retrieving the registered images, the image processing module (120) determines the ROI (region of interest) of the retrieved images by using one of pattern matching techniques, geometrical image mapping techniques etc. The ROI is an area of an image defined for determining the dimension of the tag. In an example, the ROI defines the borders of the retrieved images of the tag, wherein such boundaries are detected by observing discontinuity in brightness of the captured images. The image processing module (120) further stores the determined ROIs of the first image and the second image as ROI data (214) of the memory (112). The image processing module (120) can also determine ROI of the single image of the tag used in ML and DL models, and store the determined ROI of the single image as ROI data (214) of the memory (112).
Upon determining the ROI of the retrieved images, the image processing module (120) determines differences between the ROIs of the first image and the second image, wherein basic sanity checks for image quality are applied. Upon completion of the sanity checks, the image processing module (120) geometrically resizes the first image and second image and tunes in same channel like RGB or Grey or Binary. Once both the first image and the second image are at common level of attributes, the image processing module (120) computes a difference matrix based on differences at pixel level or in neighbouring level or both based on the acquired common level of geometrical attributes. The difference matrix represents a resultant image that is computed by either subtracting a matrix of the ROI of the first image from a matrix of ROI of the second image or by subtracting the matrix of ROI of the second image from the matrix of ROI of the first image. The resultant image contains the features that is available in the first image but not in the second image. In an example, there are two images of a same 3D tag. One image is well illuminated, and the other image is in normal lighting
condition or low lighting condition, then the resultant image obtained by subtracting the respective ROIs comprises only the reflection points that is available in the well-illuminated image but not in the image captured in normal lighting condition or low lighting condition.
The image processing module (120) further thresholds the difference matrix to generate the thresholded resultant image. Thresholding is a type of image segmentation, where the pixels of an image are changed to make the image easier to analyse. In an example, thresholding is performed on a grey scale image to generate a binary image. The image processing module (120) further parses the thresholded resultant image with respect to pixels or group of pixels to determine a plurality of optical features. The plurality of optical features include topography of the thresholded resultant image, prominent feature points and neighbouring region around the feature points, grey areas, geometrical relationship between features of interest, and nature and value of such feature points. The grey areas are determined based on measured distances between reflection points. The image processing module (120) determines count of final reflection points based on analysis of the reflection points on a set of criteria. The image processing module (120) determines the final reflection points based on the set of criteria including for example, pixel intensity value of the prominent feature points, area of regions with reflection, radius of regions with reflection, difference between pixel intensity of prominent points for the ROIs of the first image and the second image and boundary points for the reflection regions. The image processing module (120) stores information of the optical features, count of final reflection points as dimension data (216) in the memory (114).
In another embodiment, the image processing module (120) retrieves the single well-illuminated image or the image of the tag with natural illumination from the imaging data (210) stored in the memory (114) and determines the ROI of the retrieved image in order to identify the tag as 3D or 2D. Upon determining the ROI of the retrieved image, the image processing module (120) determines a plurality of image features of the retrieved image by using the ML model (222). The image features include but not limited to reflection points, grey areas, pattern of raised feature, texture features, and other features commonly related to 3D tags. In an example, the feature extraction is performed by using one of plurality of feature extraction techniques such as Haralick feature extraction, local binary patterns, bag of words etc.
The machine learning model is a representation of complex computational technique of image processing. The ML model (222) is generated based on Support Vector Machine (SVM) technique. The SVM technique comprises a set of supervised learning methods used for classification, regression and outliers detection. The SVM technique is effective in high
dimensional spaces and in cases where number of dimensions is greater than the number of samples.
In another embodiment, the ML model (222) is generated using Logistic Regression technique. The Logistic Regression technique is used to solve classification issues. The Logistic Regression technique is a predictive analytic technique that is based on the probability idea. The classification algorithm Logistic Regression is used to predict the likelihood of a categorical dependent variable. The dependant variable in logistic regression is a binary variable with data coded as 1 or 0.
The ML model (222) is trained to determine plurality of image features from the determined ROI of the received image by using the training data as stored in the image database (106) by the data acquisition module (116). The training data includes a plurality of images of original 3D tags and 2D fake tags captured via different image capturing devices in different illuminating environment. The training data is processed in one or more phases to prepare a training dataset and a testing dataset, wherein the one or more phases include but not limited to analysis of data, handling missing data, cleansing of data, deciding key factors related to data etc. The training set is further split into a plurality of mini batches so that the process of training can be conducted in a plurality of iterations. Once the ML model (222) receives each of the mini-batches as input, the ML model (222) starts learning from the information contained in each of the mini-batches, wherein the ML model (222) can use one of plurality of learning mechanism such as supervised learning, unsupervised learning etc. The training of the ML model (222) is completed upon processing all the mini batches. Further, the trained ML model (222) is tested with the testing dataset to ensure desired performance of the ML model (222). Therefore, the ML model (222) receives the training data as data feed and learns about plurality of features of received images. In one embodiment, upon completion of the initial training process, the ML model (222) continuously performs self-learning from the processing of one or more received images of tags in real time.
In yet another embodiment, the image processing module (120) retrieves the single well-illuminated image or the image of the tag with natural illumination from imaging data (210) of the memory (114) and determines the ROI of the retrieved image in order to identify the tag as 3D or 2D. Upon determining the ROI of the retrieved image, the image processing module (120) feeds the determined ROI to the DL model (224).
The DL model (224) is a representation of complex computational technique of image
processing. The DL model (222) is generated based on Squeeze Net architecture consisting of 26 convolution layers, 3 max pooling layers and 2 average pooling layers. The SqueezeNet architecture is comprised of "squeeze" and "expand" layers. The Squeeze-Net architecture convolutional layer has only 1><1 filters. These are fed into an expand layer that has a mix of lxl and 3x3 convolution filters.
In another embodiment, the DL model (224) is generated using tiny darknet architecture. The tiny darknet architecture enables the DL model to be executed in system having low computation capacity and enables faster generation of results.
The DL model (222) is trained to classify input tag as one of three-dimensional (3D) tag and two-dimensional (2D) tag based on the determined ROI of the received image by using the training data as stored in the image database (106) by the data acquisition module (116). The training data includes a plurality of images of original 3D tags and 2D fake tags captured via different image capturing devices in different illuminating environment. The training data is processed in one or more phases to prepare a training dataset and a testing dataset, wherein the one or more phases include but not limited to analysis of data, handling missing data, cleansing of data, deciding key factors related to data etc. The training set is further split into a plurality of mini batches so that the process of training can be conducted in a plurality of iterations. Once the DL model (224) receives each of the mini-batches as input, the DL model (224) starts learning from the information contained in each of the mini-batches, wherein the DL model (224) can use one of plurality of learning mechanism such as supervised learning, unsupervised learning etc. The training of the DL model (224) is completed upon processing all the mini batches. Further, the trained DL model (224) is tested with the testing dataset to ensure desired performance of the DL model. Therefore, the DL model (224) receives the training data as data feed and learns about plurality of features and classification of received images. In one embodiment, upon completion of the initial training process, the DL model (224) continuously learns by itself from the processing of one or more received images of tags in real time. The trained DL model (224) is used for detecting characteristic features of the input image from reflective surface formed by raised edges of the input tag, wherein the features of the reflective surface of images captured from original 3D tags are different from features of the reflective surface of images captured from flat 2D tags.
In one embodiment, the images of the training data are modified to generate augmented images by varying image brightness, contrast, blur, random oversampling, pixel swapping, colour space transforms, colour channel mixing, edge enhancement, colour jitter, daylight filter,
sharpness filter, dilation, erosion, Laplacian, histogram equalization, contrast limited adaptive histogram equalization etc. and the DL model (224) is trained based on the augmented images. Further, sensitivity analysis is performed on the trained DL model, wherein the sensitivity analysis indicates processing sensitivity of the DL model to detect final reflection points. Thus, such technique enhances the efficiency and accuracy of the DL model (224).
The dimension determination module (122) classifies the input tag as one of 3D tag and 2D tag based on comparison of the count of final reflection points with a predefined threshold reflection points value. The predefined threshold reflection points value is determined based on observation of tag detection results during experiments with a plurality of images of 3D tags and 2D tags.
In one embodiment, the dimension determination module (122) classifies the input tag as one of three-dimensional (3D) tag and two-dimensional (2D) tag based on the plurality of image features by using the trained ML model.
In yet another embodiment, the dimension determination module (122) predicts a probability score of classifying the input tag as one of a 3D tag and a 2D tag based on probability distribution of image features by using the DL model (224). The probability score indicates a possibility of the input tag to be determined as either 3D or 2D. The probability score is further determined as a numerical value. In one example, the probability score can be represented as numerical value 0.9. The dimension determination module (122) classifies the input tag as one of 3D tag and 2D tag based on a comparison of the predicted probability score with a predefined probability threshold score. The predefined probability threshold score includes a lower probability threshold score and an upper probability threshold score that are determined based on observation of optimum probability cut-offs for genuine 3D tag and fake 2D tag images.
In the process of determining the probability cut-offs, the dimension determination module (122) receives a plurality of testing or validation images, wherein the validation images include images of 3D tags and 2D tags, modified images of 2D, images of 3D tags and 2D tags with varying illumination. Further, the dimension determination module (122) validates the DL model using the plurality of validation images and determines the optimum probability cut-offs for genuine 3D tag and fake 2D tag. The probability cut-offs include the upper probability threshold score and the lower probability threshold score.
Figure 4A illustrates a flowchart showing a method for detecting 3D tag in accordance with some embodiments of the present disclosure.
As illustrated in Figure 4A, the method (400) comprises one or more blocks implemented by the processor (112) to automatically detect 3D tag by using TDS (102). The method (400) may be described in the general context of computer executable instructions. Generally, computer executable instructions can include routines, programs, objects, components, data structures, procedures, modules, and functions, which perform specific functions or implement specific abstract data types.
The order in which the method (400) is described is not intended to be construed as a limitation, and any number of the described method blocks can be combined in any order to implement the method. Additionally, individual blocks may be deleted from the methods without departing from the spirit and scope of the subject matter described herein. Furthermore, the method can be implemented in any suitable hardware, software, firmware, or combination thereof.
At block (402), a first image and a second image of the tag are received. In one embodiment, the data acquisition module (116) receives a first image and a second image of the input tag as captured by the image capturing unit (108) at the time of scanning the tag and stores the image information as imaging data (210) in the memory (114). The image capturing unit (108) uses the light illuminating source (110) such as flashlight while capturing the first image so that the input tag is well-illuminated while capturing the first image. The illuminated flashlight can fall on the input tag partially or fully, wherein one or more light illuminating sources can be integrated with the image capturing unit (108) or light illuminating source(s) could be placed separately. Under the light illuminating source, the image capturing unit (108) captures one or more images of the tag for varied duration depending upon the random structure of the tag. The image capturing unit (108) captures the second image without using the light illuminating source i.e., the tag is illuminated in naturally available light. In an example, the light ambience can be made totally or partially dark based on the optical nature of the input tag. In one embodiment, the data acquisition module (116) receives threshold information for one or more image features from a user or from external sources as input and stores the threshold information in the memory (114).
At block (404), region of interest (ROI) of the first image and the second image are determined. In one embodiment, the data acquisition module (116) further registers the captured first image and the second image. The registration process spatially transforms the captured images to appropriately align the captured images by applying plurality of image processing techniques such as image enhancement, image thresholding, feature definition, feature extraction,
homography using perspective relation techniques etc. Upon registering the captured first image and the second image, the image processing module (120) determines the ROI (region of interest) of the retrieved images by using one of pattern matching techniques, geometrical image mapping techniques etc. The ROI is an area of an image defined for determining the dimension of the input tag.
At block (406), difference between ROIs of the first image and the second image is computed to generate a resultant image. Upon determining the ROI of the retrieved images, the image processing module (120) determines differences between ROIs of the first image and the second image, wherein basic sanity checks for image quality are applied. Upon completion of the sanity checks, the image processing module (120) geometrically resizes the first image and second image and tunes in same channel like RGB or Grey or Binary. Once both the first image and the second image are at common level of attributes, the image processing module (120) computes a difference matrix based on differences at pixel level or in neighbouring level or both based on the acquired common level of geometrical attributes. The difference matrix represents a resultant image that is computed by either subtracting a matrix of the ROI of the first image from a matrix of ROI of the second image or by subtracting the matrix of ROI of the second image from the matrix of ROI of the first image. The resultant image contains the features that is available in the first image but not in the second image.
At block (408), a plurality of image features are determined from the resultant image. In one embodiment, the image processing module (120) further thresholds the difference matrix to generate the thresholded resultant image. Thresholding is a type of image segmentation, where the pixels of an image are changed to make the image easier to analyse. In an example, thresholding is performed on a grey scale image to generate a binary image. The image processing module (120) further parses the thresholded resultant image with respect to pixels or group of pixels to determine a plurality of optical features. The plurality of optical features include topography of the thresholded resultant image, prominent feature points and neighbouring region around the feature points, grey areas, geometrical relationship between features of interest, and nature and value of such feature points. The grey areas are determined based on measured distances between reflection points. The image processing module (120) determines count of final reflection points based on analysis of the reflection points on a set of criteria. The image processing module (120) determines the final reflection points based on the set of criteria including for example, pixel intensity value of the prominent feature points, area of regions with reflection, radius of regions with reflection, difference between pixel intensity
of prominent points for the ROIs of the first image and the second image and boundary points for the reflection regions. The image processing module (120) stores information of the optical features, count of final reflection points as dimension data (216) in the memory (114).
At block (410), the tag is determined as one of 3D or 2D based on the determined image features. In one embodiment, the dimension determination module (122) retrieves information related to extracted features from the memory (114) as stored by the image processing module (120) and also retrieves the threshold information for reflection points from the memory (114) as stored by the data acquisition module (116). The dimension determination module (122) classifies the input tag as one of 3D tag and 2D tag based on comparison of the count of final reflection points with a predefined threshold reflection points value.
Figure 4B illustrates a flowchart showing a method for detecting the 3D tag by using a Machine Learning (ML) in accordance with some embodiments of the present disclosure.
As illustrated in Figure 4B, the method (430) comprises one or more blocks implemented by the processor (112) to automatically detect 3D tag by using TDS (102). The method (430) may be described in the general context of computer executable instructions. Generally, computer executable instructions can include routines, programs, objects, components, data structures, procedures, modules, and functions, which perform specific functions or implement specific abstract data types.
The order in which the method (430) is described is not intended to be construed as a limitation, and any number of the described method blocks can be combined in any order to implement the method. Additionally, individual blocks may be deleted from the methods without departing from the spirit and scope of the subject matter described herein. Furthermore, the method can be implemented in any suitable hardware, software, firmware, or combination thereof.
At block (432), an image of an input tag is received. In one embodiment, the data acquisition module (116) can also receive only a single image of the input tag captured by the image capturing device to be analysed by the image processing module (120) by using the ML model (222). The single image can be well-illuminated image or the image of the input tag with natural illumination. The data acquisition module (116) can further register the single image of the input tag.
In one embodiment, the data acquisition module (116) receives a plurality of images of a plurality of tags and respective dimensional information of such tags from various external
sources like public image libraries, image websites, multi-dimensional images produced by different image processing tools etc. The data acquisition module (116) further stores the received dimensional information of the plurality of tags along with the respective images in the memory (114), wherein such received data is used as training data for training the ML model (222).
At block (434), region of interest of the received image is determined. In one embodiment, the data acquisition module (116) can further register the single image of the input tag. The image processing module (120) determines ROI of the received single image of the input tag and store the determined ROI of the single image as ROI data (214) of the memory (114).
At block (436), a plurality of image features are determined by using a ML model. In one embodiment, upon determining the ROI of the retrieved image, the image processing module (120) determines a plurality of image features of the retrieved image by using the ML model (222). The image features include reflection points, grey areas, pattern of raised feature, texture features, and other features commonly related to 3D tags. In an example, the feature extraction is performed by using one of plurality of feature extraction techniques such as Haralick feature extraction, local binary patterns, bag of words etc.
The ML model (222) is a representation of complex computational technique of image processing. The ML model (222) is generated based on Support Vector Machine (SVM) technique. The SVM technique comprises are a set of supervised learning methods used for classification, regression and outliers detection. In one embodiment, the ML model (222) is generated using Logistic Regression technique. The Logistic Regression technique is used to solve classification issues.
The ML model (222) is trained to determine plurality of image features from the determined ROI of the received image by using the training data as stored in the image database (106) by the data acquisition module (116). The training data includes a plurality of images of original 3D tags and 2D fake tags captured via different image capturing devices in different illuminating environment.
At block (438), the tag is classified as one of 3D or 2D based on determined image features. In one embodiment, the dimension determination module (122) classifies the input tag as one of three-dimensional (3D) tag and two-dimensional (2D) tag based on the plurality of image features by using the trained ML model (222).
In an example experiment, the SVM based trained ML model shows 99.53% efficacy with
respect to predicting genuine 3D tag and fake 2D photocopy of genuine 3D tag as illustrated in below Table 1.
Classes 2D (Prediction) 3D (Prediction)
2D 870 1
3D 9 1289
Table 1
In another exemplary experiment, the Logistic Regression based trained ML model shows 98.93% efficacy with respect to predicting genuine 3D tag and fake 2D photocopy of genuine 3D tag as illustrated in below Table 2.
Classes 2D (Prediction) 3D (Prediction)
2D 735 6
3D 15 1209
Table 2
Therefore, the SVM based ML model and the Logistic Regression based ML model accurately identifies input tag as 3D or 2D by significantly reducing the false positive rate of tag detection.
Figure 4C illustrates a flowchart showing a method for detecting the 3D tag by using a Deep Learning (DL) model in accordance with some embodiments of the present disclosure.
As illustrated in Figure 4C, the method (450) comprises one or more blocks implemented by the processor (112) to automatically detect 3D tag by using TDS (102). The method (450) may be described in the general context of computer executable instructions. Generally, computer executable instructions can include routines, programs, objects, components, data structures, procedures, modules, and functions, which perform specific functions or implement specific abstract data types.
The order in which the method (450) is described is not intended to be construed as a limitation, and any number of the described method blocks can be combined in any order to implement the method. Additionally, individual blocks may be deleted from the methods without departing from the spirit and scope of the subject matter described herein. Furthermore, the method can be implemented in any suitable hardware, software, firmware, or combination thereof.
At block (452), an image of an input tag is received. In one embodiment, the data acquisition module (116) receives a single image of the input tag captured by the image capturing device to be analysed by the image processing module (120) by using the DL model (224). The single image can be well-illuminated image or the image of the input tag with natural illumination.
At block (454), region of interest (ROI) of the received image is determined. In one embodiment, the data acquisition module (116) can further register the single image of the input tag. The image processing module (120) determines ROI of the received single image of the input tag and store the determined ROI of the single image as ROI data (214) in the memory (114).
At block (456), the tag is classified as one of 3D or 2D based on determined ROI. In one embodiment, upon determining the ROI of the retrieved image, the image processing module (120) feeds the determined ROI to the DL model (224).
The DL model (224) is a representation of complex computational technique of image processing. The DL model (222) is based on Squeeze Net architecture consisting of 26 convolution layers, 3 max pooling layers and 2 average pooling layers. In another embodiment, the DL model (224) is based on customized tiny darknet architecture by adding a batch normalization layer after each convolution layer except the last one. There are total 12 convolutional layers with 2 dense layers at the end. The tiny darknet architecture enables the DL model to be executed in system having low computation capacity and enables faster generation result.
The DL model (224) is trained to classify input tag as one of three-dimensional (3D) tag and two-dimensional (2D) tag based on the determined ROI of the received image by using the training data as stored in the image database (106) by the data acquisition module (116).
The dimension determination module (122) further predicts a probability score of classifying the input tag as one of a 3D tag and a 2D tag based on probability distribution of image features by using the DL model (224). The dimension determination module (122) classifies the input tag as one of 3D tag and 2D tag based on a comparison of the predicted probability score with a predefined probability threshold score. The predefined probability threshold score includes a lower probability threshold score and an upper probability threshold score that are determined based on observation of optimum probability cut-offs for genuine 3D tag and fake 2D tag images.
In another example experiment, the special features of the genuine 3D tags are proven to be
robust to copy via a simple 2D photocopy by especially training the customized Tiny Darknet based DL model to distinguish between images of the genuine 3D tags and respective fake 2D photocopies. The DL model is especially trained to capture the special features of the 3D tag and it shows 99.96% efficacy with respect to predicting genuine 3D tags as genuine and fake 2D photocopy of the genuine 3D tags as fake as illustrated in below Table 3. The ability of the DL model to distinguish between genuine and fake tags proves that the genuine 3D tags consist of special features which cannot be re-created by a simple 2D photocopy.
Classes 2D (Prediction) 3D (Prediction)
2D 7665 2
3D 0 6476
Table 3
In another exemplary experiment, the Squeeze Net based DL model, especially trained to capture the special features of the genuine 3D tag, shows high efficacy of 99.88% with respect to detecting different type of fake 2D tags printed via different printing methods such as Dot-matrix, Flexography, Laser-jet, Deskjet, Inkjet, Enbossing, Coating etc. while the especially trained customized Tiny-Darknet based DL model failed to distinguish between the genuine 3D and these type of fake 2D tags with high efficacy as illustrated in Table 4. This proves that the special features embedded in the genuine 3D tags cannot be copied by even sophisticated printing methods and the TDS (102) is able to detect the genuine 3D tags separately from the fake 2D tags copied by sophisticated printing methods.
Tag Type True Predictions False Predictions
Flexography - 2D 2374 3
Inkjet-2D 437 1
Photocopy - 2D 4305 2
Genuine - 3D 4213 7
Table 4
In yet another exemplary experiment, the sensitivity of the Squeeze Net based DL model towards synthetic augmentations is observed in order to prove that the special features of the genuine 3D tags cannot be created synthetically. The different synthetic augmentations are applied on the fake 2D images by varying image brightness, contrast, blur, random oversampling, pixel swapping, colour space transforms, colour channel mixing, edge enhancement, colour jitter, daylight filter, sharpness filter, dilation, erosion, Laplacian, histogram equalization, contrast limited adaptive histogram equalization etc. to add synthetic
features to the fake 2D tags.
The augmented images of the fake 2D tags are used to train the Squeeze Net based DL model and the resulting DL model is found to show only a small sensitivity to the synthetic augmentations as illustrated in Table 5, implying the DL model considered is already a robust one. The results in Table 5 shows the effect of different synthetic image augmentation techniques applied on the images of the fake 2D tags in terms of how the probability scores predicted by the DL model changes for these images when synthetic augmentations are added. As illustrated in Table 5, the probability scores by the DL model, used for tag detection, have not been significantly changed for the same image of 2D fake tag after adding synthetic augmentations, hence proving that the DL model trained based on the special features of the genuine 3D tags can accurately and efficiently identify image even with synthetic augmentations as fake tag and the DL based TDS (102) is able to detect the genuine 3D tags separately from the fake 2D tags even with added synthetic features.
Augmentation Type Original Image Prediction Augmented Image Prediction
Brightness 2D - Probability = 0.85 2D - Probability = 0.79
Image Enhancement 2D-Probability = 0.91 2D-Probability = 0.88
Colour Jitter 2D - Probability = 0.82 2D-Probability = 0.81
Table 5
Therefore, the Tiny Darknet based DL model and the Squeeze Net based DL model accurately identifies input tag as 3D or 2D by significantly reducing the false positive probability of tag detection.
Figure 5 illustrates a block diagram of an exemplary computer system for implementing embodiments consistent with the present disclosure.
In an embodiment, the computer system (500) may be 3D tag detection system (102), which is used for identifying 3D tag based upon the real time requirements. The computer system (500) may include a central processing unit ("CPU" or "processor") (508). The processor (508) may comprise at least one data processor for executing program components for executing user or system-generated business processes. The processor (508) may include specialized processing units such as integrated system (bus) controllers, memory management control units, floating point units, graphics processing units, digital signal processing units, etc.
The processor (508) may be disposed in communication with one or more input/output (I/O)
devices (502 and 504) via I/O interface (506). The I/O interface (506) may employ communication protocols/methods such as, without limitation, audio, analog, digital, stereo, IEEE-1394, serial bus, Universal Serial Bus (USB), infrared, PS/2, BNC, coaxial, component, composite, Digital Visual Interface (DVI), high-definition multimedia interface (HDMI), Radio Frequency (RF) antennas, S-Video, Video Graphics Array (VGA), IEEE 802.n /b/g/n/x, Bluetooth, cellular (e.g., Code-Division Multiple Access (CDMA), High-Speed Packet Access (HSPA+), Global System For Mobile Communications (GSM), Long-Term Evolution (LTE) or the like), etc.
Using the I/O interface (506), the computer system (500) may communicate with one or more I/O devices (502 and 504). In some implementations, the processor (508) may be disposed in communication with a communication network (110) via a network interface (510). The network interface (510) may employ connection protocols including, without limitation, direct connect, Ethernet (e.g., twisted pair 10/100/1000 Base T), Transmission Control Protocol/Internet Protocol (TCP/IP), token ring, IEEE 802.1 la/b/g/n/x, etc. Using the network interface (510) and the communication network (106), the computer system (500) may be connected to the image database (104), and the TDS (102).
The communication network (106) can be implemented as one of the several types of networks, such as intranet or any such wireless network interfaces. The communication network (110) may either be a dedicated network or a shared network, which represents an association of several types of networks that use a variety of protocols, for example, Hypertext Transfer Protocol (HTTP), Transmission Control Protocol/Internet Protocol (TCP/IP), Wireless Application Protocol (WAP), etc., to communicate with each other. Further, the communication network (106) may include a variety of network devices, including routers, bridges, servers, computing devices, storage devices, etc.
In some embodiments, the processor (508) may be disposed in communication with a memory (530) e.g., RAM (514), and ROM (516), etc. as shown in Figure 5, via a storage interface (512). The storage interface (512) may connect to memory (530) including, without limitation, memory drives, removable disc drives, etc., employing connection protocols such as Serial Advanced Technology Attachment (SATA), Integrated Drive Electronics (IDE), IEEE-1394, Universal Serial Bus (USB), fiber channel, Small Computer Systems Interface (SCSI), etc. The memory drives may further include a drum, magnetic disc drive, magneto-optical drive, optical drive, Redundant Array of Independent Discs (RAID), solid-state memory devices, solid-state drives, etc.
The memory (530) may store a collection of program or database components, including, without limitation, user/application (518), an operating system (528), a web browser (524), a mail client (520), a mail server (522), a user interface (526), and the like. In some embodiments, computer system (500) may store user/application data (518), such as the data, variables, records, etc. as described in this invention. Such databases may be implemented as fault-tolerant, relational, scalable, secure databases such as Oracle or Sybase.
The operating system (528) may facilitate resource management and operation of the computer system (500). Examples of operating systems include, without limitation, Apple Macintosh™ OS X ™, UNIX ™, Unix-like system distributions (e.g., Berkeley Software Distribution (BSD), FreeBSD ™, Net BSD ™, Open BSD ™, etc.), Linux distributions (e.g., Red Hat™, Ubuntu™, K-Ubuntu™, etc.), International Business Machines (IBM™) OS/2™, Microsoft Windows ™ (XP ™, Vista/7/8, etc.), Apple iOS ™, Google Android ™, Blackberry ™ Operating System (OS), or the like. A user interface may facilitate display, execution, interaction, manipulation, or operation of program components through textual or graphical facilities. For example, user interfaces may provide computer interaction interface elements on a display system operatively connected to the computer system (500), such as cursors, icons, check boxes, menus, windows, widgets, etc. Graphical User Interfaces (GUIs) may be employed, including, without limitation, Apple ™ Macintosh ™ operating systems' Aqua ™, IBM™ OS/2™, Microsoft™ Windows™ (e.g., Aero, Metro, etc.), Unix X-Windows™, web interface libraries (e.g., ActiveX, Java, JavaScript, AJAX, HTML, Adobe Flash, etc.), or the like.
The illustrated steps are set out to explain the exemplary embodiments shown, and it should be anticipated that ongoing technological development will change the manner in which particular functions are performed. These examples are presented herein for purposes of illustration, and not limitation. Further, the boundaries of the functional building blocks have been arbitrarily defined herein for the convenience of the description. Alternative boundaries can be defined so long as the specified functions and relationships thereof are appropriately performed. Alternatives (including equivalents, extensions, variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such alternatives fall within the scope and spirit of the disclosed embodiments. Also, the words "comprising," "having," "containing," and "including," and other similar forms are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items or meant to
be limited to only the listed item or items. It must also be noted that as used herein and in the appended claims, the singular forms "a," "an," and "the" include plural references unless the context clearly dictates otherwise.
Furthermore, one or more computer-readable storage media may be utilized in implementing embodiments consistent with the present disclosure. A computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored. Thus, a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein. The term "computer-readable medium" should be understood to include tangible items and exclude carrier waves and transient signals, i.e., are non-transitory. Examples include random access memory (RAM), read-only memory (ROM), volatile memory, non-volatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any other known physical storage media.
Finally, the language used in the specification has been principally selected for readability and instructional purposes, and it may not have been selected to delineate or circumscribe the inventive subject matter. Accordingly, the disclosure of the embodiments of the disclosure is intended to be illustrative, but not limiting, of the scope of the disclosure.
With respect to the use of substantially any plural and/or singular terms herein, those having skill in the art can translate from the plural to the singular and/or from the singular to the plural as is appropriate to the context and/or application. The various singular/plural permutations may be expressly set forth herein for sake of clarity.
We Claim:
A method of detecting a 3D tag, the method comprising:
receiving a first image and a second image of an input tag captured by an image capturing device, wherein the first image is captured in a well-illuminated environment and the second image is captured in normal light without using illuminating source;
determining region of interest (ROI) from the first image and the second image thus captured;
computing differences between the ROIs of the first image and the second image to generate a resultant image;
determining a plurality of image features from the resultant image, wherein the image features include reflection points, grey areas, shape of raised edges, gradient in pixel values across the raised edges; and
determining the input tag as one of three-dimensional (3D) tag and two-dimensional (2D) tag based on classification of the plurality of image features. The method as claimed in claim 1, wherein determining region of interest (ROI) includes registering the first image and the second image for spatially transforming the captured images for appropriate alignment.
The method as claimed in claim 1, wherein computing differences between the ROIs comprises the steps of:
geometrically resizing the captured ROIs of the first image and the second image based on quality of the captured images and tuning in same channel as one of RGB or Grey or binary; and
computing a difference matrix based on differences at pixel level or in neighboring level or both upon resizing and tuning the ROIs of the first image and the second image to acquire common level of geometrical attributes, wherein the difference matrix represents the resultant image that is computed by either subtracting a matrix of the ROI of the first image from a matrix of ROI of the second image or by subtracting the matrix of ROI of the second image from the matrix of ROI of the first image.
The method as claimed in claim 1, wherein determining the plurality of image features comprises the steps of:
thresholding the difference matrix to generate the thresholded resultant image;
parsing the thresholded resultant image with respect to pixels or group of pixels to determine a plurality of optical features, wherein the plurality of optical features
include topography of the thresholded resultant image, prominent feature points and neighboring region around the feature points, grey areas, geometrical relationship between features of interest, and nature and value of such feature points, wherein grey areas are determined based on measured distances between reflection points;
determining count of final reflection points based on analysis of the reflection points on a set of criteria, wherein the final reflection points are determined based on pixel intensity value of the prominent feature points, area of regions with reflection, radius of regions with reflection, difference between pixel intensity of prominent points for the ROIs of the first image and the second image and boundary points for the reflection regions; and
classifying the input tag as one of 3D tag and 2D tag based on comparison of the count of final reflection points with a predefined threshold reflection points value.
The method as claimed in claim 1, wherein the 3D tag comprises a random structure with thickness ranging from 50 to 500 microns, wherein the random structure of the 3D tag creates one or more 3D features including reflections spots, thin areas at contour edges upon illumination of light.
A method of detecting a 3D tag, the method comprising:
receiving an image of an input tag captured by an image capturing device, wherein the image is captured in either in a well-illuminated environment or in normal light without using illuminating source;
determining region of interest (ROI) of the captured image;
determining a plurality of image textural features related to the 3D tags from the determined ROI of the received image by using a Machine Learning (ML) model, the textural features are extracted from reflection points, grey areas along the edges, pattern of raised feature, and other features commonly related to 3D tags.; and
classifying the input tag as one of three-dimensional (3D) tag and two-dimensional (2D) tag based on the plurality of image features by using the ML model. The method as claimed in claim 6, wherein the ML model is generated based on Support Vector Machine technique.
The method as claimed in claim 6, wherein the ML model is generated using Logistic Regression technique.
9. The method as claimed in claim 6, further comprising training the ML model with training data that includes a plurality of images of original 3D tags and 2D fake tags captured via different image capturing devices in different illuminating environment.
10. A method of detecting a 3D tag, the method comprising:
receiving an image of an input tag captured by an image capturing device, wherein the image is captured in either in a well-illuminated environment or in normal light without using illuminating source;
determining region of interest (ROI) of the captured image; and classifying the input tag as one of three-dimensional (3D) tag and two-dimensional (2D) tag based on the plurality of image features by using a Deep Learning (DL) model.
11. The method as claimed in claim 10, wherein classifying the input tag comprises the
steps of:
predicting a probability score of classifying the input tag as one of a 3D tag or a 2D tag based on probability distribution of image features;
classifying the input tag as one of 3D tag or 2D tag based on a comparison of the predicted probability score with a predefined probability threshold score, wherein the predefined probability threshold score includes a lower probability threshold score and an upper probability threshold score that are determined based on observation of optimum probability cut-offs for genuine 3D tag and fake 2D tag images.
12. The method as claimed in claim 11, wherein determining the probability cut-offs
comprises the steps of:
receiving a plurality of validation images including images of 3D tags and 2D
tags, modified images of 2D, images of 3D tags and 2D tags with varying illumination;
validating the DL model by using the plurality of validation images; and
determining the optimum probability cut-offs for genuine 3D tag and fake 2D
tag, including the upper probability threshold score and the lower probability threshold
score based on the validation of the DL model.
13. The method as claimed in claim 10, further comprising generating the DL model based on Squeeze Net architecture consisting of 26 convolution layers, 3 max pooling layers and 2 average pooling layers.
14. The method as claimed in claim 13, further comprising training the DL model with training data that includes a plurality of images of original 3D tags and 2D fake tags
captured via different image capturing devices in different illuminating environment, wherein the DL model is used for detecting characteristic features of the input image from reflective surface formed by raised edges of the input tag.
15. The method as claimed in claim 14, wherein the features of the reflective surface of images captured from original 3D tags are different from features of the reflective surface of images captured from flat 2D tags.
16. The method as claimed in claim 10, further comprising generating the DL model using tiny darknet architecture and is trained with training data, wherein the training data includes a plurality of images of original 3D tags and 2D fake tags captured via different image capturing devices in different illuminating environment.
17. The method as claimed in claim 16, wherein generating the DL model further comprises the steps of:
augmenting the images of training data by varying image brightness, contrast, blur, random oversampling, pixel swapping, colour space transforms, colour channel mixing, edge enhancement, colour jitter, daylight filter, sharpness filter, dilation, erosion, Laplacian, histogram equalization, contrast limited adaptive histogram equalization;
training the DL model based on the augmented images; and performing sensitivity analysis of the DL model, wherein the sensitivity analysis indicates processing sensitivity of the DL model to detect final reflection points.
18. A system for detecting a 3D tag, the system comprising:
an image capturing device;
a light illuminating source configured to enable a well-illuminated environment;
a processor coupled with the image capturing device and the light illuminating
source; and
a memory communicatively coupled with the processor, wherein the memory
stores processor-executable instructions, which on execution, cause the processor to: receive a first image and a second image of an input tag captured by the image capturing device, wherein the first image is captured in the well-illuminated environment and the second image is captured in normal light without using the light illuminating source;
determine region of interest (ROI) from the first image and the second image thus captured;
compute differences between the ROIs of the first image and the second image to generate a resultant image;
determine a plurality of image features from the resultant image, wherein the image features include reflection points, grey areas, shape of raised edges, gradient in pixel values across the raised edges; and
determine the input tag as one of three-dimensional (3D) tag or two-dimensional (2D) tag based on classification of the plurality of image features.
19. The system as claimed in claim 18, wherein the processor is configured to determine region of interest (ROI) by registering the first image and the second image for spatially transforming the captured images for appropriate alignment.
20. The system as claimed in claim 18, wherein the processor is configured to compute differences between the ROIs by:
geometrically resizing the captured ROIs of the first image and the second image based on quality of the captured images and tuning in same channel as one of RGB or Grey or binary; and
computing a difference matrix based on differences at pixel level or in neighbouring level or both upon resizing and tuning the ROIs of the first image and the second image to acquire common level of geometrical attributes, wherein the difference matrix represents the resultant image that is computed by either subtracting a matrix of the ROI of the first image from a matrix of ROI of the second image or by subtracting the matrix of ROI of the second image from the matrix of ROI of the first image.
21. The system as claimed in claim 18, wherein the processor is configured to determine
the plurality of image features by:
thresholding the difference matrix to generate the thresholded resultant image;
parsing the thresholded resultant image with respect to pixels or group of pixels to determine a plurality of optical features, wherein the plurality of optical features include topography of the thresholded resultant image, prominent feature points and neighbouring region around the feature points, grey areas, geometrical relationship between features of interest, and nature and value of such feature points, wherein grey areas are determined based on measured distances between reflection points;
determining count of final reflection points based on analysis of the reflection points on a set of criteria, wherein the set of criteria includes pixel intensity value of the prominent feature points, area of regions with reflection, radius of regions with reflection, difference between pixel intensity of prominent points for the ROIs of the first image and the second image and boundary points for the reflection regions; and
classifying the input tag as one of 3D tag and 2D tag based on comparison of the count of final reflection points with a predefined threshold reflection points value.
The system as claimed in claim 18, wherein the 3D tag comprises a random structure with thickness ranging from 50 to 500 microns, wherein the random structure of the 3D tag creates one or more 3D features including reflections spots, thin areas at contour edges upon illumination of light.
A system for detecting a 3D tag, the system comprising: an image capturing device;
a processor coupled with the image capturing device; and a memory communicatively coupled with the processor, wherein the memory stores processor-executable instructions, which on execution, cause the processor to: receive an image of an input tag captured by the image capturing device, wherein the image is captured in either in a well-illuminated environment or in normal light without using illuminating source;
determine region of interest (ROI) of the captured image; determine a plurality of image textural features related to the 3D tags from the determined ROI of the received image by using a Machine Learning (ML) model, the textural features are extracted from reflection points, grey areas along the edges, pattern of raised feature, and other features commonly related to 3D tags; and
classify the input tag as one of three-dimensional (3D) tag and two-dimensional (2D) tag based on the plurality of image features by using the ML model. The system as claimed in claim 23, wherein the processor is configured to generate the ML model based on Support Vector Machine technique.
The system as claimed in claim 23, wherein the processor is configured to generate the ML model using Logistic Regression technique.
The system as claimed in claim 23, wherein the processor is further configured to train the ML model with training data that includes a plurality of images of original 3D tags and 2D fake tags captured via different image capturing devices in different illuminating environment.
A system for detecting a 3D tag, the system comprising: an image capturing device;
a processor coupled with the image capturing device; and a memory communicatively coupled with the processor, wherein the memory stores processor-executable instructions, which on execution, cause the processor to: receive an image of an input tag captured by the image capturing device, wherein the image is captured in either in a well-illuminated environment or in normal light without using illuminating source;
determine region of interest (ROI) of the captured image; and classify the input tag as one of three-dimensional (3D) tag and two-dimensional (2D) tag based on the plurality of image features by using a Deep Learning (DL) model.
The system as claimed in claim 27, wherein the processor is configured to classify the input tag by:
predicting a probability score of classifying the input tag as one of a 3D tag or a 2D tag based on probability distribution of image features;
classifying the input tag as one of 3D tag or 2D tag based on a comparison of the predicted probability score with a predefined probability threshold score, wherein the predefined probability threshold score includes a lower probability threshold score and an upper probability threshold score that are determined based on observation of optimum probability cut-offs for genuine 3D tag and fake 2D tag images. The system as claimed in claim 28, wherein the processor is configured to determine the probability cut-offs by:
receiving a plurality of validation images including images of 3D tags and 2D tags, modified images of 2D, images of 3D tags and 2D tags with varying illumination;
validating the DL model by using the plurality of validation images; and
determining the optimum probability cut-offs for genuine 3D tag and fake 2D tag, including the upper probability threshold score and the lower probability threshold score based on the validation of the DL model.
30. The system as claimed in claim 27, wherein the processor is further configured to generate the DL model based on Squeeze Net architecture consisting of 26 convolution layers, 3 max pooling layers and 2 average pooling layers.
31. The system as claimed in claim 30, wherein the processor is further configured to train the DL model with training data that includes a plurality of images of original 3D tags and 2D fake tags captured via different image capturing devices in different illuminating environment, wherein the DL model is used for detecting characteristic features of the input image from reflective surface formed by raised edges of the input tag.
32. The system as claimed in claim 31, wherein the features of the reflective surface of images captured from original 3D tags are different from features of the reflective surface of images captured from flat 2D tags.
33. The system as claimed in claim 27, wherein the processor is further configured to generate the DL model using tiny darknet architecture and is trained with training data, wherein the training data includes a plurality of images of original 3D tags and 2D fake tags captured via different image capturing devices in different illuminating environment.
34. The system as claimed in claim 33, wherein the processor is configured to generate the DL model by:
augmenting the images of training data by varying image brightness, contrast, blur, random oversampling, pixel swapping, colour space transforms, colour channel mixing, edge enhancement, colour jitter, daylight filter, sharpness filter, dilation, erosion, Laplacian, histogram equalization, contrast limited adaptive histogram equalization;
training the DL model based on the augmented images; and performing sensitivity analysis of the DL model, wherein the sensitivity analysis indicates processing sensitivity of the DL model to detect final reflection points.
| # | Name | Date |
|---|---|---|
| 1 | 202111000584-STATEMENT OF UNDERTAKING (FORM 3) [06-01-2021(online)].pdf | 2021-01-06 |
| 2 | 202111000584-PROVISIONAL SPECIFICATION [06-01-2021(online)].pdf | 2021-01-06 |
| 3 | 202111000584-FORM FOR STARTUP [06-01-2021(online)].pdf | 2021-01-06 |
| 4 | 202111000584-FORM FOR SMALL ENTITY(FORM-28) [06-01-2021(online)].pdf | 2021-01-06 |
| 5 | 202111000584-FORM 1 [06-01-2021(online)].pdf | 2021-01-06 |
| 6 | 202111000584-FIGURE OF ABSTRACT [06-01-2021(online)].pdf | 2021-01-06 |
| 7 | 202111000584-EVIDENCE FOR REGISTRATION UNDER SSI(FORM-28) [06-01-2021(online)].pdf | 2021-01-06 |
| 8 | 202111000584-EVIDENCE FOR REGISTRATION UNDER SSI [06-01-2021(online)].pdf | 2021-01-06 |
| 9 | 202111000584-DRAWINGS [06-01-2021(online)].pdf | 2021-01-06 |
| 10 | 202111000584-DECLARATION OF INVENTORSHIP (FORM 5) [06-01-2021(online)].pdf | 2021-01-06 |
| 11 | 202111000584-Proof of Right [06-04-2021(online)].pdf | 2021-04-06 |
| 12 | 202111000584-FORM-26 [06-04-2021(online)].pdf | 2021-04-06 |
| 13 | 202111000584-APPLICATIONFORPOSTDATING [05-01-2022(online)].pdf | 2022-01-05 |
| 14 | 202111000584-DRAWING [13-01-2022(online)].pdf | 2022-01-13 |
| 15 | 202111000584-CORRESPONDENCE-OTHERS [13-01-2022(online)].pdf | 2022-01-13 |
| 16 | 202111000584-COMPLETE SPECIFICATION [13-01-2022(online)].pdf | 2022-01-13 |
| 17 | 202111000584-STARTUP [05-09-2022(online)].pdf | 2022-09-05 |
| 18 | 202111000584-FORM28 [05-09-2022(online)].pdf | 2022-09-05 |
| 19 | 202111000584-FORM-9 [05-09-2022(online)].pdf | 2022-09-05 |
| 20 | 202111000584-FORM 18A [05-09-2022(online)].pdf | 2022-09-05 |
| 21 | 202111000584-FER.pdf | 2022-10-10 |
| 22 | 202111000584-FER_SER_REPLY [31-03-2023(online)].pdf | 2023-03-31 |
| 23 | 202111000584-COMPLETE SPECIFICATION [31-03-2023(online)].pdf | 2023-03-31 |
| 24 | 202111000584-CLAIMS [31-03-2023(online)].pdf | 2023-03-31 |
| 25 | 202111000584-ABSTRACT [31-03-2023(online)].pdf | 2023-03-31 |
| 26 | 202111000584-US(14)-HearingNotice-(HearingDate-08-01-2024).pdf | 2023-11-18 |
| 27 | 202111000584-FORM-26 [03-01-2024(online)].pdf | 2024-01-03 |
| 28 | 202111000584-Correspondence to notify the Controller [03-01-2024(online)].pdf | 2024-01-03 |
| 29 | 202111000584-Correspondence to notify the Controller [05-01-2024(online)].pdf | 2024-01-05 |
| 30 | 202111000584-US(14)-ExtendedHearingNotice-(HearingDate-20-02-2024).pdf | 2024-01-30 |
| 31 | 202111000584-Correspondence to notify the Controller [15-02-2024(online)].pdf | 2024-02-15 |
| 32 | 202111000584-Written submissions and relevant documents [05-03-2024(online)].pdf | 2024-03-05 |
| 33 | 202111000584-PatentCertificate08-03-2024.pdf | 2024-03-08 |
| 34 | 202111000584-IntimationOfGrant08-03-2024.pdf | 2024-03-08 |
| 1 | 202111000584E_10-10-2022.pdf |