Sign In to Follow Application
View All Documents & Correspondence

Systems And Methods For Offline Signature Identification And Extraction

Abstract: Conventional solutions for signature verification use neural networks and deep learning algorithms to match and verify signatures from documents, thereby requiring a huge database of signatures already available to search the signature from the document and then to verify authenticity of the signature. Availability of signatures is a concern in many business processes. Secondly, the approaches are computationally intense. The present disclosure uses a heuristic based Principal Component Analysis (PCA) approach to extract only relevant portions of the document. Subsequently the extracted signature portions are matched against legitimate signatures obtained from the users using a randomization approach. The unsupervised method of the present disclosure can be applied for identifying and extracting signatures of any size with less computation effort and without the need for a huge training database of signatures. [To be published with FIG. 2]

Get Free WhatsApp Updates!
Notices, Deadlines & Correspondence

Patent Information

Application #
Filing Date
22 May 2020
Publication Number
48/2021
Publication Type
INA
Invention Field
ELECTRONICS
Status
Email
kcopatents@khaitanco.com
Parent Application
Patent Number
Legal Status
Grant Date
2024-11-25
Renewal Date

Applicants

Tata Consultancy Services Limited
Nirmal Building, 9th Floor, Nariman Point Mumbai Maharashtra India 400021

Inventors

1. KOLANDAI SWAMY, Antony Arokia Durai Raj
Tata Consultancy Services Limited Unit-VIII & IX, Think Campus, KIADB Industrial Estate, Electronic City, Phase-II, Bangalore Karnataka India 560100
2. MANDAL, Indrajit
Tata Consultancy Services Limited Unit-VIII & IX, Think Campus, KIADB Industrial Estate, Electronic City, Phase-II, Bangalore Karnataka India 560100

Specification

FORM 2
THE PATENTS ACT, 1970
(39 of 1970)
&
THE PATENT RULES, 2003
COMPLETE SPECIFICATION (See Section 10 and Rule 13)
Title of invention:
SYSTEMS AND METHODS FOR OFFLINE SIGNATURE IDENTIFICATION AND EXTRACTION
Applicant
Tata Consultancy Services Limited A company Incorporated in India under the Companies Act, 1956
Having address:
Nirmal Building, 9th floor,
Nariman point, Mumbai 400021,
Maharashtra, India
Preamble to the description
The following specification particularly describes the invention and the manner in which it is to be performed.

TECHNICAL FIELD [001] The disclosure herein generally relates to the field of image processing, and, more particularly, to systems and methods for identification and extraction of signatures from an image.
BACKGROUND
[002] Authenticating or verifying signatures in documents forms a critical part of Business Process Solutions (BPS) industry like banking, legal, government processes, and the like. The state of the art in signature verification mainly uses machine learning, neural network and deep learning algorithms to match and verify signatures from documents. Though the output from deep learning is reasonably accurate, it requires the signatures to be already available in the database to search the signature from the document and then to verify the authenticity of the signature. The existing method requires a huge database of signatures to train machine learning models to achieve reasonable accuracy. Hence it takes significant computation effort to learn the model from huge database of signatures and score the signatures extracted from documents.
[003] Conventional signature verification methods also assume that the signatures are already extracted and are available, however in several business processes this may not be the case always. In some cases, the signatures need to be extracted from the document before they can be processed. Most of the existing solutions address the problem of signature verification by the approach of classification problems. However, this approach may not work reliably when there is a large database of signatures.

[004] Embodiments of the present disclosure present technological improvements as solutions to one or more of the above-mentioned technical problems recognized by the inventors in conventional systems.
[005] In an aspect, there is provided a processor implemented method comprising the steps of: applying, via one or more hardware processors, Principal Component Analysis (PCA) on a pixel matrix of an image of a document to be processed for identifying and extracting one or more signatures contained therein, to obtain a first principal component; identifying, via the one or more hardware processors, a set of candidate signatures using a heuristic method, wherein the candidate signatures correspond to at least one Region Of Interest (ROI) in the obtained first principal component based on a change in pattern therein, applying, via the one or more hardware processors, the PCA on the pixel matrix corresponding to each of the candidate signatures to obtain a second principal component associated thereof; and identifying, via the one or more hardware processors, a set of extracted signatures using the heuristic method, wherein the extracted signatures correspond to at least one ROI in the obtained second principal component based on a change in pattern in the obtained second principal component corresponding to each of the candidate signatures.
[006] In another aspect, there is provided a system comprising: one or more data storage devices operatively coupled to one or more hardware processors and configured to store instructions configured for execution via the one or more hardware processors to: apply, via one or more hardware processors, Principal Component Analysis (PCA) on a pixel matrix of an image of a document to be processed for identifying and extracting one or more signatures contained therein, to obtain a first principal component; identify, via the one or more hardware processors, a set of candidate signatures using a heuristic method, wherein the candidate signatures correspond to at least one Region Of Interest (ROI) in the obtained first principal component based on a change in pattern therein, apply, via the one or more hardware processors, the PCA on the pixel matrix corresponding to each of the candidate signatures to obtain a second principal component associated thereof; and identify, via the one or more hardware processors, a set of

extracted signatures using the heuristic method, wherein the extracted signatures correspond to at least one ROI in the obtained second principal component based on a change in pattern in the obtained second principal component corresponding to each of the candidate signatures.
[007] In yet another aspect, there is provided a computer program product comprising a non-transitory computer readable medium having a computer readable program embodied therein, wherein the computer readable program, when executed on a computing device, causes the computing device to: apply, via one or more hardware processors, Principal Component Analysis (PCA) on a pixel matrix of an image of a document to be processed for identifying and extracting one or more signatures contained therein, to obtain a first principal component; identify, via the one or more hardware processors, a set of candidate signatures using a heuristic method, wherein the candidate signatures correspond to at least one Region Of Interest (ROI) in the obtained first principal component based on a change in pattern therein, apply, via the one or more hardware processors, the PCA on the pixel matrix corresponding to each of the candidate signatures to obtain a second principal component associated thereof; and identify, via the one or more hardware processors, a set of extracted signatures using the heuristic method, wherein the extracted signatures correspond to at least one ROI in the obtained second principal component based on a change in pattern in the obtained second principal component corresponding to each of the candidate signatures.
[008] In accordance with an embodiment of the present disclosure, the one or more hardware processors are configured to receive the image of the document to be processed; and process the received image to obtain the pixel matrix of the image of the document prior to applying PCA on the pixel matrix of the image of the document.
[009] In accordance with an embodiment of the present disclosure, the one or more hardware processors are configured to resize the received image to an empirically determined size prior to processing the received image.
[010] In accordance with an embodiment of the present disclosure, the one or more hardware processors are configured to perform the heuristic method

comprising: computing mean, mode, variance and standard deviation on the pixel matrix of the image of the document; identifying regions with blank area using the computed mode and an empirically determined threshold to account for the blank area within the one or more signatures and in the image of the document; and eliminating regions with photographs using the computed standard deviation and variance and empirically determined values for the standard deviation and variance.
[011] In accordance with an embodiment of the present disclosure, the one or more hardware processors are configured to optimize the identified at least one ROI corresponding to the candidate signatures and having a maximum coordinate and a minimum coordinate associated with a left, right, top and bottom edge by reducing the maximum coordinate associated with each of the left, right, top and bottom edge iteratively by a unit value till an estimated heuristic value is reached to obtain optimum coordinates associated thereof, such that the optimum coordinates are greater than the minimum coordinate.
[012] In accordance with an embodiment of the present disclosure, the one or more hardware processors are configured to authenticate the extracted signatures by: obtaining legitimate signatures associated with each user from a database, wherein the legitimate signatures are resized to an empirically determined length and breadth; for each of the obtained legitimate signatures, generating training samples corresponding to the legitimate signatures by: randomly sampling a pixel of an image of the obtained legitimate signatures; perturbing a value of pixels in a neighborhood of the sampled pixel to a minimum extent; and storing signatures associated with the perturbed value of pixels as the training samples of the legitimate signatures, such that the training samples of the legitimate signatures account for variation of the obtained legitimate signatures; for each of the obtained legitimate signatures, generating training samples of illegitimate signatures by: randomly sampling one to three pixels of an image of the obtained legitimate signatures; perturbing the pixels that are in a neighborhood of the sampled one to three pixels to a maximum extent; and storing signatures associated with the perturbed value of the sampled one to three pixels as the training samples of the illegitimate signatures of the obtained legitimate signatures, such that the training

samples the illegitimate signatures account for variations thereof; training an ensemble classifier comprising a combination of machine learning and deep learning models using the generated training samples of the legitimate signatures and the illegitimate signatures; obtaining a first score and a second score for the extracted signatures using the trained ensemble classifier, wherein the first score is indicative of a closeness of the extracted signatures to the legitimate signatures associated with each of the obtained legitimate signatures and the second score is indicative of a closeness of the extracted signatures to the illegitimate signatures associated with each of the obtained legitimate signatures; and authenticating the extracted signatures based on the obtained first score, the second score and an estimated threshold value.
[013] In accordance with an embodiment of the present disclosure, the one or more hardware processors are configured to preprocess the extracted signatures to eliminate noise in the form of printed text and lines, prior to authenticating the extracted signatures, by: dividing pixels associated with the extracted signatures into a plurality of sub-regions; identifying lines and printed text in the extracted signatures by assessing values of the pixels in each of the plurality of sub-regions, wherein identical linear pixel values within and across the plurality of sub-regions is indicative of lines and pixel values in an empirically determined range for printed text is indicative of printed text in the plurality of sub-regions; and replacing the pixel values of the pixels associated with the identified lines and printed text to 255 to eliminate the noise.
[014] It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
BRIEF DESCRIPTION OF THE DRAWINGS [015] The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate exemplary embodiments and, together with the description, serve to explain the disclosed principles:

[016] FIG.1 illustrates an exemplary block diagram of a system for offline signature identification and extraction, in accordance with some embodiments of the present disclosure.
[017] FIG.2 illustrates an exemplary flow diagram of a computer implemented method for offline signature identification and extraction, in accordance with some embodiments of the present disclosure.
[018] FIG.3 illustrates an image of a scanned document to be processed, in accordance with some embodiments of the present disclosure.
[019] FIG.4 illustrates a plot of a first principal component corresponding to the image of FIG.3, in accordance with some embodiments of the present disclosure.
[020] FIG.5A through FIG.5C illustrate a first Region Of Interest (ROI), a second ROI and a third ROI, respectively identified in the first principal component of FIG.4, in accordance with some embodiments of the present disclosure.
[021] FIG.6A through FIG.6C illustrate a plot of a second principle component corresponding to the first ROI, the second ROI and the third ROI of FIG.5A through FIG.5C respectively, in accordance with some embodiments of the present disclosure.
[022] FIG.7A illustrates an extracted signature from the first ROI, FIG.7B and FIG.7C illustrate extracted signatures from the second ROI and FIG.7D illustrates the extracted signature from the third ROI respectively, in accordance with some embodiments of the present disclosure.
DETAILED DESCRIPTION OF EMBODIMENTS [023] Exemplary embodiments are described with reference to the accompanying drawings. In the figures, the left-most digit(s) of a reference number identifies the figure in which the reference number first appears. Wherever convenient, the same reference numbers are used throughout the drawings to refer to the same or like parts. While examples and features of disclosed principles are described herein, modifications, adaptations, and other implementations are possible without departing from the scope of the disclosed embodiments. It is

intended that the following detailed description be considered as exemplary only, with the true scope being indicated by the following claims.
[024] The problem of signature verification exists in two scenarios: 1) Online signature verification and 2) Offline signature verification. The online signature verification is needed when a user is required to be authenticated when asked to typically sign on electronic gadgets such as tablets or other hand-held devices. Online signature verification methods use velocity, depth, pressure, etc. associated with the writing aid as factors to do verification. The offline signature verification is needed when signatures present in a hard copy (e.g. checks, legal documents, etc.) need to be authenticated. Signature verification is a critical process in the Business Process Services (BPS) industry like banking, legal, government processes, etc.
[025] Applicant addresses the problem of offline signature verification by providing a computationally less intensive approach as compared to the state of art approaches. Firstly, the approach of the present disclosure localizes signature regions from the document, thereby reducing the search space significantly from the entire document (conventional approach) to the limited portions of the document. Secondly, the approach of the present disclosure does not necessitate prior training (conventional approach) to extract the signatures from the document, thereby obviating need for a huge training database of legitimate signatures which may not be feasible in all business processes. In the present disclosure, Principal Component Analysis (PCA) with a heuristic approach is used to extract signatures from identified portions of the image of the document. Some of the conventional methods have used PCA, but for signature verification rather than for signature localization or identification within the document as in the case of the present disclosure. In the conventional methods, machine learning algorithms were employed over the features obtained from PCA components for verification of the signatures.
[026] In the present disclosure, once extracted, the signatures are verified. Conventionally, signature verification is done by the rolling window approach, wherein an image of the document to be processed is placed on a grid. By moving

across the grid and comparing, a match with an available signature is checked. Furthermore, advanced learning methodologies employing Artificial Neural Networks (ANN), Gaussian Mixture Models (GMM) Hidden Markov Models (HMM), and the like, are employed, that consumes significant computational effort for the process of verification. Most of the Deep Learning algorithms used traditionally for signature verification have time complexity order of O(nk) where k = 4, 5 or higher is integer ) whereas the approach of the present disclosure is of the order of O(n3). Some of the existing approaches are based on the size of the signatures and may not work on dynamic sizes. The present approach uses an unsupervised approach that can work on any size of signatures.
[027] In accordance with the present disclosure, signature identification is a process of identifying possible signature regions in a document. The resulting regions may or may not contain signatures. The signature regions are identified using a change in the pattern of principal components. The principal component is obtained by applying principal component analysis on the pixel matrix. The step-by-step approach of extracting signatures from the document that need to be validated is as explained hereinafter.
[028] Referring now to the drawings, and more particularly to FIG. 1 through FIG.7D, where similar reference characters denote corresponding features consistently throughout the figures, there are shown preferred embodiments and these embodiments are described in the context of the following exemplary system and/or method.
[029] FIG.1 illustrates an exemplary block diagram of a system 100 for offline signature identification and extraction, in accordance with some embodiments of the present disclosure. In an embodiment, the system 100 includes one or more processors 104, communication interface device(s) or input/output (I/O) interface(s) 106, and one or more data storage devices or memory 102 operatively coupled to the one or more processors 104. The one or more processors 104 that are hardware processors can be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, graphics controllers, logic circuitries,

and/or any devices that manipulate signals based on operational instructions. Among other capabilities, the processor(s) are configured to fetch and execute computer-readable instructions stored in the memory. In the context of the present disclosure, the expressions ‘processors’ and ‘hardware processors’ may be used interchangeably. In an embodiment, the system 100 can be implemented in a variety of computing systems, such as laptop computers, notebooks, hand-held devices, workstations, mainframe computers, servers, a network cloud and the like.
[030] I/O interface(s) 106 can include a variety of software and hardware interfaces, for example, a web interface, a graphical user interface, and the like and can facilitate multiple communications within a wide variety of networks N/W and protocol types, including wired networks, for example, LAN, cable, etc., and wireless networks, such as WLAN, cellular, or satellite. In an embodiment, the I/O interface(s) can include one or more ports for connecting a number of devices to one another or to another server.
[031] The memory 102 may include any computer-readable medium known in the art including, for example, volatile memory, such as static random access memory (SRAM) and dynamic random access memory (DRAM), and/or non-volatile memory, such as read only memory (ROM), erasable programmable ROM, flash memories, hard disks, optical disks, and magnetic tapes. In an embodiment, one or more modules (not shown) of the system 100 can be stored in the memory 102.
[032] FIG.2A and FIG.2B illustrate an exemplary flow diagram of a computer implemented method 200 for offline signature identification and extraction, in accordance with some embodiments of the present disclosure. In an embodiment, the system 100 includes one or more data storage devices or memory 102 operatively coupled to the one or more processors 104 and is configured to store instructions configured for execution of steps of the method 200 by the one or more processors 104. The steps of the method 200 will now be explained in detail with reference to the components of the system 100 of FIG.1. Although process steps, method steps, techniques or the like may be described in a sequential order, such processes, methods and techniques may be configured to work in alternate

orders. In other words, any sequence or order of steps that may be described does not necessarily indicate a requirement that the steps be performed in that order. The steps of processes described herein may be performed in any order practical. Further, some steps may be performed simultaneously.
[033] Accordingly, in an embodiment of the present disclosure, the one or more processors 104, are configured to apply, at step 202, Principal Component Analysis (PCA) on a pixel matrix of an image of a document to be processed for identifying and extracting one or more signatures contained therein, to obtain a first principal component.
[034] In an embodiment, an input to the system 100 may be an image of the document to be processed. FIG.3 illustrates an image of a scanned document to be processed, in accordance with some embodiments of the present disclosure. The received image is then processed by the system 100 to obtain the pixel matrix that is provided as an input at step 202. In an embodiment, the processing of the received image may be preceded by resizing the received image to an empirically determined size (say, 600 x 900) needed for performing the PCA. FIG.4 illustrates a plot of the first principal component corresponding to the image of FIG.3, in accordance with some embodiments of the present disclosure.
[035] In an embodiment of the present disclosure, the one or more processors 104, are configured to identify or localize, at step 204, a set of candidate signatures using a heuristic method, wherein the candidate signatures correspond to at least one Region Of Interest (ROI) in the obtained first principal component based on a change in pattern in the first principal component. FIG.5A through FIG.5C illustrate a first Region Of Interest (ROI), a second ROI and a third ROI, respectively identified in the first principal component of FIG.4, in accordance with some embodiments of the present disclosure. It may be noted that only the first ROI and the third ROI contains signatures.
[036] In an embodiment of the present disclosure, the one or more processors 104, are configured to apply, at step 206, PCA on the pixel matrix corresponding to each of the candidate signatures (FIG5A through FIG.5C) to obtain a second principal component associated thereof. FIG.6A through FIG.6C

illustrate a plot of a second principle component corresponding to the first ROI, the second ROI and the third ROI of FIG.5A through FIG.5C respectively, in accordance with some embodiments of the present disclosure.
[037] In an embodiment of the present disclosure, the one or more processors 104, are configured to identify, at step 208, a set of extracted signatures using the heuristic method, wherein the extracted signatures correspond to at least one ROI in the obtained second principal component based on a change in pattern in the obtained second principal component corresponding to each of the candidate signatures. FIG.7A illustrates an extracted signature from the first ROI, FIG.7B and FIG.7C illustrate extracted signatures from the second ROI and FIG.7D illustrates the extracted signature from the third ROI respectively, in accordance with some embodiments of the present disclosure. It may be noted that four signatures were extracted from the document although only two signatures were present in the image of the document (FIG.3).
[038] In accordance with an embodiment of the present disclosure, the heuristic method comprises computing descriptive statistics such as mean, mode, variance and standard deviation on the pixel matrix of the image of the document. It may be noted that the pixel matrix of the entire image of the document is considered in the heuristic method. Regions with blank area are identified using the computed mode and an empirically determined threshold to account for the blank area within the one or more signatures. In an exemplary embodiment, the identification of the blank area in the image of the document involves checking for a change in pixel value in the image. If there is no change in the pixel value (i.e.255) for white region, it implies there is a white space (blank area). Based on the computed mode of pixel values associated with consecutive n pixels, the blank areas in the image of the document may be identified. Furthermore, the blank area within the signature may be identified based on available mode patterns. The pattern is found using several signatures and averaging the values. For instance, if there are 1000 signature samples, mode of the blank area is computed, and a range of average value and standard deviation may be identified.

[039] The image of a document illustrated in FIG.3 contains a photograph. The regions with photographs are associated high variation. Hence, the heuristic method of the present disclosure involves eliminating regions with photographs using the computed standard deviation and variance and empirically determined values for the standard deviation and variance. Using the pixel values of the image, the standard deviation and variance is computed. Using pen, pencil or any other device, signatures are studied, and the standard deviation and variance are computed for several signatures. The empirically determined values for the standard deviation and variance are compared with the computed standard deviation and variance. If the difference is within an empirically determined threshold, the region is considered as a photograph and is eliminated.
[040] In accordance with the present disclosure, for optimizing the computing effort, the at least one ROI corresponding to the candidate signatures are optimized so that only the relevant ROI is identified. The optimization steps of the present disclosure may be applied to signatures of any size in the image of the document. The heuristic method described herein above identifies approximate ROI corresponding to the one or more signatures. The identified ROI are each characterized by a maximum coordinate and a minimum coordinate associated with a left, right, top and bottom edge. The optimizing comprises reducing the maximum coordinate associated with each of the left, right, top and bottom edge iteratively by a unit value till an estimated heuristic value is reached to obtain optimum coordinates associated thereof, such that the optimum coordinates are greater than the minimum coordinate. In an embodiment, as the ROI is reduced by a unit value, a bounding box of the ROI may end approximately a unit before the signature. The optimum coordinates ensure that the ROI is as close as possible to the signature.
[041] In an embodiment of the present disclosure, the one or more processors 104, are configured to authenticate, at step 210, the extracted signatures of step 208. A randomization approach has been used for the authentication or verification of the extracted signatures. Legitimate signatures associated with each user are obtained from a database, wherein the legitimate signatures are resized to an empirically determined length and breadth. All the signatures in the databased

are resized to a fixed size to facilitate PCA. For each of the obtained legitimate signatures, training samples corresponding to the legitimate signatures are generated by randomly sampling a pixel of an image of the obtained legitimate signatures. A value of pixels in a neighborhood of the sampled pixel is perturbed to a minimum extent. Signatures associated with the perturbed value of pixels are stored as the training samples of the legitimate signatures, such that the training samples of the legitimate signatures account for variants of the obtained legitimate signatures. Likewise, for each of the obtained legitimate signatures, training samples of illegitimate signatures are generated by randomly sampling one to three pixels of an image of the obtained legitimate signatures. The pixels in a neighborhood of the sampled one to three pixels are perturbed to a maximum extent. Signatures associated with the perturbed value of the sampled one to three pixels are stored as the training samples of the legitimate signatures, such that the training samples of the illegitimate signatures account for associated variants. An ensemble classifier comprising a combination of machine learning and deep learning models is trained using the generated training samples of the legitimate signatures and the illegitimate signatures. A first score indicative of a closeness of the extracted signatures from step 208 to the legitimate signatures associated with each of the obtained legitimate signatures and a second score indicative of a closeness of the extracted signatures to the illegitimate signatures associated with each of the obtained legitimate signatures is obtained. Lower value of the first score represents that the extracted signatures are legitimate while lower value of the second score represents that the extracted signatures are illegitimate The extracted signatures from step 208 are authenticated based on the obtained first score, the obtained second score and an estimated threshold value, wherein the estimated threshold value is further based on the obtained legitimate signatures and the training samples of the illegitimate signatures generated for each of the obtained legitimate signatures.
[042] In accordance with an embodiment of the present disclosure, the step of authenticating the extracted signatures may be preceded by preprocessing the extracted signatures to eliminate noise in the form of printed text and lines by

dividing pixels associated with the extracted signatures into a plurality of sub-regions. Then identical linear pixel values within and across the plurality of sub-regions are identified as lines and pixel values in an empirically determined range for printed text are identified as printed text in the plurality of sub-regions. The pixel values of the pixels associated with the identified lines and printed text are replaced to 255 (white) to eliminate the noise.
[043] Some applications of the systems and methods of the present disclosure include ensuring whether all pages of a document are signed and whether the signature on a document is legitimate. This task, when performed manually would require a huge back office staff for certain business processes. Automating this effort also has to address the challenge that a signature may be available in any part of the document and not necessarily at a fixed location. For structured documents, template-based solutions known in the art may work. However, the present disclosure addresses offline signature identification and verification in an unstructured document. Hence identifying or localizing the signature in the document is a critical aspect of the method of the present disclosure without dependency on available signatures and without the need for a huge database of legitimate signatures for training computationally effort intensive machine learning or deep learning models. The signature verification of the present disclosure involves a randomization-based approach rather than the conventional supervised classification approach.
[044] The written description describes the subject matter herein to enable any person skilled in the art to make and use the embodiments. The scope of the subject matter embodiments is defined by the claims and may include other modifications that occur to those skilled in the art. Such other modifications are intended to be within the scope of the claims if they have similar elements that do not differ from the literal language of the claims or if they include equivalent elements with insubstantial differences from the literal language of the claims.
[045] It is to be understood that the scope of the protection is extended to such a program and in addition to a computer-readable means having a message therein; such computer-readable storage means contain program-code means for

implementation of one or more steps of the method, when the program runs on a server or mobile device or any suitable programmable device. The hardware device can be any kind of device which can be programmed including e.g. any kind of computer like a server or a personal computer, or the like, or any combination thereof. The device may also include means which could be e.g. hardware means like e.g. an application-specific integrated circuit (ASIC), a field-programmable gate array (FPGA), or a combination of hardware and software means, e.g. an ASIC and an FPGA, or at least one microprocessor and at least one memory with software processing components located therein. Thus, the means can include both hardware means and software means. The method embodiments described herein could be implemented in hardware and software. The device may also include software means. Alternatively, the embodiments may be implemented on different hardware devices, e.g. using a plurality of CPUs.
[046] The embodiments herein can comprise hardware and software elements. The embodiments that are implemented in software include but are not limited to, firmware, resident software, microcode, etc. The functions performed by various components described herein may be implemented in other components or combinations of other components. For the purposes of this description, a computer-usable or computer readable medium can be any apparatus that can comprise, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
[047] The illustrated steps are set out to explain the exemplary
embodiments shown, and it should be anticipated that ongoing technological
development will change the manner in which particular functions are performed.
These examples are presented herein for purposes of illustration, and not limitation.
Further, the boundaries of the functional building blocks have been arbitrarily
defined herein for the convenience of the description. Alternative boundaries can
be defined so long as the specified functions and relationships thereof are
appropriately performed. Alternatives (including equivalents, extensions,
variations, deviations, etc., of those described herein) will be apparent to persons skilled in the relevant art(s) based on the teachings contained herein. Such

alternatives fall within the scope of the disclosed embodiments. Also, the words “comprising,” “having,” “containing,” and “including,” and other similar forms are intended to be equivalent in meaning and be open ended in that an item or items following any one of these words is not meant to be an exhaustive listing of such item or items, or meant to be limited to only the listed item or items. It must also be noted that as used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural references unless the context clearly dictates otherwise.
[048] Furthermore, one or more computer-readable storage media may be utilized in implementing embodiments consistent with the present disclosure. A computer-readable storage medium refers to any type of physical memory on which information or data readable by a processor may be stored. Thus, a computer-readable storage medium may store instructions for execution by one or more processors, including instructions for causing the processor(s) to perform steps or stages consistent with the embodiments described herein. The term “computer-readable medium” should be understood to include tangible items and exclude carrier waves and transient signals, i.e., be non-transitory. Examples include random access memory (RAM), read-only memory (ROM), volatile memory, nonvolatile memory, hard drives, CD ROMs, DVDs, flash drives, disks, and any other known physical storage media.
[049] It is intended that the disclosure and examples be considered as exemplary only, with a true scope of disclosed embodiments being indicated by the following claims.

We Claim:
1. A processor implemented method (200) comprising the steps of:
applying, via one or more hardware processors, Principal Component Analysis (PCA) on a pixel matrix of an image of a document to be processed for identifying and extracting one or more signatures contained therein, to obtain a first principal component (202);
identifying, via the one or more hardware processors, a set of candidate signatures using a heuristic method, wherein the candidate signatures correspond to at least one Region Of Interest (ROI) in the obtained first principal component based on a change in pattern therein (204),
applying, via the one or more hardware processors, the PCA on the pixel matrix corresponding to each of the candidate signatures to obtain a second principal component associated thereof (206); and
identifying, via the one or more hardware processors, a set of extracted signatures using the heuristic method, wherein the extracted signatures correspond to at least one ROI in the obtained second principal component based on a change in pattern in the obtained second principal component corresponding to each of the candidate signatures (208).
2. The processor implemented method of claim 1, wherein the step of applying
principal component analysis on a pixel matrix of an image of a document
is preceded by:
receiving the image of the document to be processed; and processing the received image to obtain the pixel matrix of the image of the document.
3. The processor implemented method of claim 2, wherein the step of
processing the received image preceded by resizing the received image to
an empirically determined size.

4. The processor implemented method of claim 1, wherein the heuristic
method comprises:
computing mean, mode, variance and standard deviation on the pixel matrix of the image of the document;
identifying regions with blank area using the computed mode and an empirically determined threshold to account for the blank area within the one or more signatures and in the image of the document; and
eliminating regions with photographs using the computed standard deviation and variance and empirically determined values for the standard deviation and variance.
5. The processor implemented method of claim 4, further comprising optimizing the identified at least one ROI corresponding to the candidate signatures and having a maximum coordinate and a minimum coordinate associated with a left, right, top and bottom edge, the optimizing comprises reducing the maximum coordinate associated with each of the left, right, top and bottom edge iteratively by a unit value till an estimated heuristic value is reached to obtain optimum coordinates associated thereof, such that the optimum coordinates are greater than the minimum coordinate.
6. The processor implemented method of claim 1, further comprising authenticating the extracted signatures (210) by:
obtaining legitimate signatures associated with each user from a database, wherein the legitimate signatures are resized to an empirically determined length and breadth;
for each of the obtained legitimate signatures, generating training samples corresponding to the legitimate signatures by:
randomly sampling a pixel of an image of the obtained legitimate signatures;
perturbing a value of pixels in a neighborhood of the sampled pixel to a minimum extent; and

storing signatures associated with the perturbed value of
pixels as the training samples of the legitimate signatures, such that
the training samples of the legitimate signatures account for
variation of the obtained legitimate signatures;
for each of the obtained legitimate signatures, generating training samples of illegitimate signatures by:
randomly sampling one to three pixels of an image of the
obtained legitimate signatures;
perturbing the pixels that are in a neighborhood of the
sampled one to three pixels to a maximum extent; and
storing signatures associated with the perturbed value of the
sampled one to three pixels as the training samples of the illegitimate
signatures of the obtained legitimate signatures, such that the
training samples the illegitimate signatures account for variations
thereof;
training an ensemble classifier comprising a combination of machine learning and deep learning models using the generated training samples of the legitimate signatures and the illegitimate signatures;
obtaining a first score and a second score for the extracted signatures using the trained ensemble classifier, wherein the first score is indicative of a closeness of the extracted signatures to the legitimate signatures associated with each of the obtained legitimate signatures and the second score is indicative of a closeness of the extracted signatures to the illegitimate signatures associated with each of the obtained legitimate signatures; and
authenticating the extracted signatures based on the obtained first score, the second score and an estimated threshold value .
7. The processor implemented method of claim 6, wherein the step of
authenticating the extracted signatures is preceded by preprocessing the extracted signatures to eliminate noise in the form of printed text and lines by:

dividing pixels associated with the extracted signatures into a plurality of sub-regions;
identifying lines and printed text in the extracted signatures by assessing values of the pixels in each of the plurality of sub-regions, wherein identical linear pixel values within and across the plurality of sub-regions is indicative of lines and pixel values in an empirically determined range for printed text is indicative of printed text in the plurality of sub-regions; and
replacing the pixel values of the pixels associated with the identified lines and printed text to 255 to eliminate the noise.
8. A system (100) comprising:
one or more data storage devices (102) operatively coupled to one or more hardware processors (104) and configured to store instructions configured for execution via the one or more hardware processors to:
apply, via one or more hardware processors, Principal Component Analysis (PCA) on a pixel matrix of an image of a document to be processed for identifying and extracting one or more signatures contained therein, to obtain a first principal component;
identify, via the one or more hardware processors, a set of candidate signatures using a heuristic method, wherein the candidate signatures correspond to at least one Region Of Interest (ROI) in the obtained first principal component based on a change in pattern therein,
apply, via the one or more hardware processors, the PCA on the pixel matrix corresponding to each of the candidate signatures to obtain a second principal component associated thereof; and
identify, via the one or more hardware processors, a set of extracted signatures using the heuristic method, wherein the extracted signatures correspond to at least one ROI in the obtained second principal component based on a change in pattern in the obtained second principal component corresponding to each of the candidate signatures.

9. The system of claim 8, wherein the one or more processors are further configured to receive the image of the document to be processed; and process the received image to obtain the pixel matrix of the image of the document prior to applying PCA on the pixel matrix of the image of the document.
10. The system of claim 9, wherein the one or more processors are further configured to resize the received image to an empirically determined size prior to processing the received image.
11. The system of claim 8, wherein the one or more processors are further configured to perform the heuristic method comprising:
computing mean, mode, variance and standard deviation on the pixel matrix of the image of the document;
identifying regions with blank area using the computed mode and an empirically determined threshold to account for the blank area within the one or more signatures and in the image of the document; and
eliminating regions with photographs using the computed standard deviation and variance and empirically determined values for the standard deviation and variance.
12. The system of claim 11, wherein the one or more processors are further
configured to optimize the identified at least one ROI corresponding to the
candidate signatures and having a maximum coordinate and a minimum
coordinate associated with a left, right, top and bottom edge by reducing the
maximum coordinate associated with each of the left, right, top and bottom
edge iteratively by a unit value till an estimated heuristic value is reached to
obtain optimum coordinates associated thereof, such that the optimum
coordinates are greater than the minimum coordinate.

13. The system of claim 8, wherein the one or more processors are further
configured to authenticate the extracted signatures by:
obtaining legitimate signatures associated with each user from a database, wherein the legitimate signatures are resized to an empirically determined length and breadth;
for each of the obtained legitimate signatures, generating training samples corresponding to the legitimate signatures by:
randomly sampling a pixel of an image of the obtained legitimate signatures;
perturbing a value of pixels in a neighborhood of the sampled pixel to a minimum extent; and
storing signatures associated with the perturbed value of pixels as the training samples of the legitimate signatures, such that the training samples of the legitimate signatures account for variation of the obtained legitimate signatures;
for each of the obtained legitimate signatures, generating training samples of illegitimate signatures by:
randomly sampling one to three pixels of an image of the obtained legitimate signatures;
perturbing the pixels that are in a neighborhood of the sampled one to three pixels to a maximum extent; and
storing signatures associated with the perturbed value of the sampled one to three pixels as the training samples of the illegitimate signatures of the obtained legitimate signatures, such that the training samples the illegitimate signatures account for variations thereof;
training an ensemble classifier comprising a combination of machine learning and deep learning models using the generated training samples of the legitimate signatures and the illegitimate signatures;
obtaining a first score and a second score for the extracted signatures using the trained ensemble classifier, wherein the first score is indicative of

a closeness of the extracted signatures to the legitimate signatures associated with each of the obtained legitimate signatures and the second score is indicative of a closeness of the extracted signatures to the illegitimate signatures associated with each of the obtained legitimate signatures; and
authenticating the extracted signatures based on the obtained first score, the second score and an estimated threshold value.
14. The system of claim 13, wherein the one or more processors are further
configured to preprocess the extracted signatures to eliminate noise in the form of printed text and lines, prior to authenticating the extracted signatures, by:
dividing pixels associated with the extracted signatures into a plurality of sub-regions;
identifying lines and printed text in the extracted signatures by assessing values of the pixels in each of the plurality of sub-regions, wherein identical linear pixel values within and across the plurality of sub-regions is indicative of lines and pixel values in an empirically determined range for printed text is indicative of printed text in the plurality of sub-regions; and
replacing the pixel values of the pixels associated with the identified lines and printed text to 255 to eliminate the noise.

Documents

Orders

Section Controller Decision Date

Application Documents

# Name Date
1 202021021574-IntimationOfGrant25-11-2024.pdf 2024-11-25
1 202021021574-STATEMENT OF UNDERTAKING (FORM 3) [22-05-2020(online)].pdf 2020-05-22
1 202021021574-Written submissions and relevant documents [06-11-2024(online)].pdf 2024-11-06
2 202021021574-REQUEST FOR EXAMINATION (FORM-18) [22-05-2020(online)].pdf 2020-05-22
2 202021021574-PatentCertificate25-11-2024.pdf 2024-11-25
2 202021021574-Correspondence to notify the Controller [18-10-2024(online)].pdf 2024-10-18
3 202021021574-FORM 18 [22-05-2020(online)].pdf 2020-05-22
3 202021021574-ORIGINAL UR 6(1A) FORM 1-121124.pdf 2024-11-14
3 202021021574-US(14)-HearingNotice-(HearingDate-22-10-2024).pdf 2024-10-05
4 202021021574-Correspondence to notify the Controller [28-03-2024(online)].pdf 2024-03-28
4 202021021574-FORM 1 [22-05-2020(online)].pdf 2020-05-22
4 202021021574-Written submissions and relevant documents [06-11-2024(online)].pdf 2024-11-06
5 202021021574-FORM-26 [22-12-2023(online)].pdf 2023-12-22
5 202021021574-FIGURE OF ABSTRACT [22-05-2020(online)].jpg 2020-05-22
5 202021021574-Correspondence to notify the Controller [18-10-2024(online)].pdf 2024-10-18
6 202021021574-US(14)-HearingNotice-(HearingDate-22-10-2024).pdf 2024-10-05
6 202021021574-US(14)-HearingNotice-(HearingDate-03-04-2024).pdf 2023-12-14
6 202021021574-DRAWINGS [22-05-2020(online)].pdf 2020-05-22
7 202021021574-DECLARATION OF INVENTORSHIP (FORM 5) [22-05-2020(online)].pdf 2020-05-22
7 202021021574-Correspondence to notify the Controller [28-03-2024(online)].pdf 2024-03-28
7 202021021574-CLAIMS [17-05-2022(online)].pdf 2022-05-17
8 202021021574-COMPLETE SPECIFICATION [22-05-2020(online)].pdf 2020-05-22
8 202021021574-FER_SER_REPLY [17-05-2022(online)].pdf 2022-05-17
8 202021021574-FORM-26 [22-12-2023(online)].pdf 2023-12-22
9 202021021574-FER.pdf 2021-12-03
9 202021021574-US(14)-HearingNotice-(HearingDate-03-04-2024).pdf 2023-12-14
9 Abstract1.jpg 2020-08-07
10 202021021574-CLAIMS [17-05-2022(online)].pdf 2022-05-17
10 202021021574-FORM-26 [16-10-2020(online)].pdf 2020-10-16
10 202021021574-Proof of Right [05-11-2020(online)].pdf 2020-11-05
11 202021021574-FER_SER_REPLY [17-05-2022(online)].pdf 2022-05-17
11 202021021574-FORM-26 [16-10-2020(online)].pdf 2020-10-16
11 202021021574-Proof of Right [05-11-2020(online)].pdf 2020-11-05
12 202021021574-FER.pdf 2021-12-03
12 Abstract1.jpg 2020-08-07
13 202021021574-COMPLETE SPECIFICATION [22-05-2020(online)].pdf 2020-05-22
13 202021021574-FER_SER_REPLY [17-05-2022(online)].pdf 2022-05-17
13 202021021574-Proof of Right [05-11-2020(online)].pdf 2020-11-05
14 202021021574-FORM-26 [16-10-2020(online)].pdf 2020-10-16
14 202021021574-DECLARATION OF INVENTORSHIP (FORM 5) [22-05-2020(online)].pdf 2020-05-22
14 202021021574-CLAIMS [17-05-2022(online)].pdf 2022-05-17
15 202021021574-DRAWINGS [22-05-2020(online)].pdf 2020-05-22
15 202021021574-US(14)-HearingNotice-(HearingDate-03-04-2024).pdf 2023-12-14
15 Abstract1.jpg 2020-08-07
16 202021021574-COMPLETE SPECIFICATION [22-05-2020(online)].pdf 2020-05-22
16 202021021574-FIGURE OF ABSTRACT [22-05-2020(online)].jpg 2020-05-22
16 202021021574-FORM-26 [22-12-2023(online)].pdf 2023-12-22
17 202021021574-Correspondence to notify the Controller [28-03-2024(online)].pdf 2024-03-28
17 202021021574-DECLARATION OF INVENTORSHIP (FORM 5) [22-05-2020(online)].pdf 2020-05-22
17 202021021574-FORM 1 [22-05-2020(online)].pdf 2020-05-22
18 202021021574-DRAWINGS [22-05-2020(online)].pdf 2020-05-22
18 202021021574-US(14)-HearingNotice-(HearingDate-22-10-2024).pdf 2024-10-05
18 202021021574-FORM 18 [22-05-2020(online)].pdf 2020-05-22
19 202021021574-FIGURE OF ABSTRACT [22-05-2020(online)].jpg 2020-05-22
19 202021021574-REQUEST FOR EXAMINATION (FORM-18) [22-05-2020(online)].pdf 2020-05-22
19 202021021574-Correspondence to notify the Controller [18-10-2024(online)].pdf 2024-10-18
20 202021021574-Written submissions and relevant documents [06-11-2024(online)].pdf 2024-11-06
20 202021021574-STATEMENT OF UNDERTAKING (FORM 3) [22-05-2020(online)].pdf 2020-05-22
20 202021021574-FORM 1 [22-05-2020(online)].pdf 2020-05-22
21 202021021574-FORM 18 [22-05-2020(online)].pdf 2020-05-22
21 202021021574-ORIGINAL UR 6(1A) FORM 1-121124.pdf 2024-11-14
22 202021021574-PatentCertificate25-11-2024.pdf 2024-11-25
22 202021021574-REQUEST FOR EXAMINATION (FORM-18) [22-05-2020(online)].pdf 2020-05-22
23 202021021574-IntimationOfGrant25-11-2024.pdf 2024-11-25
23 202021021574-STATEMENT OF UNDERTAKING (FORM 3) [22-05-2020(online)].pdf 2020-05-22

Search Strategy

1 pca_signature_identificationE_02-12-2021.pdf

ERegister / Renewals

3rd: 03 Dec 2024

From 22/05/2022 - To 22/05/2023

4th: 03 Dec 2024

From 22/05/2023 - To 22/05/2024

5th: 03 Dec 2024

From 22/05/2024 - To 22/05/2025

6th: 08 Apr 2025

From 22/05/2025 - To 22/05/2026