Abstract: Methods and server systems for processing heterogeneous transaction graph to compute risk scores associated with each node are described herein. Method performed by server system includes accessing heterogeneous transaction graph and extracting fraud heterogeneous transaction graph from heterogeneous transaction graph based on plurality of node-related features including fraud transaction label. Method includes determining first and second set of weights for subset of edges of fraud heterogeneous transaction graph based on first and second edge definition criteria for each edge. Method includes computing for each node of the fraud heterogeneous transaction graph, set of first and second risk scores based on first and second set of weights. Method includes determining third set of weights for plurality of edges of heterogeneous transaction graph based on third edge definition criteria for each edge and computing set of third risk scores based on third set of weights.
Description: The present disclosure relates to a financial eco-system and, more particularly, to electronic methods and complex processing systems for processing a heterogeneous transaction graph to compute risk scores associated with each node of the heterogeneous transaction graph.
BACKGROUND
In the financial domain, with the ever-increasing number of digital payment transactions, the number of fraudulent payment transactions (or fraud transactions) is also increasing. A payment transaction is considered to be a fraud transaction if it is performed by a fraudster with the intent to defraud any entity, such as a cardholder or a merchant in the payment ecosystem. For instance, when a fraud cardholder uses a stolen payment card to perform a transaction at a legitimate merchant, then this transaction is a fraud transaction. Various types of financial fraud include first-party fraud, third-party fraud, return fraud, chargeback fraud, merchant fraud, cardholder fraud, and so on. To curb the ever-growing fraud transactions, various fraud detection techniques have been developed. Such fraud detection techniques play a crucial role in maintaining the smooth operation of the payment ecosystem.
Generally, classification-based Artificial Intelligence (AI) or Machine Learning (ML) models are used for performing fraud detection in the payment eco-system. These models are trained on historical payment transactions to learn suspicious behavior from cardholders or merchants labeled as fraud. Upon completion of the learning process, these models can predict whether another cardholder or merchant will commit fraud or not. Such predictions can be derived to label either historical or ongoing transactions as fraud or non-fraud. Conventionally, to train such models, historical transactions are converted to bipartite graphs and processed by these models to learn insights from the historical transactions to perform fraud detection. The term ‘Bipartite graph’ refers to versatile graph structures that represent a relationship between two distinct types of nodes connected via edges. For example, a bipartite transaction graph may include cardholder nodes and merchant nodes connected via edges where these edges represent transactions between the various cardholder nodes and the merchant nodes. These graph-based classification models show good performance while detecting fraud. However, such graph-based classification models only consider the interactions between two different entities, such as cardholders and merchants for detecting fraud.
As may be understood, with the implementation of the Three Domain Secure 2.0 (or 3DS2) protocol across the payment eco-system, the data that is being collected during a payment transaction has increased and diversified. Here, 3DS2 refers to a globally accepted security protocol that has been designed to protect a payment card from unauthorized online use. In various examples, the 3DS2 protocol allows a payment processor to collect data regarding various entities, such as cardholder Personal Account Number (PAN), merchant, cardholder Internet Protocol (IP address), shipping address, cardholder email, and the like. Therefore, it is now possible to learn from the interactions of the various entities in the 3DS2-enabled payment ecosystem to perform fraud detection. It is noted that if graphs are generated using data from these various entities, they will form undirected multipartite graphs (or heterogeneous transaction graphs). It is understood that it is very complex to learn insights from heterogeneous transaction graphs due to their inherent complexity and lack of scalable solutions. This makes it difficult to learn the behavior of various entities in the 3DS2 network, leading to poor performance while performing fraud detection.
Thus, there exists a technological need for technical solutions for processing a heterogeneous transaction graph to compute risk scores associated with each node of the heterogeneous transaction graph.
SUMMARY
Various embodiments of the present disclosure provide methods and systems for processing a heterogeneous transaction graph to compute risk scores associated with each node of the heterogeneous transaction graph.
In an embodiment, a computer-implemented method for processing a heterogeneous transaction graph to compute risk scores associated with each node of the heterogeneous transaction graph is disclosed. The computer-implemented method performed by a server system includes accessing a heterogeneous transaction graph from a database associated with the server system. The heterogeneous transaction graph includes a plurality of node sets associated with a plurality of entity sets and a plurality of edges. Herein, each node set includes a plurality of nodes representing a plurality of entities of an entity set. Further, each node is associated with a plurality of node-related features of an individual entity and an edge of the plurality of edges indicates information related to a transactional relationship between two distinct nodes connected by the edge. The computer-implemented method further includes extracting a fraud heterogeneous transaction graph from the heterogeneous transaction graph based, at least in part, on the plurality of node-related features including a fraud transaction label. The fraud heterogeneous transaction graph includes, for each node set, a subset of fraud nodes, a subset of non-fraud nodes, and a subset of edges between the subset of fraud nodes and the subset of non-fraud nodes. The computer-implemented method further includes determining a first set of weights for the subset of edges of the fraud heterogeneous transaction graph based, at least in part, on a first edge definition criteria for each edge. The first edge definition criteria include a fraud count metric and a fraud ratio metric. The computer-implemented method further includes determining a second set of weights for the subset of edges based, at least in part, on a second edge definition criteria for each edge. The second edge definition criteria include a decline count metric and a decline ratio metric. The computer-implemented method further includes computing for each node of the fraud heterogeneous transaction graph, a set of first risk scores, and a set of second risk scores, based, at least in part, on the first set of weights and the second set of weights. The computer-implemented method further includes determining a third set of weights for the plurality of edges of the heterogeneous transaction graph based, at least in part, on a third edge definition criteria for each edge. The third edge definition criteria include an approved count metric. The computer-implemented method further includes computing for each node of the heterogeneous transaction graph, a set of third risk scores based, at least in part, on the third set of weights.
In another embodiment, a server system is disclosed. The server system includes a communication interface and a memory including executable instructions. The server system also includes a processor communicably coupled to the memory. The processor is configured to execute the instructions to cause the server system, at least in part, to access a heterogeneous transaction graph from a database associated with the server system. The heterogeneous transaction graph includes a plurality of node sets associated with a plurality of entity sets and a plurality of edges. Herein, each node set includes a plurality of nodes representing a plurality of entities of an entity set. Further, each node is associated with a plurality of node-related features of an individual entity and an edge of the plurality of edges indicates information related to a transactional relationship between two distinct nodes connected by the edge. The server system is further caused to extract a fraud heterogeneous transaction graph from the heterogeneous transaction graph based, at least in part, on the plurality of node-related features including a fraud transaction label. The fraud heterogeneous transaction graph includes, for each node set, a subset of fraud nodes, a subset of non-fraud nodes, and a subset of edges between the subset of fraud nodes and the subset of non-fraud nodes. The server system is further caused to determine a first set of weights for the subset of edges of the fraud heterogeneous transaction graph based, at least in part, on a first edge definition criteria for each edge. The first edge definition criteria include a fraud count metric and a fraud ratio metric. The server system is further caused to determine a second set of weights for the subset of edges based, at least in part, on a second edge definition criteria for each edge. The second edge definition criteria include a decline count metric and a decline ratio metric. The server system is further caused to compute for each node of the fraud heterogeneous transaction graph, a set of first risk scores, and a set of second risk scores, based, at least in part, on the first set of weights and the second set of weights. The server system is further caused to determine a third set of weights for the plurality of edges of the heterogeneous transaction graph based, at least in part, on a third edge definition criteria for each edge. The third edge definition criteria include an approved count metric. The server system is further caused to compute for each node of the heterogeneous transaction graph, a set of third risk scores based, at least in part, on the third set of weights.
In yet another embodiment, a non-transitory computer-readable storage medium is disclosed. The non-transitory computer-readable storage medium includes computer-executable instructions that, when executed by at least a processor of a server system, cause the server system to perform a method. The method includes accessing a heterogeneous transaction graph from a database associated with the server system. The heterogeneous transaction graph includes a plurality of node sets associated with a plurality of entity sets and a plurality of edges. Herein, each node set includes a plurality of nodes representing a plurality of entities of an entity set. Further, each node is associated with a plurality of node-related features of an individual entity and an edge of the plurality of edges indicates information related to a transactional relationship between two distinct nodes connected by the edge. The method further includes extracting a fraud heterogeneous transaction graph from the heterogeneous transaction graph based, at least in part, on the plurality of node-related features including a fraud transaction label. The fraud heterogeneous transaction graph includes, for each node set, a subset of fraud nodes, a subset of non-fraud nodes, and a subset of edges between the subset of fraud nodes and the subset of non-fraud nodes. The method further includes determining a first set of weights for the subset of edges of the fraud heterogeneous transaction graph based, at least in part, on a first edge definition criteria for each edge. The first edge definition criteria include a fraud count metric and a fraud ratio metric. The method further includes determining a second set of weights for the subset of edges based, at least in part, on a second edge definition criteria for each edge. The second edge definition criteria include a decline count metric and a decline ratio metric. The method further includes computing for each node of the fraud heterogeneous transaction graph, a set of first risk scores, and a set of second risk scores, based, at least in part, on the first set of weights and the second set of weights. The method further includes determining a third set of weights for the plurality of edges of the heterogeneous transaction graph based, at least in part, on a third edge definition criteria for each edge. The third edge definition criteria include an approved count metric. The method further includes computing for each node of the heterogeneous transaction graph, a set of third risk scores based, at least in part, on the third set of weights.
The foregoing summary is illustrative only and is not intended to be in any way limiting. In addition to the illustrative aspects, embodiments, and features described above, further aspects, embodiments, and features will become apparent by reference to the drawings and the following detailed description.
BRIEF DESCRIPTION OF THE FIGURES
For a more complete understanding of example embodiments of the present technology, reference is now made to the following descriptions taken in connection with the accompanying drawings in which:
FIG. 1 illustrates a schematic representation of an environment related to at least some example embodiments of the present disclosure;
FIG. 2 illustrates a simplified block diagram of a server system, in accordance with an embodiment of the present disclosure;
FIG. 3 illustrates a schematic representation of a heterogeneous transaction graph, in accordance with an embodiment of the present disclosure;
FIG. 4 illustrates an architecture depicting the processing of a heterogeneous transaction graph using a page rank model and a classification model, in accordance with an embodiment of the present disclosure;
FIGS. 5A, and 5B, collectively, illustrate a process flow diagram depicting a method for computing various risk scores associated with each node of the heterogeneous transaction graph, in accordance with an embodiment of the present disclosure;
FIG. 6 illustrates a simplified block diagram of an acquirer server, in accordance with an embodiment of the present disclosure;
FIG. 7 illustrates a simplified block diagram of an issuer server, in accordance with an embodiment of the present disclosure; and
FIG. 8 illustrates a simplified block diagram of a payment server, in accordance with an embodiment of the present disclosure.
The drawings referred to in this description are not to be understood as being drawn to scale except if specifically noted, and such drawings are only exemplary in nature.
DETAILED DESCRIPTION
In the following description, for purposes of explanation, numerous specific details are set forth to provide a thorough understanding of the present disclosure. It will be apparent, however, to one skilled in the art that the present disclosure can be practiced without these specific details.
Reference in this specification to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. The appearance of the phrase “in an embodiment” in various places in the specification does not necessarily all refer to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Moreover, various features are described which may be exhibited by some embodiments and not by others. Similarly, various requirements are described which may be requirements for some embodiments but not for other embodiments.
Moreover, although the following description contains many specifics for the purposes of illustration, anyone skilled in the art will appreciate that many variations and/or alterations to said details are within the scope of the present disclosure. Similarly, although many of the features of the present disclosure are described in terms of each other, or in conjunction with each other, one skilled in the art will appreciate that many of these features can be provided independently of other features. Accordingly, this description of the present disclosure is set forth without any loss of generality to, and without imposing limitations upon, the present disclosure.
Embodiments of the present disclosure may be embodied as an apparatus, a system, a method, or a computer program product. Accordingly, embodiments of the present disclosure may take the form of an entire hardware embodiment, an entire software embodiment (including firmware, resident software, micro-code, etc.), or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “circuit”, “engine”, “module”, or “system”. Furthermore, aspects of the present disclosure may take the form of a computer program product embodied in one or more computer-readable storage media having computer-readable program code embodied thereon.
The terms “account holder”, “user”, “cardholder”, “consumer”, “buyer”, and “customer” are used interchangeably throughout the description and refer to a person who has a payment account or a payment card (e.g., credit card, debit card, etc.) associated with the payment account, that will be used by a merchant to perform a payment transaction. The payment account may be opened via an issuing bank or an issuer server.
The term “merchant”, used throughout the description generally refers to a seller, a retailer, a purchase location, an organization, or any other entity that is in the business of selling goods or providing services, and it can refer to either a single business location or a chain of business locations of the same entity.
The terms “payment network” and “card network” are used interchangeably throughout the description and refer to a network or collection of systems used for the transfer of funds using cash substitutes. Payment networks may use a variety of different protocols and procedures to process the transfer of money for various types of transactions. Payment networks are companies that connect an issuing bank with an acquiring bank to facilitate an online payment. Transactions that may be performed via a payment network may include product or service purchases, credit purchases, debit transactions, fund transfers, account withdrawals, etc. Payment networks may be configured to perform transactions via cash substitutes that may include payment cards, letters of credit, checks, financial accounts, etc. Examples of networks or systems configured to perform as payment networks include those operated by such as Mastercard®.
The term “payment card”, used throughout the description, refers to a physical or virtual card linked with a financial or payment account that may be presented to a merchant or any such facility to fund a financial transaction via the associated payment account. Examples of the payment card include, but are not limited to, debit cards, credit cards, prepaid cards, virtual payment numbers, virtual card numbers, forex cards, charge cards, e-wallet cards, and stored-value cards. A payment card may be a physical card that may be presented to the merchant for funding the payment. Alternatively, or additionally, the payment card may be embodied in the form of data stored in a user device, where the data is associated with a payment account such that the data can be used to process the financial transaction between the payment account and a merchant’s financial account.
The term “payment account”, used throughout the description refers to a financial account that is used to fund a financial transaction. Examples of the financial account include, but are not limited to a savings account, a credit account, a checking account, and a virtual payment account. The financial account may be associated with an entity, such as an individual person, a family, a commercial entity, a company, a corporation, a governmental entity, a non-profit organization, and the like. In some scenarios, the financial account may be a virtual or temporary payment account that can be mapped or linked to a primary financial account, such as those accounts managed by payment wallet service providers, and the like.
The terms “payment transaction”, “financial transaction”, “event”, and “transaction” are used interchangeably throughout the description and refer to a transaction of payment of a certain amount being initiated by the cardholder. More specifically, refers to electronic financial transactions including, for example, online payment, payment at a terminal (e.g., Point of Sale (POS) terminal), and the like. Generally, a payment transaction is performed between two entities, such as a buyer and a seller. It is to be noted that a payment transaction is followed by a payment transfer of a transaction amount (i.e., monetary value) from one entity (e.g., issuing bank associated with the buyer) to another entity (e.g., acquiring bank associated with the seller), in exchange of any goods or services.
The term ‘set’ refers to a collection of well-defined, unordered objects called elements or members. For example, the phrases a ‘set of entities’, and a ‘set of nodes’ refer to collection of nodes and entities, respectively.
OVERVIEW
Various embodiments of the present disclosure provide methods, systems, user devices, and computer program products for processing a heterogeneous transaction graph to compute risk scores associated with each node of the heterogeneous transaction graph. In a specific embodiment, the server system may be embodied within a payment server associated with a payment network.
In one embodiment, the server system is configured to access a heterogeneous transaction graph from a database associated with the server system. In case the heterogeneous transaction graph is not available in the database, it may be generated by accessing a historical transaction dataset from the database. Thus, in an embodiment, the server system may be configured to access the historical transaction dataset from the database. The historical transaction dataset may include information related to each entity of a plurality of entities in each entity set and a relationship between the plurality entities. The server system may further be configured to generate the plurality of node-related features for each entity of the plurality of entities in each entity set based, at least in part, on the historical transaction dataset.
Further, the server system may generate the heterogeneous transaction graph based, at least in part, on the historical transaction dataset and the plurality of node-related features for each entity. The heterogeneous transaction graph may include a plurality of node sets associated with a plurality of entity sets and a plurality of edges. Further, each node set may include a plurality of nodes representing the plurality of entities of an entity set. Further, each node is associated with the plurality of node-related features of an individual entity. An edge of the plurality of edges may indicate information related to a transactional relationship between two distinct nodes connected by the edge.
The server system is further configured to extract a fraud heterogeneous transaction graph from the heterogeneous transaction graph based, at least in part, on the plurality of node-related features including a fraud transaction label. The fraud heterogeneous transaction graph may include, for each node set, a subset of fraud nodes, a subset of non-fraud nodes, and a subset of edges between the subset of fraud nodes and the subset of non-fraud nodes. Further, the server system is configured to determine a first set of weights for the subset of edges of the fraud heterogeneous transaction graph based, at least in part, on a first edge definition criteria for each edge. The first edge definition criteria may include a fraud count metric and a fraud ratio metric.
In an embodiment, the fraud count metric indicates a count of the fraud transactions between the distinct nodes connected by an individual edge from the subset of edges. Similarly, in an embodiment, the fraud ratio metric is a ratio of fraud transactions and non-fraud transactions between the distinct nodes connected by the individual edge.
Furthermore, the server system may be configured to determine a second set of weights for the subset of edges based, at least in part, on a second edge definition criteria for each edge. The second edge definition criteria may include a decline count metric and a decline ratio metric. In an embodiment, the decline count metric indicates a count of declined transactions between the distinct nodes connected by an individual edge from the subset of edges. Similarly, in an embodiment, the decline ratio metric is a ratio of declined transactions and approved transactions between the distinct nodes connected by the individual edge.
The server system may be configured to compute for each node of the fraud heterogeneous transaction graph, a set of first risk scores, and a set of second risk scores, based, at least in part, on the first set of weights and the second set of weights. In a non-limiting implementation, the server system may be configured to compute the set of first risk scores for each node by iteratively processing the fraud heterogeneous transaction graph till each node of the fraud heterogeneous transaction graph is traversed. The server system may iteratively process the fraud heterogeneous transaction graph using a page rank model associated with the server system. In one embodiment, the processing may include traversing the fraud heterogeneous transaction graph from a selected random node of the fraud heterogeneous transaction graph for all candidate edges satisfying the first edge definition criteria. Herein, a candidate edge is not a dangling edge.
The server system may further compute a first traverse count for each node based on a number of times each node is traversed. Finally, the server system may determine the set of first risk scores for each node based, at least in part, on the first traverse count of the corresponding node. In one embodiment, the set of first risk scores may include a first rank feature and a second rank feature. The first rank feature is computed by setting the first edge definition criteria as the fraud count metric and the second rank feature is computed by setting the first edge definition criteria as the fraud ratio metric.
In a non-limiting implementation, the server system may be configured to compute the set of second risk scores for each node by iteratively processing the fraud heterogeneous transaction graph till each node of the fraud heterogeneous transaction graph is traversed. In an embodiment, the server system may iteratively process the fraud heterogeneous transaction graph using the page rank model. The processing may include traversing the fraud heterogeneous transaction graph from a selected random node of the fraud heterogeneous transaction graph for all candidate edges satisfying the second edge definition criteria. Herein, a candidate edge is not a dangling edge.
Further, the server system may be configured to compute a second traverse count for each node based on a number of times each node is traversed. The server system may finally determine the set of second risk scores for each node based, at least in part, on the second traverse count of the corresponding node. In one embodiment, the set of second risk scores may include a third rank feature and a fourth rank feature. The third rank feature is computed by setting the second edge definition criteria as the decline count metric. The fourth rank feature is computed by setting the second edge definition criteria as the decline ratio metric.
In a specific embodiment, for computing the set of second risk scores for each node, the server system may further be configured to determine a subset of preferred nodes from the fraud heterogeneous transaction graph based, at least in part, on the fraud ratio metric and a first preference threshold. The server system may be configured to iteratively process the fraud heterogeneous transaction graph till each node of the fraud heterogeneous transaction graph is traversed using the page rank model. The processing may include traversing the fraud heterogeneous transaction graph from a selected random node of the fraud heterogeneous transaction graph for all candidate edges satisfying the second edge definition criteria. The processing may further include randomly re-selecting another starting node from the subset of preferred nodes for the next iteration. The server system may further be configured to a third traverse count for each node based on a number of times each node is traversed. The server system may finally determine the second risk score for each node based, at least in part, on the third traverse count of the corresponding node.
In an embodiment, the set of second risk scores may further include a fifth rank feature and a sixth rank feature. The fifth rank feature is computed by setting the second edge definition criteria as the decline count metric. The sixth rank feature is computed by setting the second edge definition criteria as the decline ratio metric.
Moreover, the server system may be configured to determine a third set of weights for the plurality of edges of the heterogeneous transaction graph based, at least in part, on a third edge definition criteria for each edge, the third edge definition criteria comprising an approved count metric. In an embodiment, the approved count metric indicates a count of approved transactions between distinct nodes connected by the individual edge. The server system is further configured to compute for each node of the heterogeneous transaction graph, a set of third risk scores based, at least in part, on the third set of weights.
In a non-limiting implementation, to compute the set of third risk scores for each node, the server system may be configured to iteratively process the heterogeneous transaction graph till each node of the heterogeneous transaction graph is traversed using the page rank model. The processing may include traversing the heterogeneous transaction graph from a selected random node of the heterogeneous transaction graph for all candidate edges satisfying the third edge definition criteria. Herein, a candidate edge is not a dangling edge. The server system may further be configured to compute a fourth traverse count for each node based on a number of times each node is traversed. The server system may determine a seventh rank feature for each node based, at least in part, on the fourth traverse count of the corresponding node. The set of third risk scores may include the seventh rank feature.
Further, to compute the set of third risk scores for each node, the server system may be configured to determine a subset of preferred nodes from the heterogeneous transaction graph based, at least in part, on the approved count metric and a second preference threshold. The server system may further be configured to iteratively process the heterogeneous transaction graph till each node of the heterogeneous transaction graph is traversed using the page rank model. The processing may include traversing the heterogeneous transaction graph from a selected random node of the heterogeneous transaction graph for all candidate edges satisfying the third edge definition criteria. The processing may further include randomly re-selecting another starting node from the subset of preferred nodes for the next iteration. The server system may then be configured to compute a fifth traverse count for each node based on a number of times each node is traversed. The server system may determine the eighth rank feature for each node based, at least in part, on the fifth traverse count of the corresponding node, the set of third risk scores comprising the eighth rank feature.
In a non-limiting implementation, the server system is further configured to receive a transaction authentication request for a transaction between a first entity and a second entity from the plurality of entity set. The server system is further configured to generate, for each node of each node set associated with the plurality of entity sets of the heterogeneous transaction graph, a set of updated node-related features based, at least in part, on the plurality of node-related features and corresponding set of first risk scores, corresponding set of second risk scores, and corresponding set of third risk scores.
Further, the server system is configured to determine the plurality of updated node-related features of the first entity and the plurality of updated node-related features of the second entity. Furthermore, the server system is configured to determine a fraud score for the transaction based, at least in part, on the plurality of updated node-related features of the first entity and the plurality of updated node-related features of the second entity. In a specific embodiment, the server system may determine the fraud score using a classification model associated with the server system. In one embodiment, the server system may label the transaction as a fraudulent transaction based, at least in part, on the corresponding fraud score of the transaction being at least equal to a threshold value. In another embodiment, the server system may label the transaction as a non-fraudulent transaction based, at least in part, on the corresponding fraud score of the transaction being lower than a threshold value.
Various embodiments of the present disclosure offer multiple advantages and technical effects. For instance, the present disclosure aims to solve the technical problem of how to improve the performance of a classification model i.e., configured to perform a Risk-Based Authentication (RBA). The present disclosure solves this technical problem by providing an approach that enables the classification model to learn from the 3DS2 data.
Further, the present disclosure provides various technical effects such as generating a set of first, second, and third risk scores, respectively. In other words, the present disclosure provides eight additional features that can be used by the classification model to learn additional insights, thus improving its prediction accuracy. More specifically, the presence of these eight additional features in the updated node-related features which represent the insights or learnings from the 3DS2 data, helps to improve the prediction of a fraud score for any transaction by the classification model. In some scenarios, to perform the RBA for an ongoing transaction, within the 3DS 2.0 protocol for a payment eco-system a classification model, may be utilized by the directory server or the server system. This classification model can generate the fraud score for the transaction based on both the updated node-related features of the cardholder and the merchant. Further, the directory server may append the fraud score of the transaction within a transaction authentication request for the ongoing transaction within the 3DS2 protocol. Then, the directory server transmits the transaction authentication request to the Access Control Server (ACS) associated with the issuer of the cardholder. Then, the issuer or ACS may rely on the fraud score to determine whether to approve or decline the ongoing transaction based on its internal policies. As may be understood, since the fraud score is generated in part using the eight additional features described herein, the accuracy of the fraud prediction by the said fraud score for a transaction is improved. Thus, allowing for improved RBA in the 3DS 2.0 protocol.
Various embodiments of the present disclosure are described hereinafter with reference to FIGS. 1 to 8.
FIG. 1 illustrates a schematic representation of an environment 100 related to at least some embodiments of the present disclosure. Although the environment 100 is presented in one arrangement, other embodiments may include the parts of the environment 100 (or other parts) arranged otherwise depending on, for example, generating a heterogeneous transaction graph, processing the heterogeneous transaction graph using AI/ML models, computing risk scores using the heterogeneous transaction graph, performing fraud classification using the computed risk scores, and the like.
The environment 100 generally includes a plurality of entities, such as a server system 102, a plurality of cardholders 104(1), 104(2), … 104(N) (collectively, referred to as ‘a plurality of cardholders 104’ and ‘N’ is a Natural number), a plurality of merchants 106(1), 106(2), … 106(N) (collectively, referred to as ‘a plurality of merchants 106’ and ‘N’ is a Natural number), an acquirer server 108, an issuer server 110, and a payment network 112 including a payment server 114, each coupled to, and in communication with (and/or with access to) a network 116. The network 116 may include, without limitation, a Light Fidelity (Li-Fi) network, a Local Area Network (LAN), a Wide Area Network (WAN), a Metropolitan Area Network (MAN), a satellite network, the Internet, a fiber optic network, a coaxial cable network, an Infrared (IR) network, a Radio Frequency (RF) network, a virtual network, and/or another suitable public and/or private network capable of supporting communication among two or more of the parts or users illustrated in FIG. 1, or any combination thereof.
Various entities in the environment 100 may connect to the network 116 in accordance with various wired and wireless communication protocols, such as Transmission Control Protocol/Internet Protocol (TCP/IP), User Datagram Protocol (UDP), 2nd Generation (2G), 3rd Generation (3G), 4th Generation (4G), 5th Generation (5G) communication protocols, Long Term Evolution (LTE) communication protocols, future communication protocols or any combination thereof. For example, the network 116 may include multiple different networks, such as a private network made accessible by the server system 102 and a public network (e.g., the Internet, etc.) through which the server system 102, the acquirer server 108, the issuer server 110, and the payment server 114 may communicate.
In an embodiment, the plurality of cardholders 104 use one or more payment cards 118(1), 118(2), … 118(N) (collectively, referred to hereinafter as a plurality of payment cards 118 and ‘N’ is a Natural number) respectively to make payment transactions.
The cardholder (e.g., the cardholder 104(1)) may be any individual, representative of a corporate entity, a non-profit organization, or any other person who is presenting payment account details during an electronic payment transaction. The cardholder (e.g., the cardholder 104(1)) may have a payment account issued by an issuing bank (not shown in figures) associated with the issuer server 110 (explained later) and may be provided a payment card (e.g., the payment card 118(1)) with financial or other account information encoded onto the payment card (e.g., the payment card 118(1)) such that the cardholder (i.e., the cardholder 104(1)) may use the payment card 118(1) to initiate and complete a payment transaction using a bank account at the issuing bank.
In an example, the plurality of cardholders 104 may use their corresponding electronic devices (not shown in figures) to access a mobile application or a website associated with the issuing bank, or any third-party payment application. In various non-limiting examples, the electronic devices may refer to any electronic devices, such as, but not limited to, Personal Computers (PCs), tablet devices, Personal Digital Assistants (PDAs), voice-activated assistants, Virtual Reality (VR) devices, smartphones, and laptops.
The plurality of merchants 106 may include retail shops, restaurants, supermarkets or establishments, government and/or private agencies, or any such places equipped with POS terminals, where customers visit to perform financial transactions in exchange for any goods and/or services or any financial transactions.
In one scenario, the plurality of cardholders 104 may use their corresponding payment accounts to conduct payment transactions with the plurality of merchants 106. Moreover, it may be noted that each of the plurality of cardholders 104 may use their corresponding plurality of payment cards 118 differently or make the payment transaction using different means of payment. For instance, the cardholder 104(1) may enter payment account details on an electronic device (not shown) associated with the cardholder 104(1) to perform an online payment transaction. In another example, the cardholder 104(2) may utilize the payment card 118(2) to perform an offline payment transaction. The term “payment transaction” refers to an agreement that is carried out between a buyer and a seller to exchange goods or services in exchange for assets in the form of a payment (e.g., cash, fiat-currency, digital asset, cryptographic currency, coins, tokens, etc.). For example, the cardholder 104(3) may enter details of the payment card 118(3) to transfer funds in the form of fiat currency on an e-commerce platform to buy goods. In another instance, each cardholder of the plurality of cardholders 104 (e.g., the cardholder 104(1)) may transact at any merchant from the plurality of merchants 106 (e.g., the merchant 106(1)).
In one embodiment, the plurality of cardholders 104 is associated with the issuer server 110. In one embodiment, the issuer server 110 is associated with a financial institution normally called an “issuer bank”, “issuing bank” or simply “issuer”, in which a cardholder (e.g., the cardholder 104(1)) may have the payment account, (which also issues a payment card, such as a credit card or a debit card), and provides microfinance banking services (e.g., payment transaction using credit/debit cards) for processing electronic payment transactions, to the cardholder (e.g., the cardholder 104(1)).
In an embodiment, the plurality of merchants 106 is associated with the acquirer server 108. In an embodiment, each merchant (e.g., the merchant 106(1)) is associated with an acquirer server (e.g., the acquirer server 108). In one embodiment, the acquirer server 108 is associated with a financial institution (e.g., a bank) that processes financial transactions. This can be an institution that facilitates the processing of payment transactions for physical stores, merchants (e.g., the merchants 106), or institutions that own platforms that make either online purchases or purchases made via software applications possible (e.g., shopping cart platform providers and in-app payment processing providers). The terms “acquirer”, “acquiring bank”, “acquiring bank” or “acquirer server” will be used interchangeably herein.
In one embodiment, the payment network 112 may be used by the payment card issuing authorities as a payment interchange network. Examples of the plurality of payment cards 118 include debit cards, credit cards, etc. Similarly, examples of payment interchange networks include but are not limited to, a Mastercard® payment system interchange network. The Mastercard® payment system interchange network is a proprietary communications standard promulgated by Mastercard International Incorporated® for the exchange of electronic payment transaction data between issuers and acquirers that are members of Mastercard International Incorporated®. (Mastercard is a registered trademark of Mastercard International Incorporated located in Purchase, N.Y.).
As explained earlier, there exist multiple challenges in performing fraud classification using Three Domain Secure 2.0 (3DS2) data due to its complexity. In particular, there exists a need for a solution that can learn from the interactions and behavior of the various entities in the 3DS2-enabled payment ecosystem to perform fraud detection within the payment eco-system. If graphs are generated using 3DS2 data regarding various entities, such as cardholder PAN, merchant, cardholder IP address, shipping address, cardholder email, and the like, then it will be an undirected multipartite graph (or a heterogeneous transaction graph). As may be understood, it is very complex to learn insights from heterogeneous transaction graphs due to their inherent complexity and lack of scalable solutions.
The above-mentioned technical problem among other problems is addressed by one or more embodiments implemented by the server system 102 of the present disclosure. In one embodiment, the server system 102 is configured to perform one or more of the operations described herein.
In one embodiment, the environment 100 may further include a database 120 coupled with the server system 102. In an example, the server system 102 coupled with the database 120 is embodied within the payment server 114, however, in other examples, the server system 102 can be a standalone component (acting as a hub) connected to the acquirer server 108 and the issuer server 110. The database 120 may be incorporated in the server system 102 or maybe an individual entity connected to the server system 102 or maybe a database stored in cloud storage. In one embodiment, the database 120 may store a page rank model 122, and other necessary machine instructions required for implementing the various functionalities of the server system 102, such as firmware data, operating system, and the like.
In an example, that page rank model 122 is an AI or ML-based model that is configured or trained to perform a plurality of operations. In a non-limiting example, the page rank model 122 is a link analysis model. The page rank model 122 can be a graph-based model that is specifically designed to analyze graphs where nodes represent different entities and edges represent the links between different nodes. It is noted that the page rank model 122 has been explained in detail later in the present disclosure with reference to FIG. 2 and FIG. 4. In addition, the database 120 provides a storage location for data and/or metadata obtained from various operations performed by the server system 102.
In various non-limiting examples, the database 120 may include one or more Hard Disk Drives (HDD), Solid-State Drives (SSD), an Advanced Technology Attachment (ATA) adapter, a Serial ATA (SATA) adapter, a Small Computer System Interface (SCSI) adapter, a Redundant Array of Independent Disks (RAID) controller, a Storage Area Network (SAN) adapter, a network adapter, and/or any component providing the server system 102 with access to the database 120. In one implementation, the database 120 may be viewed, accessed, amended, updated, and/or deleted by an administrator (not shown) associated with the server system 102 through a database management system (DBMS) or relational database management system (RDBMS) present within the database 120.
In an embodiment, the server system 102 is configured to access a heterogeneous transaction graph from the database 120. The heterogeneous transaction graph may include a plurality of node sets associated with a plurality of entity sets and a plurality of edges. Each node set may include a plurality of nodes that represent a plurality of entities of an entity set. In some instances, the heterogeneous transaction graph is generated using a historical transaction dataset (not shown) stored in the database 120. In some examples, the historical transaction dataset may include real-time or historical transaction data (including the 3DS2 data) of the plurality of cardholders 104 and the plurality of merchants 106. The historical transaction data may include, but is not limited to, transaction attributes, such as transaction amount, source of funds, such as bank or credit cards, transaction timestamp, transaction channel used for loading funds such as POS terminal or ATM, transaction velocity features, such as count and transaction amount sent in the past ‘x’ number of days to a particular user, transaction location information, external data sources, merchant country, merchant Identifier (ID), cardholder ID, cardholder product, cardholder Permanent Account Number (PAN), Merchant Category Code (MCC), merchant location data or merchant co-ordinates, merchant industry, merchant super industry, ticket price, fraud/non-fraud transaction label, approved/decline flag, and other transaction-related data.
In an embodiment, the heterogeneous transaction graph may be generated such that each node is associated with a plurality of node-related features of an individual entity. Furthermore, each distinct type of node belonging to different node sets and different entity sets may be linked using an edge such that each edge of the plurality of edges indicates information related to a transactional relationship between two distinct nodes connected by the edge. In various non-limiting examples, the plurality of node-related features may include transaction amount, transaction count in the past x days, cardholder presence indicator, card not present indicator, cardholder fraud count in the x days, domestic/international transaction, decline count in the past x days, approval count in the past x days, etc., among other suitable features.
In one instance, when a graph is generated using the 3DS2 data, the plurality of entity sets may include cardholders, merchants, IP addresses of the cardholders, and the like. The node sets for each entity set may correspond to the plurality of cardholders 104 for the cardholder entity set, the plurality of merchants 106 for the merchant entity set, the IP addresses corresponding to the plurality of cardholders 104 for the IP address entity set, and the like.
In an example, the edge between a particular cardholder node and a particular merchant node may indicate an interaction, such as a transaction between them. Further, the edge between that particular cardholder node and an IP address node may represent that the transaction took place with the merchant node through a cardholder device with the said IP address. Furthermore, the edge between the merchant node and the IP address node may represent that the merchant received the transaction request from that IP address node for the concerned transaction. It is understood that although the heterogeneous transaction graph along with the various embodiments of the present disclosure has been explained with reference to cardholders, merchants, and IP addresses as the entities, there may exist any number of other entities derived using the 3DS2 data, such as the shipping address, email address and so on in the heterogenous transaction graph. Further, the various embodiments of the present disclosure cover such other entities as well within its scope.
Then, the server system 102 is configured to extract a fraud heterogeneous transaction graph from the heterogeneous transaction graph based, at least in part, on the plurality of node-related features including a fraud transaction label. It is understood that for a historical transaction, it is generally known whether that transaction was fraudulent or non-fraudulent after a given lag period. The lag period may be defined as the period during which the fraud labels have not matured (i.e., the transactions are not labelled as fraud or non-fraud yet). Therefore, when a label associated with any node includes a feature identifying that node to be involved in a fraud transaction, this information can be used to generate the fraud heterogeneous transaction graph. It is understood that if a merchant node is responsible for a fraudulent transaction with a single cardholder node, then there is an increased likelihood of the same node being involved in further fraudulent activities as well. Therefore, the fraud heterogeneous transaction graph should include all the cardholder nodes and IP address nodes with which the said fraud merchant node has transacted over a given time period even if the other transactions are non-fraudulent. Due to this reason, it may be understood that the fraud heterogeneous transaction graph may include a subset of fraud nodes for each node set, a subset of non-fraud nodes for each node set, and a subset of edges.
Then, the server system 102 is configured to determine a first set of weights for the subset of edges of the fraud heterogeneous transaction graph based, at least in part, on a first edge definition criteria for each edge. The first edge definition criteria may include a fraud count metric and a fraud ratio metric. The fraud count metric refers to a count of the fraud transactions between the distinct nodes connected by an individual edge from the subset of edges. The fraud ratio metric refers to a ratio of fraud transactions and non-fraud transactions between the distinct nodes connected by the individual edge. It is understood that since only one edge definition may be used at a time for computing the weights of the edges, the first set of weights may further include a subset of primary first weights and a subset of secondary first weights. Such that the subset of primary first weights is computed using the fraud count metric as the edge definition and the subset of secondary first weights is computed using the fraud ratio metric as the edge definition.
Similarly, the server system 102 is configured to determine a second set of weights for the subset of edges based, at least in part, on a second edge definition criteria for each edge. The second edge definition criteria may include a decline count metric and a decline ratio metric. The decline count metric refers to a count of declined transactions between the distinct nodes connected by an individual edge from the subset of edges. The decline ratio metric refers to a ratio of declined transactions and approved transactions between the distinct nodes connected by the individual edge. As described earlier, only one edge definition may be used at a time for computing the weights of the edges, therefore the second set of weights may further include a subset of primary second weights and a subset of secondary second weights. Such that the subset of primary second weights is computed using the decline count metric as the edge definition and the subset of secondary second weights is computed using the decline ratio metric as the edge definition.
Then, the server system 102 is configured to determine a third set of weights for the plurality of edges of the heterogeneous transaction graph based, at least in part, on a third edge definition criteria for each edge. The third edge definition criteria may include an approved count metric. The approved count metric refers to a count of approved transactions between distinct nodes connected by the individual edge.
Then, the server system 102 is configured to compute, for each node of the fraud heterogeneous transaction graph, a set of first risk scores, and a set of second risk scores, based, at least in part, on the first set of weights and the second set of weights, respectively. In one example, the set of first risk scores for each node includes a first rank feature and a second rank feature. In one example, the set of second risk scores for each node includes a third rank feature, a fourth rank feature, a fifth feature, and a sixth feature. In particular, the server system 102 computes the set of first risk scores and the set of second risk scores for each node using the page rank model 122 or the page rank model 122 with a preference vector in some instances. It is noted that since the fraud heterogeneous transaction graph is way less complex than the heterogeneous transaction graph due to an overall lower number of fraudulent entities in the payment network, using it to generate the scores is faster and requires comparatively lower processing resources.
In an embodiment, the first rank feature is computed using the page rank model 122 with the edge definition criteria set as the fraud count metric (i.e., weights used by the page rank model are the subset of primary first weights). The second rank feature is computed using the page rank model 122 with the edge definition criteria set as the fraud ratio metric (i.e., weights used by the page rank model are the subset of secondary first weights). The third rank feature is computed using the page rank model 122 with the edge definition criteria set as the decline count metric (i.e., weights used by the page rank model are the subset of primary second weights). The fourth rank feature is computed using the page rank model 122 with the edge definition criteria set as the decline ratio metric (i.e., weights used by the page rank model are the subset of secondary second weights). The fifth rank feature is computed using the page rank model 122 with the edge definition criteria set as the decline count metric along with the preference vector set of the page rank model 122 as the fraud ratio metric and a first preference threshold. The first preference threshold indicates a condition for the fraud ratio metric that should be met for each node before it may be added to a subset of preferred nodes. For instance, the first preference threshold may indicate a threshold magnitude of how high the fraud ratio metric for any node should be before it can be segregated as a preferred node. The sixth rank feature is computed using the page rank model 122 with the edge definition criteria set as the decline ratio metric along with the preference vector set of the page rank model 122 as the fraud ratio metric and a first preference threshold. It is noted that these features along with their significance have been explained later in the present disclosure.
Then, the server system 102 is configured to compute, for each node of the heterogeneous transaction graph, a set of third risk scores based, at least in part, on the third set of weights. In an example, the set of third risk scores for each node includes a seventh rank feature and an eighth rank feature. In an embodiment, the seventh rank feature is computed using the page rank model 122 with the edge definition criteria set as the approved count metric (i.e., weights used by the page rank model 122 is the third set of weights). Further, the eighth rank feature is computed using the page rank model 122 with the edge definition criteria set as the approved count metric along with the preference vector set of the page rank model 122 as the approved count metric and a second preference threshold. The second preference threshold indicates a condition for the approved count metric that should be met for each node before it may be added to a subset of preferred nodes. For instance, the second preference threshold may indicate a threshold magnitude of how high the approved count metric for any node should be before it can be segregated as a preferred node.
Then, the server system 102 is configured to generate a set of updated node-related features for each node of each node set associated with the plurality of entity sets of the heterogeneous transaction graph based, at least in part, on the plurality of node-related features for said node and a corresponding set of first risk scores of said node, a corresponding set of second risk scores of said node, and a corresponding set of third risk scores of said node. In other words, the generated scores act like additional features for their respective nodes. These scores provide additional insights into the risky behavior of their respective nodes. Further, the server system 102 is configured to determine a fraud score for a transaction based, at least in part, on the set of updated node-related features corresponding to the nodes involved in performing the said transaction. Herein, the fraud score indicates the likelihood of the transaction being a fraudulent transaction. In some instances, a classification model may be used by the server system 102 to compute the fraud score for each transaction between the plurality of cardholders 104 and the plurality of merchants 106. In various non-limiting examples, the classification model may refer to any AI/ML-based classification model, such as a logistic regression model, a Multi-Layer Perceptron (MLP) model, a Gradient boosting classifier model, and the like. Thereafter, the server system 102 may be configured to perform other operations as well, these operations along with the operations described earlier and explained further in detail with reference to FIG. 2.
As may be understood, the set of first, second, and third scores act as additional features for the classification model thereby, enabling the classification model to learn from 3DS2 data. The present approach can be implemented without changing the existing architecture of the classification model by simply modifying the input features fed to the model. In other words, the first, second, third, fourth, sixth, seventh, and eighth features (which are risk score-based features) improve the performance of the classification model while predicting fraud by providing additional insights into the behavior of the various entities in the 3DS 2.0 enabled payment eco-system.
The server system 102 is a separate part of the environment 100 and may operate apart from (but still in communication with, for example, via the network 116) any third-party external servers (to access data to perform the various operations described herein). However, in other embodiments, the server system 102 may be incorporated, in whole or in part, into one or more parts of the environment 100.
The number and arrangement of systems, devices, and/or networks shown in FIG. 1 are provided as an example. There may be additional systems, devices, and/or networks; fewer systems, devices, and/or networks; different systems, devices, and/or networks; and/or differently arranged systems, devices, and/or networks than those shown in FIG. 1. Furthermore, two or more systems or devices shown in FIG. 1 may be implemented within a single system or device, or a single system or device is shown in FIG. 1 may be implemented as multiple, distributed systems or devices. In addition, the server system 102 should be understood to be embodied in at least one computing device in communication with the network 116, which may be specifically configured, via executable instructions, to perform steps as described herein, and/or embodied in at least one non-transitory computer-readable media.
FIG. 2 illustrates a simplified block diagram of a server system 200, in accordance with an embodiment of the present disclosure. It is noted that the server system 200 is identical to the server system 102 of FIG. 1. In one embodiment, the server system 200 is a part of the payment network 112 or integrated within the payment server 114. In some embodiments, the server system 200 is embodied as a cloud-based and/or Software as a Service (SaaS) based architecture.
The server system 200 includes a computer system 202 and a database 204. It is noted that the database 204 is identical to the database 120 of FIG. 1. The computer system 202 includes at least one processor 206 for executing instructions, a memory 208, a communication interface 210, a user interface 212, and a storage interface 214 that communicate with each other via a bus 234.
In some embodiments, the database 204 is integrated into the computer system 202. For example, the computer system 202 may include one or more hard disk drives as the database 204. The user interface 212 is any component capable of providing an administrator (not shown) of the server system 200, the ability to interact with the server system 200. This user interface 212 may be a Graphical User Interface (GUI) or Human Machine Interface (HMI) that can be used by the administrator to configure the various operational parameters of the server system 200. The storage interface 214 is any component capable of providing the processor 206 with access to the database 204. The storage interface 214 may include, for example, an Advanced Technology Attachment (ATA) adapter, a Serial ATA (SATA) adapter, a Small Computer System Interface (SCSI) adapter, a RAID controller, a SAN adapter, a network adapter, and/or any component providing the processor 206 with access to the database 204. In one non-limiting example, the database 204 is configured to store a historical transaction dataset 216, a page rank model 218, a classification model 220, and the like. It is noted that the page rank model 218 is identical to the page rank model 122 of FIG. 1.
The processor 206 includes suitable logic, circuitry, and/or interfaces to execute operations for computing risk scores for each node of a heterogeneous transaction graph. In other words, the processor 206 includes suitable logic, circuitry, and/or interfaces to execute operations for the machine learning model, such as the page rank model 218, and the classification model 220. Examples of the processor 206 include but are not limited to, an Application-Specific Integrated Circuit (ASIC) processor, a Reduced Instruction Set Computing (RISC) processor, a Graphical Processing Unit (GPU), a Complex Instruction Set Computing (CISC) processor, a Field-Programmable Gate Array (FPGA), and the like.
The memory 208 includes suitable logic, circuitry, and/or interfaces to store a set of computer-readable instructions for performing various operations described herein. Examples of the memory 208 include a Random-Access Memory (RAM), a Read-Only Memory (ROM), a removable storage drive, a Hard Disk Drive (HDD), and the like. It will be apparent to a person skilled in the art that the scope of the disclosure is not limited to realizing the memory 208 in the server system 200, as described herein. In another embodiment, the memory 208 may be realized in the form of a database server or a cloud storage working in conjunction with the server system 200, without departing from the scope of the present disclosure.
The processor 206 is operatively coupled to the communication interface 210, such that the processor 206 can communicate with a remote device (i.e., to/from a remote device 222) such as the issuer server 110, the acquirer server 108, the payment server 114, or communicating with any entity connected to the network 116 (as shown in FIG. 1).
It is noted that the server system 200 as illustrated and hereinafter described is merely illustrative of an apparatus that could benefit from embodiments of the present disclosure and, therefore, should not be taken to limit the scope of the present disclosure. It is noted that the server system 200 may include fewer or more components than those depicted in FIG. 2.
In one implementation, the processor 206 includes a data pre-processing module 224, a graph generation module 226, a risk score generation module 228, a fraud detection module 230, and a notification module 232. It should be noted that components, described herein, such as the data pre-processing module 224, the graph generation module 226, the risk score generation module 228, the fraud detection module 230, and the notification module 232 can be configured in a variety of ways, including electronic circuitries, digital arithmetic, and logic blocks, and memory systems in combination with software, firmware, and embedded technologies.
In an embodiment, the data pre-processing module 224 includes suitable logic and/or interfaces for accessing the historical transaction dataset 216 from the database 204 associated with the server system 200. In particular, the historical transaction dataset 216 may at least include information related to a plurality of entity sets. In some instances, the historical transaction dataset 216 includes the 3DS2 data for various transactions taking place within a 3DS 2.0 enabled payment ecosystem. In various non-limiting examples, the 3DS2 data may include information related to various entities, such as IP addresses of the cardholder devices used for initiating various transactions, shipping addresses of cardholders, biometric information of cardholders, emails of the cardholders, and so on.
In various non-limiting examples, the plurality of entity sets may include a cardholder entity set, a merchant entity set, an IP entity set, an email entity set, a shipping address entity set, and so on. It is noted that each entity set includes a plurality of entities of the same entity type. For instance, a cardholder entity set includes a plurality of cardholders 104. Further, the information related to these entities may include information related to a plurality of historical payment transactions performed by the entities and the relationship between different entities due to these transactions. In the previous example, the plurality of historical payment transactions may be performed within a predetermined interval of time (e.g., 6 months, 12 months, 24 months, etc.) at different times and stored in the historical transaction dataset 216. In other words, the historical transaction dataset 216 defines the relationship between the different entities within different entity sets within the payment eco-system. In an instance, a relationship between a cardholder's PAN, a merchant account, and an IP address of the cardholder device is established when a transaction takes place between the cardholder and the merchant using the said cardholder device. For example, when a cardholder purchases an item from a merchant using his mobile device associated with a specific IP address, a relationship is established.
In some other non-limiting examples, the historical transaction dataset 216 includes information related to at least merchant name identifier, cardholder name, cardholder identifier, unique merchant identifier, timestamp information (i.e., transaction date/time), geo-location related data (i.e., latitude and longitude of the cardholder/merchant), Merchant Category Code (MCC), merchant industry, merchant super industry, information related to payment instruments involved in the set of historical payment transactions, cardholder identifier, PAN, country code, transaction identifier, transaction amount, fraud transaction label or non-fraud transaction label, approved transaction label or declined transaction label, and the like.
In addition, the data pre-processing module 224 is configured to generate the plurality of node-related features for each entity of the plurality of entities in each entity set based, at least in part, on the historical transaction dataset 216. In various non-limiting examples, the data pre-processing module 224 may utilize any feature generation approaches, such as but not limited to one-hot encoding, binning, and the like to generate the node-related features. It is understood that such feature generation techniques are already known in the art, therefore the same are not explained here for the sake of brevity. The data pre-processing module 224 may further store the generated features in the historical transaction dataset 216 for further use by the various modules of the server system 200 as well. In other words, the historical transaction dataset 216 may include the plurality of node-related features for each entity.
In another embodiment, the data pre-processing module 224 is communicably coupled to the graph generation module 226 and is configured to transmit the plurality of node-related features for each entity to the graph generation module 226.
In an embodiment, the graph generation module 226 includes suitable logic and/or interfaces for generating a heterogeneous transaction graph based, at least in part, on the historical transaction dataset and the plurality of node-related features for each entity. In various non-limiting examples, the heterogeneous transaction graph may include a plurality of node sets associated with a plurality of entity sets and a plurality of edges. Each node set may include a plurality of nodes representing a plurality of entities of an entity set. Further, each node is associated with a plurality of node-related features of an individual entity. Furthermore, an edge of the plurality of edges indicates information related to a transactional relationship between two distinct nodes connected by the edge.
In particular, the node-related features for each node are fed to the graph generation module 226. Then, the graph generation module 226 determines one or more features required for the generation of the heterogeneous transaction graph by analyzing the information related to the plurality of entity sets included in the historical transaction dataset 216. For instance, one or more features corresponding to each entity from an entity set may be included in a node corresponding to the said entity. Then, these nodes are connected to each other with a plurality of edges. To that end, the nodes within the heterogeneous transaction graph may be connected with one or more edges between them. Herein, the edges may define the relationship between different nodes (i.e., nodes of different entity types). In a non-limiting example, the graph generation module 226 identifies the cardholders 104(1)-104(3) that have made payment transactions with the merchants 106(1)-106(3) using IP addresses A, B, and C based at least on the information related to historical payment transactions between the plurality of cardholders 104 and the plurality of merchants 106. Then a heterogeneous transaction graph including nodes corresponding to the cardholders 104(1)-104(3), merchants 106(1)-106(3), and IP addresses A, B, and C may be generated with edges connecting these nodes. The said edges may be generated based on the transactional relationship between these entities.
As an example, a schematic representation of a heterogeneous transaction graph has been explained further in detail later in the present disclosure with reference to FIG. 3. Upon generation of the heterogeneous transaction graph, the heterogeneous transaction graph may be stored in the database 204 associated with the server system 200. It is noted that when the server system 200 is configured to compute the risk scores, the server system 200 may access the heterogeneous transaction graph from the database 204. In a situation, where the heterogeneous transaction graph is not available, the server system 200 may generate the heterogeneous transaction graph based on the process described earlier. In one embodiment, the server system 200 is configured to extract a fraud heterogeneous transaction graph from the heterogeneous transaction graph based, at least in part, on the plurality of node-related features including a fraud transaction label.
In another embodiment, the graph generation module 226 is communicably coupled to the risk score generation module 228 and is configured to transmit the heterogeneous transaction graph to the risk score generation module 228.
In an embodiment, the risk score generation module 228 includes suitable logic and/or interfaces for determining a first set of weights for the subset of edges of the fraud heterogeneous transaction graph based, at least in part, on a first edge definition criteria for each edge. In an example, the first edge definition criteria include a fraud count metric and a fraud ratio metric. It is understood that since only one edge definition may be used at a time for computing the weights of the edges, the first set of weights may further include a subset of primary first weights and a subset of secondary first weights. It is understood that the intuition behind computing weights using the fraud count metric and the fraud ratio metric is that highly influential nodes in the network of fraudulent transactions could have higher associated risk thus, more weight should be assigned to them. Further, non-fraud transactions (using the fraud ratio metric) are used to reduce the effect of fraud count on larger merchants since they might be disproportionally affected by higher fraud count.
Further, the risk score generation module 228 is configured to determine a second set of weights for the subset of edges based, at least in part, on a second edge definition criteria for each edge. In an example, the second edge definition criteria may include a decline count metric and a decline ratio metric. Similarly, the second set of weights may further include a subset of primary second weights and a subset of secondary second weights. It is understood that the intuition behind computing weights using the decline count metric and the decline ratio metric is that in the absence of fraud labels, declined transactions could serve as a proxy for the flow of risk in the payment network.
Then, the risk score generation module 228 is configured to compute for each node of the fraud heterogeneous transaction graph, a set of first risk scores, and a set of second risk scores, based, at least in part, on the first set of weights and the second set of weights, respectively. In an example, the set of first risk scores may include a first rank feature and a second rank feature. In an example, the set of second risk scores may include a third rank feature, a fourth rank feature, a fifth rank feature, and a sixth rank feature. In an example, the set of third risk scores may include a seventh rank feature, and an eighth rank feature. It is noted that the term ‘rank feature’ represents a risk score that can be used as a feature for the classification model 220.
In another embodiment, to determine the set of first risk scores, the risk score generation module 228 is configured to utilize the page rank model 218. In particular, the risk score generation module 228 is configured to iteratively process the fraud heterogeneous transaction graph using the page rank model 218 till each node of the fraud heterogeneous transaction graph is traversed. The processing includes traversing the fraud heterogeneous transaction graph from a selected random node of the fraud heterogeneous transaction graph for all candidate edges satisfying the first edge definition criteria. Herein, a candidate edge is not a dangling edge. This iterative process is known as the random walk algorithm. Further, the risk score generation module 228 is configured to compute a first traverse count for each node based on a number of times each node is traversed. Furthermore, the risk score generation module 228 is configured to determine the set of first risk scores for each node based, at least in part, on the first traverse count of the corresponding node. As may be understood, if the first edge definition criteria is set as the fraud count metric for the processing steps, then the output would be the first rank feature for each node. In other words, the page rank model 218 uses the subset of primary first weights (corresponding to the fraud count metric) for computing the first rank feature for each node. On the other hand, if the first edge definition criteria is set as the fraud ratio metric for the processing steps, then the output would be the second rank feature. In other words, the page rank model 218 uses the subset of secondary first weights (corresponding to the fraud ratio metric) for computing the second rank feature for each node.
In another embodiment, to determine the set of second risk scores, the risk score generation module 228 is configured to utilize the page rank model 218. In one embodiment, the risk score generation module 228 is configured to iteratively process the fraud heterogeneous transaction graph using the page rank model 218 till each node of the fraud heterogeneous transaction graph is traversed. The processing includes traversing the fraud heterogeneous transaction graph from a selected random node of the fraud heterogeneous transaction graph for all candidate edges satisfying the second edge definition criteria. This iterative process is known as the random walk algorithm. Further, the risk score generation module 228 is configured to compute a second traverse count for each node based on a number of times each node is traversed. Furthermore, the risk score generation module 228 is configured to determine the set of second risk scores for each node based, at least in part, on the second traverse count of the corresponding node. As may be understood, if the second edge definition criteria is set as the decline count metric for the processing steps, then the output would be the third rank feature for each node. In other words, the page rank model 218 uses the subset of primary second weights (corresponding to the decline count metric) for computing the third rank feature for each node. On the other hand, if the second edge definition criteria is set as the decline ratio metric for the processing steps, then the output would be the fourth rank feature. In other words, the page rank model 218 uses the subset of secondary second weights (corresponding to the decline ratio metric) for computing the fourth rank feature for each node.
Further, the page rank model 218 may also be used with a personalized vector to bias the feature generated by the model. To that end, the risk score generation module 228 is configured to determine a subset of preferred nodes from the fraud heterogeneous transaction graph based, at least in part, on the fraud ratio metric (herein, the fraud ratio metric is the preference vector) and a first preference threshold. Here, the first preference threshold indicates the magnitude of the fraud ratio metric that should be used as the preference vector. In a non-limiting example, the first preference threshold may be defined by an administrator (not shown) of the server system 200. Further, the risk score generation module 228 is configured to iteratively process the fraud heterogeneous transaction graph using the page rank model 218 till each node of the fraud heterogeneous transaction graph is traversed. The processing includes traversing the fraud heterogeneous transaction graph from a selected random node of the fraud heterogeneous transaction graph for all candidate edges satisfying the second edge definition criteria. Then, the processing includes randomly re-selecting another starting node from the subset of preferred nodes for the next iteration. This step biases the page rank model 218 to restart at nodes with a fraud ratio (such as a high fraud ratio) which in turn, biases the scores towards nodes with a high fraud ratio (which could be of higher risk of fraud).
Further, the risk score generation module 228 is configured to compute a third traverse count for each node based on a number of times each node is traversed. Furthermore, the risk score generation module 228 is configured to determine the set of second risk scores for each node based, at least in part, on the third traverse count of the corresponding node. As may be understood, if the second edge definition criteria is set as the decline count metric for the processing steps, then the output would be the fifth rank feature for each node. It is noted that fifth rank feature would be different from the third rank feature due to the presence of a bias due to the use of a preference vector during its generation. On the other hand, if the second edge definition criteria is set as the decline ratio metric for the processing steps, then the output would be the sixth rank feature. Similarly, it is noted that sixth rank feature would be different from the fourth rank feature due to the presence of a bias due to the use of a preference vector during its generation.
In another embodiment, to determine the seventh rank feature of the set of third risk scores, the risk score generation module 228 is configured to utilize the page rank model 218. In particular, the risk score generation module 228 is configured to iteratively process the heterogeneous transaction graph using the page rank model 218 till each node of the heterogeneous transaction graph is traversed. The processing includes traversing the heterogeneous transaction graph from a selected random node of the heterogeneous transaction graph for all candidate edges satisfying the third edge definition criteria. This iterative process is known as the random walk algorithm. Further, the risk score generation module 228 is configured to compute a fourth traverse count for each node based on a number of times each node is traversed. Furthermore, the risk score generation module 228 is configured to determine the seventh rank feature for each node based, at least in part, on the fourth traverse count of the corresponding node. Herein, the page rank model 218 uses the set of third weights for computing the seventh rank feature for each node.
In another embodiment, to determine the seventh rank feature of the set of third risk scores, the risk score generation module 228 is configured to utilize the page rank model 218. In particular, the risk score generation module 228 is configured to determine a subset of preferred nodes from the heterogeneous transaction graph based, at least in part, on the approved count metric and a second preference threshold. Here, the second preference threshold indicates the magnitude of the approved count metric that should be used as the preference vector. In a non-limiting example, the second preference threshold may be defined by an administrator (not shown) of the server system 200. Further, the risk score generation module 228 is configured to iteratively process the heterogeneous transaction graph using the page rank model 218 till each node of the heterogeneous transaction graph is traversed. Then, the processing includes randomly re-selecting another starting node from the subset of preferred nodes for the next iteration. This step biases the page rank model 218 to restart at nodes with approved count metric (such as a high approved count) which in turn, biases the scores towards more commonly visited nodes (which could be of lower risk).
Further, the risk score generation module 228 is configured to compute a fifth traverse count for each node based on a number of times each node is traversed. Furthermore, the risk score generation module 228 is configured to determine the eighth rank feature for each node based, at least in part, on the fifth traverse count of the corresponding node. It is noted that the eighth rank feature would be different from the seventh rank feature due to the presence of a bias due to the use of a preference vector during its generation.
In another embodiment, the risk score generation module 228 is communicably coupled to the fraud detection module 230 and is configured to transmit the set of first, second, and third scores to the fraud detection module 230.
In an embodiment, the fraud detection module 230 includes suitable logic and/or interfaces for generating a set of updated node-related features for each node of the heterogeneous transaction graph based, at least in part, on the plurality of node-related features and a corresponding set of first risk scores, a corresponding set of second risk scores, and a corresponding set of third risk scores. In particular, the first rank feature, the second rank feature, the third rank feature, the fourth rank feature, the fifth rank feature, the sixth rank feature, the seventh rank feature, and the eighth rank feature are added to the existing plurality of node-related features to generate the set of updated node-related features.
Further, the fraud detection module 230 is configured to determine using the classification model 220, a fraud score for a transaction performed between two distinct nodes based on the set of updated node-related features corresponding to each of the distinct nodes. As may be understood, the presence of these eight additional features in the updated node-related features which represent the insights or learnings from the 3DS2 data, helps to improve the prediction of a fraud score for the said transaction by the classification model 220. In an instance, if the fraud score for a particular transaction is at least equal to a threshold value, then the said transaction may be labeled as a fraudulent transaction by the fraud detection module 230. In an alternative instance, if the fraud score for a particular transaction is lower than the threshold value, then the said transaction may be labeled as a non-fraudulent transaction by the fraud detection module 230. In a non-limiting example, the threshold value may be defined by the administrator of the server system 200.
In another embodiment, the fraud detection module 230 is communicably coupled to the notification module 232 and is configured to transmit the fraud score to the notification module 232.
In an embodiment, the notification module 232 includes suitable logic and/or interfaces for facilitating a transmission of a notification including the fraud score of a transaction to either an issuer or an acquirer in the payment network. Further, upon receiving a new transaction request or a transaction authentication request from a cardholder, the issuer may consider the fraud score associated with the said transaction to either approve or decline the transaction.
In a particular application, within the 3DS 2.0 protocol for a payment eco-system, it is understood that a Risk-Based Authentication (RBA) is performed for each payment transaction. Generally, RBA is performed by a directory server (such as a payment processor) for every e-commerce transaction initiated by the cardholder, such as cardholder 104(1) with a merchant 106(1). A Merchant Plug-In (MPI) associated with the merchant 106(1) is generally responsible for collecting the 3DS2 data of the cardholder 104(1) during the ongoing transaction. This 3DS2 data is stored by the directory server in its storage. In some instances, the server system 200 may be implemented within the directory server. In such instances, the 3DS2 data may be stored in the historical transaction dataset 216. To perform the RBA for an ongoing transaction, a classification model, such as the classification model 220 may be utilized by the directory server or the server system 200. This classification model 220 may generate the fraud score for the transaction based on both the updated node-related features of the cardholder 104(1) and the merchant 106(1). Further, the directory server may append the fraud score of the transaction within a transaction authentication request for the ongoing transaction within the 3DS2 protocol. Then, the directory server transmits the transaction authentication request to the Access Control Server (ACS) associated with the issuer of the cardholder 104(1). Then, the issuer or ACS may rely on the fraud score to determine whether to approve or decline the ongoing transaction based on its internal policies. As may be understood, since the fraud score is generated in part using the eight additional features described herein, the accuracy of the fraud prediction by the said fraud score for a transaction is improved. Thus, allowing for improved RBA in the 3DS 2.0 protocol.
FIG. 3 illustrates a schematic representation of a heterogeneous transaction graph 300, in accordance with an embodiment of the present disclosure. As described earlier, the graph generation module 226 of the server system 200 is configured to generate the heterogeneous transaction graph 300 based, at least in part, on the plurality of node-related features for each entity of the plurality of entities within each entity set.
As depicted in FIG. 3, a heterogeneous transaction graph 300 is generated for a transaction between three entity sets including a single node from a single entity type. In other words, a transaction between a merchant (represented using merchant node 302), a cardholder (represented with PAN node 304 or cardholder node 304), and an IP address of the device (depicted using IP address node 306) initiating the transaction. For instance, if the cardholder performs a transaction with the merchant with a cardholder device (not shown) associated with the cardholder, the IP address of the cardholder device will be used for generating the heterogeneous transaction graph 300. As depicted, the heterogeneous transaction graph 300 includes a plurality of node sets (e.g., merchant nodes, cardholder nodes, and IP address nodes) associated with a plurality of entity sets (e.g., merchant set, cardholder set, and IP address set). Further, the nodes are connected via a plurality of edges (see, 308(1), 308(2), and 308(3)). It is noted that the plurality of edges 308(1)-308(3) indicates a relationship between the merchant node 302, the cardholder node 304, and the IP address node 306. In one instance, this relationship can be defined as a transaction performed between the cardholder and the merchant using a cardholder device with a specific IP address.
Although the heterogeneous transaction graph 300 depicts only three nodes belonging to different node sets and distinct entities, it is understood that for a complex payment ecosystem, there may be any number of nodes for each node set and any number of entity sets. Further, various data points from the 3DS2 data may be utilized to introduce different entity types to the heterogeneous transaction graph 300. For instance, cardholder email, cardholder shipping address, Merchant Category Code (MCC), cardholder biometrics, and so on may be used to describe distinct entity sets that include various nodes corresponding to each entity type. For example, the heterogeneous transaction graph 300 may be generated for multiple transactions between four entity sets including multiple nodes from each entity type as well.
For instance, the heterogeneous transaction graph 300 may include a plurality of node sets represented by NS1, NS2, …, NSi, NS(i+1), …, NSp, where NSi depicts an ith node set and p is a natural number. Here, the plurality of node sets is associated with a plurality of entity sets represented by ES1, ES2, …, ESi, ES(i+1), …, ESp, where ESi is an ith entity set and p is a natural number. It is noted that each entity set is mapped to its corresponding node set. Further, each node set such as NSi includes a plurality of nodes of the same entity types represented by N1-ESi, N2-ESi, …., Np-ESi for an ith entity set, where Ni is the ith node. In other words, each node set has different nodes of the same entity type. To that end, each node of a particular node set will correspond to a particular entity from the plurality of entities. This is represented by E1-ESi, E2-ESi, …., Ep-ESi for an entity set, ESi. For example, if the entity type is cardholder, then a cardholder node set will include a plurality of different cardholder nodes. Further, each of the nodes within the heterogeneous transaction graph 300 may be connected by a plurality of edges represented by e1, e2, …., em, where em is the mth edge where m is a natural number. In some scenarios, the heterogeneous transaction graph 300 may be generated for a particular time period T.
FIG. 4 illustrates an architecture 400 depicting the processing of a heterogeneous transaction graph 402 using a page rank model 404 and a classification model 410, in accordance with an embodiment of the present disclosure. It should be noted that the heterogeneous transaction graph 402 is similar to the heterogeneous transaction graph 300 of FIG. 3. Further, the page rank model 404 and the classification model 410 are similar to the page rank model 218 and the classification model 220 of FIG. 2, respectively. As mentioned earlier, a fraud heterogeneous transaction graph may be extracted from the heterogeneous transaction graph 402 prior to processing it for generating risk scores 406 for each node. Further, the page rank model 404 is used for generating the risk scores 406 (i.e., the set of first risk scores, the set of second risk scores, and the set of third risk scores). Each risk score may include one or more rank features that can be used as features for the classification model 410 for performing one or more classifications. For instance, the first rank feature and the second rank feature of the set of first risk scores, the third rank feature, the fourth rank feature, the fifth rank feature, and the sixth rank feature of the set of second risk scores, and the seventh rank feature and the eighth rank feature of the set of third risk scores, collectively, can be used as additional features for the classification model 410 for performing its classifications.
The page rank model 404 used for generating the above-mentioned risk scores 406 for each node, receives an undirected multipartite graph such as the heterogeneous transaction graph 402 and/or the fraud heterogeneous transaction graph as input. Further, the page rank model 404 generates a directed graph from the undirected multipartite graph with two directed edges for each undirected edge. The server system 200 may be configured to train the page rank model 404 to generate the directed graph from the undirected multipartite graph by generating the risk scores 406 for each node based on a structure of incoming links (or edges) to the corresponding nodes. The server system 200 may initialize a set of model parameters while training the page rank model 404 which may get updated with every training iteration to improve the performance of the page rank model 404. In one embodiment, the set of model parameters may include, but not limited to, a damping factor value, a personalization vector value, a maximum number of iterations, an error tolerance, a starting value of a page rank iteration for each node, weights of edges, availability of dangling edges, and the like.
The page rank algorithm was originally designed for web pages, however, the same can be applied to rank nodes in a graph. Thus, the page rank algorithm implemented via the page rank model 404 is used in the present disclosure to generate the risk scores 406 for each node. The risk scores 406 indicate how much risk is associated with each node based on the edges connected to the corresponding node. A node connected to a highly risky node would also be risky and if connected to a less risky node then would be less risky. In other words, the risk score generated for each node by the page rank model 404 can be defined as a probability distribution value representing the likelihood that a node being under risk due to the said node being connected to another risky node. Further, in a specific embodiment, the formula that can be used for computing the risk scores 406 for each node using the page rank model 404 may be as follows:
"PR" ("n" _"i" )"?=?" "1-d" /"N" "?+?d?" ?_"n??M" ("n" _"i" )¦"PR?" ("n" _"j" )/"L" ("n" _"j" ) "?" … Eqn. 1
Herein, PR stands for page rank (otherwise also referred to as risk score) for each node, with n1, n2, … nN representing nodes in the heterogeneous transaction graph 402, M(ni) represents a set of nodes that link to ni, L(nj) represents a count of outbound links to the node nj, and N is a total count of nodes in the heterogeneous transaction graph 402. Also, d refers to a damping factor. As used herein, the term ‘damping factor’ in page rank theory refers to a factor indicating the probability, that at any step, the person who is randomly clicking on links and who can eventually stop doing so, will continue following the links.
Further, since the risk scores are probability distribution values, the nodes of the heterogeneous transaction graph 402 which is provided as input to the page rank model 404 may be initialized with an initial risk score value ranging between 0 to 1. Then, this value may be updated at every iteration based on Eqn. 1. Further, the sum of all the risk score values may have to be 1. Further, eigenvector calculation may be performed by a power iteration method, and the iteration may stop after the error tolerance has been reached. Alternatively, if the count of iterations reaches the maximum count, an exception may be raised. Moreover, a random walk-based distributed algorithm may be used by the server system 200 for training the page rank model 404 to compute the risk scores 406 for each node.
Upon obtaining the risk scores 406 from the page rank model 404, the risk scores 406 along with the node-related features 408 may be provided to the classification model 410 as a set of updated node-related features for each node. In an embodiment, the server system 200 may use the classification model 410 to determine a fraud score 412 for a transaction performed or being performed between two distinct nodes based on the set of updated node-related features corresponding to each of the distinct nodes. As may be understood, the presence of the risk scores 406 which are part of the update node-related features, helps to improve the accuracy of the fraud score 412 for the corresponding transaction by the classification model 410.
In a non-limiting implementation, the classification model 410 may include a gradient-boosting classifier-based model. As used herein, the term ‘gradient boosting classifier’ refers to an ensemble Machine Learning (ML)-based model which combines the results of multiple weaker models to create a stronger and more accurate model. In every iteration, it gives more weight to data points that were misclassified in the previous step, thereby boosting the model’s performance. Further, the model uses the concept of gradient descent that adjusts model parameters to reduce errors. In each step, the model calculates the gradient (slope) of the loss function, which represents errors, and updates model parameters to mode in the direction that reduces those errors. In an embodiment, decision trees may be used as a base learner, and at every step build a new decision tree that corrects mistakes of previous trees. Finally, all the trees may be added together to make a final decision. Further, a weight may be assigned to each decision tree based on its accuracy. Further, regularization and several hyperparameters may be used for proper tuning of the model to achieve the best model performance. For example, some of the hyperparameters may include a maximum depth of the model, a maximum count of iterations of the model, a learning rate, and the like. In a non-limiting implementation, the maximum depth of the model may be about 5 levels deep which indicates the count of levels deep a tree structure in a model can be traversed. In another non-limiting implementation, the maximum count of iteration may be about 100. Further, data points (e.g., the payment transactions between each of the cardholders, the merchants, and the IP addresses) may be classified as fraud or non-fraud based on votes assigned by the decisions trees in the ensemble model to each class. It should be noted that the classification model 410 may not necessarily be the gradient boosting classifier-based model, and it could be any other AI/ML model such as, but not limited to, an XG Boost model, a Neural network-based model, CatBoost model, a logistic regression model, and the like.
In a scenario where the nodes correspond to the cardholders, the merchants, and the IP addresses of devices used by the cardholders, the page rank model 404 generates the risk scores 406 for each of the cardholders, the merchants, and the IP addresses. FIG. 4 shows the heterogeneous transaction graph 402 having the nodes such as the cardholders illustrated as ‘C’, the merchants illustrated as ‘M’, IP addresses illustrated as ‘IP’. The risk scores 406 generated for each node along with the node-related features 408 of each of the cardholders, the merchants, and the IP addresses are provided to the classification model 410 which further classifies whether any of the payment transactions between any of the cardholders, the merchants and the IP addresses is fraudulent or not based on the fraud score 412 generated for each payment transactions between the entities such as the cardholders, the merchants, and the IP addresses. It should be noted that a conventional architecture of the page rank model 404, and that of the classification model 410 may not be altered and only the input provided to the classification model 410 is tweaked using the page rank model 404 for generating the fraud score 412 that are more accurate than that obtained from conventional approaches.
FIGS. 5A, and 5B, collectively, illustrate a process flow diagram depicting a method 500 for computing various risk scores associated with each node of the heterogeneous transaction graph, in accordance with an embodiment of the present disclosure. The method 500 depicted in the flow diagram may be executed by, for example, the server system 200. The sequence of operations of the method 500 may not be necessarily executed in the same order as they are presented. Further, one or more operations may be grouped and performed in the form of a single step, or one operation may have several sub-steps that may be performed in parallel or in a sequential manner. Operations of the method 500, and combinations of operations in the method 500 may be implemented by, for example, hardware, firmware, a processor, circuitry, and/or a different device associated with the execution of software that includes one or more computer program instructions. The plurality of operations is depicted in the process flow of the method 500. The process flow starts at operation 502.
At 502, the method 500 includes accessing, by a server system such as the server system 200, a heterogeneous transaction graph such as heterogeneous transaction graph 300 from a database such as database 204 associated with the server system 200. In a non-limiting example, the heterogeneous transaction graph includes a plurality of node sets associated with a plurality of entity sets and a plurality of edges. Each node set includes a plurality of nodes representing a plurality of entities of an entity set and each node is associated with a plurality of node-related features of an individual entity. Additionally, an edge of the plurality of edges indicates information related to a transactional relationship between two distinct nodes connected by the edge.
At 504, the method 500 includes extracting, by the server system 200, a fraud heterogeneous transaction graph from the heterogeneous transaction graph 300 based, at least in part, on the plurality of node-related features including a fraud transaction label. In other words, only nodes that have been involved in at least one fraud transaction and the nodes connected with these nodes are selected to be extracted from the heterogeneous transaction graph 300. In an example, the fraud heterogeneous transaction graph includes, for each node set, a subset of fraud nodes and a subset of non-fraud nodes, and a subset of edges between the subset of fraud nodes and the subset of non-fraud nodes.
At 506, the method 500 includes determining, by the server system 200, a first set of weights for the subset of edges of the fraud heterogeneous transaction graph based, at least in part, on a first edge definition criteria for each edge. Herein, the first edge definition criteria include a fraud count metric and a fraud ratio metric.
At 508, the method 500 includes determining, by the server system 200, a second set of weights for the subset of edges based, at least in part, on a second edge definition criteria for each edge. Herein, the second edge definition criteria include a decline count metric and a decline ratio metric.
At 510, the method 500 includes computing, by the server system 200, for each node of the fraud heterogeneous transaction graph, a set of first risk scores, and a set of second risk scores, based, at least in part, on the first set of weights and the second set of weights, respectively. In a non-limiting example, the page rank model 218 can be used to compute the set of first risk scores for each node. Further, in another non-limiting example, the page rank model 218 with the preference vector set to fraud count metric can be used to compute the set of second risk scores for each node.
At 512, the method 500 includes determining, by the server system 200, a third set of weights for the plurality of edges of the heterogeneous transaction graph based, at least in part, on a third edge definition criteria for each edge. Herein, the third edge definition criteria include an approved count metric.
At 514, the method 500 includes computing, by the server system 200, for each node of the heterogeneous transaction graph, a set of third risk scores based, at least in part, on the third set of weights. In a non-limiting example, the page rank model 218 with the preference vector set to the approved count metric can be used to compute the set of third risk scores for each node.
FIG. 6 illustrates a simplified block diagram of an acquirer server 600, in accordance with an embodiment of the present disclosure. The acquirer server 600 is an example of the acquirer server 108 of FIG. 1. The acquirer server 600 is associated with an acquirer bank/acquirer, in which a merchant may have an account, that provides a payment card. The acquirer server 600 includes a processing module 602 operatively coupled to a storage module 604 and a communication module 606. The components of the acquirer server 600 provided herein may not be exhaustive and the acquirer server 600 may include more or fewer components than those depicted in FIG. 6. Further, two or more components may be embodied in one single component, and/or one component may be configured using multiple sub-components to achieve the desired functionalities. Some components of the acquirer server 600 may be configured using hardware elements, software elements, firmware elements, and/or a combination thereof.
The storage module 604 is configured to store machine-executable instructions to be accessed by the processing module 602. Additionally, the storage module 604 stores information related to, the contact information of the merchant, bank account number, availability of funds in the account, payment card details, transaction details, and/or the like. Further, the storage module 604 is configured to store payment transactions.
In one embodiment, the acquirer server 600 is configured to store profile data (e.g., an account balance, a credit line, details of the plurality of cardholders 104, account identification information, and a payment card number) in a transaction database 608. The details of the plurality of cardholders 104 may include, but are not limited to, name, age, gender, physical attributes, location, registered contact number, family information, alternate contact number, registered e-mail address, etc.
The processing module 602 is configured to communicate with one or more remote devices such as a remote device 610 using the communication module 606 over a network such as the network 116 of FIG. 1. The examples of the remote device 610 include the server system 102, the payment server 114, the issuer server 110, or other computing systems of the acquirer server 600, and the like. The communication module 606 can facilitate such operative communication with the remote devices and cloud servers using Application Program Interface (API) calls. The communication module 606 is configured to receive a payment transaction request performed by the plurality of cardholders 104 via the network 116. The processing module 602 receives payment card information, a payment transaction amount, customer information, and merchant information from the remote device 610 (i.e., the payment server 114). The acquirer server 600 includes a user profile database 612 and the transaction database 608 for storing transaction data. The user profile database 612 may include information of cardholders. The transaction data may include, but is not limited to, transaction attributes, such as transaction amount, source of funds such as bank or credit cards, transaction channel used for loading funds such as POS terminal or ATM machine, transaction velocity features such as count and transaction amount sent in the past x days to a particular user, transaction location information, external data sources, and other internal data to evaluate each transaction.
FIG. 7 illustrates a simplified block diagram of an issuer server 700, in accordance with an embodiment of the present disclosure. The issuer server 700 is an example of the issuer server 110 of FIG. 1. The issuer server 700 is associated with an issuer bank/issuer, in which an account holder (e.g., the plurality of cardholders 104(1)-104(N)) may have an account, which provides a payment card (e.g., the payment cards 118(1)-118(N)). The issuer server 700 includes a processing module 702 operatively coupled to a storage module 704 and a communication module 706. The components of the issuer server 700 provided herein may not be exhaustive and the issuer server 700 may include more or fewer components than those depicted in FIG. 7. Further, two or more components may be embodied in one single component, and/or one component may be configured using multiple sub-components to achieve the desired functionalities. Some components of the issuer server 700 may be configured using hardware elements, software elements, firmware elements, and/or a combination thereof.
The storage module 704 is configured to store machine-executable instructions to be accessed by the processing module 702. Additionally, the storage module 704 stores information related to, the contact information of the cardholders (e.g., the plurality of cardholders 104(1)-104(N)), a bank account number, availability of funds in the account, payment card details, transaction details, payment account details, and/or the like. Further, the storage module 704 is configured to store payment transactions.
In one embodiment, the issuer server 700 is configured to store profile data (e.g., an account balance, a credit line, details of the cardholders, account identification information, payment card number, etc.) in a database. The details of the cardholders may include, but are not limited to, name, age, gender, physical attributes, location, registered contact number, family information, alternate contact number, registered e-mail address, or the like of the cardholders, etc.
The processing module 702 is configured to communicate with one or more remote devices such as a remote device 708 using the communication module 706 over a network such as the network 116 of FIG. 1. Examples of the remote device 708 include the server system 200, the payment server 114, the acquirer server 108 or other computing systems of the issuer server 700. The communication module 706 can facilitate such operative communication with the remote devices and cloud servers using API calls. The communication module 706 is configured to receive a payment transaction request performed by an account holder (e.g., the cardholder 104(1)) via the network 116. The processing module 702 receives payment card information, a payment transaction amount, customer information, and merchant information from the remote device 708 (e.g., the payment server 114). The issuer server 700 includes a transaction database 710 for storing transaction data. The transaction data may include, but is not limited to, transaction attributes, such as transaction amount, source of funds such as bank or credit cards, transaction channel used for loading funds such as POS terminal or ATM machine, transaction velocity features such as count and transaction amount sent in the past x days to a particular account holder, transaction location information, external data sources, and other internal data to evaluate each transaction. The issuer server 700 includes a user profile database 712 storing user profiles associated with the plurality of account holders.
The user profile data may include an account balance, a credit line, details of the account holders, account identification information, payment card number, or the like. The details of the account holders (e.g., the plurality of cardholders 104(1)-104(N)) may include, but are not limited to, name, age, gender, physical attributes, location, registered contact number, family information, alternate contact number, registered e-mail address, or the like of the plurality of cardholders 104.
FIG. 8 illustrates a simplified block diagram of a payment server 800, in accordance with an embodiment of the present disclosure. The payment server 800 is an example of the payment server 114 of FIG. 1. The payment server 800 and the server system 200 may use the payment network 112 as a payment interchange network. Examples of payment interchange networks include, but are not limited to, Mastercard® payment system interchange network.
The payment server 800 includes a processing system 802 configured to extract programming instructions from a memory 804 to provide various features of the present disclosure. The components of the payment server 800 provided herein may not be exhaustive and the payment server 800 may include more or fewer components than that depicted in FIG. 8. Further, two or more components may be embodied in one single component, and/or one component may be configured using multiple sub-components to achieve the desired functionalities. Some components of the payment server 800 may be configured using hardware elements, software elements, firmware elements, and/or a combination thereof.
Via a communication interface 806, the processing system 802 receives a request from a remote device 808, such as the issuer server 110, the acquirer server 108, or the server system 102. The request may be a request for conducting the payment transaction. The communication may be achieved through API calls, without loss of generality. The payment server 800 includes a database 810. The database 810 also includes transaction processing data such as issuer ID, country code, acquirer ID, and Merchant Identifier (MID), among others.
When the payment server 800 receives a payment transaction request from the acquirer server 108 or a payment terminal (e.g., IoT device), the payment server 800 may route the payment transaction request to an issuer server (e.g., the issuer server 110). The database 810 stores transaction identifiers for identifying transaction details, such as transaction amount, IoT device details, acquirer account information, transaction records, merchant account information, and the like.
In one example embodiment, the acquirer server 108 is configured to send an authorization request message to the payment server 800. The authorization request message includes, but is not limited to, the payment transaction request.
The processing system 802 further sends the payment transaction request to the issuer server 110 for facilitating the payment transactions from the remote device 808. The processing system 802 is further configured to notify the remote device 808 of the transaction status in the form of an authorization response message via the communication interface 806. The authorization response message includes, but is not limited to, a payment transaction response received from the issuer server 110. Alternatively, in one embodiment, the processing system 802 is configured to send an authorization response message for declining the payment transaction request, via the communication interface 806, to the acquirer server 108. In one embodiment, the processing system 802 executes similar operations performed by the server system 200, however, for the sake of brevity, these operations are not explained herein.
The disclosed method with reference to FIGS. 5A, and 5B, or one or more operations of the server system 200 may be implemented using software including computer-executable instructions stored on one or more computer-readable media (e.g., non-transitory computer-readable media, such as one or more optical media discs, volatile memory components (e.g., DRAM or SRAM), or nonvolatile memory or storage components (e.g., hard drives or solid-state nonvolatile memory components, such as Flash memory components) and executed on a computer (e.g., any suitable computer, such as a laptop computer, netbook, Web book, tablet computing device, smartphone, or other mobile computing devices). Such software may be executed, for example, on a single local computer or in a network environment (e.g., via the Internet, a wide-area network, a local-area network, a remote web-based server, a client-server network (such as a cloud computing network), or other such networks) using one or more network computers. Additionally, any of the intermediate or final data created and used during the implementation of the disclosed methods or systems may also be stored on one or more computer-readable media (e.g., non-transitory computer-readable media) and are considered to be within the scope of the disclosed technology. Furthermore, any of the software-based embodiments may be uploaded, downloaded, or remotely accessed through a suitable communication means. Such a suitable communication means includes, for example, the Internet, the World Wide Web (WWW), an intranet, software applications, cable (including fiber optic cable), magnetic communications, electromagnetic communications (including RF, microwave, and infrared communications), electronic communications, or other such communication means.
Although the invention has been described with reference to specific embodiments, it is noted that various modifications and changes may be made to these embodiments without departing from the broad scope of the invention. For example, the various operations, blocks, etc., described herein may be enabled and operated using hardware circuitry (for example, Complementary Metal Oxide Semiconductor (CMOS) based logic circuitry), firmware, software, and/or any combination of hardware, firmware, and/or software (for example, embodied in a machine-readable medium). For example, the apparatuses and methods may be embodied using transistors, logic gates, and electrical circuits (for example, Application Specific Integrated Circuit (ASIC) circuitry and/or in Digital Signal Processor (DSP) circuitry).
Particularly, the server system 200 and its various components may be enabled using software and/or using transistors, logic gates, and electrical circuits (for example, integrated circuit circuitry such as ASIC circuitry). Various embodiments of the invention may include one or more computer programs stored or otherwise embodied on a computer-readable medium, wherein the computer programs are configured to cause the processor or the computer to perform one or more operations. A computer-readable medium storing, embodying, or encoded with a computer program, or similar language, may be embodied as a tangible data storage device storing one or more software programs that are configured to cause the processor or computer to perform one or more operations. Such operations may be, for example, any of the steps or operations described herein. In some embodiments, the computer programs may be stored and provided to a computer using any type of non-transitory computer-readable media. Non-transitory computer-readable media includes any type of tangible storage media. Examples of non-transitory computer-readable media include magnetic storage media (such as floppy disks, magnetic tapes, hard disk drives, etc.), optical magnetic storage media (e.g. magneto-optical disks), Compact Disc Read-Only Memory (CD-ROM), Compact Disc Recordable (CD-R), compact disc rewritable (CD-R/W), Digital Versatile Disc (DVD), BLU-RAY® Disc (BD), and semiconductor memories (such as mask ROM, programmable ROM (PROM), (erasable PROM), flash memory, Random Access Memory (RAM), etc.). Additionally, a tangible data storage device may be embodied as one or more volatile memory devices, one or more non-volatile memory devices, and/or a combination of one or more volatile memory devices and non-volatile memory devices. In some embodiments, the computer programs may be provided to a computer using any type of transitory computer-readable media. Examples of transitory computer-readable media include electric signals, optical signals, and electromagnetic waves. Transitory computer-readable media can provide the program to a computer via a wired communication line (e.g., electric wires, and optical fibers) or a wireless communication line.
Various embodiments of the invention, as discussed above, may be practiced with steps and/or operations in a different order, and/or with hardware elements in configurations, which are different than those which, are disclosed. Therefore, although the invention has been described based on these embodiments, it is noted that certain modifications, variations, and alternative constructions may be apparent and well within the scope of the invention.
Although various embodiments of the invention are described herein in a language specific to structural features and/or methodological acts, the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as exemplary forms of implementing the claims.
, Claims:1. A computer-implemented method, comprising:
accessing, by a server system, a heterogeneous transaction graph from a database associated with the server system, the heterogeneous transaction graph comprising a plurality of node sets associated with a plurality of entity sets and a plurality of edges, each node set comprising a plurality of nodes representing a plurality of entities of an entity set, each node associated with a plurality of node-related features of an individual entity and an edge of the plurality of edges indicating information related to a transactional relationship between two distinct nodes connected by the edge;
extracting, by the server system, a fraud heterogeneous transaction graph from the heterogeneous transaction graph based, at least in part, on the plurality of node-related features comprising a fraud transaction label, the fraud heterogeneous transaction graph comprising, for each node set, a subset of fraud nodes, a subset of non-fraud nodes, and a subset of edges between the subset of fraud nodes and the subset of non-fraud nodes;
determining, by the server system, a first set of weights for the subset of edges of the fraud heterogeneous transaction graph based, at least in part, on a first edge definition criteria for each edge, the first edge definition criteria comprising a fraud count metric and a fraud ratio metric;
determining, by the server system, a second set of weights for the subset of edges based, at least in part, on a second edge definition criteria for each edge, the second edge definition criteria comprising a decline count metric and a decline ratio metric;
computing, by the server system, for each node of the fraud heterogeneous transaction graph, a set of first risk scores, and a set of second risk scores, based, at least in part, on the first set of weights and the second set of weights;
determining, by the server system, a third set of weights for the plurality of edges of the heterogeneous transaction graph based, at least in part, on a third edge definition criteria for each edge, the third edge definition criteria comprising an approved count metric; and
computing, by the server system, for each node of the heterogeneous transaction graph, a set of third risk scores based, at least in part, on the third set of weights.
2. The computer-implemented method as claimed in claim 1, wherein the fraud count metric indicates a count of the fraud transactions between the distinct nodes connected by an individual edge from the subset of edges and the fraud ratio metric is a ratio of fraud transactions and non-fraud transactions between the distinct nodes connected by the individual edge.
3. The computer-implemented method as claimed in claim 1, wherein the decline count metric indicates a count of declined transactions between the distinct nodes connected by an individual edge from the subset of edges and the decline ratio metric is a ratio of declined transactions and approved transactions between the distinct nodes connected by the individual edge.
4. The computer-implemented method as claimed in claim 1, the approved count metric indicates a count of approved transactions between distinct nodes connected by the individual edge.
5. The computer-implemented method as claimed in claim 1, further comprising:
receiving, by the server system, a transaction authentication request for a transaction between a first entity and a second entity from the plurality of entity set;
performing, by the server system, for each node of each node set associated with the plurality of entity sets of the heterogeneous transaction graph:
generating a set of updated node-related features based, at least in part, on the plurality of node-related features and corresponding set of first risk scores, corresponding set of second risk scores, and corresponding set of third risk scores;
determining, by the server system, the plurality of updated node-related features of the first entity and the plurality of updated node-related features of the second entity; and
determining, by a classification model associated with the server system, a fraud score for the transaction based, at least in part, on the plurality of updated node-related features of the first entity and the plurality of updated node-related features of the second entity.
6. The computer-implemented method as claimed in claim 5, further comprising:
labeling, by the server system, the transaction as a fraudulent transaction based, at least in part, on the corresponding fraud score of the transaction being at least equal to a threshold value.
7. The computer-implemented method as claimed in claim 5, further comprising:
labeling, by the server system, the transaction as a non-fraudulent transaction based, at least in part, on the corresponding fraud score of the transaction being lower than a threshold value.
8. The computer-implemented method as claimed in claim 1, wherein computing the set of first risk scores for each node, comprises:
iteratively processing, by a page rank model associated with the server system, the fraud heterogeneous transaction graph till each node of the fraud heterogeneous transaction graph is traversed, wherein the processing comprises:
traversing the fraud heterogeneous transaction graph from a selected random node of the fraud heterogeneous transaction graph for all candidate edges satisfying the first edge definition criteria, wherein a candidate edge is not a dangling edge;
computing, by the server system, a first traverse count for each node based on a number of times each node is traversed; and
determining, by the server system, the set of first risk scores for each node based, at least in part, on the first traverse count of the corresponding node.
9. The computer-implemented method as claimed in claim 8, wherein the set of first risk scores comprises a first rank feature and a second rank feature, wherein the first rank feature is computed by setting the first edge definition criteria as the fraud count metric and the second rank feature is computed by setting the first edge definition criteria as the fraud ratio metric.
10. The computer-implemented method as claimed in claim 1, wherein computing the set of second risk scores for each node, comprises:
iteratively processing, by a page rank model associated with the server system, the fraud heterogeneous transaction graph till each node of the fraud heterogeneous transaction graph is traversed, wherein the processing comprises:
traversing the fraud heterogeneous transaction graph from a selected random node of the fraud heterogeneous transaction graph for all candidate edges satisfying the second edge definition criteria, wherein a candidate edge is not a dangling edge;
computing, by the server system, a second traverse count for each node based on a number of times each node is traversed; and
determining, by the server system, the set of second risk scores for each node based, at least in part, on the second traverse count of the corresponding node.
11. The computer-implemented method as claimed in claim 10, wherein the set of second risk scores comprises a third rank feature and a fourth rank feature, wherein the third rank feature is computed by setting the second edge definition criteria as the decline count metric and the fourth rank feature is computed by setting the second edge definition criteria as the decline ratio metric.
12. The computer-implemented method as claimed in claim 10, wherein computing the set of second risk scores for each node, further comprises:
determining, by the server system, a subset of preferred nodes from the fraud heterogeneous transaction graph based, at least in part, on the fraud ratio metric and a first preference threshold;
iteratively processing, by the page rank model, the fraud heterogeneous transaction graph till each node of the fraud heterogeneous transaction graph is traversed, wherein the processing comprises:
traversing the fraud heterogeneous transaction graph from a selected random node of the fraud heterogeneous transaction graph for all candidate edges satisfying the second edge definition criteria; and
randomly re-selecting another starting node from the subset of preferred nodes for next iteration;
computing, by the server system, a third traverse count for each node based on a number of times each node is traversed; and
determining, by the server system, the second risk score for each node based, at least in part, on the third traverse count of the corresponding node.
13. The computer-implemented method as claimed in claim 12, wherein the set of second risk scores comprises a fifth rank feature and a sixth rank feature, wherein the fifth rank feature is computed by setting the second edge definition criteria as the decline count metric and the sixth rank feature is computed by setting the second edge definition criteria as the decline ratio metric.
14. The computer-implemented method as claimed in claim 1, wherein computing the set of third risk scores for each node, comprises:
iteratively processing, by a page rank model, the heterogeneous transaction graph till each node of the heterogeneous transaction graph is traversed, wherein the processing comprises:
traversing the heterogeneous transaction graph from a selected random node of the heterogeneous transaction graph for all candidate edges satisfying the third edge definition criteria, wherein a candidate edge is not a dangling edge;
computing, by the server system, a fourth traverse count for each node based on a number of times each node is traversed; and
determining, by the server system, a seventh rank feature for each node based, at least in part, on the fourth traverse count of the corresponding node, the set of third risk scores comprising the seventh rank feature.
15. The computer-implemented method as claimed in claim 14, wherein computing the set of third risk scores for each node, further comprises:
determining, by the server system, a subset of preferred nodes from the heterogeneous transaction graph based, at least in part, on the approved count metric and a second preference threshold;
iteratively processing, by the page rank model, the heterogeneous transaction graph till each node of the heterogeneous transaction graph is traversed, wherein the processing comprises:
traversing the heterogeneous transaction graph from a selected random node of the heterogeneous transaction graph for all candidate edges satisfying the third edge definition criteria, and
randomly re-selecting another starting node from the subset of preferred nodes for next iteration;
computing, by the server system, a fifth traverse count for each node based on a number of times each node is traversed; and
determining, by the server system, the eighth rank feature for each node based, at least in part, on the fifth traverse count of the corresponding node, the set of third risk scores comprising the eighth rank feature.
16. The computer-implemented method as claimed in claim 1, further comprising:
accessing, by the server system, a historical transaction dataset from the database, the historical transaction dataset comprising information related to each entity of the plurality of entities in each entity set and a relationship between the plurality entities;
generating, by the server system, the plurality of node-related features for each entity of the plurality of entities in each entity set based, at least in part, on the historical transaction dataset; and
generating, by the server system, the heterogeneous transaction graph based, at least in part, on the historical transaction dataset and the plurality of node-related features for each entity, the heterogeneous transaction graph comprising the plurality of node sets associated with the plurality of entity sets and the plurality of edges.
17. A server system, comprising:
a memory configured to store instructions;
a communication interface; and
a processor in communication with the memory and the communication interface, the processor configured to execute the instructions stored in the memory and thereby cause the server system to perform at least in part to:
access a heterogeneous transaction graph from a database associated with the server system, the heterogeneous transaction graph comprising a plurality of node sets associated with a plurality of entity sets and a plurality of edges, each node set comprising a plurality of nodes representing a plurality of entities of an entity set, each node associated with a plurality of node-related features of an individual entity and an edge of the plurality of edges indicating information related to a transactional relationship between two distinct nodes connected by the edge;
extract a fraud heterogeneous transaction graph from the heterogeneous transaction graph based, at least in part, on the plurality of node-related features comprising a fraud transaction label, the fraud heterogeneous transaction graph comprising, for each node set, a subset of fraud nodes, a subset of non-fraud nodes, and a subset of edges between the subset of fraud nodes and the subset of non-fraud nodes;
determine a first set of weights for the subset of edges of the fraud heterogeneous transaction graph based, at least in part, on a first edge definition criteria for each edge, the first edge definition criteria comprising a fraud count metric and a fraud ratio metric;
determine a second set of weights for the subset of edges based, at least in part, on a second edge definition criteria for each edge, the second edge definition criteria comprising a decline count metric and a decline ratio metric;
compute for each node of the fraud heterogeneous transaction graph, a set of first risk scores, and a set of second risk scores, based, at least in part, on the first set of weights and the second set of weights;
determine a third set of weights for the plurality of edges of the heterogeneous transaction graph based, at least in part, on a third edge definition criteria for each edge, the third edge definition criteria comprising an approved count metric; and
compute for each node of the heterogeneous transaction graph, a set of third risk scores based, at least in part, on the third set of weights.
18. The server system as claimed in claim 17, wherein the server system is further caused, at least in part, to:
receive a transaction authentication request for a transaction between a first entity and a second entity from the plurality of entity set;
perform for each node of each node set associated with the plurality of entity sets of the heterogeneous transaction graph by:
generating a set of updated node-related features based, at least in part, on the plurality of node-related features and corresponding set of first risk scores, corresponding set of second risk scores, and corresponding set of third risk scores;
determine the plurality of updated node-related features of the first entity and the plurality of updated node-related features of the second entity; and
determine, by a classification model associated with the server system, a fraud score for the transaction based, at least in part, on the plurality of updated node-related features of the first entity and the plurality of updated node-related features of the second entity.
19. The server system as claimed in claim 18, wherein the server system is further caused, at least in part, to perform one of:
label the transaction as a fraudulent transaction based, at least in part, on the corresponding fraud score of the transaction being at least equal to a threshold value; and
label the transaction as a non-fraudulent transaction based, at least in part, on the corresponding fraud score of the transaction being lower than a threshold value.
20. A non-transitory computer-readable storage medium comprising computer-executable instructions that, when executed by at least a processor of a server system, cause the server system to perform a method comprising:
accessing a heterogeneous transaction graph from a database associated with the server system, the heterogeneous transaction graph comprising a plurality of node sets associated with a plurality of entity sets and a plurality of edges, each node set comprising a plurality of nodes representing a plurality of entities of an entity set, each node associated with a plurality of node-related features of an individual entity and an edge of the plurality of edges indicating information related to a transactional relationship between two distinct nodes connected by the edge;
extracting a fraud heterogeneous transaction graph from the heterogeneous transaction graph based, at least in part, on the plurality of node-related features comprising a fraud transaction label, the fraud heterogeneous transaction graph comprising, for each node set, a subset of fraud nodes, a subset of non-fraud nodes, and a subset of edges between the subset of fraud nodes and the subset of non-fraud nodes;
determining a first set of weights for the subset of edges of the fraud heterogeneous transaction graph based, at least in part, on a first edge definition criteria for each edge, the first edge definition criteria comprising a fraud count metric and a fraud ratio metric;
determining a second set of weights for the subset of edges based, at least in part, on a second edge definition criteria for each edge, the second edge definition criteria comprising a decline count metric and a decline ratio metric;
computing for each node of the fraud heterogeneous transaction graph, a set of first risk scores, and a set of second risk scores, based, at least in part, on the first set of weights and the second set of weights;
determining a third set of weights for the plurality of edges of the heterogeneous transaction graph based, at least in part, on a third edge definition criteria for each edge, the third edge definition criteria comprising an approved count metric; and
computing for each node of the heterogeneous transaction graph, a set of third risk scores based, at least in part, on the third set of weights
| # | Name | Date |
|---|---|---|
| 1 | 202441015505-STATEMENT OF UNDERTAKING (FORM 3) [01-03-2024(online)].pdf | 2024-03-01 |
| 2 | 202441015505-POWER OF AUTHORITY [01-03-2024(online)].pdf | 2024-03-01 |
| 3 | 202441015505-FORM 1 [01-03-2024(online)].pdf | 2024-03-01 |
| 4 | 202441015505-FIGURE OF ABSTRACT [01-03-2024(online)].pdf | 2024-03-01 |
| 5 | 202441015505-DRAWINGS [01-03-2024(online)].pdf | 2024-03-01 |
| 6 | 202441015505-DECLARATION OF INVENTORSHIP (FORM 5) [01-03-2024(online)].pdf | 2024-03-01 |
| 7 | 202441015505-COMPLETE SPECIFICATION [01-03-2024(online)].pdf | 2024-03-01 |
| 8 | 202441015505-Proof of Right [07-05-2024(online)].pdf | 2024-05-07 |
| 9 | 202441015505-Request Letter-Correspondence [25-02-2025(online)].pdf | 2025-02-25 |
| 10 | 202441015505-Power of Attorney [25-02-2025(online)].pdf | 2025-02-25 |
| 11 | 202441015505-Form 1 (Submitted on date of filing) [25-02-2025(online)].pdf | 2025-02-25 |
| 12 | 202441015505-Covering Letter [25-02-2025(online)].pdf | 2025-02-25 |